Author: Smin Rana

  • Swift 6 Deep Dive: Concurrency, Ownership, and Performance in Real Apps

    Swift 6 Deep Dive: Concurrency, Ownership, and Performance in Real Apps

    Table of Contents

    • Objective and Success Metrics
    • The Swift 6 Landscape
    • Structured Concurrency: async/await, Task, TaskGroup
    • Actor Isolation: Shared Mutable State (without over-actorizing)
    • Ownership and Sendable: Value Types, Isolation, Intent
    • SwiftUI and MainActor Boundaries
    • Backpressure and Cancellation
    • Networking Reliability: Retry, Timeout, Priority
    • Performance Budgets and Instruments
    • Storage: SwiftData/SQLite Patterns for Production
    • Navigation and State: Router Patterns in SwiftUI
    • Interop: Combine, AsyncStream, and Observable
    • Background Work: Reliability in a Post-Fetch World
    • Security and Privacy: Modern Defaults
    • Implementation Checklist
    • FAQs

    Objective and Success Metrics

    Build faster, safer, more reliable iOS/macOS apps using Swift 6’s concurrency and ownership model.

    • Crash-free sessions: improve stability and reduce heisenbugs
    • Performance: lower latency, fewer context switches, smoother UI
    • Predictability: explicit boundaries, lifetimes, and cancellation

    The Swift 6 Landscape

    Swift 6 strengthens structured concurrency and compiler diagnostics:

    • Stricter Sendable checks and improved isolation
    • Ownership semantics that reduce copies and clarify intent
    • Runtime scheduling and standard library performance work

    Structured Concurrency: async/await, Task, TaskGroup

    Use async/await for straightforward flows; TaskGroup for aggregation and coordinated cancellation.

    func loadDashboard(userID: String) async throws -> DashboardData {
        try await withThrowingTaskGroup(of: Any.self) { group in
            group.addTask { try await fetchProfile(userID) }
            group.addTask { try await fetchStats(userID) }
            group.addTask { try await fetchFeed(userID) }
    
            var profile: Profile?
            var stats: Stats?
            var feed: [Item] = []
    
            for try await result in group {
                switch result {
                case let p as Profile: profile = p
                case let s as Stats: stats = s
                case let f as [Item]: feed = f
                default: break
                }
            }
            guard let p = profile, let s = stats else { throw AppError.missingData }
            return DashboardData(profile: p, stats: s, feed: feed)
        }
    }

    Guidelines

    • Keep groups shallow; prefer clarity over cleverness
    • Propagate errors; don’t swallow and log later
    • Prefer structured lifetimes; avoid Task.detached for app logic

    Actor Isolation: Shared Mutable State (without over-actorizing)

    Actorize the mutable core — caches, session state, in-memory stores — and keep computation outside.

    actor ImageCache {
        private var store: [URL: Image] = [:]
        func put(_ url: URL, image: Image) { store[url] = image }
        func get(_ url: URL) -> Image? { store[url] }
        nonisolated func estimateCount() -> Int { 0 }
    }

    Principles

    • Isolate mutation; permit pure reads via nonisolated when safe
    • Avoid actorizing everything; measure throughput and latency
    • Centralize cancellation in your pipeline

    Ownership and Sendable: Value Types, Isolation, Intent

    Sendable annotations and ownership semantics eliminate cross-thread mutation bugs.

    struct Profile: Sendable { /* fields */ }
    final class Session: @unchecked Sendable { /* document thread usage */ }

    Rules

    • Prefer structs for cross-task data
    • Confine class mutation to actors or MainActor
    • Reach for @unchecked Sendable only with explicit rules

    SwiftUI and MainActor Boundaries

    UI mutations belong on MainActor; async work does not.

    @MainActor
    final class DashboardViewModel: ObservableObject {
        @Published private(set) var items: [Item] = []
        @Published private(set) var isLoading = false
    
        func refresh() {
            isLoading = true
            Task {
                let fetched = try await fetchItems()
                await MainActor.run { self.items = fetched; self.isLoading = false }
            }
        }
    }

    Backpressure and Cancellation

    Unbounded fans burn battery and starve UI. Add gates and cancellations.

    actor RequestGate {
        private let maxConcurrent: Int
        private var running = 0
        private var waiters: [CheckedContinuation<Void, Never>] = []
    
        init(maxConcurrent: Int = 6) { self.maxConcurrent = maxConcurrent }
    
        func acquire() async {
            if running < maxConcurrent { running += 1; return }
            await withCheckedContinuation { cont in waiters.append(cont) }
        }
    
        func release() {
            if let next = waiters.popLast() { next.resume() } else { running = max(0, running - 1) }
        }
    }

    Checklist

    • Link task lifetimes to view lifetimes
    • Cancel on route changes; dedupe by ID
    • Coalesce repeated requests

    Networking Reliability: Retry, Timeout, Priority

    Keep the network layer predictable.

    struct RequestOptions { let priority: TaskPriority; let timeout: TimeInterval; let retries: Int }
    
    func fetchJSON<T: Decodable>(from url: URL, options: RequestOptions = .init(priority: .medium, timeout: 15, retries: 2)) async throws -> T {
        return try await withTimeout(seconds: options.timeout) {
            try await retry(times: options.retries, jitter: 0.3) {
                let (data, _) = try await URLSession.shared.data(from: url)
                return try JSONDecoder().decode(T.self, from: data)
            }
        }
    }

    Performance Budgets and Instruments

    Measure and enforce budgets.

    • Concurrency caps per screen
    • Memory caps (vector length, cache size)
    • Energy caps (no long tasks on input)
    • Instruments: Energy, Time Profiler, Allocations, Concurrency; signpost hot paths
    import os.signpost
    let log = OSLog(subsystem: "com.app", category: "perf")
    let sp = OSSignposter(log: log)
    func signposted<T>(_ name: StaticString, _ op: () async throws -> T) async rethrows -> T {
        let s = sp.beginInterval(name); defer { sp.endInterval(name, s) }
        return try await op()
    }

    Storage: SwiftData/SQLite Patterns for Production

    • Plan migrations; don’t rely on magic
    • Batch operations; avoid per-row loops
    • Avoid N+1 queries; index and measure faulting

    Navigation and State: Router Patterns in SwiftUI

    • Central Router that owns navigation state
    • Dependencies via Environment for testability
    • Avoid global singletons that keep tasks alive

    Interop: Combine, AsyncStream, and Observable

    Bridge pragmatically; don’t rewrite happy code.

    func combineToAsync<T>(_ publisher: some Publisher<T, Error>) -> AsyncThrowingStream<T, Error> {
        AsyncThrowingStream { continuation in
            let c = publisher.sink(receiveCompletion: { completion in
                if case .failure(let e) = completion { continuation.finish(throwing: e) } else { continuation.finish() }
            }, receiveValue: { value in continuation.yield(value) })
            continuation.onTermination = { _ in c.cancel() }
        }
    }

    Background Work: Reliability in a Post-Fetch World

    • Idempotent, resumable tasks
    • Short, progressive units; re-schedule if needed
    • Checkpoints for long operations

    Security and Privacy: Modern Defaults

    • Passkeys first, passwords fallback
    • DeviceCheck for abuse signals
    • Private Relay-aware networking

    Implementation Checklist

    • [ ] Define budgets (concurrency, memory, energy)
    • [ ] Actorize shared mutable state; keep computation outside
    • [ ] Adopt async/await and TaskGroup for structured flows
    • [ ] Implement RequestGate and cancellation patterns
    • [ ] Predictable networking (retry, timeout, priority)
    • [ ] Add signposts; profile with Instruments
    • [ ] Plan storage migrations; batch writes
    • [ ] Router-driven navigation; avoid global singletons
    • [ ] Pragmatic interop: Combine ↔︎ AsyncStream; prefer Observable for new state
    • [ ] Background reliability with checkpoints

    FAQs

    • Do I need to rewrite my app for Swift 6?
      • No. Incrementally adopt structured concurrency and actors where it pays off.
    • Should I actorize all classes?
      • No. Isolate shared mutable state; keep pure computation outside actors.
    • Is Task.detached bad?
      • Use it sparingly. Prefer structured lifetimes and cancellation with Task/TaskGroup.
    • How do I fix flaky concurrency tests?
      • Test actors and isolated state directly; avoid wall-clock sleeps; add hooks.
    • What’s the fastest way to see performance problems?
      • Instruments + signposts; measure before guessing.

    Spread the love
  • Building AI‑Native iOS Features: On‑device LLMs with Core ML and MLX

    Building AI‑Native iOS Features: On‑device LLMs with Core ML and MLX

    Ship a semantic search feature that feels instant, works offline, preserves privacy, and raises engagement — built natively on iOS with Core ML and MLX. Success looks like faster findability, longer sessions, and users trusting the app with their notes, docs, or content.

    What Success Looks Like

    • Latency: <100ms for typical queries on modern iPhones
    • Privacy: zero network calls for core interactions; clear user consent
    • Engagement: +20–30% more successful searches; +10–15% longer sessions
    • Reliability: graceful degradation when indexing is interrupted; safe cancellation

    The User Problem

    Users don’t remember exact words — they remember ideas. Literal search makes them feel clumsy and slow. We want the app to understand meaning: “winter boarding checklist,” “Swift actor pattern,” “sound design notes” — even if phrased differently.

    Our Constraints

    • On‑device by default (ANE/GPU/CPU), no per‑keystroke backend calls
    • Battery‑aware and memory‑bounded; performance budgets per screen
    • Simple, testable architecture; no mystery schedulers or hidden queues

    The Plan

    1. Represent meaning with embeddings
    2. Make retrieval fast with a local index
    3. Keep it private and instant with aggressive caching
    4. Respect energy and memory budgets
    5. Tell a clear UX story with tight feedback loops

    Prototyping on Mac with MLX

    We began on Apple Silicon, where iteration speed wins. MLX let us test small transformer‑based embedding models, tune dimensions, and measure throughput across real text (notes, code snippets, short docs). We weren’t chasing leaderboard scores — we optimized for consistency and speed in our domain.

    Core ML Conversion for iOS

    Once our embedding model behaved well, we converted it to Core ML (coremltools) and compiled it into .mlmodelc assets. This brought ANE acceleration and stable APIs. We wrapped the model in a boring interface: “give me a string, I’ll return a float vector.” No surprises.

    A Boring, Reliable Wrapper

    import CoreML
    
    final class TextEmbeddingModel {
        enum ModelError: Error { case outputMissing }
        private let model: MLModel
    
        init() {
            let url = Bundle.main.url(forResource: "TextEmbed", withExtension: "mlmodelc")!
            model = try! MLModel(contentsOf: url)
        }
    
        func embed(_ text: String) throws -> [Float] {
            let input = try MLDictionaryFeatureProvider(dictionary: ["text": text])
            let output = try model.prediction(from: input)
            guard let arr = output.featureValue(for: "embedding")?.multiArrayValue else { throw ModelError.outputMissing }
            var result = [Float](repeating: 0, count: arr.count)
            for i in 0..<arr.count { result[i] = arr[i].floatValue }
            return result
        }
    }

    Pipeline: Normalize, Cache, Index

    The model is a component; the pipeline is the feature. We normalized inputs, cached aggressively, and isolated mutation with actors.

    struct EmbeddingResult: Sendable { let vector: [Float]; let key: String }
    
    actor EmbeddingCache {
        private var store: [String: [Float]] = [:]
        func get(_ key: String) -> [Float]? { store[key] }
        func put(_ key: String, _ vector: [Float]) { store[key] = vector }
    }
    
    struct TextPreprocessor {
        static func normalize(_ s: String) -> String { s.lowercased().trimmingCharacters(in: .whitespacesAndNewlines) }
        static func key(for s: String) -> String { String(s.hashValue) } // replace with stable hash
    }
    
    actor EmbeddingService {
        private let cache = EmbeddingCache()
        private let model = TextEmbeddingModel()
    
        func embed(_ text: String) async throws -> EmbeddingResult {
            let clean = TextPreprocessor.normalize(text)
            let key = TextPreprocessor.key(for: clean)
            if let cached = await cache.get(key) { return EmbeddingResult(vector: cached, key: key) }
            let vector = try model.embed(clean)
            await cache.put(key, vector)
            return EmbeddingResult(vector: vector, key: key)
        }
    }

    Local Retrieval with Cosine Similarity

    Cosine similarity is simple and effective for semantic search. We kept writes serialized and reads fast.

    actor VectorIndex {
        struct Item: Sendable { let id: String; let vector: [Float] }
        private var items: [Item] = []
    
        func upsert(_ item: Item) {
            if let idx = items.firstIndex(where: { $0.id == item.id }) { items[idx] = item } else { items.append(item) }
        }
    
        func topK(query: [Float], k: Int = 10) -> [Item] {
            let scored = items.map { ($0, cosine($0.vector, query)) }
            return scored.sorted(by: { $0.1 > $1.1 }).prefix(k).map { $0.0 }
        }
    
        private func cosine(_ a: [Float], _ b: [Float]) -> Float {
            var dot: Float = 0, na: Float = 0, nb: Float = 0
            for i in 0..<min(a.count, b.count) { dot += a[i]*b[i]; na += a[i]*a[i]; nb += b[i]*b[i] }
            return dot / (sqrt(na) * sqrt(nb) + 1e-6)
        }
    }

    UX: Instant Feedback, Honest Ranking

    Users type; results update. We debounced input, embedded the query, fetched top matches locally, and updated the UI — no waiting on a network round‑trip. We surfaced “why” explanations next to results (“matched concepts: winter boarding, checklist”).

    @MainActor
    final class SearchViewModel: ObservableObject {
        @Published var query: String = ""
        @Published var results: [Doc] = []
        private let embedder = EmbeddingService()
        private let index = VectorIndex()
    
        func search(_ q: String) {
            query = q
            Task { [weak self] in
                guard let self else { return }
                do {
                    let emb = try await self.embedder.embed(q)
                    let local = await self.index.topK(query: emb.vector, k: 20)
                    await MainActor.run { self.results = local.map(toDoc) }
                } catch {
                    // handle gracefully
                }
            }
        }
    }

    Budgets and Profiling (ANE/Metal)

    We set measurable budgets and stuck to them:

    • Concurrency: 4 tasks for search, 6 for background indexing
    • Memory: cap vector lengths; compress idle caches
    • Energy: no long‑running work triggered by typing

    We used Instruments — Energy, Time Profiler, Allocations, Concurrency — and added signposts around embedding and retrieval.

    import os.signpost
    let log = OSLog(subsystem: "com.app", category: "ai")
    let sp = OSSignposter(log: log)
    
    func signposted<T>(_ name: StaticString, _ op: () async throws -> T) async rethrows -> T {
        let s = sp.beginInterval(name); defer { sp.endInterval(name, s) }
        return try await op()
    }

    Guardrails and Trust (Consent, Accessibility, Explainability)

    We treated AI like a respectful assistant:

    • Transparent consent and on‑device defaults
    • Clear controls to pause/stop generation
    • Constrained prompts and output lengths
    • Simple “why” explanations to reduce surprise

    Persistence and Resilience

    We persisted embeddings and outputs with lightweight indexing, batched writes, and versioned caches. When the model changed, we invalidated cleanly and rebuilt in the background. Checkpoints let long jobs resume.

    App Intents (Shortcuts)

    We exposed quick actions and Shortcuts so users could jump directly to “ideas about audio” or “notes on actors,” making the feature feel native beyond the app.

    Keeping the Binary Lean

    We shipped a base embedding model, downloaded larger variants on demand, and audited assets ruthlessly. Smaller apps install more, start faster, and crash less.

    Testing and CI/CD

    We tested actor‑isolated caches and indexes, verified cancellation, used fixtures for embeddings, and avoided sleeps. In CI, we staged model assets and gated releases with end‑to‑end tests. Budgets were checked on physical devices before TestFlight.

    Results

    • Latency consistently under 100ms for typical queries
    • Dramatic increase in successful searches and longer sessions
    • Fewer support tickets about “can’t find my note”
    • Positive reviews citing speed and trust (“works offline, feels instant”)

    Lessons

    • The model is not the feature; the pipeline is
    • Ownership and isolation prevent heisenbugs and copy storms
    • Budgets make performance a product choice, not luck
    • On‑device by default earns trust and word‑of‑mouth

    Implementation Checklist

    • [ ] Define objective and metrics (latency, privacy, engagement)
    • [ ] Prototype embeddings with MLX on Mac (tune dimensions/tokenization)
    • [ ] Convert to Core ML (.mlmodelc) and wrap a stable API
    • [ ] Build pipeline: normalize, cache, index
    • [ ] Implement local retrieval and “why” explanations
    • [ ] Set concurrency/memory/energy budgets; add signposts; profile on device
    • [ ] Persist vectors; batch writes; version caches; checkpoints
    • [ ] Integrate ## App Intents (Shortcuts) (Shortcuts) for quick actions
    • [ ] Keep binary lean; stage assets; test on physical devices
    • [ ] Monitor results; iterate

    FAQs

    • What is on‑device AI for iOS?
      • Running models locally on iPhone/iPad using Core ML/Metal/ANE, keeping latency low and data private.
    • Core ML vs MLX — which should I use?
      • Use MLX on Mac for rapid prototyping and custom layers; convert to Core ML for production iOS deployment with ANE acceleration.
    • Can iPhones run LLMs?
      • Yes, small distilled models are practical for templated generation, short summaries, and classification with rationale.
    • How do I keep battery usage low?
      • Cap concurrency, use ANE where available, measure with Instruments, avoid long tasks on user input.
    • How do I ensure privacy?
      • Avoid per‑keystroke network calls; keep embeddings and retrieval on device; offer opt‑in for remote expansion.
    • How do I tune search quality?
      • Normalize inputs, cache aggressively, and tune embedding dimensions/tokenization for your domain; surface “why” explanations.

    Where This Goes Next

    We’ve reused the pipeline to power intent suggestions, lightweight categorization, and short previews. The same embedding cache and index became a platform inside the app. Small, reliable pieces compound.

    Spread the love
  • Case Study 5: Dev Tool MVP

    Case Study 5: Dev Tool MVP

    I built a CLI tool intended to standardize local development setup across microservices. The promise: one command—dev bootstrap—that discovers services, generates .env files, and starts containers via Docker Compose. In demos, it was magical. In real teams, it broke in 40% of setups due to bespoke scripts, Compose version drift, OS differences, and odd edge cases. The MVP automated too much, too early, and eroded trust.

    This article explains what I built, why it failed, and how I would rebuild the MVP around a clear compatibility contract and a validator-first workflow that earns trust before automating.

    The Context: Diverse Stacks, Fragile Automation

    Microservice repos evolve organically. Teams glue together language-specific tools, local caches, custom scripts, and different container setups. A tool that tries to own the entire “bootstrap and run” flow without a shared contract is brittle.

    What I Built (MVP Scope)

    • Discovery: Scan repos for services via file patterns.
    • Env Generation: Infer env keys from docker-compose.yml and sample .env.example files; produce unified .env.
    • Compose Orchestration: Start all services locally with one command.
    • Opinionated Defaults: Assume standard port ranges and common service names.
    • Metrics: Time to first run, number of successful bootstraps per team.

    Launch and Early Results

    • Solo demos worked spectacularly.
    • Team pilots revealed fragility: custom scripts, non-standard Compose naming, and OS-specific quirks caused frequent failures.
    • Trust dropped quickly; teams reverted to their known scripts.

    Why It Failed: Over-Automation Without a Contract

    I tried to automate the whole workflow without agreeing on a small, stable contract that teams could satisfy. Without a shared “dev.json” or similar spec, guessing env keys and start commands led to errors. Reliability suffered, and with dev tools, reliability is the MVP.

    Root causes:

    • Inference Errors: Guessing configurations from heterogeneous repos is error-prone.
    • Hidden Assumptions: Opinionated defaults clashed with local reality.
    • No Validation Step: Users couldn’t see or fix mismatches before automation ran.

    The MVP I Should Have Built: Validate and Guide

    Start with a minimal compatibility contract and a validator that helps teams conform incrementally.

    • Contract: Each service exposes a dev.json containing ports, env keys, and start command.
    • Validator CLI: dev validate checks conformance, explains gaps, and suggests fixes.
    • Linter: Provide a linter for dev.json with clear error messages.
    • Guided Setup: Generate .env from dev.json and start one service at a time.
    • Telemetry: Track validation pass rate, categories of errors, and time to first successful run.

    How It Would Work (Still MVP)

    • Step 1: Teams add dev.json to each service with minimal fields.
    • Step 2: Run dev validate; fix issues based on actionable messages.
    • Step 3: Use dev env to generate environment files deterministically.
    • Step 4: Start one service with dev run service-a; expand to orchestration only after a high pass rate.

    This builds trust by making the tool predictable and by exposing mismatches early.

    Technical Shape

    • Schema: dev.json with fields { name, port, env: [KEY], start: "cmd" }.
    • Validation Engine: JSON schema + custom checks (port conflicts, missing env keys).
    • Compose Adapter: Optional; reads from dev.json to generate Compose fragments rather than infer from arbitrary files.
    • Cross-Platform Tests: Simple checks for OS differences (path separators, shell commands).

    Measuring Trust

    • Validation Pass Rate: Percentage of services passing dev validate.
    • First Successful Run: Time from install to one service running.
    • Error Categories: Distribution helps prioritize adapters and docs.
    • Rollback Incidents: Track how often teams abandon the tool mid-setup.

    Onboarding and Documentation

    • Quick Start: Create dev.json with a template; run dev validate.
    • Troubleshooting: Clear guides for common errors with copy-paste fixes.
    • Contracts Over Recipes: Emphasize the compatibility contract and why it exists.

    Personal Reflections

    I wanted the “it just works” moment so much that I skipped the steps that make “it just works” possible: a shared spec and a validator. Dev teams reward predictability over magic; trust is the currency.

    Counterfactual Outcomes

    With a validator-first MVP:

    • Validation pass rate climbs from ~40% to ~80% in two months.
    • Time to first successful run drops significantly.
    • Teams adopt the tool gradually, and orchestration becomes feasible.

    Iteration Path

    • Add adapters for common stacks (Node, Python, Go).
    • Introduce a dev doctor command that diagnoses OS and toolchain issues.
    • Expand the contract only as needed; resist auto-inference beyond the spec.

    Closing Thought

    For dev tools, the smallest viable product is a trust-building tool: define a minimal contract, validate it, and guide teams to conformance. Automate only after reliability is demonstrated. Magic is delightful, but trust is what sticks.

    Spread the love