← All KinPapers
9The LocalKin Team · April 2026

Embedded Dylib: A Distribution-Ready Pattern for Pure-Go Bindings to Native macOS Frameworks

The LocalKin Team

Position Paper — April 2026

Abstract

Programmatic access to native platform frameworks (Apple's ScreenCaptureKit, AVFoundation, Accessibility, Vision) has been functionally inaccessible to Go developers for a decade. Historical Go libraries such as kbinani/screenshot (2.3k★) and go-vgo/robotgo (10.2k★) relied on CGDisplayCreateImage, which Apple deprecated in macOS 15 (Sequoia, 2024) and removed entirely in macOS 26 (Tahoe, 2026). The standard replacement path requires Objective-C blocks, delegate protocols, and ARC-managed CMSampleBuffer lifecycles — historically accessible only via cgo, which imposes a C toolchain on every downstream user, breaks cross-compilation, and adds significant build complexity. We present Embedded Dylib, a repeatable template for pure-Go bindings to native macOS frameworks that combines three matured building blocks — ebitengine/purego for function-pointer-level Foreign Function Interface (FFI), Go's //go:embed directive for binary asset bundling, and a thin Objective-C shim that synchronizes the framework's asynchronous block-based APIs behind plain C ABI. The template ships a universal (arm64 + x86_64) Mach-O dynamic library of 147-190 KB embedded inside the Go module; on first call, the binary is extracted to the user's cache directory and dynamically loaded — requiring no clang, no Makefile, and no manual steps from downstream consumers. We validate the template by shipping two production libraries in a single 13-hour development window: sckit-go, a screen-capture library covering the full surface area of Apple's ScreenCaptureKit (five target kinds, streaming and one-shot capture at sub-20ms frame latency), and kinrec, an end-user screen recorder that mixes system audio and microphone into a single AAC track via AVAudioEngine. Both libraries ship with 0 cgo invocations in downstream user code, 81 tests combined (73-79% coverage), 0 linter warnings across staticcheck + golangci-lint, and verified correctness on macOS 14-26. We argue that Embedded Dylib unblocks a previously-frozen quadrant of the Go ecosystem: the ~30 Apple frameworks for which Rust has bindings via objc2 but Go had essentially none. With AI-assisted Objective-C authorship removing the final skill-gap barrier, the marginal cost of a new binding drops to 2-7 days. We catalog 12 high-value candidates, present their coordination as the KinKit program, and report the failure modes encountered during our own validation, including a sample-buffer PTS desynchronization bug that produces structurally-valid but effectively-unplayable mp4 files — documented here as a regression test every such pipeline should include.

Keywords: foreign function interface, macOS, ScreenCaptureKit, AVFoundation, Go, Objective-C, distribution, developer experience

1. Introduction

Consider a Go developer in 2026 who needs to capture a screenshot on macOS. Perhaps they are building a computer-use agent that feeds screen pixels to Claude Vision. Perhaps they are extending a CLI tool with a "save screenshot" flag. Perhaps they want to record a five-second screencast from a CI pipeline to include in a bug report. The task is routine. The task is impossible.

The top search result, kbinani/screenshot, has 2,300 stars on GitHub and has not been updated since 2023. Its CaptureDisplay function returns nil on macOS 15. The more comprehensive go-vgo/robotgo (10.2k stars) exhibits partial failure on the same system: mouse movement works, keyboard events work, but screen capture silently returns empty bitmaps. Both libraries depend, transitively, on CGDisplayCreateImage — an API that Apple marked deprecated in macOS 15 (Sequoia) and removed in macOS 26 (Tahoe). The replacement is ScreenCaptureKit [Apple, 2022], a modern, permission-gated, asynchronous API built around SCStream, SCScreenshotManager, and AVAssetWriter. ScreenCaptureKit works beautifully — in Swift, and in Objective-C, and in Rust (via the community-maintained screencapturekit-rs crate powering mediar-ai/screenpipe). In Go, there is nothing. A GitHub search for repositories combining the terms "Go" and "ScreenCaptureKit" returns exactly one result: a two-day experiment abandoned in August 2024 with three stars and nine kilobytes of code.

This is not a story about one library. It is a story about a category. The macOS frameworks that are important to modern applications — ScreenCaptureKit (screen), AVFoundation (audio-video), Accessibility (UI tree), Vision (OCR and image recognition), Speech (transcription), FSEvents (file watching), Core Location, EventKit, Core Bluetooth, CoreML, and approximately twenty more — are each a few thousand lines of Objective-C away from any Go program. The few historical attempts that bridge the gap rely almost universally on cgo, which transforms the downstream developer experience from go get to a Makefile, an installed C toolchain, an LLVM version pin, a cross-compilation arrangement, and frequently a private Go module proxy to avoid rebuilding native code on every CI job.

The consequence is that the Go ecosystem on macOS is stuck at Layer 0 (the operating system) with nothing between it and Layer 3 (application-level tools). The entire Layer 1 — primitive bindings that expose platform capabilities to higher layers — is nearly empty. Meanwhile, the Rust ecosystem has been steadily filling its equivalent Layer 1 via the objc2 crate and derivatives, which is why the most visible modern Mac developer tools that are neither products (CleanShot X) nor Electron wrappers (Kap, abandoned since 2022) are predominantly Rust (Raycast extensions, Screenpipe, Asahi's macOS-Linux interop work).

This paper presents a template — we call it Embedded Dylib — for closing that gap. The template is not a single library; it is a repeatable engineering pattern that uses three mature building blocks to produce a distribution experience indistinguishable from a pure-Go library while providing full, first-class access to any Apple framework. We validate the template by shipping two production libraries end-to-end in a thirteen-hour window. We document the one pipeline-correctness bug the validation process surfaced — a presentation-timestamp misalignment between the synthesized audio mixer and video stream — because it is the kind of bug every future application of this template is likely to hit, and our regression test is worth sharing.

The paper is structured as follows. Section 2 frames the problem precisely: what "Layer 1" means, why it is empty in Go, and why that emptiness is not incidental. Section 3 describes the three building blocks — purego, //go:embed, and the synchronous Objective-C shim — and the invariants the template enforces. Section 4 and Section 5 present the two validating case studies. Section 6 addresses a meta-concern relevant to this paper and only now relevant to this paper: AI-assisted authorship of the Objective-C layer. Section 7 provides benchmarks and compares against existing approaches. Section 8 introduces the KinKit program — twelve high-value candidates to which the template naturally extends. Section 9 addresses limitations and the cases where the template is the wrong choice. Section 10 concludes.

2. The Layer 1 Drought: Why Go's macOS Ecosystem Stalled

2.1 An Inventory of the Empty Quadrant

We define Layer 1 as the set of libraries that expose platform-native APIs directly to application code, without attempting to abstract across platforms. A Layer 1 library on macOS is a thin Go surface over a specific Apple framework; the same library does not compile on Linux, and it does not attempt to. By this definition, Linux's Layer 1 for Go is well-populated (golang.org/x/sys/unix, github.com/prometheus/procfs, github.com/fsnotify/fsnotify, dozens more). macOS's Layer 1 is not.

Table 1 enumerates the major Apple frameworks and their Go-binding status as of April 2026. "Go binding" here means a maintained library with at least one released version, a non-trivial test suite, and published documentation. "Broken" indicates a historical binding that no longer functions on a supported macOS version.

Apple FrameworkRust bindingGo bindingNotes
ScreenCaptureKitscreencapturekit-rsNoneSubject of this paper; historical kbinani/screenshot broken
AVFoundationobjc2-av-foundationNoneHistorical robotgo partial, broken on macOS 15+
Accessibility (AX)accessibility-sysNoneCritical for agent automation
Visionvision-sys (partial)NoneOCR, barcode, face detection
Speech / SFSpeechRecognizerThird-party cratesNoneOn-device transcription
EventKitobjc2-event-kitNoneCalendar and reminders
CoreBluetoothbluestNoneBLE peripheral access
CoreLocationobjc2-core-locationNoneLocation services
FSEventsDirect C bindingsfsnotify (uses cgo)Only one of Go's few bindings, and cgo-based
Keychainkeychain-serviceskeybase/go-keychainUses cgo
NSPasteboardarboard, clipboardatotto/clipboardUses cgo
UserNotificationsnotify-rustgen2brain/beeepUses multi-platform fallback, not Apple-native
CoreMLcoreml-rsNoneOn-device inference
Metal / MPSmetal-rsNoneGPU compute
Vision and other on-device MLVariousNone

Table 1: Binding coverage for major macOS frameworks, Rust vs. Go, April 2026. "None" means no pure-Go or cgo-based maintained binding. Libraries marked with cgo represent the state of the art prior to this work: functional but imposing a downstream toolchain requirement.

Of the fourteen frameworks surveyed, Rust has maintained bindings for all fourteen (in varying states of maturity). Go has bindings for only four — and every one of the four imposes cgo on the downstream user.

2.2 Why cgo Is the Wrong Escape Hatch

cgo is Go's official mechanism for calling C code. It works. It is documented. It is also, on balance, hostile to the distribution model that made Go successful in the first place.

Consider the experience of installing a pure-Go library:

$ go get github.com/user/library

The Go module proxy fetches the module, the checksum database verifies it, the build cache compiles it, and the user can begin writing code. Cross-compilation to a different architecture works transparently: GOOS=linux GOARCH=arm64 go build produces a Linux arm64 binary even on an x86_64 Mac, because the toolchain contains all necessary compilers for every supported target.

Now consider the experience of installing a cgo-based library:

$ go get github.com/user/cgo-library
$ go build
# github.com/user/cgo-library
./c_stub.go:12:3: could not determine kind of name for C.my_function

The user must install clang (or gcc), install the Apple Command Line Tools (or the full Xcode for framework headers), set CGO_ENABLED=1, and pray that their system toolchain matches the one the library author tested against. Cross-compilation ceases to be transparent: building a Linux binary from a Mac now requires a Linux cross-compiler for C, which is not provided by the Go toolchain. Continuous Integration pipelines must install and cache platform-specific C toolchains. Docker build layers balloon from 50 MB to 500 MB. go install github.com/user/cgo-library/cmd/tool@latest — the commonly-assumed idiom for installing a CLI written in Go — often fails silently on systems where the C toolchain is absent.

These are not hypothetical complaints. The cgo overhead is cited in virtually every postmortem of a Go library that struggled to gain adoption despite solid technical foundations. github.com/mattn/go-sqlite3, which uses cgo to bundle SQLite, is widely used but is also widely cited as the library that forced teams to switch from Alpine Linux to Debian-based container bases. The pure-Go SQLite port github.com/mattn/go-sqlite3 is preferred in cloud-native workflows specifically to escape the cgo tax.

For Layer 1 bindings to platform frameworks, cgo is particularly ill-matched. The bindings are small (a few thousand lines). They change infrequently. They would be ideal candidates for distribution as pure-Go modules if not for the one small problem of needing to invoke Objective-C methods and Core Foundation types. The Embedded Dylib pattern is a response to that one small problem.

2.3 The Skill Gap: Objective-C Is Not Go

There is a second reason the Go ecosystem on macOS stalled, and it is not technical. Objective-C is a skill that most Go developers do not have. The language has idiosyncratic syntax ([object method:argument]), its own memory management model (manual retain/release prior to ARC, automatic but non-trivial under ARC), and a radically different approach to concurrency (Grand Central Dispatch, blocks, delegates). Reading Objective-C well enough to write a 400-line framework wrapper is a week-long project for a Go developer who has not done it before. Writing it well enough for production use is a month-long project.

The overlap between "developers who are fluent in Go and shipping production code" and "developers who are fluent in Objective-C and willing to debug retain-count issues at 2 AM" is small, and it is aging. Apple has been steering new development toward Swift since 2014. A Swift-fluent developer might write a clean wrapper, but Swift does not interop with C as cleanly as Objective-C does, and purego does not yet support Swift's calling conventions. Objective-C remains the right tool for the shim layer described in this paper, and also a tool that few Go-ecosystem contributors volunteer to use.

This skill gap is the reason that even after purego matured to the point where pure-Go FFI became practical, no one used it to build ScreenCaptureKit bindings. The FFI mechanism was the easy part. Writing the Objective-C was the hard part. Section 6 addresses what changed in 2024-2026 to remove that barrier, but the short version is: Claude can write Objective-C, and Claude can write it well enough to ship to production on the first try. This is a meta-observation about the field, not a product pitch, but the empirical fact is that the two libraries described in this paper were both written by a Go developer who last touched Objective-C in 2013, in a single day, with AI assistance for the Objective-C layer exclusively.

3. The Embedded Dylib Pattern

The Embedded Dylib pattern composes three mature building blocks to produce a distribution experience indistinguishable from a pure-Go library while providing full access to any Apple framework. This section describes each block and the invariants the template enforces.

3.1 Building Block One: purego for Cgo-Free FFI

The github.com/ebitengine/purego library, initiated in 2021 and stabilized at version 0.8 in late 2024, provides a pure-Go implementation of the parts of cgo needed for dynamic library loading. Its critical APIs are purego.Dlopen, which opens a shared library at runtime, and purego.RegisterLibFunc, which binds a function pointer by symbol name to a Go function value. The effect is that Go code can call C-ABI functions exported from a .dylib (or .so or .dll) with no cgo directives anywhere in the module.

A canonical use of purego for our purposes looks like this:

var captureFn func(uint32, unsafe.Pointer, int32, ...) int32

h, err := purego.Dlopen(dylibPath, purego.RTLD_LAZY|purego.RTLD_GLOBAL)
if err != nil {
    return err
}
purego.RegisterLibFunc(&captureFn, h, "sckit_capture_display")

// Now captureFn is callable like any Go function:
n := captureFn(displayID, unsafe.Pointer(&buffer[0]), bufCap, ...)

Four properties of this approach matter for the pattern:

  1. No cgo toolchain on the downstream user's machine. The Go module importing this code compiles with plain go build, no CGO_ENABLED=1 required, no C compiler required. The .dylib is loaded at runtime.
  2. Cross-compilation is preserved. A linux/amd64 binary can still be produced from a macOS host, because there is no C code being compiled.
  3. The ABI boundary is plain C function pointers with C struct arguments. This is much simpler than cgo's type-mapping layer, which must invent Go equivalents for every C type.
  4. Failure modes are runtime, not compile-time. If the dylib is missing, we get a clear error at first call; if a symbol is missing, we get a clear error at registration. Neither is worse than cgo's cryptic linker failures.

The limitation of purego is that it cannot (in Go 1.22) create Objective-C block callbacks from the Go side. This is the reason we still need an Objective-C shim: to expose the framework's block-accepting APIs as synchronous C functions.

3.2 Building Block Two: //go:embed for Binary Asset Bundling

Go 1.16 (March 2021) introduced the //go:embed directive, which bundles arbitrary files into the compiled binary. The directive is typically advertised for embedding HTML templates, SQL migrations, or static web assets. For our purposes, it bundles a compiled universal Mach-O dynamic library:

package dylib

import _ "embed"

//go:embed libsckit_sync.dylib
var Bytes []byte

At compile time, the 147 KB file libsckit_sync.dylib — a universal binary containing both arm64 and x86_64 slices, produced by clang with -arch arm64 -arch x86_64 — is read into a byte slice accessible from Go code. The Go module, when fetched via go get, transfers this byte slice as part of the module's tarball. No separate distribution channel is required; no installer scripts; no release attachment tied to specific platform versions.

Downstream users' first call into the library triggers extraction: a SHA-256 prefix of the embedded bytes becomes a cache directory name (~/Library/Caches/<name>/<hash-prefix>/), the dylib is written there via atomic temp-file-plus-rename, and the path is passed to purego.Dlopen. Subsequent process launches detect the existing cached file and skip extraction — an operation that amortizes to zero cost after the first invocation.

The content-hash-indexed cache directory has a useful property: multiple versions of the library can coexist without conflict. If a project vendors two versions of sckit-go via different transitive dependencies, each version's dylib bytes produce a different hash and therefore a different cache directory. There is no global mutex, no shared path to lock, no upgrade procedure.

3.3 Building Block Three: The Synchronous Objective-C Shim

ScreenCaptureKit, like most modern Apple frameworks, exposes its APIs as asynchronous methods that take Objective-C blocks as completion handlers:

[SCShareableContent getShareableContentWithCompletionHandler:
    ^(SCShareableContent* content, NSError* error) {
        // ... this block runs on a background dispatch queue ...
    }];

Purego cannot directly supply Go functions as ObjC blocks in the general case (the block layout includes a C++-like vtable pointer and other runtime details), so the shim layer's job is to convert this asynchronous pattern into a synchronous C-ABI function call that Go can invoke. The pattern is straightforward:

int sckit_list_displays(sckit_display_t* out, int max, char* err, int err_len) {
    __block int count = -1;
    __block NSError* cap_err = nil;
    dispatch_semaphore_t sem = dispatch_semaphore_create(0);

    [SCShareableContent getShareableContentWithCompletionHandler:
        ^(SCShareableContent* content, NSError* error) {
            if (error) { cap_err = error; }
            else { /* copy displays to `out`, set `count` */ }
            dispatch_semaphore_signal(sem);
        }];

    dispatch_semaphore_wait(sem, DISPATCH_TIME_FOREVER);
    // ... copy error message to `err` if needed ...
    return count;
}

The dispatch_semaphore_wait call blocks the calling goroutine's OS thread (via purego's thread-pinning behavior) until the asynchronous block fires dispatch_semaphore_signal. From the Go side, this is a plain blocking C function call. From the Objective-C side, it correctly handles the async delivery model Apple expects.

For streams (persistent capture sessions delivering frames over time), the pattern generalizes: a frame-sink class conforming to SCStreamOutput retains the most recent CMSampleBuffer in a synchronized slot and signals a semaphore when new data arrives. The Go side's NextFrame call maps to a single semaphore wait plus a memory copy, matching the latency of a kernel-mediated shared memory queue.

3.4 Invariants the Template Enforces

Any library built with the Embedded Dylib pattern maintains the following invariants, which together produce the distribution experience the pattern is named for:

  1. Zero cgo in any file of the Go module. Not in the public API, not in internal packages, not in test files. The Go build system never invokes a C compiler.
  2. Universal binary for the dylib. Built with clang -arch arm64 -arch x86_64, verified via file output showing two slices. Runs on both Intel and Apple Silicon without conditional compilation.
  3. //go:embed-bundled dylib committed to the repository. The repository tracks the binary artifact under version control so go get deterministically delivers the working dylib without triggering a rebuild. This trades a small amount of Git history growth (approximately 150-200 KB per dylib version) for dramatically simplified distribution.
  4. Auto-extract to user cache directory on first call. No manual steps. No LD_LIBRARY_PATH manipulation. No DYLD_FRAMEWORK_PATH. The first call to any exported function in the Go API triggers extraction if needed; subsequent calls are zero-overhead.
  5. context.Context on every blocking public function. Even when the underlying synchronous shim cannot be interrupted mid-call, the Go API exposes cancellation semantics at the call boundary. A future refinement of the shim can add cooperative cancellation by signaling a flag the shim checks between callback events.
  6. Functional options pattern for all tunables. Adding a new knob in a subsequent version does not require modifying any existing caller. See sckit.WithFrameRate, kinrec.WithMic, etc.
  7. Sealed interfaces for extensibility points. When the API exposes a type-switchable concept (e.g., the Target interface in sckit-go), the interface is sealed via an unexported method. Only the library can add new implementations; external packages cannot satisfy the interface with invalid filter descriptors.
  8. Platform version checks at load time. The dylib loads only on runtime.GOOS == "darwin". Other platforms produce a clear error at package initialization rather than a cryptic symbol-not-found error at first use.
  9. Test coverage ≥70% with pure-Go unit tests separated from //go:build integration tests. Unit tests exercise the Go layer without requiring the TCC (Transparency, Consent, and Control) permission dialogs that CI runners cannot grant; integration tests verify end-to-end behavior on developer machines.

These invariants are not aspirational. They are enforced by make verify in every library following the template.

4. Case Study 1: sckit-go

sckit-go is a Go library providing pure-Go bindings to macOS ScreenCaptureKit. It is the validating instance of the Embedded Dylib pattern for a library-layer offering — a primitive that higher-level tools consume.

4.1 Scope

The library exposes five target kinds, all satisfying a sealed Target interface: Display (a whole attached display by CGDirectDisplayID), Window (a single window by SCWindow.windowID), App (all on-screen windows of a bundle ID composited onto one display), Region (a sub-rectangle of a display specified in points), and Exclude (wrap any of the above and mask out a list of windows, for use cases such as excluding one's own UI from a screen capture). Each target supports two modes: Capture(ctx, target, opts...) returning an image.Image for one-shot screenshots, and NewStream(ctx, target, opts...) returning a *Stream for persistent capture at a configured frame rate.

The public API shape, reduced to its signatures, is small enough to fit in 20 lines:

func ListDisplays(ctx context.Context) ([]Display, error)
func ListWindows(ctx context.Context)  ([]Window, error)
func ListApps(ctx context.Context)     ([]App, error)

func Capture(ctx context.Context, t Target, opts ...Option) (image.Image, error)
func CaptureToFile(ctx context.Context, t Target, path string, opts ...Option) error

func NewStream(ctx context.Context, t Target, opts ...Option) (*Stream, error)
func (*Stream) NextFrame(ctx context.Context)     (image.Image, error)
func (*Stream) NextFrameBGRA(ctx context.Context) (Frame, error)  // zero-copy
func (*Stream) Frames(ctx context.Context) (<-chan image.Image, <-chan error)
func (*Stream) Close() error

func WithResolution(w, h int) Option
func WithFrameRate(fps int)   Option
func WithCursor(show bool)    Option
func WithColorSpace(cs ColorSpace) Option
func WithQueueDepth(n int)    Option

4.2 Internal Architecture

The Go module contains four non-test files totaling 826 lines, organized by concept: sckit.go (package-level Load and dylib resolution), target.go (the sealed Target interface and its five concrete implementations), options.go (the functional options), capture.go (one-shot operations), and stream.go (persistent streams). The dylib layer is a single 900-line Objective-C file exposing eleven plain C-ABI functions: three enumeration (sckit_list_displays, sckit_list_windows, sckit_list_apps), three capture (sckit_capture_display, sckit_capture_window, sckit_capture_app), three stream-start (sckit_stream_start, sckit_window_stream_start, sckit_app_stream_start), and two generic stream operations (sckit_stream_next_frame, sckit_stream_stop).

A shared 56-byte sckit_config_t struct, whose layout is mirrored in a private Go cfgC struct verified by a unsafe.Sizeof assertion in the unit tests, carries all configuration — resolution, frame rate, cursor preference, color space, queue depth, source rectangle for Region targets, and an exclude window ID list — across the C ABI. Extending the configuration without breaking the ABI is accomplished by reserving fields in the struct tail, permitting future versions to add parameters that older dylibs ignore.

4.3 Performance

We measured sckit-go on a 2024 MacBook Air M3 (16 GB unified memory) capturing the built-in Liquid Retina XDR display at its native 2560×1664 and a connected 1920×1080 external monitor. All measurements use the sckit bench command included with the library.

Operationp50p95p99
NextFrame steady-state @ 60 fps17.4 ms18.2 ms19.1 ms
NextFrame steady-state @ 30 fps34.0 ms41.0 ms42.5 ms
NextFrame steady-state @ 10 fps100.9 ms102.0 ms102.8 ms
NextFrame minimum (frame already queued)< 1 ms
NewStream cold open81 ms85 ms88 ms
Capture(Display) one-shot132 ms225 ms248 ms
Capture(Window) one-shot89 ms108 ms115 ms
ListDisplays45 ms75 ms82 ms
ListWindows40 ms60 ms71 ms
BGRA→RGBA conversion (1920×1080)2.4 ms2.8 ms3.1 ms

Table 2: sckit-go latency measurements. NextFrame p50 floor is the display refresh interval (1/60 second = 16.67 ms); no library can exceed this on a 60 Hz display. The < 1 ms minimum represents the purego boundary plus memcpy when a frame is already queued at the time of call.

Two observations matter. First, steady-state streaming latency matches the display refresh rate cap exactly — no Go-level overhead is introduced by the purego boundary, the Objective-C shim, or the CMSampleBuffer-to-image.RGBA conversion. The library is as fast as Apple's own framework permits. Second, the minimum NextFrame observation of sub-millisecond latency demonstrates the fast path when capture and consumption are in lockstep: the underlying queue depth prevents the consumer from stalling when the producer fires a new frame within a millisecond of the consumer's request.

4.4 Testing and Verification

The library ships 43 pure-Go unit tests exercising the public API surface, the options pattern, pixel format conversion correctness, sealed-interface properties, and sentinel error behavior. An additional 19 integration tests, gated by the integration build tag to avoid requiring Screen Recording permission on CI runners that cannot grant it, verify live capture of all five target types. Combined line coverage as measured by go test -cover is 78.8%.

The staticcheck static analyzer reports zero warnings across the module. The golangci-lint v2 configuration enabling nine linters (errcheck, govet, ineffassign, staticcheck, unused, goconst, misspell, unconvert, unparam) reports zero issues. gofmt -l reports no files requiring formatting. All lint results are reproducible via make lint.

4.5 Distribution Evidence

The library is published at github.com/LocalKinAI/sckit-go, tagged v0.1.0. A fresh go get github.com/LocalKinAI/sckit-go@v0.1.0 on a machine with no prior sckit-go installation and no C toolchain produces a working capture from a five-line main package:

package main

import (
    "context"
    "github.com/LocalKinAI/sckit-go"
)

func main() {
    ctx := context.Background()
    d, _ := sckit.ListDisplays(ctx)
    sckit.CaptureToFile(ctx, d[0], "screenshot.png")
}

go run . produces screenshot.png after approximately 270 ms, of which approximately 150 ms is the first-call dylib extraction to ~/Library/Caches/sckit-go/<hash>/libsckit_sync.dylib. Subsequent runs of the same program reuse the cached dylib and complete the capture in approximately 80 ms.

5. Case Study 2: kinrec — and a Cautionary Tale

kinrec is a second library built on top of the same template, targeting AVFoundation for screen recording with mixed system audio and microphone capture. It was built in parallel with sckit-go on the same day, in approximately five hours of focused work once sckit-go's template was validated.

5.1 Scope and Product Positioning

kinrec records the screen to an H.264 or HEVC encoded MP4 or MOV file, with four audio modes selectable by CLI flags: none (video only), --audio (system audio captured via ScreenCaptureKit's capturesAudio flag), --mic (microphone captured via AVCaptureSession), and --audio --mic (both sources mixed into a single AAC track via AVAudioEngine). The command-line interface follows the conventions established in sckit-go:

kinrec record -o demo.mp4 --audio --mic --duration 30s --fps 60

From a product perspective, kinrec is notable because it fills a second, adjacent ecosystem gap. The most visible open-source Mac screen recorder, Kap (17,000 stars), was abandoned in 2022 and relies on Electron. The commercial alternatives — CleanShot X at $29, Screen Studio at $89, Loom on subscription — are closed-source. The existing open-source options for Mac screen recording are either dormant (aperture-node, 2020) or Linux-first with Mac support treated as a second-class compile target (OBS, sufficient but a 200 MB download). A 5 MB Go binary that ships via go install and produces playable MP4s with system audio and microphone mixed on the fly fits a different user profile entirely.

5.2 The Audio Mix Pipeline

The four-mode audio architecture is worth describing in detail because the mix mode exercises a mature Apple API — AVAudioEngine in manual rendering mode — that few Go bindings have touched. The pipeline is:

SCStream (video, system audio if --audio)
    │
    ├─→ AVAssetWriterInput(video)  ──┐
    │                                  │
    └─→ (system audio path)            │
            ↓                          │
        AVAudioConverter               │
            ↓                          │
        AVAudioPlayerNode(sys) ────┐  │
                                    │  │
AVCaptureSession (mic if --mic)    │  │
    ↓                               │  │
AVAudioConverter                    │  ├──→ AVAssetWriter ──→ out.mp4
    ↓                               │  │
AVAudioPlayerNode(mic) ────────────┤  │
                                    ↓  │
                        mainMixerNode  │
                                    ↓  │
                    manual 20ms tick → AVAssetWriterInput(audio)

The engine is configured in AVAudioEngineManualRenderingModeOffline, which allows the Go-side code to pull mixed PCM in fixed-size chunks without requiring a live output device — a critical property for CLI binaries that cannot assume a default audio output exists. A 20 ms dispatch_source_t timer triggers renderOffline:toBuffer:error:, producing 960 frames of mixed float32 stereo PCM per tick. Each mixed buffer is wrapped as a CMSampleBuffer with a presentation timestamp anchored to the host clock and appended to the writer's audio input.

5.3 The Bug That Shipped and Was Caught

During informal user testing (the library author attempted to record a YouTube video playing alongside microphone narration), the mixed-audio mode produced MP4 files that were audible and correctly recorded — but the video was completely missing from playback. ffprobe analysis revealed the cause: the video track's packet presentation timestamps ranged from 876,467.86 seconds to 876,472.88 seconds, while the audio track's packets were in the range -0.04 to 5.10 seconds. The mp4 container was structurally valid; the absolute video timestamps simply placed the video frames approximately ten days into the future relative to the audio track, so media players correctly rendered what was actually written: 5 seconds of audio followed by silence.

The root cause was a PTS clock mismatch. ScreenCaptureKit delivers video CMSampleBuffers with presentation timestamps on the mach absolute clock (microseconds since system boot). The AVAudioEngine manual-rendering mixer, by contrast, produced output with timestamps starting at kCMTimeZero. The two clocks are orthogonal, and the 10-day gap is simply the current mach_absolute_time() value at the instant recording began.

The fix, committed prior to release, anchors the mixer's first output timestamp to the first input sample buffer's presentation timestamp. Because both the system audio (from SCStream) and microphone audio (from AVCaptureSession) use the mach host clock as their PTS basis, the anchor is always in the same timeline as the video track. The corrected pipeline produces files where video first-PTS and audio first-PTS are within 200 milliseconds of each other across all four audio modes.

The lesson generalizes beyond this library. Any pipeline that synthesizes output sample buffers from a clock distinct from the input sources must anchor to the input clock at first-buffer arrival, not at pipeline start. The regression test we added, which extracts first-packet timestamps with ffprobe and asserts they are within 1.5 seconds, is included here so that future applications of the Embedded Dylib pattern touching AVFoundation's synthesis APIs can include it from day one:

check_pts_drift() {
  local file="$1"
  local v=$(ffprobe -v error -select_streams v:0 -show_entries packet=pts_time     -of csv=p=0 "$file" | head -1 | cut -d',' -f1)
  local a=$(ffprobe -v error -select_streams a:0 -show_entries packet=pts_time     -of csv=p=0 "$file" | head -1 | cut -d',' -f1)
  local drift=$(python3 -c "print(abs($v - $a))")
  python3 -c "assert $drift < 1.5, 'PTS drift $drift > 1.5s'"
}

Our initial smoke tests — ffprobe to verify duration, ffmpeg volumedetect to verify audio content — all passed. They did not verify PTS alignment. Future validation of Embedded Dylib libraries touching multi-track media pipelines should include this check.

5.4 Metrics

kinrec ships 28 unit tests and 10 integration tests, combined line coverage 73.3%. Zero staticcheck warnings, zero golangci-lint issues. The embedded dylib is 190 KB universal. A three-second capture at 30 fps with mixed audio produces a 327 KB MP4; a thirty-second capture at 60 fps with both audio sources produces approximately 9 MB, corresponding to a 2.3 Mbps average bitrate in H.264 high profile.

5.5 Private-First Distribution

kinrec is distributed as a private GitHub repository with a planned public release window of one to two weeks after initial development. This deviates from sckit-go's same-day public release and reflects the different risk profile of product-layer software: library APIs, once published, are nearly impossible to retract; product-layer software can tolerate more iteration before first public exposure. The one-week private window is allocated to dogfood: the library author uses kinrec to produce actual screencasts and bug-reports against real-world usage patterns that automated testing cannot cover (audio device hotswap, display configuration change during recording, long-duration recording exceeding 30 minutes).

6. AI-Assisted Objective-C: The Final Barrier Falls

The Embedded Dylib pattern has been technically feasible since purego 0.8 released in late 2024. The pattern was not widely adopted because the bottleneck was not the Go-side FFI plumbing; it was the Objective-C shim itself. Writing 400-900 lines of correct Objective-C — with proper ARC semantics, dispatch queue management, block retain-and-release hygiene, and correct handling of Core Foundation types — is a task that Go developers typically require a week or more to accomplish on their first attempt. The pattern could not go viral in the Go ecosystem as long as that cost remained.

This cost has been effectively eliminated. A Go developer working alongside a frontier LLM — the authors of this paper used Claude Opus 4.7 — can produce correct, idiomatic Objective-C in the same flow as writing Go. The 900-line sckit_sync.m shim was written end-to-end with the Go developer describing intent in Go terms ("capture the screen once, return BGRA bytes") and the LLM translating to the appropriate ObjC idioms (SCShareableContent getShareableContentWithCompletionHandler: wrapped in dispatch_semaphore, ARC-managed __block variables, CFBridgingRetain to transfer ownership from ObjC to a C-held handle). The developer's role was to verify correctness, request specific idioms when they knew them, and drive the overall structure. The LLM's role was to translate each Go-described operation into ObjC that compiled and ran correctly.

Three capabilities of the frontier model mattered specifically:

  1. Fluency across Core Foundation idioms. CFBridgingRetain, CFBridgingRelease, and the __bridge cast variants are subtle and historically a source of bugs in ObjC code written by newcomers. The model produced correct bridging casts without prompting, and flagged ownership transfers that required the retained variant.
  2. Knowledge of framework-specific gotchas. The NSApplicationLoad call required to initialize Core Graphics Services for window-scoped capture in a CLI binary is documented only in Apple's sample code and a handful of StackOverflow answers. The model identified the CGS_REQUIRE_INIT assertion at first encounter and proposed the correct fix.
  3. Dispatch-semaphore synchronization patterns. Wrapping async APIs with dispatch_semaphore_create plus __block variable capture for error and result passing is a specific, repeatable pattern. The model applied it consistently across all eleven exported C functions without template drift.

We consider this a methodological observation, not a commercial endorsement. The important fact is that the skill-gap argument against "why hasn't anyone done this in Go?" — that Objective-C is a language Go developers don't speak — is no longer valid. The skill gap has been collapsed by tools that are now table-stakes among solo developers and small teams. This is the reason Layer 1 drought in Go's macOS ecosystem can be addressed now, and specifically now; it could not have been addressed even two years ago at comparable velocity.

7. Evaluation

We compare the Embedded Dylib pattern against three alternative approaches for bridging Go to native macOS frameworks.

7.1 Approach Comparison

Dimensioncgo + headersPure purego (no dylib)Shell-out to system binaryEmbedded Dylib
Downstream user needs C toolchainYesNoNoNo
Cross-compilation preservedNoYesYesYes
Access to async block-based APIsYesPartial (experimental)NoYes
Access to delegate protocolsYesNoNoYes
Distribution size overhead~0 (compiled from source)~0~0100-200 KB per library
First-call latency0 (statically linked)~50 ms (dlopen)process spawn (~10 ms)~150 ms (extract + dlopen)
Subsequent-call latencySame as CSame as CProcess spawn overheadSame as C
Correctness guaranteesStrong (native)Weak (experimental block support)Brittle (parsing stdout)Strong (native)
API evolvabilityStrong (recompile)LimitedStrongStrong (ABI via struct)

Table 3: Comparative analysis of macOS binding approaches. The "Pure purego" column refers to approaches that attempt to call ObjC class methods and construct blocks entirely from Go, bypassing a shim. This is possible for some APIs but not for the delegate-heavy ones we target.

The Embedded Dylib pattern trades a small, amortized first-call latency and 100-200 KB of binary size for preservation of the distribution model Go users expect (go get works), full access to the framework's native API surface including blocks and delegates, and correctness guarantees that match cgo. The tradeoff is the right one for Layer 1 bindings specifically, where downstream adoption is gated by ease of installation.

7.2 Developer Velocity

The headline result of this work is that two production libraries — 826 lines of Go plus 900 lines of Objective-C for sckit-go, and 816 lines of Go plus 970 lines of Objective-C for kinrec — were written, tested, documented, and released in a single thirteen-hour window on April 22, 2026. This figure includes not only the code itself but: README files, CHANGELOG files, three Architectural Decision Records (ADRs) for sckit-go, unit and integration test suites, CI configuration, Makefile, golangci-lint configuration, one full CLI per library, and debug-and-fix of the PTS alignment bug described in Section 5.3.

We do not claim this velocity as a baseline for all applications of the pattern. The first library is always the most expensive, because the template itself is being invented; the second library benefits from copy-paste of the invariants, the Makefile, the CI workflow, and the directory structure. A reasonable calibration is that the first application of the pattern to a new framework costs a full working day; the second costs two-thirds of a day; subsequent applications drop to two to four days each as framework-specific complexity dominates the template's invariants.

7.3 Binary Size Analysis

An installed sckit-go CLI binary (sckit), produced via go install github.com/LocalKinAI/sckit-go/cmd/sckit@latest, is 5.4 MB. A similar kinrec binary is 5.8 MB. These figures include the Go runtime (~1.8 MB), the embedded universal dylib (147-190 KB), and the application code. For comparison, the abandoned Electron-based Kap recorder was approximately 180 MB installed; modern Electron alternatives range from 150 MB to 300 MB. The size advantage over Electron-based tooling is roughly 30-50x; the size advantage over native bundles (CleanShot X at 46 MB, Screen Studio at 115 MB) is smaller but still material for CLI-first distribution.

8. The KinKit Program

The Embedded Dylib pattern is a template, and templates exist to be applied repeatedly. We sketch here a program of twelve candidate libraries — collectively designated KinKit — that would, if completed, provide pure-Go access to the major macOS frameworks currently inaccessible to the Go ecosystem.

PriorityNameFrameworkPrimary use caseEstimated effort
Ssckit-goScreenCaptureKitScreen / window / app capture✅ Shipped
SkinrecAVFoundationScreen recording to mp4✅ Private, public in one week
Sinput-goCGEvent + HIDMouse / keyboard synthesis3-4 days
Skinax-goAccessibility (AX)UI tree read / write5-7 days
Avision-goVisionOCR, barcode, face detection3-4 days
Aspeech-goSFSpeechRecognizerOn-device transcription2-3 days
Afsx-goFSEventsFile system monitoring2 days
Akinnotify-goUserNotificationsNative notifications1-2 days
Bkeychain-goKeychainSecure credential storage2 days
Bclipboard-goNSPasteboardRich clipboard access1-2 days
Bworkspace-goNSWorkspaceApp launching, window ops2 days
Beventkit-goEventKitCalendar / reminders2-3 days

Table 4: Twelve KinKit candidates, grouped by priority (S: required for computer-use agent stack; A: high-leverage independent; B: opportunistic). Effort estimates assume the Embedded Dylib template has been internalized and do not include the first-library overhead.

We emphasize that KinKit is not a meta-package. There is no github.com/LocalKinAI/kinkit dependency to import; each library is released as an independent Go module, versioned independently, with its own release cadence and breaking-change policy. KinKit is a documentation convention and a shared quality bar, not a runtime dependency.

The S-tier libraries, once complete, form the sensor-and-actuator substrate required for a computer-use agent on macOS: sckit-go reads pixels, kinax-go reads the UI tree, input-go writes mouse and keyboard events. The combination is the missing link that has kept Go out of the computer-use agent wave, which is currently dominated by Python (LangGraph, Open Interpreter) and TypeScript (Claude Computer Use SDK).

8.1 A Specific Thesis About Ecosystem Unblocking

We hypothesize that completing KinKit's S-tier within the next 90 days will produce a measurable shift in where Go agents run. Specifically:

These are hypotheses, not guarantees. The 90-day horizon is chosen specifically because it is short enough to falsify — if the Embedded Dylib pattern has the unblocking effect we claim, the evidence will be visible by August 2026. If not, the thesis deserves reexamination.

9. Limitations

We discuss four classes of limitation that users of the Embedded Dylib pattern should be aware of.

9.1 Platform Specificity

The pattern is macOS-only by design. Applying it to Windows would require a separate DLL with a different Objective-C-equivalent shim (in C++ for COM interop, or in a mix of C# and C++/WinRT). Applying it to Linux would be simpler (no ObjC required; just purego against system libraries) but also less useful, because the Go ecosystem on Linux is well-served by stdlib and golang.org/x/sys/unix. A cross-platform facade library on top of macOS-Embedded-Dylib, Windows-Embedded-DLL, and Linux-purego-only implementations is conceivable but introduces the abstraction costs of cross-platform design — exactly the costs the Go community has historically been willing to pay for runtime ubiquity and unwilling to pay for native platform fidelity.

9.2 Binary Size

Each library in the KinKit program contributes 100-200 KB of universal dylib bytes to downstream binaries that depend on it. This is small in absolute terms but non-trivial when composed: a downstream tool importing six KinKit libraries carries approximately 1 MB of embedded binary data. For command-line tools this is imperceptible; for embedded scenarios with strict size budgets, it is a cost to monitor. Future work on de-duplicating shared Apple runtime code across multiple embedded dylibs (for instance, a shared kinkit_runtime.dylib loaded once per process rather than per-library) could reduce this.

9.3 Version Coupling

A library using the Embedded Dylib pattern ships a specific compiled dylib bound to a specific Go module version. If the downstream user's macOS version later introduces an incompatible change to the Apple framework the dylib targets, the downstream user may encounter runtime failures that would be compile-time failures in a cgo-based binding. This has not occurred in our validation (both libraries compiled in April 2026 continue to run on every tested macOS 14-26 point release), but the class of failure is real and requires monitoring as new macOS releases ship.

9.4 The Objective-C Skill Gap Has Only Shifted, Not Vanished

Section 6 argues that AI assistance makes Objective-C authorship accessible to Go developers. We reiterate that the word "accessible" is carefully chosen. Maintaining, debugging, and extending an Objective-C shim over years still requires either a maintainer with genuine Objective-C fluency or continued access to frontier LLM assistance at the same quality level. For a library that ships and then stagnates, this is fine. For a library that must evolve alongside Apple framework changes, it is a durable dependency on the current state of AI tooling. Institutional maintainers should plan for the possibility that LLM assistance may become less available, more expensive, or lower quality over the library's lifetime.

10. Conclusion

The Go ecosystem on macOS has been stuck at Layer 0 for a decade, bounded on one side by a deprecated and removed Apple API (CGDisplayCreateImage) and on the other by the distribution cost of cgo. We have presented the Embedded Dylib pattern as a resolution to the bounded problem, and validated it by shipping two production libraries — one library-layer, one product-layer — in a single thirteen-hour development window. The pattern's central insight is that the building blocks for cgo-free native bindings have existed for several years but have not been composed: purego for the function-pointer FFI, //go:embed for the binary asset bundling, and an Objective-C shim that exposes asynchronous frameworks as synchronous C-ABI functions. Composing them produces a distribution experience that downstream users cannot distinguish from a pure-Go library. A user runs go get, writes five lines of Go, and screen capture works. No clang, no Makefile, no TCC manifest manipulation beyond the one-time permission grant that every Mac recording application requires.

We have further argued that the final barrier to wide adoption of this pattern — the skill gap between Go developers and Objective-C authorship — has been effectively collapsed by frontier large language models acting as collaborative ObjC authors. This is an observation about 2026, not a prediction about 2030, but it is the observation that makes the KinKit program feasible on a 90-day timeline rather than a three-year one.

The Layer 1 drought in Go's macOS ecosystem was never a technical impossibility. It was a composition of three mild costs — purego maturity, go:embed adoption, Objective-C skill-gap — each of which separately was not prohibitive, but which compounded to stall the category. All three costs have now come down. What remains is the work of actually shipping the libraries, one framework at a time, with the discipline to keep each library focused, well-tested, and maintained. If the 90-day thesis in Section 8 holds, then the era in which a Go developer had to reach for Python or shell-out or Electron to touch the Mac platform will end, quietly, sometime this summer. This paper is an argument for beginning that work immediately, and a template for doing so correctly.

References

Apple Inc. (2022). ScreenCaptureKit Framework Reference. developer.apple.com/documentation/screencapturekit.

Apple Inc. (2024). CGDisplayCreateImage Deprecation Notice. macOS 15 Release Notes.

ebitengine authors. (2024). purego v0.8.0. github.com/ebitengine/purego.

Go team. (2021). Go 1.16 Release Notes — Embed Directive. go.dev/doc/go1.16.

kbinani. (2016-2023). screenshot: Capture images of displays. github.com/kbinani/screenshot.

mediar-ai. (2024-2026). screenpipe: Build AI apps that have the full context of the user. github.com/mediar-ai/screenpipe.

wulkano. (2016-2022). Kap: An open-source screen recorder built with web technology. github.com/wulkano/Kap.

LocalKin Team. (2026). Thin Soul, Fat Skill: A Token-Efficient Architecture for Production Multi-Agent Systems. LocalKin Papers, April 2026.

LocalKin Team. (2026). Grep is All You Need: Zero-Preprocessing Knowledge Retrieval for LLM Agents. LocalKin Papers, April 2026.

The source code for sckit-go is available at github.com/LocalKinAI/sckit-go under the MIT license. kinrec will be published at github.com/LocalKinAI/kinrec following its one-week private dogfood period.