We have talked about how I like running local models on macOS through the MLX framework: it is fast and it doesn’t require an ollama daemon in the background. simonw/llm provides good CLI ergonomics (for instance, to provide examples, set a system prompt, temperature, etc.).

I also like my pipelines deterministic so that upstream releases don’t break them by mistake. For this, I use Nix to package things up.

Unfortunately, when using the mlx Nix derivation available in nixpkgs, the CLI would stall and the inference process would eventually SIGSEV. That’s because that derivation does not build with Metal support:

# NOTE The `metal` command-line utility used to build the Metal kernels is not open-source.
# To build mlx with Metal support in Nix, you'd need to use one of the sandbox escape
# hatches which let you interact with a native install of Xcode, such as `composeXcodeWrapper`
# or by changing the upstream (e.g., https://github.com/zed-industries/zed/discussions/7016).

The sandbox escape hatch sounds scary. Instead, I fixed it by pulling wheels from Pypi (it turns out that mlx and mlx-metal are required) and patching them to work with Nix.

If anyone needs it, here is the resulting derivation. Below it is in action:

echo "Hi" | nix shell .#llm -c llm prompt -m mlx-community/Llama-3.2-3B-Instruct-4bit

How can I assist you today?