opencode

Running opencode with Local LLMs: From Ollama to oMLX on Apple Silicon

A hands-on guide to running AI coding agents entirely on local models. From Ollama's llama.cpp to oMLX's native MLX backend — getting 63 tok/s and 128K context on an M4 Pro.