Ollama

ollama/ollama last check 2026-06-18 16:02 UTC 191 releases recent

Notes

Release notes

v0.30.8 · recent

What's Changed

Fixed ollama launch selecting the wrong provider in some cases
Improved prompt caching by decoupling it from context shift for better KV cache reuse
More stable MLX inference with hardened linear and embedding layers
MLX runner now creates snapshots during prompt processing and speculative decoding for improved reliability
Improved recurrent model support with per-boundary states from the gated-delta kernels

Full Changelog: https://github.com/ollama/ollama/compare/v0.30.7...v0.30.8