Ollama
Notes
Release notes
v0.15.5
New models
- GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
- Qwen3-Coder-Next: a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
Improvements to ollama launch
ollama launchcan now be provided arguments, for exampleollama launch claude -- --resumeollama launchwill now work run subagents when usingollama launch claude- Ollama will now set context limits for a set of models when using
ollama launch opencode
What's Changed
- Sub-agent support for
ollama launchfor planning, deep research, and similar tasks ollama signinwill now open a browser window to make signing in easier- Ollama will now default to the following context lengths based on VRAM:
- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context
- GLM-4.7-Flash support on Ollama's experimental MLX engine
ollama signinwill now open the browser to the connect page- Fixed off by one error when using
num_predictin the API - Fixed issue where tokens from a previous sequence would be returned when hitting
num_predict
New Contributors
- @avukmirovich made their first contribution in https://github.com/ollama/ollama/pull/13934
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.4...v0.15.5