Ollama
Notes
Release notes
v0.13.2
· 6m+
What's Changed
- Flash attention is now enabled by default for vision models such as
mistral-3,gemma3,qwen3-vland more. This improves memory utilization and performance when providing images as input. - Fixed GPU detection on multi-GPU CUDA machines
- Fixed issue where
deepseek-v3.1would always think even with thinking is disabled in Ollama's app
New Contributors
- @chengcheng84 made their first contribution in https://github.com/ollama/ollama/pull/13265
- @nathan-hook made their first contribution in https://github.com/ollama/ollama/pull/13256
Full Changelog: https://github.com/ollama/ollama/compare/v0.13.1...v0.13.2-rc0