Ollama

ollama/ollama last check 2026-06-19 01:01 UTC 191 releases recent

Notes

Release notes

v0.1.32 · 1y+

picture of ollama levelling up

New models

WizardLM 2: State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases.
- wizardlm2:8x22b: large 8x22B model based on Mixtral 8x22B
- wizardlm2:7b: fast, high-performing model based on Mistral 7B
Snowflake Arctic Embed: A suite of text embedding models by Snowflake, optimized for performance.
Command R+: a powerful, scalable large language model purpose-built for RAG use cases
DBRX: A large 132B open, general-purpose LLM created by Databricks.
Mixtral 8x22B: the new leading Mixture of Experts (MoE) base model by Mistral AI.

Ollama will now better utilize available VRAM, leading to less out-of-memory errors, as well as better GPU utilization
When running larger models that don't fit into VRAM on macOS, Ollama will now split the model between GPU and CPU to maximize performance.
Fixed several issues where Ollama would hang upon encountering an error
Fix issue where using quotes in OLLAMA_ORIGINS would cause an error

@sugarforever made their first contribution in https://github.com/ollama/ollama/pull/3400
@yaroslavyaroslav made their first contribution in https://github.com/ollama/ollama/pull/3378
@Nagi-ovo made their first contribution in https://github.com/ollama/ollama/pull/3423
@ParisNeo made their first contribution in https://github.com/ollama/ollama/pull/3436
@philippgille made their first contribution in https://github.com/ollama/ollama/pull/3437
@cesto93 made their first contribution in https://github.com/ollama/ollama/pull/3461
@ThomasVitale made their first contribution in https://github.com/ollama/ollama/pull/3515
@writinwaters made their first contribution in https://github.com/ollama/ollama/pull/3539
@alexmavr made their first contribution in https://github.com/ollama/ollama/pull/3555

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.31...v0.1.32