Ollama

ollama/ollama last check 2026-06-19 01:01 UTC 191 releases recent

Notes

site

Release notes

v0.1.23 · 1y+

view on github

vision

New vision models

The LLaVA model family on Ollama has been updated to version 1.6, and now includes a new 34b version:

ollama run llava A new 7B LLaVA model based on mistral.
ollama run llava:13b 13B LLaVA model
ollama run llava:34b 34B LLaVA model – one of the most powerful open-source vision models available

These new models share new improvements:

More permissive licenses: LLaVA 1.6 models are distributed via the Apache 2.0 license or the LLaMA 2 Community License.
Higher image resolution: support for up to 4x more pixels, allowing the model to grasp more details.
Improved text recognition and reasoning capabilities: these models are trained on additional document, chart and diagram data sets.

`keep_alive` parameter: control how long models stay loaded

When making API requests, the new keep_alive parameter can be used to control how long a model stays loaded in memory:

curl http://localhost:11434/api/generate -d &#39;{
  &quot;model&quot;: &quot;mistral&quot;,
  &quot;prompt&quot;: &quot;Why is the sky blue?&quot;,
  &quot;keep_alive&quot;: &quot;30s&quot;
}&#39;

If set to a positive duration (e.g. 20m, 1hr or 30), the model will stay loaded for the provided duration
If set to a negative duration (e.g. -1), the model will stay loaded indefinitely
If set to 0, the model will be unloaded immediately once finished
If not set, the model will stay loaded for 5 minutes by default

Support for more Nvidia GPUs

GeForce GTX TITAN X 980 Ti 980 970 960 950 750 Ti 750
GeForce GTX 980M 970M 965M 960M 950M 860M 850M
GeForce 940M 930M 910M 840M 830M
Quadro M6000 M5500M M5000 M2200 M1200 M620 M520
Tesla M60 M40
NVS 810

What's Changed

New keep_alive API parameter to control how long models stay loaded
Image paths can now be provided to ollama run when running multimodal models
Fixed issue where downloading models via ollama pull would slow down to 99%
Fixed error when running Ollama with Nvidia GPUs and CPUs without AVX instructions
Support for additional Nvidia GPUs (compute capability 5)
Fixed issue where system prompt would be repeated in subsequent messages
ollama serve will now print prompt when OLLAMA_DEBUG=1 is set
Fixed issue where exceeding context size would cause erroneous responses in ollama run and the /api/chat API
ollama run will now allow sending messages without images to multimodal models

New Contributors

@jaglinux made their first contribution in https://github.com/ollama/ollama/pull/2224
@textspur made their first contribution in https://github.com/ollama/ollama/pull/2252
@rjmacarthy made their first contribution in https://github.com/ollama/ollama/pull/1950
@hugo53 made their first contribution in https://github.com/ollama/ollama/pull/1957
@RussellCanfield made their first contribution in https://github.com/ollama/ollama/pull/2313

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.22...v0.1.23

New vision models

keep_alive parameter: control how long models stay loaded

Support for more Nvidia GPUs

What's Changed

New Contributors

`keep_alive` parameter: control how long models stay loaded