Ollama

ollama/ollama last check 191 releases recent
Notes
Release notes
v0.12.10 · 6m+
view on github

ollama run now works with embedding models

ollama run can now run embedding models to generate vector embeddings from text:

ollama run embeddinggemma "Hello world"

Content can also be provided to ollama run via standard input:

echo "Hello world" | ollama run embeddinggemma

What's Changed

  • Fixed errors when running qwen3-vl:235b and qwen3-vl:235b-instruct
  • Enable flash attention for Vulkan (currently needs to be built from source)
  • Add Vulkan memory detection for Intel GPU using DXGI+PDH
  • Ollama will now return tool call IDs from the /api/chat API
  • Fixed hanging due to CPU discovery
  • Ollama will now show login instructions when switching to a cloud model in interactive mode
  • Fix reading stale VRAM data
  • ollama run now works with embedding models

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.12.9...v0.12.10