Ollama

ollama/ollama last check 191 releases recent
Notes
Release notes
v0.3.4 · 1y+
view on github

<img width="1548" alt="Screenshot 2024-08-06 at 8 16 44 PM" src="https://github.com/user-attachments/assets/9a53a40e-4649-4d67-8433-052a5941a5b6">

New embedding models

  • BGE-M3: a large embedding model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity.
  • BGE-Large: a large embedding model trained in english.
  • Paraphrase-Multilingual: A multilingual embedding model trained on parallel data for 50+ languages.

New embedding API with batch support

Ollama now supports a new API endpoint /api/embed for embedding generation:

curl http://localhost:11434/api/embed -d &#39;{
  &quot;model&quot;: &quot;all-minilm&quot;,
  &quot;input&quot;: [&quot;Why is the sky blue?&quot;, &quot;Why is the grass green?&quot;]
}&#39;

This API endpoint supports new features:

  • Batches: generate embeddings for several documents in one request
  • Normalized embeddings: embeddings are now normalized, improving similarity results
  • Truncation: a new truncate parameter that will error if set to false
  • Metrics: responses include load_duration, total_duration and prompt_eval_count metrics

See the API documentation for more details and examples.

What's Changed

  • Fixed initial slow download speeds on Windows
  • NUMA support will now be autodetected by Ollama to improve performance
  • Fixed issue where the /api/embed would sometimes return embedding results out of order

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.3...v0.3.4