Ollama
Notes
Release notes
v0.1.19
· 1y+
This release focuses on performance and fixing a number issues and crashes relating to memory allocation.
New Models
- LLaMa-Pro: An expansion of LLaMa by Tencent to an 8B that specializes in language, programming and mathematics.
What's Changed
- Fixed "out of memory" errors when running models such as
llama2,mixtralorllama2:13bwith limited GPU memory - Fixed CUDA errors when running on older GPUs that aren't yet supported
- Increasing context size with
num_ctxwill now work (up to a model's supported context window).
To use a 32K context window with Mistral:
# ollama run
/set parameter num_ctx 32678
# api
curl http://localhost:11434/api/generate -d '{
"model": "mistral",
"prompt": "Why is the sky blue?",
"options": {"num_ctx": 32678}
}'
- Larger models such as
mixtralcan now be run on Macs with less memory - Fixed an issue where pressing up or down arrow keys would cause the wrong prompt to show in
ollama run - Fixed performance issues on Intel Macs
- Fixed an error that would occur with old Nvidia GPUs
OLLAMA_ORIGINSnow supports browser extension URLs- Ollama will now offload more processing to the GPU where possible
New Contributors
- @sublimator made their first contribution in https://github.com/jmorganca/ollama/pull/1797
- @gbaptista made their first contribution in https://github.com/jmorganca/ollama/pull/1830
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.18...v0.1.19