Ollama

notes:
Release list
0.23.1
RECENT
0.23.0
RECENT
0.22.1
0.22.0
0.21.3
0.21.2
0.21.1
0.21.0
0.20.7
0.20.6
0.20.5
0.20.4
0.20.3
0.20.2
0.20.0
0.19.0
0.18.4
0.18.3
0.18.2
0.18.1
Release notes:

Gemma 4 MTP (Multi-token Processing) for the MLX runner

Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks.

ollama run gemma4:31b-coding-mtp-bf16

What's Changed

  • Update MLX and MLX-C with threading fixes by @dhiltgen in https://github.com/ollama/ollama/pull/15845
  • go: bump to 1.26 by @ParthSareen in https://github.com/ollama/ollama/pull/15904
  • Add Gemma 4 MTP speculative decoding by @pdevine in https://github.com/ollama/ollama/pull/15980

Full Changelog: https://github.com/ollama/ollama/compare/v0.23.0...v0.23.1-rc0

Copyright © 2023 - All right reserved by Yadoc SAS