Gguf

TurboQuant in gguf-runner: roughly half the memory at nearly the same speed

Implementing TurboQuant in gguf-runner cuts KV-cache memory roughly in half while staying close to Q8 throughput.
Created 2026
gguf-runner updates: vision support, releases, and many small improvements

gguf-runner gained vision support, ships GitHub release binaries, and received many usability and performance improvements.
Created 2026