1. TurboQuant in gguf-runner: roughly half the memory at nearly the same speed

    Implementing TurboQuant in gguf-runner cuts KV-cache memory roughly in half while staying close to Q8 throughput.

  2. gguf-runner updates: vision support, releases, and many small improvements

    gguf-runner gained vision support, ships GitHub release binaries, and received many usability and performance improvements.

  3. gguf-runner: a minimal GGUF CLI

    A small Rust CLI to run GGUF models locally: mmap loading, CPU-only inference, and a general-purpose terminal runner that can lean on RAM (and swap) for large models.

  4. An epaper picture frame

  5. A Special Purpose HTTP Proxy in Rust

  6. A HTTP Server-Timing Header for axum

  7. Write a KEDA external Scaler for Oracle in Rust

  8. How to access Azure Key Vault in Rust

  9. A hello world kubernetes operator in Rust