1 result found
Speculative KV coding compresses key-value cache up to 4x without loss, potentially cutting memory costs and enabling larger models on existing hardware.