Pre-computed Q-Filters for efficient KV cache compression. • 15 items • Updated • 7
This model has been pushed to the Hub using the PytorchModelHubMixin integration:
- Library: [More Information Needed]
- Docs: [More Information Needed]
- Downloads last month
- 4,564
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
