Topic: "neural-compressor"
intel/auto-round
Advanced Quantization Algorithm for LLMs/VLMs.
Language: Python - Size: 10.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 457 - Forks: 37

huggingface/optimum-benchmark
🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
Language: Python - Size: 8.28 MB - Last synced at: about 22 hours ago - Pushed at: 2 days ago - Stars: 300 - Forks: 58
