An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: neural-compressor

huggingface/optimum-benchmark

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

Language: Python - Size: 8.3 MB - Last synced at: 4 days ago - Pushed at: 20 days ago - Stars: 303 - Forks: 58

intel/auto-round

Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Transformers, and vLLM.

Language: Python - Size: 10.9 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 497 - Forks: 41