GitHub topics: neural-compressor

Repositories

huggingface/optimum-benchmark

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

Language: Python - Size: 8.3 MB - Last synced at: 4 days ago - Pushed at: 20 days ago - Stars: 303 - Forks: 58

intel/auto-round

Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Transformers, and vLLM.

Language: Python - Size: 10.9 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 497 - Forks: 41

Related Keywords

neural-compressor 2 benchmark 1 onnxruntime 1 openvino 1 pytorch 1 tensorrt-llm 1 text-generation-inference 1 awq 1 gptq 1 int4 1 quantization 1 rounding 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub topics: neural-compressor

huggingface/optimum-benchmark

intel/auto-round