An open API service providing repository metadata for many open source software ecosystems.

Topic: "mxformat"

intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language: Python - Size: 469 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,414 - Forks: 269

intel/neural-speed 📦

An innovative library for efficient LLM inference via low-bit quantization

Language: C++ - Size: 16.2 MB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 348 - Forks: 38