ecosyste.ms

Repos

An open API service providing repository metadata for many open source software ecosystems.

Topic: "mxformat"

intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language: Python - Size: 469 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,414 - Forks: 269

intel/neural-speed 📦

An innovative library for efficient LLM inference via low-bit quantization

Language: C++ - Size: 16.2 MB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 348 - Forks: 38

Related Topics

fp4 2 sparsity 2 int8 2 int4 2 nf4 1 auto-tuning 1 awq 1 gptq 1 knowledge-distillation 1 large-language-models 1 low-precision 1 post-training-quantization 1 pruning 1 quantization 1 quantization-aware-training 1 smoothquant 1 sparsegpt 1 low-bit 1 llm-inference 1 llm-fine-tuning 1 llamacpp 1 int7 1 int6 1 int5 1 int3 1 int2 1 int1 1 gpu 1 gaudi2 1 fp8 1 cpu 1