GitHub topics: int2
intel/neural-speed 📦
An innovative library for efficient LLM inference via low-bit quantization
Language: C++ - Size: 16.2 MB - Last synced at: about 17 hours ago - Pushed at: 9 months ago - Stars: 348 - Forks: 38

An innovative library for efficient LLM inference via low-bit quantization
Language: C++ - Size: 16.2 MB - Last synced at: about 17 hours ago - Pushed at: 9 months ago - Stars: 348 - Forks: 38