Topic: "llm-quantization"
snu-mllab/GuidedQuant
Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)
Language: Python - Size: 3.38 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 31 - Forks: 0
GongCheng1919/bias-compensation
[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation
Language: Python - Size: 918 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 7 - Forks: 1
t81dev/ternary
Ternary Quantization for LLMs: Implement balanced ternary (T3_K) weights for 2.63-bit quantization—the first working solution for modern large language models.
Language: C++ - Size: 31.3 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0
MagicTeaMC/AutoGGUF
Let me make GGUF files quickly
Language: Python - Size: 19.5 KB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0
paraglondhe098/sentiment-classification-llm
Implemented and fine-tuned BERT for a custom sequence classification task, leveraging LoRA adapters for efficient parameter updates and 4-bit quantization to optimize performance and resource utilization.
Language: Jupyter Notebook - Size: 6.66 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
nagababumo/Quantization-in-Depth
Language: Jupyter Notebook - Size: 5.81 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0