GitHub topics: llm-quantization

Repositories

snu-mllab/GuidedQuant

Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)

Language: Python - Size: 3.38 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 31 - Forks: 0

MagicTeaMC/AutoGGUF

Let me make GGUF files quickly

Language: Python - Size: 19.5 KB - Last synced at: 27 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

GongCheng1919/bias-compensation

[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation

Language: Python - Size: 918 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 1

paraglondhe098/sentiment-classification-llm

Implemented and fine-tuned BERT for a custom sequence classification task, leveraging LoRA adapters for efficient parameter updates and 4-bit quantization to optimize performance and resource utilization.

Language: Jupyter Notebook - Size: 6.66 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nagababumo/Quantization-in-Depth

Language: Jupyter Notebook - Size: 5.81 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Related Keywords

llm-quantization 5 quantization 3 llm 2 torch-quantization 1 pytorch 1 hugging-face-hub 1 hugging-face 1 dequantization 1 2-bit 1 qlora 1 peft-fine-tuning-llm 1 nlp-augmentation 1 nlp 1 lora 1 llm-fine-tuning 1 data-augmentation 1 post-training-quantization 1 output-error-optimization 1 llm-compression 1 bias-compensation 1 llamacpp 1 llama-cpp 1 gguf 1 llm-inference 1 large-language-models 1 efficient-inference 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos