Topic: "llm-compression"
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Language: Python - Size: 62.3 MB - Last synced at: 7 days ago - Pushed at: 14 days ago - Stars: 1,637 - Forks: 130

pprp/Pruner-Zero
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
Language: Python - Size: 1.07 MB - Last synced at: 27 days ago - Pushed at: 5 months ago - Stars: 80 - Forks: 6

lliai/D2MoE
D^2-MoE: Delta Decompression for MoE-based LLMs Compression
Language: Python - Size: 1.89 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 36 - Forks: 3

VITA-Group/llm-kick
[ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.
Language: Python - Size: 7.11 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 23 - Forks: 5

Picovoice/llm-compression-benchmark
LLM Compression Benchmark
Language: Python - Size: 13.7 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 21 - Forks: 0

Picovoice/serverless-picollm
LLM Inference on AWS Lambda
Language: Python - Size: 21.8 MB - Last synced at: 17 days ago - Pushed at: 11 months ago - Stars: 10 - Forks: 0

GongCheng1919/bias-compensation
[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation
Language: Python - Size: 918 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 7 - Forks: 1

bupt-ai-club/llm-compression-papers
papers of llm compression
Size: 103 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0
