llm-compression | Topic | Ecosyste.ms: Repos

Topic: "llm-compression"

A curated list for Efficient Large Language Models

Language: Python - Size: 62.3 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,709 - Forks: 135

Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs

Language: Python - Size: 1.07 MB - Last synced at: 27 days ago - Pushed at: 7 months ago - Stars: 81 - Forks: 7

D^2-MoE: Delta Decompression for MoE-based LLMs Compression

Language: Python - Size: 1.89 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 36 - Forks: 3

[ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.

Language: Python - Size: 7.11 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 23 - Forks: 5

LLM Compression Benchmark

Language: Python - Size: 13.7 MB - Last synced at: 11 days ago - Pushed at: about 2 months ago - Stars: 21 - Forks: 0

LLM Inference on AWS Lambda

Language: Python - Size: 21.8 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 0

[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation

Language: Python - Size: 918 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 7 - Forks: 1

papers of llm compression

Size: 103 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0