GitHub topics: efficient-llm-inference
hao-ai-lab/Consistency_LLM
[ICML 2024] CLLMs: Consistency Large Language Models
Language: Python - Size: 16.9 MB - Last synced at: 19 days ago - Pushed at: 7 months ago - Stars: 391 - Forks: 16

GATECH-EIC/AmoebaLLM
[NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu, Junwei Li, Jiayi Qian, Yongan Zhang, Xiangchi Yuan, Dachuan Shi, Roman Yakunin, and Yingyan (Celine) Lin.
Language: Python - Size: 28.8 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

snu-mllab/Context-Memory
Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)
Language: Python - Size: 2.1 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 1
