An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: efficient-llm-inference

hao-ai-lab/Consistency_LLM

[ICML 2024] CLLMs: Consistency Large Language Models

Language: Python - Size: 16.9 MB - Last synced at: 19 days ago - Pushed at: 7 months ago - Stars: 391 - Forks: 16

GATECH-EIC/AmoebaLLM

[NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu, Junwei Li, Jiayi Qian, Yongan Zhang, Xiangchi Yuan, Dachuan Shi, Roman Yakunin, and Yingyan (Celine) Lin.

Language: Python - Size: 28.8 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

snu-mllab/Context-Memory

Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)

Language: Python - Size: 2.1 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 1