Topic: "inference-speed"
Ki6an/fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Language: Python - Size: 277 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 578 - Forks: 73

HKUDS/SepLLM
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
Language: Python - Size: 170 MB - Last synced at: 23 days ago - Pushed at: 5 months ago - Stars: 68 - Forks: 3

renebidart/text-classification-benchmark
Inference speed / accuracy tradeoff on text classification with transformer models such as BERT, RoBERTa, DeBERTa, SqueezeBERT, MobileBERT, Funnel Transformer, etc.
Language: Jupyter Notebook - Size: 1.49 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0
