Topic: "model-acceleration"
he-y/Awesome-Pruning
A curated list of neural network pruning resources.
Size: 605 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 2,446 - Forks: 330

Efficient-ML/Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Size: 61.5 MB - Last synced at: 29 days ago - Pushed at: 3 months ago - Stars: 2,084 - Forks: 221

guan-yuan/awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
Size: 150 KB - Last synced at: 29 days ago - Pushed at: almost 4 years ago - Stars: 853 - Forks: 160

chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration
Size: 8.79 KB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 393 - Forks: 78

xuyang-liu16/Awesome-Generation-Acceleration
📚 Collection of awesome generation acceleration resources.
Size: 637 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 215 - Forks: 6

xuyang-liu16/Awesome-Token-level-Model-Compression
📚 Collection of token-level model compression resources.
Size: 1.71 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 97 - Forks: 4

czg1225/CoDe
[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient
Language: Python - Size: 30.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 93 - Forks: 3

wangxb96/Awesome-EdgeAI
Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"
Size: 3.64 MB - Last synced at: 29 days ago - Pushed at: 5 months ago - Stars: 83 - Forks: 8

musco-ai/musco-pytorch
MUSCO: MUlti-Stage COmpression of neural networks
Language: Jupyter Notebook - Size: 681 KB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 72 - Forks: 16

Alpha-Innovator/AdaptiveDiffusion
[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Language: Python - Size: 8.63 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 65 - Forks: 3

sdc17/CrossGET
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
Language: Python - Size: 11.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 26 - Forks: 0

Lee-Gihun/MicroNet_OSI-AI
(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"
Language: Python - Size: 14.8 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 6

signalogic/SigDL
Deep Learning Compression and Acceleration SDK -- deep model compression for Edge and IoT embedded systems, and deep model acceleration for clouds and private servers
Size: 14.9 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 17 - Forks: 10

StargazerX0/ScaleKV
ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Language: Python - Size: 3.48 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 15 - Forks: 0

cantbebetter2/Awesome-Diffusion-Distillation
A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.
Size: 1000 Bytes - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 1

MingSun-Tse/Caffe_IncReg
[IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental Regularization"; [BMVC'18] "Structured Probabilistic Pruning for Convolutional Neural Network Acceleration"
Language: Makefile - Size: 19.2 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 5

ksm26/Efficiently-Serving-LLMs
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
Language: Jupyter Notebook - Size: 2.34 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 3

bnabis93/vision-language-examples
Vision-lanugage model example code.
Language: Python - Size: 2.99 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

likholat/openvino_quantization
This sample shows how to convert TensorFlow model to OpenVINO IR model and how to quantize OpenVINO model.
Language: Python - Size: 13.7 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

dhingratul/Model-Compression
Reduce the model complexity by 612 times, and memory footprint by 19.5 times compared to base model, while achieving worst case accuracy threshold.
Language: Jupyter Notebook - Size: 11 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 2

bhllx/On-Efficient-Variants-of-Segment-Anything-Model
On Efficient Variants of Segment Anything Model
Size: 18.6 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0
