model-acceleration | Topic | Ecosyste.ms: Repos

Topic: "model-acceleration"

he-y/Awesome-Pruning

A curated list of neural network pruning resources.

Size: 605 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 2,446 - Forks: 330

Efficient-ML/Awesome-Model-Quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

Size: 61.5 MB - Last synced at: 29 days ago - Pushed at: 3 months ago - Stars: 2,084 - Forks: 221

guan-yuan/awesome-AutoML-and-Lightweight-Models

A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

Size: 150 KB - Last synced at: 29 days ago - Pushed at: almost 4 years ago - Stars: 853 - Forks: 160

chester256/Model-Compression-Papers

Papers for deep neural network compression and acceleration

Size: 8.79 KB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 393 - Forks: 78

xuyang-liu16/Awesome-Generation-Acceleration

📚 Collection of awesome generation acceleration resources.

Size: 637 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 215 - Forks: 6

xuyang-liu16/Awesome-Token-level-Model-Compression

📚 Collection of token-level model compression resources.

Size: 1.71 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 97 - Forks: 4

czg1225/CoDe

[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

Language: Python - Size: 30.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 93 - Forks: 3

wangxb96/Awesome-EdgeAI

Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"

Size: 3.64 MB - Last synced at: 29 days ago - Pushed at: 5 months ago - Stars: 83 - Forks: 8

musco-ai/musco-pytorch

MUSCO: MUlti-Stage COmpression of neural networks

Language: Jupyter Notebook - Size: 681 KB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 72 - Forks: 16

Alpha-Innovator/AdaptiveDiffusion

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Language: Python - Size: 8.63 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 65 - Forks: 3

sdc17/CrossGET

[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.

Language: Python - Size: 11.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 26 - Forks: 0

Lee-Gihun/MicroNet_OSI-AI

(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"

Language: Python - Size: 14.8 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 6

signalogic/SigDL

Deep Learning Compression and Acceleration SDK -- deep model compression for Edge and IoT embedded systems, and deep model acceleration for clouds and private servers

Size: 14.9 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 17 - Forks: 10

StargazerX0/ScaleKV

ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Language: Python - Size: 3.48 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 15 - Forks: 0

cantbebetter2/Awesome-Diffusion-Distillation

A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.

Size: 1000 Bytes - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 1

MingSun-Tse/Caffe_IncReg

[IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental Regularization"; [BMVC'18] "Structured Probabilistic Pruning for Convolutional Neural Network Acceleration"

Language: Makefile - Size: 19.2 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 5

ksm26/Efficiently-Serving-LLMs

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

Language: Jupyter Notebook - Size: 2.34 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 3