Topic: "model-compression"
microsoft/nni 📦
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Language: Python - Size: 127 MB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 14,163 - Forks: 1,819

huawei-noah/Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Language: Python - Size: 98.4 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 4,184 - Forks: 718

dkozlov/awesome-knowledge-distillation
Awesome Knowledge Distillation
Size: 171 KB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 3,639 - Forks: 509

huawei-noah/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Language: Python - Size: 29 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 3,080 - Forks: 635

VainF/Torch-Pruning
[CVPR 2023] DepGraph: Towards Any Structural Pruning
Language: Python - Size: 10 MB - Last synced at: 13 days ago - Pushed at: 29 days ago - Stars: 2,962 - Forks: 347

Tencent/PocketFlow
An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
Language: Python - Size: 1.13 MB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 2,862 - Forks: 492

FLHonker/Awesome-Knowledge-Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
Size: 457 KB - Last synced at: 12 days ago - Pushed at: almost 2 years ago - Stars: 2,571 - Forks: 338

he-y/Awesome-Pruning
A curated list of neural network pruning resources.
Size: 605 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 2,435 - Forks: 330

666DZY666/micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Language: Python - Size: 6.58 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2,239 - Forks: 476

Efficient-ML/Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Size: 61.5 MB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 2,053 - Forks: 220

haitongli/knowledge-distillation-pytorch
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
Language: Python - Size: 22.1 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 1,851 - Forks: 344

tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Language: Python - Size: 2.22 MB - Last synced at: 13 days ago - Pushed at: 2 months ago - Stars: 1,530 - Forks: 325

AberHu/Knowledge-Distillation-Zoo
Pytorch implementation of various Knowledge Distillation (KD) methods.
Language: Python - Size: 90.8 KB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 1,526 - Forks: 261

microsoft/NeuronBlocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Language: Python - Size: 14.9 MB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 1,454 - Forks: 195

huawei-noah/Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
Language: Jupyter Notebook - Size: 100 MB - Last synced at: 10 days ago - Pushed at: 6 months ago - Stars: 1,257 - Forks: 217

ethanhe42/channel-pruning
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
Language: Python - Size: 548 KB - Last synced at: 6 days ago - Pushed at: 12 months ago - Stars: 1,083 - Forks: 311

MingSun-Tse/Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration.
Size: 700 KB - Last synced at: 3 days ago - Pushed at: 18 days ago - Stars: 945 - Forks: 131

horseee/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Language: Python - Size: 102 MB - Last synced at: 11 days ago - Pushed at: 10 months ago - Stars: 883 - Forks: 43

guan-yuan/awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
Size: 150 KB - Last synced at: 11 months ago - Pushed at: almost 4 years ago - Stars: 827 - Forks: 160

alibaba/TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Language: Python - Size: 25.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 809 - Forks: 122

lhyfst/knowledge-distillation-papers
knowledge distillation papers
Size: 321 KB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 741 - Forks: 83

SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language: Python - Size: 1.5 MB - Last synced at: 9 days ago - Pushed at: 8 months ago - Stars: 685 - Forks: 45

cnkuangshi/LightCTR
Lightweight and Scalable framework that combines mainstream algorithms of Click-Through-Rate prediction based computational DAG, philosophy of Parameter Server and Ring-AllReduce collective communication.
Language: C++ - Size: 9.41 MB - Last synced at: 9 months ago - Pushed at: almost 6 years ago - Stars: 674 - Forks: 142

SforAiDl/KD_Lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Language: Python - Size: 22.2 MB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 622 - Forks: 59

he-y/filter-pruning-geometric-median
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
Language: Python - Size: 2.17 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 608 - Forks: 113

cedrickchee/awesome-ml-model-compression
Awesome machine learning model compression research papers, quantization, tools, and learning material.
Size: 213 KB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 510 - Forks: 61

iamhankai/ghostnet.pytorch 📦
[CVPR2020] GhostNet: More Features from Cheap Operations
Language: Python - Size: 607 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 509 - Forks: 118

Zhen-Dong/Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
Size: 309 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 478 - Forks: 39

microsoft/archai
Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
Language: Python - Size: 48.3 MB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 475 - Forks: 90

mit-han-lab/amc
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Language: Python - Size: 17.6 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 416 - Forks: 108

1duo/awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Size: 11.8 MB - Last synced at: 10 days ago - Pushed at: almost 6 years ago - Stars: 411 - Forks: 73

chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration
Size: 8.79 KB - Last synced at: 9 months ago - Pushed at: almost 4 years ago - Stars: 393 - Forks: 78

pratyushasharma/laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Language: Python - Size: 2.25 MB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 386 - Forks: 32

he-y/soft-filter-pruning
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Language: Python - Size: 59.6 KB - Last synced at: 15 days ago - Pushed at: over 5 years ago - Stars: 380 - Forks: 73

Zhen-Dong/HAWQ
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
Language: Python - Size: 691 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 361 - Forks: 80

Xiuyu-Li/q-diffusion
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
Language: Python - Size: 5.97 MB - Last synced at: 16 days ago - Pushed at: about 1 year ago - Stars: 347 - Forks: 24

SqueezeAILab/KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Language: Python - Size: 19.8 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 339 - Forks: 30

czg1225/SlimSAM
[NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim
Language: Python - Size: 36 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 323 - Forks: 17

JetRunner/BERT-of-Theseus
⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).
Language: Python - Size: 1.04 MB - Last synced at: 20 days ago - Pushed at: almost 2 years ago - Stars: 310 - Forks: 38

tianyic/only_train_once_personal_footprint
OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
Language: Python - Size: 2.94 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 300 - Forks: 46

datawhalechina/awesome-compression
模型压缩的小白入门教程
Size: 302 MB - Last synced at: 11 days ago - Pushed at: 5 months ago - Stars: 265 - Forks: 35

THU-MIG/torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
Language: Python - Size: 132 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 250 - Forks: 41

kssteven418/I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
Language: Python - Size: 6.38 MB - Last synced at: 16 days ago - Pushed at: about 2 years ago - Stars: 241 - Forks: 34

HanXinzi-AI/awesome-computer-vision-resources
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
Size: 49.8 MB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 238 - Forks: 33

yehuitang/Pruning
Code for "Co-Evolutionary Compression for Unpaired Image Translation" (ICCV 2019), "SCOP: Scientific Control for Reliable Neural Network Pruning" (NeurIPS 2020) and “Manifold Regularized Dynamic Network Pruning” (CVPR 2021).
Language: Python - Size: 1.57 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 237 - Forks: 47

Picovoice/picollm
On-device LLM Inference Powered by X-Bit Quantization
Language: Python - Size: 94.2 MB - Last synced at: 3 days ago - Pushed at: 11 days ago - Stars: 233 - Forks: 13

vinhkhuc/JFastText
Java interface for fastText
Language: Java - Size: 57.6 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 226 - Forks: 144

changlin31/DS-Net
(CVPR 2021, Oral) Dynamic Slimmable Network
Language: Python - Size: 83 KB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 225 - Forks: 19

Sharpiless/Yolov5-distillation-train-inference
Yolov5 distillation training | Yolov5知识蒸馏训练,支持训练自己的数据
Language: Python - Size: 2.36 MB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 216 - Forks: 33

princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
Language: Python - Size: 1.79 MB - Last synced at: 17 days ago - Pushed at: almost 2 years ago - Stars: 196 - Forks: 32

Peterisfar/YOLOV3
yolov3 by pytorch
Language: Python - Size: 17.3 MB - Last synced at: 2 days ago - Pushed at: almost 3 years ago - Stars: 195 - Forks: 53

liuziwei7/mobile-id
Deep Face Model Compression
Language: Matlab - Size: 3.62 MB - Last synced at: 11 months ago - Pushed at: over 6 years ago - Stars: 195 - Forks: 102

HoyTta0/KnowledgeDistillation
Knowledge distillation in text classification with pytorch. 知识蒸馏,中文文本分类,教师模型BERT、XLNET,学生模型biLSTM。
Language: Python - Size: 2.05 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 191 - Forks: 49

VainF/Diff-Pruning
[NeurIPS 2023] Structural Pruning for Diffusion Models
Language: Python - Size: 25.2 MB - Last synced at: 11 days ago - Pushed at: 10 months ago - Stars: 185 - Forks: 12

MingSun-Tse/Collaborative-Distillation
[CVPR'20] Collaborative Distillation for Ultra-Resolution Universal Style Transfer (PyTorch)
Language: Python - Size: 54.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 181 - Forks: 23

wanglouis49/pytorch-weights_pruning
PyTorch Implementation of Weights Pruning
Language: Python - Size: 2.48 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 179 - Forks: 35

Efficient-ML/Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Size: 63.5 KB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 176 - Forks: 11

ChanChiChoi/awesome-model-compression
papers about model compression
Size: 504 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 166 - Forks: 38

CASE-Lab-UMD/LLM-Drop
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
Language: Python - Size: 90.3 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 165 - Forks: 19

mit-han-lab/amc-models
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Language: Python - Size: 37.1 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 164 - Forks: 27

DwangoMediaVillage/keras_compressor
Model Compression CLI Tool for Keras.
Language: Python - Size: 19.5 KB - Last synced at: about 7 hours ago - Pushed at: almost 6 years ago - Stars: 156 - Forks: 38

NVlabs/condensa
Programmable Neural Network Compression
Language: Python - Size: 16.2 MB - Last synced at: 13 days ago - Pushed at: almost 3 years ago - Stars: 148 - Forks: 26

TF2-Engine/TF2
An Open Source Deep Learning Inference Engine Based on FPGA
Language: Python - Size: 110 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 146 - Forks: 61

LiyuanLucasLiu/LD-Net
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Language: Python - Size: 599 KB - Last synced at: 6 days ago - Pushed at: about 5 years ago - Stars: 146 - Forks: 13

jim-schwoebel/allie
🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.
Language: Python - Size: 275 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 143 - Forks: 35

diaoenmao/HeteroFL-Computation-and-Communication-Efficient-Federated-Learning-for-Heterogeneous-Clients
[ICLR 2021] HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients
Language: Python - Size: 23.8 MB - Last synced at: 8 months ago - Pushed at: about 2 years ago - Stars: 141 - Forks: 33

cuguilke/microexpnet
MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Frontal Face Images
Language: Python - Size: 2.76 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 139 - Forks: 26

AIoT-MLSys-Lab/SVD-LLM
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
Language: Python - Size: 744 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 135 - Forks: 10

deep-fry/mayo
Mayo: Auto-generation of hardware-friendly deep neural networks. Dynamic Channel Pruning: Feature Boosting and Suppression.
Language: Python - Size: 33.2 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 113 - Forks: 21

thu-nics/MoA
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
Language: Python - Size: 532 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 103 - Forks: 6

ziplab/SPViT
[TPAMI 2024] This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.
Language: Python - Size: 198 KB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 99 - Forks: 14

kssteven418/LTP
[KDD'22] Learned Token Pruning for Transformers
Language: Python - Size: 40.1 MB - Last synced at: 15 days ago - Pushed at: about 2 years ago - Stars: 96 - Forks: 18

VainF/Data-Free-Adversarial-Distillation
Code and pretrained models for paper: Data-Free Adversarial Distillation
Language: Python - Size: 1.53 MB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 96 - Forks: 18

MahmoudWahdan/dialog-nlu
Tensorflow and Keras implementation of the state of the art researches in Dialog System NLU
Language: Jupyter Notebook - Size: 806 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 94 - Forks: 40

TobiasLee/Awesome-Efficient-PLM
Must-read papers on improving efficiency for pre-trained language models.
Size: 74.2 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 91 - Forks: 12

onnx/neural-compressor
Model compression for ONNX
Language: Python - Size: 2.35 MB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 90 - Forks: 9

VITA-Group/SViTE
[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang
Language: Python - Size: 615 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 12

archsyscall/aquvitae
Knowledge Distillation Toolkit
Language: Python - Size: 170 MB - Last synced at: 17 days ago - Pushed at: almost 5 years ago - Stars: 89 - Forks: 10

microsoft/Moonlit
This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.
Language: Python - Size: 12 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 82 - Forks: 7

wangxb96/Awesome-EdgeAI
Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"
Size: 3.64 MB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 81 - Forks: 7

iamhankai/Versatile-Filters
Pytorch code for paper: Learning Versatile Filters for Efficient Convolutional Neural Networks (NeurIPS 2018)
Language: Python - Size: 121 KB - Last synced at: 16 days ago - Pushed at: over 5 years ago - Stars: 79 - Forks: 16

wenwei202/iss-rnns
Sparse Recurrent Neural Networks -- Pruning Connections and Hidden Sizes (TensorFlow)
Language: Python - Size: 8.79 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 72 - Forks: 22

UMBCvision/CompRess
Compressing Representations for Self-Supervised Learning
Language: Python - Size: 22 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 71 - Forks: 12

musco-ai/musco-pytorch
MUSCO: MUlti-Stage COmpression of neural networks
Language: Jupyter Notebook - Size: 681 KB - Last synced at: 7 days ago - Pushed at: about 4 years ago - Stars: 71 - Forks: 16

HankYe/PAGCP
PAGCP for the compression of YOLOv5
Language: Python - Size: 1.13 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 70 - Forks: 8

MingSun-Tse/Regularization-Pruning
[ICLR'21] Neural Pruning via Growing Regularization (PyTorch)
Language: Python - Size: 2.17 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 69 - Forks: 17

zju-vipa/CMI
[IJCAI-2021] Contrastive Model Inversion for Data-Free Knowledge Distillation
Language: Python - Size: 2.56 MB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 65 - Forks: 15

htqin/BiPointNet
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.
Language: Python - Size: 32.2 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 65 - Forks: 10

FLHonker/ZAQ-code
CVPR 2021 : Zero-shot Adversarial Quantization (ZAQ)
Language: Python - Size: 188 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 64 - Forks: 16

Irtza/Keras_model_compression
Model Compression Based on Geoffery Hinton's Logit Regression Method in Keras applied to MNIST 16x compression over 0.95 percent accuracy.An Implementation of "Distilling the Knowledge in a Neural Network - Geoffery Hinton et. al"
Language: Jupyter Notebook - Size: 4.45 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 64 - Forks: 5

bloomberg/minilmv2.bb
Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)
Language: Python - Size: 30.3 KB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 61 - Forks: 5

CASE-Lab-UMD/Unified-MoE-Compression
The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".
Language: Python - Size: 47.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 5

pvti/Awesome-Tensor-Decomposition
😎 A curated list of tensor decomposition resources for model compression.
Size: 318 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 59 - Forks: 7

hnuzhy/CV_DL_Gather
Gather research papers, corresponding codes (if having), reading notes and any other related materials about Hot🔥🔥🔥 fields in Computer Vision based on Deep Learning.
Size: 37.6 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 59 - Forks: 6

horseee/LLaMA-Pruning 📦
Structural Pruning for LLaMA
Language: Python - Size: 83 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 54 - Forks: 4

kxytechnologies/kxy-python
A toolkit to boost the productivity of machine learning engineers.
Language: Python - Size: 38.6 MB - Last synced at: 3 days ago - Pushed at: almost 3 years ago - Stars: 52 - Forks: 11

MingSun-Tse/Awesome-Pruning-at-Initialization
[IJCAI'22 Survey] Recent Advances on Neural Network Pruning at Initialization.
Size: 71.3 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 49 - Forks: 2

VITA-Group/ATMC
[NeurIPS'2019] Shupeng Gui, Haotao Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, Ji Liu, “Model Compression with Adversarial Robustness: A Unified Optimization Framework”
Language: Python - Size: 50.6 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 49 - Forks: 10

xuyang-liu16/Awesome-Token-level-Model-Compression
📚 Collection of token reduction for model compression resources.
Size: 598 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 48 - Forks: 2

MingSun-Tse/ASSL
[NeurIPS'21 Spotlight] Aligned Structured Sparsity Learning for Efficient Image Super-Resolution (PyTorch)
Language: Python - Size: 4.32 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 48 - Forks: 7
