model-compression | Topic | Ecosyste.ms: Repos

Topic: "model-compression"

microsoft/nni 📦

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Language: Python - Size: 127 MB - Last synced at: 6 days ago - Pushed at: 12 months ago - Stars: 14,205 - Forks: 1,822

huawei-noah/Efficient-AI-Backbones

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

Language: Python - Size: 98.4 MB - Last synced at: 13 days ago - Pushed at: 3 months ago - Stars: 4,240 - Forks: 723

dkozlov/awesome-knowledge-distillation

Awesome Knowledge Distillation

Size: 215 KB - Last synced at: 7 days ago - Pushed at: 16 days ago - Stars: 3,686 - Forks: 512

huawei-noah/Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Language: Python - Size: 29 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 3,103 - Forks: 637

VainF/Torch-Pruning

[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.

Language: Python - Size: 10 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 3,038 - Forks: 351

Tencent/PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.

Language: Python - Size: 1.13 MB - Last synced at: 28 days ago - Pushed at: about 2 years ago - Stars: 2,884 - Forks: 490

FLHonker/Awesome-Knowledge-Distillation

Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。

Size: 457 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 2,601 - Forks: 338

he-y/Awesome-Pruning

A curated list of neural network pruning resources.

Size: 605 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 2,446 - Forks: 330

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

Language: Python - Size: 6.68 MB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 2,250 - Forks: 476

Efficient-ML/Awesome-Model-Quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

Size: 61.5 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 2,131 - Forks: 224

haitongli/knowledge-distillation-pytorch

A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility

Language: Python - Size: 22.1 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 1,938 - Forks: 352

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Language: Python - Size: 2.22 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 1,534 - Forks: 327

AberHu/Knowledge-Distillation-Zoo

Pytorch implementation of various Knowledge Distillation (KD) methods.

Language: Python - Size: 90.8 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1,526 - Forks: 261

microsoft/NeuronBlocks

NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego

Language: Python - Size: 14.9 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 1,454 - Forks: 195

huawei-noah/Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab

Language: Jupyter Notebook - Size: 100 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 1,273 - Forks: 218

ethanhe42/channel-pruning

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Language: Python - Size: 548 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 1,082 - Forks: 310

MingSun-Tse/Efficient-Deep-Learning

Collection of recent methods on (deep) neural network compression and acceleration.

Size: 700 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 945 - Forks: 131

horseee/DeepCache

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Language: Python - Size: 102 MB - Last synced at: 28 days ago - Pushed at: 12 months ago - Stars: 893 - Forks: 43

guan-yuan/Awesome-AutoML-and-Lightweight-Models

A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

Size: 150 KB - Last synced at: 6 days ago - Pushed at: about 4 years ago - Stars: 854 - Forks: 160

alibaba/TinyNeuralNetwork

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

Language: Python - Size: 25.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 825 - Forks: 126

lhyfst/knowledge-distillation-papers

knowledge distillation papers

Size: 321 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 753 - Forks: 87

SqueezeAILab/SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Language: Python - Size: 1.5 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 692 - Forks: 46

cnkuangshi/LightCTR

Lightweight and Scalable framework that combines mainstream algorithms of Click-Through-Rate prediction based computational DAG, philosophy of Parameter Server and Ring-AllReduce collective communication.

Language: C++ - Size: 9.41 MB - Last synced at: 11 months ago - Pushed at: about 6 years ago - Stars: 674 - Forks: 142

SforAiDl/KD_Lib

A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.

Language: Python - Size: 22.2 MB - Last synced at: about 24 hours ago - Pushed at: over 2 years ago - Stars: 634 - Forks: 61

he-y/filter-pruning-geometric-median

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)

Language: Python - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 614 - Forks: 114

cedrickchee/awesome-ml-model-compression

Awesome machine learning model compression research papers, quantization, tools, and learning material.

Size: 213 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 523 - Forks: 60

iamhankai/ghostnet.pytorch 📦

[CVPR2020] GhostNet: More Features from Cheap Operations

Language: Python - Size: 607 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 509 - Forks: 118

Zhen-Dong/Awesome-Quantization-Papers

List of papers related to neural network quantization in recent AI conferences and journals.

Size: 309 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 478 - Forks: 39

microsoft/archai

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.

Language: Python - Size: 48.3 MB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 477 - Forks: 89

mit-han-lab/amc

[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Language: Python - Size: 17.6 KB - Last synced at: 25 days ago - Pushed at: over 1 year ago - Stars: 441 - Forks: 115

1duo/awesome-ai-infrastructures

Infrastructures™ for Machine Learning Training/Inference in Production.

Size: 11.8 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 416 - Forks: 74

chester256/Model-Compression-Papers

Papers for deep neural network compression and acceleration

Size: 8.79 KB - Last synced at: 11 months ago - Pushed at: about 4 years ago - Stars: 393 - Forks: 78

pratyushasharma/laser

The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Language: Python - Size: 2.25 MB - Last synced at: 3 days ago - Pushed at: 12 months ago - Stars: 388 - Forks: 34

he-y/soft-filter-pruning

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

Language: Python - Size: 59.6 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 380 - Forks: 74

Zhen-Dong/HAWQ

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

Language: Python - Size: 691 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 361 - Forks: 80

SqueezeAILab/KVQuant

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Language: Python - Size: 19.8 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 359 - Forks: 31

Xiuyu-Li/q-diffusion

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

Language: Python - Size: 5.97 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 347 - Forks: 24

czg1225/SlimSAM

[NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim

Language: Python - Size: 36 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 323 - Forks: 17

JetRunner/BERT-of-Theseus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

Language: Python - Size: 1.04 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 310 - Forks: 38

tianyic/only_train_once_personal_footprint

OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM

Language: Python - Size: 2.94 MB - Last synced at: 21 days ago - Pushed at: 9 months ago - Stars: 302 - Forks: 48

datawhalechina/awesome-compression

模型压缩的小白入门教程，PDF下载地址 https://github.com/datawhalechina/awesome-compression/releases

Size: 311 MB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 295 - Forks: 36

HanXinzi-AI/awesome-computer-vision-resources

a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。

Size: 49.8 MB - Last synced at: about 5 hours ago - Pushed at: about 1 year ago - Stars: 273 - Forks: 34

Picovoice/picollm

On-device LLM Inference Powered by X-Bit Quantization

Language: Python - Size: 98 MB - Last synced at: 3 days ago - Pushed at: 15 days ago - Stars: 250 - Forks: 14

THU-MIG/torch-model-compression

针对pytorch模型的自动化模型结构分析和修改工具集，包含自动分析模型结构的模型压缩算法库

Language: Python - Size: 132 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 249 - Forks: 41

kssteven418/I-BERT

[ICML'21 Oral] I-BERT: Integer-only BERT Quantization

Language: Python - Size: 6.38 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 246 - Forks: 36

vinhkhuc/JFastText

Java interface for fastText

Language: Java - Size: 57.6 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 237 - Forks: 98

yehuitang/Pruning

Code for "Co-Evolutionary Compression for Unpaired Image Translation" (ICCV 2019), "SCOP: Scientific Control for Reliable Neural Network Pruning" (NeurIPS 2020) and “Manifold Regularized Dynamic Network Pruning” (CVPR 2021).

Language: Python - Size: 1.57 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 237 - Forks: 47

changlin31/DS-Net

(CVPR 2021, Oral) Dynamic Slimmable Network

Language: Python - Size: 83 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 225 - Forks: 19

Sharpiless/Yolov5-distillation-train-inference

Yolov5 distillation training | Yolov5知识蒸馏训练，支持训练自己的数据

Language: Python - Size: 2.36 MB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 220 - Forks: 33

princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

Language: Python - Size: 1.79 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 195 - Forks: 32

Peterisfar/YOLOV3

yolov3 by pytorch

Language: Python - Size: 17.3 MB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 195 - Forks: 53

liuziwei7/mobile-id

Deep Face Model Compression

Language: Matlab - Size: 3.62 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 195 - Forks: 102

HoyTta0/KnowledgeDistillation

Knowledge distillation in text classification with pytorch. 知识蒸馏，中文文本分类，教师模型BERT、XLNET，学生模型biLSTM。

Language: Python - Size: 2.05 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 191 - Forks: 49

VainF/Diff-Pruning

[NeurIPS 2023] Structural Pruning for Diffusion Models

Language: Python - Size: 25.2 MB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 185 - Forks: 12

Efficient-ML/Awesome-Efficient-AIGC

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

Size: 63.5 KB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 183 - Forks: 11

MingSun-Tse/Collaborative-Distillation

[CVPR'20] Collaborative Distillation for Ultra-Resolution Universal Style Transfer (PyTorch)

Language: Python - Size: 54.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 181 - Forks: 23

wanglouis49/pytorch-weights_pruning

PyTorch Implementation of Weights Pruning

Language: Python - Size: 2.48 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 179 - Forks: 35

mit-han-lab/amc-models

[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Language: Python - Size: 37.1 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 167 - Forks: 27

ChanChiChoi/awesome-model-compression

papers about model compression

Size: 504 KB - Last synced at: about 12 hours ago - Pushed at: over 2 years ago - Stars: 166 - Forks: 38

CASE-Lab-UMD/LLM-Drop

The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".

Language: Python - Size: 90.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 165 - Forks: 19

DwangoMediaVillage/keras_compressor

Model Compression CLI Tool for Keras.

Language: Python - Size: 19.5 KB - Last synced at: 3 days ago - Pushed at: about 6 years ago - Stars: 156 - Forks: 37

NVlabs/condensa

Programmable Neural Network Compression

Language: Python - Size: 16.2 MB - Last synced at: 15 days ago - Pushed at: about 3 years ago - Stars: 148 - Forks: 26

TF2-Engine/TF2

An Open Source Deep Learning Inference Engine Based on FPGA

Language: Python - Size: 110 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 146 - Forks: 61

LiyuanLucasLiu/LD-Net

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Language: Python - Size: 599 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 146 - Forks: 13

jim-schwoebel/allie

🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.

Language: Python - Size: 275 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 141 - Forks: 35

diaoenmao/HeteroFL-Computation-and-Communication-Efficient-Federated-Learning-for-Heterogeneous-Clients

[ICLR 2021] HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients

Language: Python - Size: 23.8 MB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 141 - Forks: 33

cuguilke/microexpnet

MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Frontal Face Images

Language: Python - Size: 2.76 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 139 - Forks: 26

AIoT-MLSys-Lab/SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Language: Python - Size: 744 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 135 - Forks: 10

xuyang-liu16/Awesome-Token-level-Model-Compression

📚 Collection of token-level model compression resources.

Size: 1.72 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 117 - Forks: 4

deep-fry/mayo

Mayo: Auto-generation of hardware-friendly deep neural networks. Dynamic Channel Pruning: Feature Boosting and Suppression.

Language: Python - Size: 33.2 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 113 - Forks: 21

thu-nics/MoA

The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>

Language: Python - Size: 532 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 103 - Forks: 6

ziplab/SPViT

[TPAMI 2024] This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.

Language: Python - Size: 198 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 99 - Forks: 14

kssteven418/LTP

[KDD'22] Learned Token Pruning for Transformers

Language: Python - Size: 40.1 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 96 - Forks: 18

VainF/Data-Free-Adversarial-Distillation

Code and pretrained models for paper: Data-Free Adversarial Distillation

Language: Python - Size: 1.53 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 96 - Forks: 18

MahmoudWahdan/dialog-nlu

Tensorflow and Keras implementation of the state of the art researches in Dialog System NLU

Language: Jupyter Notebook - Size: 806 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 94 - Forks: 40

onnx/neural-compressor

Model compression for ONNX

Language: Python - Size: 2.35 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 92 - Forks: 9

TobiasLee/Awesome-Efficient-PLM

Must-read papers on improving efficiency for pre-trained language models.

Size: 74.2 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 91 - Forks: 12

VITA-Group/SViTE

[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Language: Python - Size: 615 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 12

archsyscall/aquvitae

Knowledge Distillation Toolkit

Language: Python - Size: 170 MB - Last synced at: 28 days ago - Pushed at: almost 5 years ago - Stars: 88 - Forks: 10

wangxb96/Awesome-EdgeAI

Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"

Size: 3.64 MB - Last synced at: 12 days ago - Pushed at: 6 months ago - Stars: 87 - Forks: 8

microsoft/Moonlit

This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.

Language: Python - Size: 12 MB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 83 - Forks: 7

iamhankai/Versatile-Filters

Pytorch code for paper: Learning Versatile Filters for Efficient Convolutional Neural Networks (NeurIPS 2018)

Language: Python - Size: 121 KB - Last synced at: 3 months ago - Pushed at: almost 6 years ago - Stars: 79 - Forks: 16

hnuzhy/CV_DL_Gather

Gather research papers, corresponding codes (if having), reading notes and any other related materials about Hot🔥🔥🔥 fields in Computer Vision based on Deep Learning.

Size: 37.6 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 74 - Forks: 6

musco-ai/musco-pytorch

MUSCO: MUlti-Stage COmpression of neural networks

Language: Jupyter Notebook - Size: 681 KB - Last synced at: 1 day ago - Pushed at: over 4 years ago - Stars: 72 - Forks: 16

wenwei202/iss-rnns

Sparse Recurrent Neural Networks -- Pruning Connections and Hidden Sizes (TensorFlow)

Language: Python - Size: 8.79 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 72 - Forks: 22

UMBCvision/CompRess

Compressing Representations for Self-Supervised Learning

Language: Python - Size: 22 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 71 - Forks: 12

HankYe/PAGCP

PAGCP for the compression of YOLOv5

Language: Python - Size: 1.13 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 70 - Forks: 8

MingSun-Tse/Regularization-Pruning

[ICLR'21] Neural Pruning via Growing Regularization (PyTorch)

Language: Python - Size: 2.17 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 69 - Forks: 17

pvti/Awesome-Tensor-Decomposition

😎 A curated list of tensor decomposition resources for model compression.

Size: 488 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 68 - Forks: 8

CASE-Lab-UMD/Unified-MoE-Compression

The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".

Language: Python - Size: 47.1 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 67 - Forks: 5

zju-vipa/CMI

[IJCAI-2021] Contrastive Model Inversion for Data-Free Knowledge Distillation

Language: Python - Size: 2.56 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 65 - Forks: 15

htqin/BiPointNet

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

Language: Python - Size: 32.2 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 65 - Forks: 10

FLHonker/ZAQ-code

CVPR 2021 : Zero-shot Adversarial Quantization (ZAQ)

Language: Python - Size: 188 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 64 - Forks: 16

Irtza/Keras_model_compression

Model Compression Based on Geoffery Hinton's Logit Regression Method in Keras applied to MNIST 16x compression over 0.95 percent accuracy.An Implementation of "Distilling the Knowledge in a Neural Network - Geoffery Hinton et. al"

Language: Jupyter Notebook - Size: 4.45 MB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 64 - Forks: 5

BaiTheBest/SparseLLM

Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)

Language: Python - Size: 145 KB - Last synced at: 13 days ago - Pushed at: 3 months ago - Stars: 61 - Forks: 9

bloomberg/minilmv2.bb

Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)

Language: Python - Size: 30.3 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 61 - Forks: 5

horseee/LLaMA-Pruning 📦

Structural Pruning for LLaMA

Language: Python - Size: 83 KB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 54 - Forks: 4