An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: pruning

tianyic/only_train_once_personal_footprint

OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM

Language: Python - Size: 2.94 MB - Last synced at: about 4 hours ago - Pushed at: 7 months ago - Stars: 303 - Forks: 47

VainF/Torch-Pruning

[CVPR 2023] DepGraph: Towards Any Structural Pruning

Language: Python - Size: 10 MB - Last synced at: about 14 hours ago - Pushed at: 12 days ago - Stars: 2,982 - Forks: 346

intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language: Python - Size: 469 MB - Last synced at: about 15 hours ago - Pushed at: about 20 hours ago - Stars: 2,380 - Forks: 267

friendshipkim/overfill

Code for OverFill: Two-Stage Models for Efficient Language Model Decoding

Language: Python - Size: 1.87 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

datawhalechina/leedl-tutorial

《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases

Language: Jupyter Notebook - Size: 294 MB - Last synced at: 1 day ago - Pushed at: 21 days ago - Stars: 14,994 - Forks: 3,015

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Language: Python - Size: 2.22 MB - Last synced at: about 14 hours ago - Pushed at: 2 months ago - Stars: 1,531 - Forks: 325

openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

Language: Python - Size: 61.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,001 - Forks: 251

horseee/LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

Language: Python - Size: 5.92 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 1,000 - Forks: 116

princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Language: Python - Size: 19 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 599 - Forks: 52

peremartra/Large-Language-Model-Notebooks-Course

Practical course about Large Language Models.

Language: Jupyter Notebook - Size: 12.9 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,572 - Forks: 399

mcthouacbb/Sirius

Chess engine

Language: C++ - Size: 37 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 34 - Forks: 1

huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

Language: Jupyter Notebook - Size: 17 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 459 - Forks: 128

Syencil/mobile-yolov5-pruning-distillation

mobilev2-yolov5s剪枝、蒸馏,支持ncnn,tensorRT部署。ultra-light but better performence!

Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 843 - Forks: 165

quic/aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Language: Python - Size: 18.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,274 - Forks: 400

ModelTC/llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Language: Python - Size: 28.9 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 456 - Forks: 53

cedrickchee/awesome-ml-model-compression

Awesome machine learning model compression research papers, quantization, tools, and learning material.

Size: 213 KB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 510 - Forks: 61

datawhalechina/llm-deploy

大模型/LLM推理和部署理论与实践

Size: 100 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 240 - Forks: 35

Guaishou74851/AdcSR

(CVPR 2025) Adversarial Diffusion Compression for Real-World Image Super-Resolution [PyTorch]

Language: Python - Size: 36.2 MB - Last synced at: 9 days ago - Pushed at: 19 days ago - Stars: 70 - Forks: 7

albertkjoller/transformer-redundancy

Code for the paper "How Redundant Is the Transformer Stack in Speech Representation Models?" (ICASSP 2025)

Language: Python - Size: 20.7 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

ROIM1998/APT

[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference

Language: Python - Size: 4.08 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 38 - Forks: 1

JeanSanchezFelix/EdgeML-Projects

This repository contains example notebooks and homeworks demonstrating various techniques in model optimization for Edge ML.

Language: Jupyter Notebook - Size: 26.4 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

Bisonai/awesome-edge-machine-learning

A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.

Language: Python - Size: 135 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 261 - Forks: 51

luuyin/OWL

Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"

Language: Python - Size: 594 KB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 64 - Forks: 8

CASIA-IVA-Lab/FLAP

[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models

Language: Python - Size: 987 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 47 - Forks: 12

Mohammad-Hallaq/Wake_Vision_Challenge_Model_Centric_Track Fork of harvard-edge/Wake_Vision_Challenge_Model_Centric_Track

Language: PureBasic - Size: 37.5 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

1duo/awesome-ai-infrastructures

Infrastructures™ for Machine Learning Training/Inference in Production.

Size: 11.8 MB - Last synced at: 12 days ago - Pushed at: almost 6 years ago - Stars: 411 - Forks: 73

huawei-noah/Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab

Language: Jupyter Notebook - Size: 100 MB - Last synced at: 12 days ago - Pushed at: 6 months ago - Stars: 1,257 - Forks: 217

neuralmagic/deepsparse

Sparsity-aware deep learning inference runtime for CPUs

Language: Python - Size: 137 MB - Last synced at: 11 days ago - Pushed at: 9 months ago - Stars: 3,130 - Forks: 183

yaya-sy/lillama

[NAACL' 25 main] Lillama: Large Language Model Compression via Low-Rank Feature Distillation

Language: Python - Size: 166 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 6 - Forks: 0

VITA-Group/SMC-Bench

[ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, AJAY KUMAR JAISWAL, Zhangyang Wang

Language: Python - Size: 34.4 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 2

THU-MIG/torch-model-compression

针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库

Language: Python - Size: 132 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 250 - Forks: 41

PaddlePaddle/PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

Language: Python - Size: 16.3 MB - Last synced at: 13 days ago - Pushed at: 5 months ago - Stars: 1,587 - Forks: 350

he-y/Awesome-Pruning

A curated list of neural network pruning resources.

Size: 605 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 2,435 - Forks: 330

cupcakearmy/autorestic

Config driven, easy backup cli for restic.

Language: Go - Size: 3 MB - Last synced at: 13 days ago - Pushed at: 23 days ago - Stars: 1,511 - Forks: 81

SforAiDl/KD_Lib

A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.

Language: Python - Size: 22.2 MB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 622 - Forks: 59

open-mmlab/mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.

Language: Python - Size: 11.1 MB - Last synced at: 14 days ago - Pushed at: 11 months ago - Stars: 1,576 - Forks: 235

fangvv/ILMIL

Code for paper "基于知识蒸馏的目标检测模型增量深度学习方法"

Language: Python - Size: 2.37 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 12 - Forks: 3

arcee-ai/PruneMe

Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models

Language: Python - Size: 166 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 229 - Forks: 28

Efficient-ML/Awesome-Efficient-AIGC

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

Size: 63.5 KB - Last synced at: 12 days ago - Pushed at: 2 months ago - Stars: 176 - Forks: 11

quic/aimet-pages

AIMET GitHub pages documentation

Language: HTML - Size: 41 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 8 - Forks: 4

e-dupuis/awesome-approximate-dnn

Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment

Size: 166 KB - Last synced at: 1 day ago - Pushed at: 11 months ago - Stars: 25 - Forks: 6

fastmachinelearning/hls4ml-tutorial

Tutorial notebooks for hls4ml

Language: Jupyter Notebook - Size: 19.9 MB - Last synced at: 12 days ago - Pushed at: 21 days ago - Stars: 334 - Forks: 149

666DZY666/micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

Language: Python - Size: 6.58 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2,239 - Forks: 476

Won-Seong/lightweight-resnet

Compressing ResNet50 with iterative pruning & distillation to maintain high accuracy on CIFAR-100.

Language: Python - Size: 115 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

saiakshitha33/Computer-vision-tracking-system

this is tracking system designed for tracking the objects that are same visually ,this repo contains code that track the trajectories of individual objects ,count the objects at each frame and much more

Language: Jupyter Notebook - Size: 31.9 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

SanderGi/YADES

YOLOv8 Animal Detection for Embedded Systems. 97% test accuracy in just 400kb (about the same size as the photos it classifies or 1 second of video). Various quantization, pruning, and distillation techniques for vision models are explored.

Language: Jupyter Notebook - Size: 62 MB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 2 - Forks: 0

ZhengaoLi/DISP-LLM-Dimension-Independent-Structural-Pruning

An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.

Language: Python - Size: 83 KB - Last synced at: 20 days ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 0

he-y/filter-pruning-geometric-median

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)

Language: Python - Size: 2.17 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 608 - Forks: 113

he-y/soft-filter-pruning

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

Language: Python - Size: 59.6 KB - Last synced at: 17 days ago - Pushed at: over 5 years ago - Stars: 380 - Forks: 73

joisino/speedbook

書籍『深層ニューラルネットワークの高速化』のサポートサイトです。

Language: Jupyter Notebook - Size: 480 KB - Last synced at: 17 days ago - Pushed at: 8 months ago - Stars: 53 - Forks: 1

VITA-Group/Unified-LTH-GNN

[ICML 2021] "A Unified Lottery Tickets Hypothesis for Graph Neural Networks", Tianlong Chen*, Yongduo Sui*, Xuxi Chen, Aston Zhang, Zhangyang Wang

Language: Python - Size: 18.1 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 66 - Forks: 13

FasterAI-Labs/fasterai

FasterAI: Prune and Distill your models with FastAI and PyTorch

Language: Jupyter Notebook - Size: 34.9 MB - Last synced at: 11 days ago - Pushed at: 22 days ago - Stars: 247 - Forks: 19

vbdi/divprune

[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models

Language: Python - Size: 11 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 7 - Forks: 0

apple/ml-upscale

Export utility for unconstrained channel pruned models

Language: Jupyter Notebook - Size: 3.28 MB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 72 - Forks: 12

coinslab/StatPruneNet

Development and Evaluation of Neural Net Sensitivity-Based Pruning Algorithms Using Statistical Inference

Language: Python - Size: 9.76 MB - Last synced at: 7 days ago - Pushed at: 23 days ago - Stars: 1 - Forks: 0

Nota-NetsPresso/SNP

Structured Neuron Level Pruning to compress Transformer-based models [ECCV'24]

Language: Python - Size: 99.1 MB - Last synced at: 23 days ago - Pushed at: 9 months ago - Stars: 12 - Forks: 1

airaria/TextPruner

A PyTorch-based model pruning toolkit for pre-trained language models

Language: Python - Size: 10.5 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 385 - Forks: 35

ArmenJeddi/saint

a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity

Language: Python - Size: 26.8 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 10 - Forks: 1

Mxbonn/ltmp

Code for Learned Thresholds Token Merging and Pruning for Vision Transformers (LTMP). A technique to reduce the size of Vision Transformers to any desired size with minimal loss of accuracy.

Language: Python - Size: 90.8 KB - Last synced at: 24 days ago - Pushed at: 5 months ago - Stars: 16 - Forks: 1

DrChainsaw/NaiveNASflux.jl

Your local Flux surgeon

Language: Julia - Size: 1.05 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 24 - Forks: 0

mlzxy/qsparse

Train neural networks with joint quantization and pruning on both weights and activations using any pytorch modules

Language: Python - Size: 293 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 41 - Forks: 2

schphe/tictac

tictactoe ai implementation using minimax with αβ-pruning

Language: C - Size: 17.6 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 3 - Forks: 0

binarypatrick/Prune

Prune is a simple tool that lets you remove any files not matching the specified retention options.

Language: C# - Size: 2.31 MB - Last synced at: 6 days ago - Pushed at: 28 days ago - Stars: 5 - Forks: 1

princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

Language: Python - Size: 1.79 MB - Last synced at: 19 days ago - Pushed at: almost 2 years ago - Stars: 196 - Forks: 32

ZIB-IOL/PERP

Code to reproduce the experiments of the paper: "PERP: Rethinking the Prune-Retrain Paradigm in the ERA of LLMs"

Language: Python - Size: 20.5 KB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

csarron/awesome-emdl

Embedded and mobile deep learning research resources

Size: 88.9 KB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 746 - Forks: 167

nelaturuharsha/TurboPrune

Harness for training/finding lottery tickets in PyTorch. With support for multiple pruning techniques and augmented by distributed training, FFCV and AMP.

Language: Python - Size: 925 KB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 17 - Forks: 1

ap-dev-github/atithidev-db-api

A fully scalable, AWS Lambda-based API . It enables seamless host and review management with serverless deployment, CI/CD automation, and robust security while being highly optimized for AWS cost efficiency.Now enhanced with TypeScript type safety, Jest testing, package pruning.

Language: TypeScript - Size: 356 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

xuxw98/DSPDet3D

[ECCV 2024] 3D Small Object Detection with Dynamic Spatial Pruning

Language: Python - Size: 123 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 104 - Forks: 4

ZIB-IOL/SMS

Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"

Language: Python - Size: 57.6 KB - Last synced at: 9 days ago - Pushed at: 6 months ago - Stars: 9 - Forks: 0

sjmikler/snip-pruning 📦

SNIP: SINGLE-SHOT NETWORK PRUNING

Language: Jupyter Notebook - Size: 48 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 30 - Forks: 2

BenWhetton/keras-surgeon

Pruning and other network surgery for trained Keras models.

Language: Python - Size: 132 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 408 - Forks: 107

JulesBelveze/bert-squeeze

🛠️ Tools for Transformers compression using PyTorch Lightning ⚡

Language: Python - Size: 2.44 MB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 82 - Forks: 10

alibaba/TinyNeuralNetwork

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

Language: Python - Size: 25.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 809 - Forks: 122

satabios/sconce

E2E AutoML Model Compression Package

Language: Jupyter Notebook - Size: 134 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 47 - Forks: 4

SpursLipu/YOLOv3v4-ModelCompression-MultidatasetTraining-Multibackbone

YOLO ModelCompression MultidatasetTraining

Language: Python - Size: 45.4 MB - Last synced at: 3 days ago - Pushed at: almost 3 years ago - Stars: 444 - Forks: 136

FabrizioSandri/2SSP

2SSP: A Two-Stage Framework for Structured Pruning of LLMs

Language: Python - Size: 1.94 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 13 - Forks: 1

jacobgil/pytorch-pruning

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

Language: Python - Size: 11.7 KB - Last synced at: 20 days ago - Pushed at: almost 6 years ago - Stars: 878 - Forks: 202

jack-willturner/deep-compression

Learning both Weights and Connections for Efficient Neural Networks https://arxiv.org/abs/1506.02626

Language: Jupyter Notebook - Size: 2.4 MB - Last synced at: 20 days ago - Pushed at: over 2 years ago - Stars: 177 - Forks: 38

neuralmagic/sparsify

ML model optimization product to accelerate inference.

Language: Python - Size: 7.18 MB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 326 - Forks: 30

sayakpaul/Adventures-in-TensorFlow-Lite

This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.

Language: Jupyter Notebook - Size: 49.1 MB - Last synced at: 17 days ago - Pushed at: about 2 years ago - Stars: 172 - Forks: 35

EIDOSLAB/simplify

Simplification of pruned models for accelerated inference | SoftwareX https://doi.org/10.1016/j.softx.2021.100907

Language: Python - Size: 2.01 MB - Last synced at: 30 days ago - Pushed at: about 2 months ago - Stars: 35 - Forks: 3

MK2112/mobileYOLOv3

YOLOv3 on a MobileNetV3_Small architecture; trained, explained, pruned and quantized for text detection.

Language: Jupyter Notebook - Size: 26.3 MB - Last synced at: 20 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

iurada/talos-task-arithmetic

Official repository of our work "Efficient Model Editing with Task-Localized Sparse Fine-tuning" accepted at ICLR 2025

Language: Python - Size: 84 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

gaosh/DWNP

Language: Python - Size: 38.1 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

bzantium/pytorch-admm-pruning

Prune DNN using Alternating Direction Method of Multipliers (ADMM)

Language: Python - Size: 15.6 KB - Last synced at: 25 days ago - Pushed at: over 5 years ago - Stars: 100 - Forks: 18

VITA-Group/PruneCXR

[MICCAI 2023] "How Does Pruning Impact Long-Tailed Multi-Label Medical Image Classifiers?" by Gregory Holste et al.

Language: Python - Size: 1000 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 1

delve-team/delve

PyTorch model training and layer saturation monitor

Language: Python - Size: 13.8 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 81 - Forks: 13

kentaroy47/Deep-Compression.Pytorch

Unofficial Pytorch implementation of Deep Compression in CIFAR10

Language: Python - Size: 430 MB - Last synced at: 21 days ago - Pushed at: over 3 years ago - Stars: 35 - Forks: 9

JarvisPei/FuseGPT

The implementation for the paper, FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers.

Language: Python - Size: 12.7 KB - Last synced at: 17 days ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

Nota-NetsPresso/nota-wav2lip

A 28× Compressed Wav2Lip for Efficient Talking Face Generation [ICCV'23 Demo] [MLSys'23 Workshop] [NVIDIA GTC'23]

Language: Python - Size: 78.1 KB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 56 - Forks: 6

VITA-Group/SViTE

[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Language: Python - Size: 615 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 12

hexuandeng/DRPruning

Language: Python - Size: 10.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

ragibson/ModularityPruning

Pruning tool to identify small subsets of network partitions that are significant from the perspective of stochastic block model inference. This method works for single-layer and multi-layer networks, as well as for restricting focus to a fixed number of communities when desired.

Language: Python - Size: 2.68 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 16 - Forks: 2

kriskrisliu/PAT

[AAAI 2025] PAT: Pruning-Aware Tuning for Large Language Models

Language: Python - Size: 30.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 27 - Forks: 0

mattjegan/pruner

A CLI tool for pruning your overgrown requirements file

Language: Python - Size: 22.5 KB - Last synced at: 3 days ago - Pushed at: about 4 years ago - Stars: 6 - Forks: 1

fangvv/EdgeDI

Code for paper "Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters"

Language: Python - Size: 18.6 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 2

3outeille/DSD-training

Implementation of DSD: Dense-Sparse-Dense Training for Deep Neural Networks in Pytorch.

Language: Jupyter Notebook - Size: 2.28 MB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 9 - Forks: 1

kumasento/gconv-prune

Code repository for paper "Efficient Structured Pruning and Architecture Searching for Group Convolution" https://arxiv.org/abs/1811.09341

Language: Python - Size: 2.29 MB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 2

r-papso/torch-optimizer

PyTorch models optimization by neural network pruning

Language: Python - Size: 55.3 MB - Last synced at: 14 days ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

Related Keywords
pruning 444 deep-learning 102 pytorch 97 quantization 75 model-compression 61 machine-learning 41 neural-network 33 tensorflow 30 python 30 sparsity 27 compression 25 knowledge-distillation 22 deep-neural-networks 22 llm 20 neural-networks 19 distillation 18 lottery-ticket-hypothesis 16 decision-trees 16 computer-vision 16 ai 15 cnn 15 convolutional-neural-networks 14 optimization 14 transformer 14 artificial-intelligence 13 keras 12 structured-pruning 11 nlp 11 quantization-aware-training 11 classification 10 transformers 10 bert 10 decision-tree 10 alpha-beta-pruning 10 logistic-regression 9 onnx 9 object-detection 9 resnet 9 neural-network-compression 9 pruning-algorithms 9 efficiency 9 sparsification 8 image-classification 8 inference 8 network-compression 7 transfer-learning 7 neural-network-pruning 7 weight-pruning 7 edge-computing 7 filter-pruning 7 llama 7 network-pruning 7 federated-learning 7 minimax 6 awesome-list 6 post-training-quantization 6 python3 6 large-language-models 6 decision-tree-classifier 6 pytorch-implementation 6 auc-roc-curve 5 yolo 5 java 5 minimax-algorithm 5 multicollinearity 5 sparse-neural-networks 5 model-pruning 5 vit 5 compression-algorithm 5 tensorrt 5 nas 5 cpp 5 unstructured-pruning 5 backup 5 continual-learning 5 natural-language-processing 4 machinelearning 4 super-resolution 4 detection 4 backtracking 4 yolov5 4 deeplearning 4 vgg 4 alpha 4 alpha-beta 4 beta 4 random-forest 4 game 4 regularization 4 segmentation 4 automl 4 channel-pruning 4 efficient-deep-learning 4 lightweight 4 reinforcement-learning 4 model-optimization 4 pruning-optimization 4 pruning-structures 4 evaluation 4 prune 4