GitHub topics: pruning
tianyic/only_train_once_personal_footprint
OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
Language: Python - Size: 2.94 MB - Last synced at: about 4 hours ago - Pushed at: 7 months ago - Stars: 303 - Forks: 47

VainF/Torch-Pruning
[CVPR 2023] DepGraph: Towards Any Structural Pruning
Language: Python - Size: 10 MB - Last synced at: about 14 hours ago - Pushed at: 12 days ago - Stars: 2,982 - Forks: 346

intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Language: Python - Size: 469 MB - Last synced at: about 15 hours ago - Pushed at: about 20 hours ago - Stars: 2,380 - Forks: 267

friendshipkim/overfill
Code for OverFill: Two-Stage Models for Efficient Language Model Decoding
Language: Python - Size: 1.87 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

datawhalechina/leedl-tutorial
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Language: Jupyter Notebook - Size: 294 MB - Last synced at: 1 day ago - Pushed at: 21 days ago - Stars: 14,994 - Forks: 3,015

tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Language: Python - Size: 2.22 MB - Last synced at: about 14 hours ago - Pushed at: 2 months ago - Stars: 1,531 - Forks: 325

openvinotoolkit/nncf
Neural Network Compression Framework for enhanced OpenVINO™ inference
Language: Python - Size: 61.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,001 - Forks: 251

horseee/LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
Language: Python - Size: 5.92 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 1,000 - Forks: 116

princeton-nlp/LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Language: Python - Size: 19 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 599 - Forks: 52

peremartra/Large-Language-Model-Notebooks-Course
Practical course about Large Language Models.
Language: Jupyter Notebook - Size: 12.9 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,572 - Forks: 399

mcthouacbb/Sirius
Chess engine
Language: C++ - Size: 37 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 34 - Forks: 1

huggingface/optimum-intel
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
Language: Jupyter Notebook - Size: 17 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 459 - Forks: 128

Syencil/mobile-yolov5-pruning-distillation
mobilev2-yolov5s剪枝、蒸馏,支持ncnn,tensorRT部署。ultra-light but better performence!
Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 843 - Forks: 165

quic/aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Language: Python - Size: 18.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,274 - Forks: 400

ModelTC/llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Language: Python - Size: 28.9 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 456 - Forks: 53

cedrickchee/awesome-ml-model-compression
Awesome machine learning model compression research papers, quantization, tools, and learning material.
Size: 213 KB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 510 - Forks: 61

datawhalechina/llm-deploy
大模型/LLM推理和部署理论与实践
Size: 100 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 240 - Forks: 35

Guaishou74851/AdcSR
(CVPR 2025) Adversarial Diffusion Compression for Real-World Image Super-Resolution [PyTorch]
Language: Python - Size: 36.2 MB - Last synced at: 9 days ago - Pushed at: 19 days ago - Stars: 70 - Forks: 7

albertkjoller/transformer-redundancy
Code for the paper "How Redundant Is the Transformer Stack in Speech Representation Models?" (ICASSP 2025)
Language: Python - Size: 20.7 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

ROIM1998/APT
[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
Language: Python - Size: 4.08 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 38 - Forks: 1

JeanSanchezFelix/EdgeML-Projects
This repository contains example notebooks and homeworks demonstrating various techniques in model optimization for Edge ML.
Language: Jupyter Notebook - Size: 26.4 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

Bisonai/awesome-edge-machine-learning
A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.
Language: Python - Size: 135 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 261 - Forks: 51

luuyin/OWL
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
Language: Python - Size: 594 KB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 64 - Forks: 8

CASIA-IVA-Lab/FLAP
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
Language: Python - Size: 987 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 47 - Forks: 12

Mohammad-Hallaq/Wake_Vision_Challenge_Model_Centric_Track Fork of harvard-edge/Wake_Vision_Challenge_Model_Centric_Track
Language: PureBasic - Size: 37.5 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

1duo/awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Size: 11.8 MB - Last synced at: 12 days ago - Pushed at: almost 6 years ago - Stars: 411 - Forks: 73

huawei-noah/Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
Language: Jupyter Notebook - Size: 100 MB - Last synced at: 12 days ago - Pushed at: 6 months ago - Stars: 1,257 - Forks: 217

neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language: Python - Size: 137 MB - Last synced at: 11 days ago - Pushed at: 9 months ago - Stars: 3,130 - Forks: 183

yaya-sy/lillama
[NAACL' 25 main] Lillama: Large Language Model Compression via Low-Rank Feature Distillation
Language: Python - Size: 166 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 6 - Forks: 0

VITA-Group/SMC-Bench
[ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, AJAY KUMAR JAISWAL, Zhangyang Wang
Language: Python - Size: 34.4 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 2

THU-MIG/torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
Language: Python - Size: 132 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 250 - Forks: 41

PaddlePaddle/PaddleSlim
PaddleSlim is an open-source library for deep model compression and architecture search.
Language: Python - Size: 16.3 MB - Last synced at: 13 days ago - Pushed at: 5 months ago - Stars: 1,587 - Forks: 350

he-y/Awesome-Pruning
A curated list of neural network pruning resources.
Size: 605 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 2,435 - Forks: 330

cupcakearmy/autorestic
Config driven, easy backup cli for restic.
Language: Go - Size: 3 MB - Last synced at: 13 days ago - Pushed at: 23 days ago - Stars: 1,511 - Forks: 81

SforAiDl/KD_Lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Language: Python - Size: 22.2 MB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 622 - Forks: 59

open-mmlab/mmrazor
OpenMMLab Model Compression Toolbox and Benchmark.
Language: Python - Size: 11.1 MB - Last synced at: 14 days ago - Pushed at: 11 months ago - Stars: 1,576 - Forks: 235

fangvv/ILMIL
Code for paper "基于知识蒸馏的目标检测模型增量深度学习方法"
Language: Python - Size: 2.37 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 12 - Forks: 3

arcee-ai/PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
Language: Python - Size: 166 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 229 - Forks: 28

Efficient-ML/Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Size: 63.5 KB - Last synced at: 12 days ago - Pushed at: 2 months ago - Stars: 176 - Forks: 11

quic/aimet-pages
AIMET GitHub pages documentation
Language: HTML - Size: 41 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 8 - Forks: 4

e-dupuis/awesome-approximate-dnn
Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment
Size: 166 KB - Last synced at: 1 day ago - Pushed at: 11 months ago - Stars: 25 - Forks: 6

fastmachinelearning/hls4ml-tutorial
Tutorial notebooks for hls4ml
Language: Jupyter Notebook - Size: 19.9 MB - Last synced at: 12 days ago - Pushed at: 21 days ago - Stars: 334 - Forks: 149

666DZY666/micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Language: Python - Size: 6.58 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2,239 - Forks: 476

Won-Seong/lightweight-resnet
Compressing ResNet50 with iterative pruning & distillation to maintain high accuracy on CIFAR-100.
Language: Python - Size: 115 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

saiakshitha33/Computer-vision-tracking-system
this is tracking system designed for tracking the objects that are same visually ,this repo contains code that track the trajectories of individual objects ,count the objects at each frame and much more
Language: Jupyter Notebook - Size: 31.9 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

SanderGi/YADES
YOLOv8 Animal Detection for Embedded Systems. 97% test accuracy in just 400kb (about the same size as the photos it classifies or 1 second of video). Various quantization, pruning, and distillation techniques for vision models are explored.
Language: Jupyter Notebook - Size: 62 MB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 2 - Forks: 0

ZhengaoLi/DISP-LLM-Dimension-Independent-Structural-Pruning
An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.
Language: Python - Size: 83 KB - Last synced at: 20 days ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 0

he-y/filter-pruning-geometric-median
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
Language: Python - Size: 2.17 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 608 - Forks: 113

he-y/soft-filter-pruning
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Language: Python - Size: 59.6 KB - Last synced at: 17 days ago - Pushed at: over 5 years ago - Stars: 380 - Forks: 73

joisino/speedbook
書籍『深層ニューラルネットワークの高速化』のサポートサイトです。
Language: Jupyter Notebook - Size: 480 KB - Last synced at: 17 days ago - Pushed at: 8 months ago - Stars: 53 - Forks: 1

VITA-Group/Unified-LTH-GNN
[ICML 2021] "A Unified Lottery Tickets Hypothesis for Graph Neural Networks", Tianlong Chen*, Yongduo Sui*, Xuxi Chen, Aston Zhang, Zhangyang Wang
Language: Python - Size: 18.1 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 66 - Forks: 13

FasterAI-Labs/fasterai
FasterAI: Prune and Distill your models with FastAI and PyTorch
Language: Jupyter Notebook - Size: 34.9 MB - Last synced at: 11 days ago - Pushed at: 22 days ago - Stars: 247 - Forks: 19

vbdi/divprune
[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
Language: Python - Size: 11 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 7 - Forks: 0

apple/ml-upscale
Export utility for unconstrained channel pruned models
Language: Jupyter Notebook - Size: 3.28 MB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 72 - Forks: 12

coinslab/StatPruneNet
Development and Evaluation of Neural Net Sensitivity-Based Pruning Algorithms Using Statistical Inference
Language: Python - Size: 9.76 MB - Last synced at: 7 days ago - Pushed at: 23 days ago - Stars: 1 - Forks: 0

Nota-NetsPresso/SNP
Structured Neuron Level Pruning to compress Transformer-based models [ECCV'24]
Language: Python - Size: 99.1 MB - Last synced at: 23 days ago - Pushed at: 9 months ago - Stars: 12 - Forks: 1

airaria/TextPruner
A PyTorch-based model pruning toolkit for pre-trained language models
Language: Python - Size: 10.5 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 385 - Forks: 35

ArmenJeddi/saint
a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity
Language: Python - Size: 26.8 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 10 - Forks: 1

Mxbonn/ltmp
Code for Learned Thresholds Token Merging and Pruning for Vision Transformers (LTMP). A technique to reduce the size of Vision Transformers to any desired size with minimal loss of accuracy.
Language: Python - Size: 90.8 KB - Last synced at: 24 days ago - Pushed at: 5 months ago - Stars: 16 - Forks: 1

DrChainsaw/NaiveNASflux.jl
Your local Flux surgeon
Language: Julia - Size: 1.05 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 24 - Forks: 0

mlzxy/qsparse
Train neural networks with joint quantization and pruning on both weights and activations using any pytorch modules
Language: Python - Size: 293 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 41 - Forks: 2

schphe/tictac
tictactoe ai implementation using minimax with αβ-pruning
Language: C - Size: 17.6 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 3 - Forks: 0

binarypatrick/Prune
Prune is a simple tool that lets you remove any files not matching the specified retention options.
Language: C# - Size: 2.31 MB - Last synced at: 6 days ago - Pushed at: 28 days ago - Stars: 5 - Forks: 1

princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
Language: Python - Size: 1.79 MB - Last synced at: 19 days ago - Pushed at: almost 2 years ago - Stars: 196 - Forks: 32

ZIB-IOL/PERP
Code to reproduce the experiments of the paper: "PERP: Rethinking the Prune-Retrain Paradigm in the ERA of LLMs"
Language: Python - Size: 20.5 KB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

csarron/awesome-emdl
Embedded and mobile deep learning research resources
Size: 88.9 KB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 746 - Forks: 167

nelaturuharsha/TurboPrune
Harness for training/finding lottery tickets in PyTorch. With support for multiple pruning techniques and augmented by distributed training, FFCV and AMP.
Language: Python - Size: 925 KB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 17 - Forks: 1

ap-dev-github/atithidev-db-api
A fully scalable, AWS Lambda-based API . It enables seamless host and review management with serverless deployment, CI/CD automation, and robust security while being highly optimized for AWS cost efficiency.Now enhanced with TypeScript type safety, Jest testing, package pruning.
Language: TypeScript - Size: 356 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

xuxw98/DSPDet3D
[ECCV 2024] 3D Small Object Detection with Dynamic Spatial Pruning
Language: Python - Size: 123 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 104 - Forks: 4

ZIB-IOL/SMS
Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"
Language: Python - Size: 57.6 KB - Last synced at: 9 days ago - Pushed at: 6 months ago - Stars: 9 - Forks: 0

sjmikler/snip-pruning 📦
SNIP: SINGLE-SHOT NETWORK PRUNING
Language: Jupyter Notebook - Size: 48 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 30 - Forks: 2

BenWhetton/keras-surgeon
Pruning and other network surgery for trained Keras models.
Language: Python - Size: 132 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 408 - Forks: 107

JulesBelveze/bert-squeeze
🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
Language: Python - Size: 2.44 MB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 82 - Forks: 10

alibaba/TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Language: Python - Size: 25.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 809 - Forks: 122

satabios/sconce
E2E AutoML Model Compression Package
Language: Jupyter Notebook - Size: 134 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 47 - Forks: 4

SpursLipu/YOLOv3v4-ModelCompression-MultidatasetTraining-Multibackbone
YOLO ModelCompression MultidatasetTraining
Language: Python - Size: 45.4 MB - Last synced at: 3 days ago - Pushed at: almost 3 years ago - Stars: 444 - Forks: 136

FabrizioSandri/2SSP
2SSP: A Two-Stage Framework for Structured Pruning of LLMs
Language: Python - Size: 1.94 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 13 - Forks: 1

jacobgil/pytorch-pruning
PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference
Language: Python - Size: 11.7 KB - Last synced at: 20 days ago - Pushed at: almost 6 years ago - Stars: 878 - Forks: 202

jack-willturner/deep-compression
Learning both Weights and Connections for Efficient Neural Networks https://arxiv.org/abs/1506.02626
Language: Jupyter Notebook - Size: 2.4 MB - Last synced at: 20 days ago - Pushed at: over 2 years ago - Stars: 177 - Forks: 38

neuralmagic/sparsify
ML model optimization product to accelerate inference.
Language: Python - Size: 7.18 MB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 326 - Forks: 30

sayakpaul/Adventures-in-TensorFlow-Lite
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
Language: Jupyter Notebook - Size: 49.1 MB - Last synced at: 17 days ago - Pushed at: about 2 years ago - Stars: 172 - Forks: 35

EIDOSLAB/simplify
Simplification of pruned models for accelerated inference | SoftwareX https://doi.org/10.1016/j.softx.2021.100907
Language: Python - Size: 2.01 MB - Last synced at: 30 days ago - Pushed at: about 2 months ago - Stars: 35 - Forks: 3

MK2112/mobileYOLOv3
YOLOv3 on a MobileNetV3_Small architecture; trained, explained, pruned and quantized for text detection.
Language: Jupyter Notebook - Size: 26.3 MB - Last synced at: 20 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

iurada/talos-task-arithmetic
Official repository of our work "Efficient Model Editing with Task-Localized Sparse Fine-tuning" accepted at ICLR 2025
Language: Python - Size: 84 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

gaosh/DWNP
Language: Python - Size: 38.1 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

bzantium/pytorch-admm-pruning
Prune DNN using Alternating Direction Method of Multipliers (ADMM)
Language: Python - Size: 15.6 KB - Last synced at: 25 days ago - Pushed at: over 5 years ago - Stars: 100 - Forks: 18

VITA-Group/PruneCXR
[MICCAI 2023] "How Does Pruning Impact Long-Tailed Multi-Label Medical Image Classifiers?" by Gregory Holste et al.
Language: Python - Size: 1000 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 1

delve-team/delve
PyTorch model training and layer saturation monitor
Language: Python - Size: 13.8 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 81 - Forks: 13

kentaroy47/Deep-Compression.Pytorch
Unofficial Pytorch implementation of Deep Compression in CIFAR10
Language: Python - Size: 430 MB - Last synced at: 21 days ago - Pushed at: over 3 years ago - Stars: 35 - Forks: 9

JarvisPei/FuseGPT
The implementation for the paper, FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers.
Language: Python - Size: 12.7 KB - Last synced at: 17 days ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

Nota-NetsPresso/nota-wav2lip
A 28× Compressed Wav2Lip for Efficient Talking Face Generation [ICCV'23 Demo] [MLSys'23 Workshop] [NVIDIA GTC'23]
Language: Python - Size: 78.1 KB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 56 - Forks: 6

VITA-Group/SViTE
[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang
Language: Python - Size: 615 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 12

hexuandeng/DRPruning
Language: Python - Size: 10.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

ragibson/ModularityPruning
Pruning tool to identify small subsets of network partitions that are significant from the perspective of stochastic block model inference. This method works for single-layer and multi-layer networks, as well as for restricting focus to a fixed number of communities when desired.
Language: Python - Size: 2.68 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 16 - Forks: 2

kriskrisliu/PAT
[AAAI 2025] PAT: Pruning-Aware Tuning for Large Language Models
Language: Python - Size: 30.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 27 - Forks: 0

mattjegan/pruner
A CLI tool for pruning your overgrown requirements file
Language: Python - Size: 22.5 KB - Last synced at: 3 days ago - Pushed at: about 4 years ago - Stars: 6 - Forks: 1

fangvv/EdgeDI
Code for paper "Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters"
Language: Python - Size: 18.6 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 2

3outeille/DSD-training
Implementation of DSD: Dense-Sparse-Dense Training for Deep Neural Networks in Pytorch.
Language: Jupyter Notebook - Size: 2.28 MB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 9 - Forks: 1

kumasento/gconv-prune
Code repository for paper "Efficient Structured Pruning and Architecture Searching for Group Convolution" https://arxiv.org/abs/1811.09341
Language: Python - Size: 2.29 MB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 2

r-papso/torch-optimizer
PyTorch models optimization by neural network pruning
Language: Python - Size: 55.3 MB - Last synced at: 14 days ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1
