Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: quantization
datawhalechina/awesome-compression
模型压缩的小白入门教程
Size: 102 MB - Last synced: about 11 hours ago - Pushed: about 12 hours ago - Stars: 43 - Forks: 10
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Language: Python - Size: 14.7 MB - Last synced: about 12 hours ago - Pushed: 1 day ago - Stars: 9,147 - Forks: 754
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Language: Python - Size: 23 MB - Last synced: about 14 hours ago - Pushed: 15 days ago - Stars: 17,526 - Forks: 1,808
Xilinx/finn
Dataflow compiler for QNN inference on FPGAs
Language: Python - Size: 77.1 MB - Last synced: about 15 hours ago - Pushed: 1 day ago - Stars: 668 - Forks: 213
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
Language: Python - Size: 19.5 MB - Last synced: about 15 hours ago - Pushed: about 15 hours ago - Stars: 1,097 - Forks: 175
OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language: C++ - Size: 13.5 MB - Last synced: about 16 hours ago - Pushed: about 17 hours ago - Stars: 2,846 - Forks: 251
mobiusml/hqq
Official implementation of Half-Quadratic Quantization (HQQ)
Language: Python - Size: 379 KB - Last synced: about 15 hours ago - Pushed: 1 day ago - Stars: 492 - Forks: 44
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Language: Python - Size: 400 MB - Last synced: about 22 hours ago - Pushed: about 23 hours ago - Stars: 1,985 - Forks: 238
Aisuko/notebooks
Implementation for the different ML tasks on Kaggle platform with GPUs.
Language: Jupyter Notebook - Size: 148 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 8 - Forks: 1
stochasticai/xTuring
Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
Language: Python - Size: 18.4 MB - Last synced: 1 day ago - Pushed: about 1 month ago - Stars: 2,527 - Forks: 197
IntelLabs/distiller 📦
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Language: Jupyter Notebook - Size: 40.5 MB - Last synced: 1 day ago - Pushed: about 1 year ago - Stars: 4,309 - Forks: 797
ash2703/MyLearnings
Documentation of my notes, learnings, presentations on Computer vision and some other cool stuff
Language: Jupyter Notebook - Size: 7.6 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 1 - Forks: 0
quic/aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Language: Python - Size: 12.8 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 1,928 - Forks: 351
htqin/awesome-model-quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Size: 61.4 MB - Last synced: 1 day ago - Pushed: 16 days ago - Stars: 1,645 - Forks: 199
csarron/awesome-emdl
Embedded and mobile deep learning research resources
Size: 88.9 KB - Last synced: 1 day ago - Pushed: about 1 year ago - Stars: 720 - Forks: 166
DerryHub/BEVFormer_tensorrt
BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Language: Python - Size: 403 KB - Last synced: 1 day ago - Pushed: 6 months ago - Stars: 349 - Forks: 59
adithya-s-k/LLM-Alchemy-Chamber
a friendly neighborhood repository with diverse experiments and adventures in the world of LLMs
Language: Jupyter Notebook - Size: 4.55 MB - Last synced: 1 day ago - Pushed: 3 days ago - Stars: 110 - Forks: 25
openvinotoolkit/training_extensions
Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™
Language: Python - Size: 363 MB - Last synced: about 23 hours ago - Pushed: 1 day ago - Stars: 1,121 - Forks: 436
sony/model_optimization
Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
Language: Python - Size: 8.43 MB - Last synced: 1 day ago - Pushed: 3 days ago - Stars: 267 - Forks: 42
ModelTC/TFMQ-DM
[CVPR 2024 Highlight] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Language: Jupyter Notebook - Size: 56.9 MB - Last synced: 1 day ago - Pushed: 2 days ago - Stars: 21 - Forks: 3
RWKV/rwkv.cpp
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Language: C++ - Size: 16.5 MB - Last synced: 1 day ago - Pushed: 29 days ago - Stars: 1,112 - Forks: 75
AutoGPTQ/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language: Python - Size: 7.8 MB - Last synced: 2 days ago - Pushed: 4 days ago - Stars: 3,847 - Forks: 388
koszeggy/KGySoft.Drawing
KGy SOFT Drawing is a library for advanced image, icon and graphics handling.
Language: C# - Size: 121 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 54 - Forks: 8
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Language: Python - Size: 2.17 MB - Last synced: 2 days ago - Pushed: 12 days ago - Stars: 1,471 - Forks: 321
ArslanKAS/Quantization-Fundamentals
Learn to compressing models through methods such as quantization to make them more efficient, faster, and accessible
Language: Jupyter Notebook - Size: 2.38 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0
arjbingly/grag
GRAG is a simple python package that provides an easy end-to-end solution for implementing Retrieval Augmented Generation (RAG). The package offers an easy way for running various LLMs locally, Thanks to LlamaCpp and also supports vector stores like Chroma and DeepLake.
Language: Python - Size: 53.1 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 4 - Forks: 0
petroniocandido/clshq_tk
Contrastive-LSH Embedding and Tokenization Technique for Multivariate Time Series Classification
Language: Jupyter Notebook - Size: 802 KB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 0 - Forks: 0
UFund-Me/Qbot
[🔥updating ...] AI 自动量化交易机器人 AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
Language: Jupyter Notebook - Size: 205 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 5,955 - Forks: 792
cedrickchee/awesome-ml-model-compression
Awesome machine learning model compression research papers, tools, and learning material.
Size: 185 KB - Last synced: 1 day ago - Pushed: 7 days ago - Stars: 446 - Forks: 57
Bisonai/awesome-edge-machine-learning
A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.
Language: Python - Size: 135 KB - Last synced: 1 day ago - Pushed: about 1 year ago - Stars: 240 - Forks: 51
kornelski/pngquant
Lossy PNG compressor — pngquant command based on libimagequant library
Language: C - Size: 1.71 MB - Last synced: 4 days ago - Pushed: 17 days ago - Stars: 5,028 - Forks: 475
huawei-noah/Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
Language: Jupyter Notebook - Size: 98.7 MB - Last synced: 2 days ago - Pushed: about 1 month ago - Stars: 1,115 - Forks: 198
dbohdan/hicolor
🎨 Convert images to 15/16-bit RGB color with dithering
Language: C - Size: 642 KB - Last synced: 4 days ago - Pushed: 5 days ago - Stars: 193 - Forks: 5
Bisonai/ncnn Fork of Tencent/ncnn
Modified inference engine for quantized convolution using product quantization
Language: C++ - Size: 7.96 MB - Last synced: 5 days ago - Pushed: almost 2 years ago - Stars: 4 - Forks: 0
autohdw/QuBLAS
Quantized BLAS
Language: C++ - Size: 102 KB - Last synced: 17 days ago - Pushed: 23 days ago - Stars: 2 - Forks: 0
neuralmagic/sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Language: Python - Size: 1.79 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 358 - Forks: 23
PINTO0309/onnx2tf
Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.
Language: Python - Size: 3.89 MB - Last synced: 13 days ago - Pushed: 15 days ago - Stars: 559 - Forks: 59
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language: Python - Size: 137 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 2,881 - Forks: 168
hkproj/quantization-notes
Notes on quantization in neural networks
Language: Jupyter Notebook - Size: 940 KB - Last synced: 1 day ago - Pushed: 5 months ago - Stars: 30 - Forks: 9
IntelLabs/nlp-architect 📦
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Language: Python - Size: 531 MB - Last synced: 3 days ago - Pushed: over 1 year ago - Stars: 2,930 - Forks: 443
OmidGhadami95/EfficientNetV2_Quantization_CK
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
Language: Jupyter Notebook - Size: 344 KB - Last synced: 11 days ago - Pushed: 11 days ago - Stars: 0 - Forks: 0
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras
Language: Python - Size: 1.32 MB - Last synced: about 4 hours ago - Pushed: 11 days ago - Stars: 523 - Forks: 101
ellenzhuwang/rooted_loss
Rooted logistic loss to accelerate neural network traning and LLM quantization
Language: Python - Size: 127 MB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 1 - Forks: 0
Harsh-Avinash/Caduceus
Infinite power but in a pendrive
Language: Jupyter Notebook - Size: 9.55 MB - Last synced: 13 days ago - Pushed: 5 months ago - Stars: 2 - Forks: 1
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language: Python - Size: 1.5 MB - Last synced: 12 days ago - Pushed: 13 days ago - Stars: 569 - Forks: 35
dguo/picture-paint
Dynamic Firefox theme that uses the National Geographic Photo of the Day
Language: JavaScript - Size: 1.18 MB - Last synced: 13 days ago - Pushed: over 1 year ago - Stars: 5 - Forks: 1
lordtt13/Machine-Vision
Models made for Edge Devices and NN Optimizations
Language: Python - Size: 1.43 MB - Last synced: 13 days ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
peterthehan/palette 📦
Median cut implementation.
Language: JavaScript - Size: 4.88 KB - Last synced: 13 days ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0
peterthehan/buckets 📦
Color quantization with React.
Language: JavaScript - Size: 3.75 MB - Last synced: 13 days ago - Pushed: about 5 years ago - Stars: 1 - Forks: 0
zlatko-minev/pyEPR
Powerful, automated analysis and design of quantum microwave chips & devices [Energy-Participation Ratio and more]
Language: Python - Size: 3.68 MB - Last synced: about 7 hours ago - Pushed: about 1 month ago - Stars: 156 - Forks: 210
lowhung/clustering-algorithms
Algorithms related to clustering such as k-Medians, DBSCAN as well as vector quantization.
Language: Python - Size: 618 KB - Last synced: 14 days ago - Pushed: almost 7 years ago - Stars: 1 - Forks: 0
mit-han-lab/TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
Language: C++ - Size: 78.4 MB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 543 - Forks: 52
hiyouga/LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
Language: Python - Size: 209 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 20,012 - Forks: 2,411
GURPREETKAURJETHRA/LLaMA3-Quantization
LLaMA3-Quantization
Language: Python - Size: 1.73 MB - Last synced: 8 days ago - Pushed: 20 days ago - Stars: 3 - Forks: 2
open-mmlab/mmrazor
OpenMMLab Model Compression Toolbox and Benchmark.
Language: Python - Size: 11.1 MB - Last synced: 12 days ago - Pushed: about 1 month ago - Stars: 1,367 - Forks: 215
A-suozhang/awesome-quantization-and-fixed-point-training
Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design
Size: 81.1 KB - Last synced: 4 days ago - Pushed: over 3 years ago - Stars: 152 - Forks: 24
ShekiLyu/lixinger-openapi
理杏仁开发平台python api(非官方)
Language: Python - Size: 241 KB - Last synced: 11 days ago - Pushed: almost 4 years ago - Stars: 49 - Forks: 11
AmanPriyanshu/FedPAQ-MNIST-implemenation
An implementation of FedPAQ using different experimental parameters. We will be looking at different variations of how, r(number of clients to be selected), t (local epochs) and s (Quantizer levels))
Language: Python - Size: 125 MB - Last synced: 13 days ago - Pushed: about 3 years ago - Stars: 21 - Forks: 3
openppl-public/ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Language: Python - Size: 5.57 MB - Last synced: 16 days ago - Pushed: about 2 months ago - Stars: 1,366 - Forks: 217
VC86/QRE-QEP
Companion code for "The Rare Eclipse Problem on Tiles: Quantised Embeddings of Disjoint Convex Sets".
Language: Jupyter Notebook - Size: 1.37 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 1 - Forks: 0
vaccovecrana/rwkv.jni
JNI wrapper for rwkv.cpp
Language: Java - Size: 734 KB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 1 - Forks: 0
SforAiDl/KD_Lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Language: Python - Size: 22.2 MB - Last synced: 5 days ago - Pushed: about 1 year ago - Stars: 571 - Forks: 56
wang-h/neuralcompressor
Embedding Quantization (Compress Word Embeddings)
Language: Python - Size: 11.7 KB - Last synced: 17 days ago - Pushed: over 4 years ago - Stars: 8 - Forks: 1
natasha/navec
Compact high quality word embeddings for Russian language
Language: Python - Size: 1.86 MB - Last synced: 17 days ago - Pushed: 10 months ago - Stars: 166 - Forks: 16
fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
Language: Python - Size: 5.17 MB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 96 - Forks: 32
ImageOptim/libimagequant
Palette quantization library that powers pngquant and other PNG optimizers
Language: Rust - Size: 1.24 MB - Last synced: about 1 month ago - Pushed: 4 months ago - Stars: 719 - Forks: 125
DeepVAC/deepvac
PyTorch Project Specification.
Language: Python - Size: 791 KB - Last synced: 17 days ago - Pushed: almost 3 years ago - Stars: 640 - Forks: 103
guan-yuan/awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
Size: 150 KB - Last synced: 1 day ago - Pushed: almost 3 years ago - Stars: 827 - Forks: 160
666DZY666/micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Language: Python - Size: 6.84 MB - Last synced: 16 days ago - Pushed: over 2 years ago - Stars: 2,178 - Forks: 477
Ki6an/fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Language: Python - Size: 277 KB - Last synced: 9 days ago - Pushed: about 1 year ago - Stars: 541 - Forks: 68
kssteven418/I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
Language: Python - Size: 6.38 MB - Last synced: 17 days ago - Pushed: over 1 year ago - Stars: 209 - Forks: 30
inisis/brocolli
Everything in Torch Fx
Language: Python - Size: 5.9 MB - Last synced: 17 days ago - Pushed: 2 months ago - Stars: 329 - Forks: 62
rvandernoort/SmallLLMs
List of smaller LLMs that could be dpeloyed to the Edge
Language: Ruby - Size: 32.2 KB - Last synced: 19 days ago - Pushed: 20 days ago - Stars: 0 - Forks: 0
huggingface/optimum-intel
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
Language: Jupyter Notebook - Size: 3.08 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 316 - Forks: 88
cubicibo/piliq
Lightweight Python PIL-libimagequant/pngquant interface with autonomous lib look-up.
Language: Python - Size: 641 KB - Last synced: 1 day ago - Pushed: 20 days ago - Stars: 0 - Forks: 0
Astro36/ICE4009-practice-project 📦
Inha Univ. Digital Communication System Capstone Design Practice/Project
Language: MATLAB - Size: 14.8 MB - Last synced: 21 days ago - Pushed: 11 months ago - Stars: 2 - Forks: 0
openvinotoolkit/nncf
Neural Network Compression Framework for enhanced OpenVINO™ inference
Language: Python - Size: 47.1 MB - Last synced: 22 days ago - Pushed: 22 days ago - Stars: 772 - Forks: 197
ksm26/Quantization-Fundamentals-with-Hugging-Face
Learn linear quantization techniques using the Quanto library and downcasting methods with the Transformers library to compress and optimize generative AI models effectively.
Language: Jupyter Notebook - Size: 205 KB - Last synced: 21 days ago - Pushed: 21 days ago - Stars: 0 - Forks: 0
SqueezeAILab/KVQuant
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Language: Python - Size: 19.7 MB - Last synced: 21 days ago - Pushed: 25 days ago - Stars: 187 - Forks: 14
Asad-Ismail/lane_detection
Lane Detection and Classification using Front camera monocular images
Language: Python - Size: 40.9 MB - Last synced: 22 days ago - Pushed: about 1 year ago - Stars: 3 - Forks: 0
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language: Python - Size: 758 KB - Last synced: 22 days ago - Pushed: 22 days ago - Stars: 314 - Forks: 21
countzero/windows_manage_large_language_models
PowerShell automation to download large language models (LLMs) from Git repositories and quantize them with llama.cpp into the GGUF format.
Language: PowerShell - Size: 32.2 KB - Last synced: 21 days ago - Pushed: 23 days ago - Stars: 1 - Forks: 0
A-suozhang/Awesome-Efficient-Diffusion
Curated list of methods that focuses on improving the efficiency of diffusion models
Size: 5.86 KB - Last synced: 4 days ago - Pushed: 11 months ago - Stars: 13 - Forks: 0
Abhishek2271/TransferabilityAnalysis
A Python API that facilitates training, creating, and transferring attacks with quantized DNNs
Language: Python - Size: 54.8 MB - Last synced: 25 days ago - Pushed: 8 months ago - Stars: 1 - Forks: 0
hailo-ai/hailo_model_zoo
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment
Language: Python - Size: 4.32 MB - Last synced: 24 days ago - Pushed: about 1 month ago - Stars: 91 - Forks: 28
PaddlePaddle/PaddleSlim
PaddleSlim is an open-source library for deep model compression and architecture search.
Language: Python - Size: 16.3 MB - Last synced: 24 days ago - Pushed: about 2 months ago - Stars: 1,514 - Forks: 347
jy-yuan/KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Language: Python - Size: 16.8 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 107 - Forks: 5
satabios/sconce
Model Compression Made Easy
Language: Jupyter Notebook - Size: 215 MB - Last synced: 27 days ago - Pushed: 27 days ago - Stars: 32 - Forks: 1
huggingface/quanto
A pytorch Quantization Toolkit
Language: Python - Size: 1.67 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 514 - Forks: 28
lmEshoo/quantization
Post-training model quantization using apache's tvm.ai
Language: Jupyter Notebook - Size: 37.9 MB - Last synced: 28 days ago - Pushed: about 4 years ago - Stars: 1 - Forks: 1
lmEshoo/pruning
model weight pruning
Language: Jupyter Notebook - Size: 72.4 MB - Last synced: 28 days ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0
intel/auto-round
SOTA Weight-only Quantization Algorithm for LLMs
Language: Python - Size: 8.33 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 48 - Forks: 7
emsquared2/Telecommunications-NTUA
Project assignment for course Introduction to Telecommunications at ECE NTUA
Language: MATLAB - Size: 4.14 MB - Last synced: 29 days ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
huawei-noah/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Language: Python - Size: 29 MB - Last synced: 29 days ago - Pushed: 4 months ago - Stars: 2,953 - Forks: 623
iAmGiG/MadeSmallML
MadeSmallML is an open-source initiative designed to explore model quantization techniques for machine learning models. Our goal is to enable efficient deployment of these models on devices with limited computational resources by reducing model size and computational demands without significantly compromising performance.
Size: 5.86 KB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 0 - Forks: 0
intel/intel-extension-for-pytorch
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Language: Python - Size: 92.1 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,316 - Forks: 193
neuralmagic/sparsify
ML model optimization product to accelerate inference.
Language: Python - Size: 7.18 MB - Last synced: 7 days ago - Pushed: about 1 month ago - Stars: 315 - Forks: 27
kurianbenoy/Indic-Subtitler
Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.
Language: Jupyter Notebook - Size: 36.3 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 55 - Forks: 7
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Language: Python - Size: 8.17 MB - Last synced: 29 days ago - Pushed: about 2 months ago - Stars: 546 - Forks: 43
huggingface/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
Language: Python - Size: 4.01 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 2,096 - Forks: 356