Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: quantization

datawhalechina/awesome-compression

模型压缩的小白入门教程

Size: 102 MB - Last synced: about 11 hours ago - Pushed: about 12 hours ago - Stars: 43 - Forks: 10

SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

Language: Python - Size: 14.7 MB - Last synced: about 12 hours ago - Pushed: 1 day ago - Stars: 9,147 - Forks: 754

ymcui/Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language: Python - Size: 23 MB - Last synced: about 14 hours ago - Pushed: 15 days ago - Stars: 17,526 - Forks: 1,808

Xilinx/finn

Dataflow compiler for QNN inference on FPGAs

Language: Python - Size: 77.1 MB - Last synced: about 15 hours ago - Pushed: 1 day ago - Stars: 668 - Forks: 213

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

Language: Python - Size: 19.5 MB - Last synced: about 15 hours ago - Pushed: about 15 hours ago - Stars: 1,097 - Forks: 175

OpenNMT/CTranslate2

Fast inference engine for Transformer models

Language: C++ - Size: 13.5 MB - Last synced: about 16 hours ago - Pushed: about 17 hours ago - Stars: 2,846 - Forks: 251

mobiusml/hqq

Official implementation of Half-Quadratic Quantization (HQQ)

Language: Python - Size: 379 KB - Last synced: about 15 hours ago - Pushed: 1 day ago - Stars: 492 - Forks: 44

intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language: Python - Size: 400 MB - Last synced: about 22 hours ago - Pushed: about 23 hours ago - Stars: 1,985 - Forks: 238

Aisuko/notebooks

Implementation for the different ML tasks on Kaggle platform with GPUs.

Language: Jupyter Notebook - Size: 148 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 8 - Forks: 1

stochasticai/xTuring

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

Language: Python - Size: 18.4 MB - Last synced: 1 day ago - Pushed: about 1 month ago - Stars: 2,527 - Forks: 197

IntelLabs/distiller 📦

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

Language: Jupyter Notebook - Size: 40.5 MB - Last synced: 1 day ago - Pushed: about 1 year ago - Stars: 4,309 - Forks: 797

ash2703/MyLearnings

Documentation of my notes, learnings, presentations on Computer vision and some other cool stuff

Language: Jupyter Notebook - Size: 7.6 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 1 - Forks: 0

quic/aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Language: Python - Size: 12.8 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 1,928 - Forks: 351

htqin/awesome-model-quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

Size: 61.4 MB - Last synced: 1 day ago - Pushed: 16 days ago - Stars: 1,645 - Forks: 199

csarron/awesome-emdl

Embedded and mobile deep learning research resources

Size: 88.9 KB - Last synced: 1 day ago - Pushed: about 1 year ago - Stars: 720 - Forks: 166

DerryHub/BEVFormer_tensorrt

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).

Language: Python - Size: 403 KB - Last synced: 1 day ago - Pushed: 6 months ago - Stars: 349 - Forks: 59

adithya-s-k/LLM-Alchemy-Chamber

a friendly neighborhood repository with diverse experiments and adventures in the world of LLMs

Language: Jupyter Notebook - Size: 4.55 MB - Last synced: 1 day ago - Pushed: 3 days ago - Stars: 110 - Forks: 25

openvinotoolkit/training_extensions

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™

Language: Python - Size: 363 MB - Last synced: about 23 hours ago - Pushed: 1 day ago - Stars: 1,121 - Forks: 436

sony/model_optimization

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.

Language: Python - Size: 8.43 MB - Last synced: 1 day ago - Pushed: 3 days ago - Stars: 267 - Forks: 42

ModelTC/TFMQ-DM

[CVPR 2024 Highlight] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Language: Jupyter Notebook - Size: 56.9 MB - Last synced: 1 day ago - Pushed: 2 days ago - Stars: 21 - Forks: 3

RWKV/rwkv.cpp

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

Language: C++ - Size: 16.5 MB - Last synced: 1 day ago - Pushed: 29 days ago - Stars: 1,112 - Forks: 75

AutoGPTQ/AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language: Python - Size: 7.8 MB - Last synced: 2 days ago - Pushed: 4 days ago - Stars: 3,847 - Forks: 388

koszeggy/KGySoft.Drawing

KGy SOFT Drawing is a library for advanced image, icon and graphics handling.

Language: C# - Size: 121 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 54 - Forks: 8

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Language: Python - Size: 2.17 MB - Last synced: 2 days ago - Pushed: 12 days ago - Stars: 1,471 - Forks: 321

ArslanKAS/Quantization-Fundamentals

Learn to compressing models through methods such as quantization to make them more efficient, faster, and accessible

Language: Jupyter Notebook - Size: 2.38 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0

arjbingly/grag

GRAG is a simple python package that provides an easy end-to-end solution for implementing Retrieval Augmented Generation (RAG). The package offers an easy way for running various LLMs locally, Thanks to LlamaCpp and also supports vector stores like Chroma and DeepLake.

Language: Python - Size: 53.1 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 4 - Forks: 0

petroniocandido/clshq_tk

Contrastive-LSH Embedding and Tokenization Technique for Multivariate Time Series Classification

Language: Jupyter Notebook - Size: 802 KB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 0 - Forks: 0

UFund-Me/Qbot

[🔥updating ...] AI 自动量化交易机器人 AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant

Language: Jupyter Notebook - Size: 205 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 5,955 - Forks: 792

cedrickchee/awesome-ml-model-compression

Awesome machine learning model compression research papers, tools, and learning material.

Size: 185 KB - Last synced: 1 day ago - Pushed: 7 days ago - Stars: 446 - Forks: 57

Bisonai/awesome-edge-machine-learning

A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.

Language: Python - Size: 135 KB - Last synced: 1 day ago - Pushed: about 1 year ago - Stars: 240 - Forks: 51

kornelski/pngquant

Lossy PNG compressor — pngquant command based on libimagequant library

Language: C - Size: 1.71 MB - Last synced: 4 days ago - Pushed: 17 days ago - Stars: 5,028 - Forks: 475

huawei-noah/Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab

Language: Jupyter Notebook - Size: 98.7 MB - Last synced: 2 days ago - Pushed: about 1 month ago - Stars: 1,115 - Forks: 198

dbohdan/hicolor

🎨 Convert images to 15/16-bit RGB color with dithering

Language: C - Size: 642 KB - Last synced: 4 days ago - Pushed: 5 days ago - Stars: 193 - Forks: 5

Bisonai/ncnn Fork of Tencent/ncnn

Modified inference engine for quantized convolution using product quantization

Language: C++ - Size: 7.96 MB - Last synced: 5 days ago - Pushed: almost 2 years ago - Stars: 4 - Forks: 0

autohdw/QuBLAS

Quantized BLAS

Language: C++ - Size: 102 KB - Last synced: 17 days ago - Pushed: 23 days ago - Stars: 2 - Forks: 0

neuralmagic/sparsezoo

Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

Language: Python - Size: 1.79 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 358 - Forks: 23

PINTO0309/onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

Language: Python - Size: 3.89 MB - Last synced: 13 days ago - Pushed: 15 days ago - Stars: 559 - Forks: 59

neuralmagic/deepsparse

Sparsity-aware deep learning inference runtime for CPUs

Language: Python - Size: 137 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 2,881 - Forks: 168

hkproj/quantization-notes

Notes on quantization in neural networks

Language: Jupyter Notebook - Size: 940 KB - Last synced: 1 day ago - Pushed: 5 months ago - Stars: 30 - Forks: 9

IntelLabs/nlp-architect 📦

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

Language: Python - Size: 531 MB - Last synced: 3 days ago - Pushed: over 1 year ago - Stars: 2,930 - Forks: 443

OmidGhadami95/EfficientNetV2_Quantization_CK

EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.

Language: Jupyter Notebook - Size: 344 KB - Last synced: 11 days ago - Pushed: 11 days ago - Stars: 0 - Forks: 0

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

Language: Python - Size: 1.32 MB - Last synced: about 4 hours ago - Pushed: 11 days ago - Stars: 523 - Forks: 101

ellenzhuwang/rooted_loss

Rooted logistic loss to accelerate neural network traning and LLM quantization

Language: Python - Size: 127 MB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 1 - Forks: 0

Harsh-Avinash/Caduceus

Infinite power but in a pendrive

Language: Jupyter Notebook - Size: 9.55 MB - Last synced: 13 days ago - Pushed: 5 months ago - Stars: 2 - Forks: 1

SqueezeAILab/SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Language: Python - Size: 1.5 MB - Last synced: 12 days ago - Pushed: 13 days ago - Stars: 569 - Forks: 35

dguo/picture-paint

Dynamic Firefox theme that uses the National Geographic Photo of the Day

Language: JavaScript - Size: 1.18 MB - Last synced: 13 days ago - Pushed: over 1 year ago - Stars: 5 - Forks: 1

lordtt13/Machine-Vision

Models made for Edge Devices and NN Optimizations

Language: Python - Size: 1.43 MB - Last synced: 13 days ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

peterthehan/palette 📦

Median cut implementation.

Language: JavaScript - Size: 4.88 KB - Last synced: 13 days ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0

peterthehan/buckets 📦

Color quantization with React.

Language: JavaScript - Size: 3.75 MB - Last synced: 13 days ago - Pushed: about 5 years ago - Stars: 1 - Forks: 0

zlatko-minev/pyEPR

Powerful, automated analysis and design of quantum microwave chips & devices [Energy-Participation Ratio and more]

Language: Python - Size: 3.68 MB - Last synced: about 7 hours ago - Pushed: about 1 month ago - Stars: 156 - Forks: 210

lowhung/clustering-algorithms

Algorithms related to clustering such as k-Medians, DBSCAN as well as vector quantization.

Language: Python - Size: 618 KB - Last synced: 14 days ago - Pushed: almost 7 years ago - Stars: 1 - Forks: 0

mit-han-lab/TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

Language: C++ - Size: 78.4 MB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 543 - Forks: 52

hiyouga/LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Language: Python - Size: 209 MB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 20,012 - Forks: 2,411

GURPREETKAURJETHRA/LLaMA3-Quantization

LLaMA3-Quantization

Language: Python - Size: 1.73 MB - Last synced: 8 days ago - Pushed: 20 days ago - Stars: 3 - Forks: 2

open-mmlab/mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.

Language: Python - Size: 11.1 MB - Last synced: 12 days ago - Pushed: about 1 month ago - Stars: 1,367 - Forks: 215

A-suozhang/awesome-quantization-and-fixed-point-training

Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design

Size: 81.1 KB - Last synced: 4 days ago - Pushed: over 3 years ago - Stars: 152 - Forks: 24

ShekiLyu/lixinger-openapi

理杏仁开发平台python api(非官方)

Language: Python - Size: 241 KB - Last synced: 11 days ago - Pushed: almost 4 years ago - Stars: 49 - Forks: 11

AmanPriyanshu/FedPAQ-MNIST-implemenation

An implementation of FedPAQ using different experimental parameters. We will be looking at different variations of how, r(number of clients to be selected), t (local epochs) and s (Quantizer levels))

Language: Python - Size: 125 MB - Last synced: 13 days ago - Pushed: about 3 years ago - Stars: 21 - Forks: 3

openppl-public/ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

Language: Python - Size: 5.57 MB - Last synced: 16 days ago - Pushed: about 2 months ago - Stars: 1,366 - Forks: 217

VC86/QRE-QEP

Companion code for "The Rare Eclipse Problem on Tiles: Quantised Embeddings of Disjoint Convex Sets".

Language: Jupyter Notebook - Size: 1.37 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 1 - Forks: 0

vaccovecrana/rwkv.jni

JNI wrapper for rwkv.cpp

Language: Java - Size: 734 KB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 1 - Forks: 0

SforAiDl/KD_Lib

A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.

Language: Python - Size: 22.2 MB - Last synced: 5 days ago - Pushed: about 1 year ago - Stars: 571 - Forks: 56

wang-h/neuralcompressor

Embedding Quantization (Compress Word Embeddings)

Language: Python - Size: 11.7 KB - Last synced: 17 days ago - Pushed: over 4 years ago - Stars: 8 - Forks: 1

natasha/navec

Compact high quality word embeddings for Russian language

Language: Python - Size: 1.86 MB - Last synced: 17 days ago - Pushed: 10 months ago - Stars: 166 - Forks: 16

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

Language: Python - Size: 5.17 MB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 96 - Forks: 32

ImageOptim/libimagequant

Palette quantization library that powers pngquant and other PNG optimizers

Language: Rust - Size: 1.24 MB - Last synced: about 1 month ago - Pushed: 4 months ago - Stars: 719 - Forks: 125

DeepVAC/deepvac

PyTorch Project Specification.

Language: Python - Size: 791 KB - Last synced: 17 days ago - Pushed: almost 3 years ago - Stars: 640 - Forks: 103

guan-yuan/awesome-AutoML-and-Lightweight-Models

A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

Size: 150 KB - Last synced: 1 day ago - Pushed: almost 3 years ago - Stars: 827 - Forks: 160

666DZY666/micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

Language: Python - Size: 6.84 MB - Last synced: 16 days ago - Pushed: over 2 years ago - Stars: 2,178 - Forks: 477

Ki6an/fastT5

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.

Language: Python - Size: 277 KB - Last synced: 9 days ago - Pushed: about 1 year ago - Stars: 541 - Forks: 68

kssteven418/I-BERT

[ICML'21 Oral] I-BERT: Integer-only BERT Quantization

Language: Python - Size: 6.38 MB - Last synced: 17 days ago - Pushed: over 1 year ago - Stars: 209 - Forks: 30

inisis/brocolli

Everything in Torch Fx

Language: Python - Size: 5.9 MB - Last synced: 17 days ago - Pushed: 2 months ago - Stars: 329 - Forks: 62

rvandernoort/SmallLLMs

List of smaller LLMs that could be dpeloyed to the Edge

Language: Ruby - Size: 32.2 KB - Last synced: 19 days ago - Pushed: 20 days ago - Stars: 0 - Forks: 0

huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

Language: Jupyter Notebook - Size: 3.08 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 316 - Forks: 88

cubicibo/piliq

Lightweight Python PIL-libimagequant/pngquant interface with autonomous lib look-up.

Language: Python - Size: 641 KB - Last synced: 1 day ago - Pushed: 20 days ago - Stars: 0 - Forks: 0

Astro36/ICE4009-practice-project 📦

Inha Univ. Digital Communication System Capstone Design Practice/Project

Language: MATLAB - Size: 14.8 MB - Last synced: 21 days ago - Pushed: 11 months ago - Stars: 2 - Forks: 0

openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

Language: Python - Size: 47.1 MB - Last synced: 22 days ago - Pushed: 22 days ago - Stars: 772 - Forks: 197

ksm26/Quantization-Fundamentals-with-Hugging-Face

Learn linear quantization techniques using the Quanto library and downcasting methods with the Transformers library to compress and optimize generative AI models effectively.

Language: Jupyter Notebook - Size: 205 KB - Last synced: 21 days ago - Pushed: 21 days ago - Stars: 0 - Forks: 0

SqueezeAILab/KVQuant

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Language: Python - Size: 19.7 MB - Last synced: 21 days ago - Pushed: 25 days ago - Stars: 187 - Forks: 14

Asad-Ismail/lane_detection

Lane Detection and Classification using Front camera monocular images

Language: Python - Size: 40.9 MB - Last synced: 22 days ago - Pushed: about 1 year ago - Stars: 3 - Forks: 0

IST-DASLab/marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language: Python - Size: 758 KB - Last synced: 22 days ago - Pushed: 22 days ago - Stars: 314 - Forks: 21

countzero/windows_manage_large_language_models

PowerShell automation to download large language models (LLMs) from Git repositories and quantize them with llama.cpp into the GGUF format.

Language: PowerShell - Size: 32.2 KB - Last synced: 21 days ago - Pushed: 23 days ago - Stars: 1 - Forks: 0

A-suozhang/Awesome-Efficient-Diffusion

Curated list of methods that focuses on improving the efficiency of diffusion models

Size: 5.86 KB - Last synced: 4 days ago - Pushed: 11 months ago - Stars: 13 - Forks: 0

Abhishek2271/TransferabilityAnalysis

A Python API that facilitates training, creating, and transferring attacks with quantized DNNs

Language: Python - Size: 54.8 MB - Last synced: 25 days ago - Pushed: 8 months ago - Stars: 1 - Forks: 0

hailo-ai/hailo_model_zoo

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

Language: Python - Size: 4.32 MB - Last synced: 24 days ago - Pushed: about 1 month ago - Stars: 91 - Forks: 28

PaddlePaddle/PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

Language: Python - Size: 16.3 MB - Last synced: 24 days ago - Pushed: about 2 months ago - Stars: 1,514 - Forks: 347

jy-yuan/KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Language: Python - Size: 16.8 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 107 - Forks: 5

satabios/sconce

Model Compression Made Easy

Language: Jupyter Notebook - Size: 215 MB - Last synced: 27 days ago - Pushed: 27 days ago - Stars: 32 - Forks: 1

huggingface/quanto

A pytorch Quantization Toolkit

Language: Python - Size: 1.67 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 514 - Forks: 28

lmEshoo/quantization

Post-training model quantization using apache's tvm.ai

Language: Jupyter Notebook - Size: 37.9 MB - Last synced: 28 days ago - Pushed: about 4 years ago - Stars: 1 - Forks: 1

lmEshoo/pruning

model weight pruning

Language: Jupyter Notebook - Size: 72.4 MB - Last synced: 28 days ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0

intel/auto-round

SOTA Weight-only Quantization Algorithm for LLMs

Language: Python - Size: 8.33 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 48 - Forks: 7

emsquared2/Telecommunications-NTUA

Project assignment for course Introduction to Telecommunications at ECE NTUA

Language: MATLAB - Size: 4.14 MB - Last synced: 29 days ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

huawei-noah/Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Language: Python - Size: 29 MB - Last synced: 29 days ago - Pushed: 4 months ago - Stars: 2,953 - Forks: 623

iAmGiG/MadeSmallML

MadeSmallML is an open-source initiative designed to explore model quantization techniques for machine learning models. Our goal is to enable efficient deployment of these models on devices with limited computational resources by reducing model size and computational demands without significantly compromising performance.

Size: 5.86 KB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 0 - Forks: 0

intel/intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Language: Python - Size: 92.1 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,316 - Forks: 193

neuralmagic/sparsify

ML model optimization product to accelerate inference.

Language: Python - Size: 7.18 MB - Last synced: 7 days ago - Pushed: about 1 month ago - Stars: 315 - Forks: 27

kurianbenoy/Indic-Subtitler

Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.

Language: Jupyter Notebook - Size: 36.3 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 55 - Forks: 7

OpenGVLab/OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Language: Python - Size: 8.17 MB - Last synced: 29 days ago - Pushed: about 2 months ago - Stars: 546 - Forks: 43

huggingface/optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Language: Python - Size: 4.01 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 2,096 - Forks: 356