GitHub topics: quantization
open-mmlab/mmrazor
OpenMMLab Model Compression Toolbox and Benchmark.
Language: Python - Size: 11.1 MB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 1,599 - Forks: 236

raywan-110/AdaQP
Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training
Language: Python - Size: 97.7 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 3

vbdi/casp
[CVPR 2025] CASP: Compression of Large Multimodal Models Based on Attention Sparsity
Language: Python - Size: 764 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 4 - Forks: 1

neuralmagic/sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Language: Python - Size: 1.33 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 386 - Forks: 28

lucadellalib/audiocodecs
A collections of audio codecs with a standardized API
Language: Python - Size: 851 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 20 - Forks: 3

aaron-xichen/pytorch-playground
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)
Language: Python - Size: 45.9 KB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 2,669 - Forks: 618

d1pankarmedhi/nn-linear-quantization
linear quantization with W8A16 for neural networks with PyTorch
Language: Python - Size: 28.3 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

Victorletzelter/annealed_mcl
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing (NeurIPS 2024)
Language: Python - Size: 39.3 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 6 - Forks: 0

ModelTC/QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
Language: Python - Size: 1.68 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 38 - Forks: 4

grinn-global/bionic-robot-hand-demo
A mixed‐precision YOLO-Pose hand gesture recognition system based on the Synaptics Astra SL1680 NPU
Language: Python - Size: 804 KB - Last synced at: 3 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

koszeggy/KGySoft.Drawing.Tools
Debugger visualizers and image editor apps built on KGy SOFT Drawing Libraries
Language: C# - Size: 2.72 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 22 - Forks: 4

hailo-ai/hailo_model_zoo
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment
Language: Python - Size: 5.83 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 451 - Forks: 58

cedrickchee/awesome-ml-model-compression
Awesome machine learning model compression research papers, quantization, tools, and learning material.
Size: 213 KB - Last synced at: 29 days ago - Pushed at: 9 months ago - Stars: 523 - Forks: 60

MichezNTF/AI-Engineering
AI Engineering is a comprehensive bootcamp designed for programmers to master AI through practical projects and foundational theory. Each week, participants engage in hands-on learning, covering essential topics like Python, data manipulation, and machine learning math. 🐙💻
Language: Python - Size: 24.4 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 0 - Forks: 0

ikergarcia1996/Easy-Translate
Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible for beginners and as seamlesscustomizable and as possible for advanced users.
Language: Python - Size: 656 KB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 217 - Forks: 338

saifhaq/alma
Language: Python - Size: 21.7 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 20 - Forks: 1

joisino/speedbook
書籍『深層ニューラルネットワークの高速化』のサポートサイトです。
Language: Jupyter Notebook - Size: 480 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 57 - Forks: 2

sinanuozdemir/quick-start-guide-to-llms
The Official Repo for "Quick Start Guide to Large Language Models"
Language: Jupyter Notebook - Size: 91.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 282 - Forks: 165

guan-yuan/Awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
Size: 150 KB - Last synced at: 2 days ago - Pushed at: about 4 years ago - Stars: 854 - Forks: 160

IntelLabs/nlp-architect 📦
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Language: Python - Size: 531 MB - Last synced at: 29 days ago - Pushed at: over 2 years ago - Stars: 2,940 - Forks: 448

neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language: Python - Size: 137 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 3,147 - Forks: 186

slitiWassim/Drone-Guard
A Self-Supervised Deep Learning Framework for Spatiotemporal Anomaly Detection in UAV Surveillance Videos
Language: JavaScript - Size: 56.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

SforAiDl/KD_Lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Language: Python - Size: 22.2 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 630 - Forks: 60

multi-modal-ai/production-hub
Hands-on hub to learn techniques to optimize and serve AI models to production the most optimal way.
Language: Jupyter Notebook - Size: 43.2 MB - Last synced at: 12 days ago - Pushed at: 9 months ago - Stars: 8 - Forks: 1

natasha/navec
Compact high quality word embeddings for Russian language
Language: Python - Size: 1.86 MB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 200 - Forks: 18

VThuong99/LeNet5qt.c
Language: C - Size: 3.7 MB - Last synced at: 10 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

mit-han-lab/tinyengine
[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory
Language: C - Size: 235 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 869 - Forks: 141

bitsandbytes-foundation/bitsandbytes-intel 📦
An extension to enable performance acceleration for bitsandbytes on Intel platforms.
Language: Python - Size: 35.2 KB - Last synced at: 1 day ago - Pushed at: 2 months ago - Stars: 3 - Forks: 1

huawei-noah/Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
Language: Jupyter Notebook - Size: 100 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 1,273 - Forks: 218

OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language: C++ - Size: 14.5 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 3,810 - Forks: 358

winstxnhdw/ct2hf
A friendly CLI tool for converting and uploading transformers for CTranslate2.
Language: Python - Size: 13.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

kornelski/pngquant
Lossy PNG compressor — pngquant command based on libimagequant library
Language: C - Size: 1.71 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 5,381 - Forks: 492

arasgungore/PCM-and-DM-modulators
A Python/MATLAB project which implements pulse-code modulation (PCM) and delta modulation (DM).
Language: Jupyter Notebook - Size: 742 KB - Last synced at: 12 days ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 0

zlatko-minev/pyEPR
Powerful, automated analysis and design of quantum microwave chips & devices [Energy-Participation Ratio and more]
Language: Python - Size: 2.78 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 179 - Forks: 253

UFund-Me/Qbot
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
Language: Jupyter Notebook - Size: 387 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11,442 - Forks: 1,647

DeepVAC/deepvac
PyTorch Project Specification.
Language: Python - Size: 791 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 680 - Forks: 104

GreenBull31/tinyllama-coreml-ios18-quantization
Quantize TinyLlama-1.1B-Chat from PyTorch to CoreML (float16, int8, int4) for efficient on-device inference on iOS 18+.
Language: Python - Size: 6.84 KB - Last synced at: 16 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

AvatariaProducciones/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language: Python - Size: 722 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

microsoft/LQ-Nets 📦
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
Language: Python - Size: 28.3 KB - Last synced at: 3 days ago - Pushed at: almost 3 years ago - Stars: 242 - Forks: 69

dvmazur/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
Language: Python - Size: 261 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 2,311 - Forks: 232

Beomi/BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
Language: Python - Size: 588 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 302 - Forks: 32

Xiuyu-Li/q-diffusion
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
Language: Python - Size: 5.97 MB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 347 - Forks: 24

neuralmagic/sparsify
ML model optimization product to accelerate inference.
Language: Python - Size: 6.99 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 324 - Forks: 30

submission2019/cnn-quantization
Quantization of Convolutional Neural networks.
Language: Python - Size: 2.71 MB - Last synced at: 25 days ago - Pushed at: 11 months ago - Stars: 243 - Forks: 60

Artessay/ArtQuantization
ArtQuantization is developed for quantizing Large Language Models, focusing on optimizing the memory usage and performance. This repository provides experimental results of quantizing models such as Qwen2.5 using different algorithms like AWQ and GPTQ, and demonstrates the memory requirements under various graphics card configurations.
Language: Python - Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

gregabbott/swotch
Make limited palette PNGs and SVG swatches from images. ~14KB
Language: HTML - Size: 580 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

dipampaul17/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language: Python - Size: 717 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 0

1duo/awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Size: 11.8 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 416 - Forks: 74

PedroFellipeAntunes/dithering-java
Java program to apply dithering (reduce color count) to an image.
Language: Java - Size: 2.74 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ImageOptim/libimagequant
Palette quantization library that powers pngquant and other PNG optimizers
Language: Rust - Size: 1.34 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 828 - Forks: 133

Nirusanan/Tabular_Data_Analysis-LLM
This project utilizes fine-tuned LLMs to generate Pandas code for performing financial data analytics tasks.
Language: Jupyter Notebook - Size: 276 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

kurianbenoy/Indic-Subtitler
Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.
Language: Jupyter Notebook - Size: 36.4 MB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 89 - Forks: 13

lucidrains/discrete-key-value-bottleneck-pytorch
Implementation of Discrete Key / Value Bottleneck, in Pytorch
Language: Python - Size: 196 KB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 88 - Forks: 3

Smallsan/OctQuant
Oct tree color quantization algorithm.
Language: Go - Size: 3.06 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

MatinHosseinianFard/Fault-Tolerant-Systems-Design-Project
A replication of "Enhancing Battery Thermal Management With Virtual Temperature Sensor Using Hybrid CNN-LSTM"
Language: Jupyter Notebook - Size: 56.7 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

singhdivyank/Data-Science-CheatSheets
Curated list of resources for Data Scientists, AI developers, and interview preperation
Language: Jupyter Notebook - Size: 322 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0

ericmckevitt/MobileNetV2-Quantization-Benchmarking
Benchmark GPU inference performance of MobileNetV2: full-precision vs quantized (INT8) models using TensorRT
Language: Python - Size: 12.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

NoakLiu/LLMEasyQuant
An Easy-to-Use Toolkit for LLM Quantization on can be executed on Macbook [Efficient ML Model]
Language: Python - Size: 2.76 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 18 - Forks: 0

kemingy/rabitq
rabitq rust implementation
Language: Rust - Size: 300 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 10 - Forks: 0

sinanuozdemir/oreilly-hands-on-gpt-llm
Mastering the Art of Scalable and Efficient AI Model Deployment
Language: Jupyter Notebook - Size: 33.2 MB - Last synced at: 30 days ago - Pushed at: 4 months ago - Stars: 136 - Forks: 91

A-suozhang/awesome-quantization-and-fixed-point-training
Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design
Size: 81.1 KB - Last synced at: 1 day ago - Pushed at: over 4 years ago - Stars: 161 - Forks: 24

tpoisonooo/llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
Language: Python - Size: 1.3 MB - Last synced at: 10 days ago - Pushed at: almost 2 years ago - Stars: 363 - Forks: 31

kssteven418/I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
Language: Python - Size: 6.38 MB - Last synced at: 28 days ago - Pushed at: over 2 years ago - Stars: 246 - Forks: 36

mit-han-lab/haq
[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Language: Python - Size: 64.5 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 384 - Forks: 85

matlab-deep-learning/Quantized-Deep-Neural-Network-on-Jetson-AGX-Xavier
How to create, train and quantize network, then integrate it into pre/post image processing and generate CUDA C++ code for targeting Jetson AGX Xavier
Language: MATLAB - Size: 10.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 2

megvii-research/Sparsebit
A model compression and acceleration toolbox based on pytorch.
Language: Python - Size: 7.45 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 332 - Forks: 40

xvyaward/owq
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models".
Language: Python - Size: 3.03 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 61 - Forks: 7

hkproj/quantization-notes
Notes on quantization in neural networks
Language: Jupyter Notebook - Size: 940 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 81 - Forks: 16

HosseinAtrsaei/Capacity-Bounds-for-Communication-Systems-with-Quantization-and-Spectral-Constraints
This repo analyzes capacity bounds of communication systems with quantization and spectral constraints. It includes theoretical derivations and numerical evaluations of mutual information under coarse quantization and bandwidth limitations, with applications in modern wireless systems.
Language: Jupyter Notebook - Size: 144 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

hahnyuan/PB-LLM
PB-LLM: Partially Binarized Large Language Models
Language: Python - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 152 - Forks: 10

Aisuko/notebooks
Implementation for the different ML tasks on Kaggle platform with GPUs.
Language: Jupyter Notebook - Size: 160 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 20 - Forks: 3

Aaronhuang-778/BiLLM
[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Language: Python - Size: 1.73 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 216 - Forks: 14

onnx/neural-compressor
Model compression for ONNX
Language: Python - Size: 2.35 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 92 - Forks: 9

autohdw/QuBLAS
Quantized BLAS
Language: C++ - Size: 377 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 2

IntelLabs/distiller 📦
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Language: Jupyter Notebook - Size: 40.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 4,390 - Forks: 805

koulanurag/mmn
Moore Machine Networks (MMN): Learning Finite-State Representations of Recurrent Policy Networks
Language: Python - Size: 115 MB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 50 - Forks: 13

laelhalawani/glai
glai - GGUF LLAMA AI - Package for simplified model handling and text generation with Llama models quantized to GGUF format. APIs for downloading and loading models automatically, includes a db with models of various scale and quantizations. With this high level API you need one line to load the model and one to generate text completions.
Language: Python - Size: 208 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

soumyadip1995/BabyGPT
Something in the middle of Karpathy's mingpt model and video lectures, BabyGPT is an easy to use model on a much smaller scale (16 and 256 out channels , 5 heads, fine tuned). To be made useful on low powered devices.
Language: Jupyter Notebook - Size: 14.9 MB - Last synced at: 10 days ago - Pushed at: almost 2 years ago - Stars: 22 - Forks: 2

satabios/sconce
E2E AutoML Model Compression Package
Language: Jupyter Notebook - Size: 134 MB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 46 - Forks: 4

GiorgosXou/NeuralNetworks
A resource-conscious neural network implementation for MCUs
Language: C++ - Size: 1.24 MB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 88 - Forks: 25

Joschua-Conrad/PSumSim
PSumSim: A Simulator for Partial-Sum Quantization in Analog Matrix-Vector Multipliers
Language: Python - Size: 1.21 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

dvgodoy/LLM-visuals
Over 60 figures and diagrams of LLMs, quantization, low-rank adapters (LoRA), and chat templates FREE TO USE in your blog posts, slides, presentations, or papers.
Size: 4.11 MB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 14 - Forks: 3

JulesBelveze/bert-squeeze
🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
Language: Python - Size: 2.44 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 83 - Forks: 10

loks666/FinancialMachineLearning
该项目利用资本资产定价模型(CAPM)和均值方差优化(MVO),通过二次规划来构建最优投资组合,以最小化风险并最大化收益。
Language: Jupyter Notebook - Size: 6.63 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

hyx1999/Quad
Official Implementation of QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition
Language: Python - Size: 1.97 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

DarthSim/quantizr
Fast library for converting RGBA images to 8-bit palette images. Written in Rust; can be used in C programs
Language: Rust - Size: 104 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 19 - Forks: 2

khaykingleb/keyword-spotting
Attention Implementation for KWS
Language: Jupyter Notebook - Size: 1000 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 2

yutingshih/vit-quant
Quantization for vision transformers
Language: Python - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

ArchitJ6/Llama2-FineTuning
🦙 Llama2-FineTuning: Fine-tune LLAMA 2 with Custom Datasets Using LoRA and QLoRA Techniques
Language: Jupyter Notebook - Size: 2.83 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

dnotitia/qllm-infer
A modular framework for evaluating quantization algorithms with reproducible and consistent benchmarks.
Language: Python - Size: 52.1 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 1

Ki6an/fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Language: Python - Size: 277 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 578 - Forks: 73

JeanSanchezFelix/EdgeML-Projects
This repository contains example notebooks and homeworks demonstrating various techniques in model optimization for Edge ML.
Language: Jupyter Notebook - Size: 26.4 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Maknee/minigpt4.cpp
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
Language: C++ - Size: 2.12 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 565 - Forks: 27

Bisonai/awesome-edge-machine-learning
A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.
Language: Python - Size: 135 KB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 261 - Forks: 50

Mohammad-Hallaq/Wake_Vision_Challenge_Model_Centric_Track Fork of harvard-edge/Wake_Vision_Challenge_Model_Centric_Track
Language: PureBasic - Size: 37.5 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

umjammer/vavi-image-sandbox
🖼️ Imaging sandbox (HEIF Java ImageIO SPI, filters, swing animation component)
Language: Java - Size: 8.52 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

countzero/windows_manage_large_language_models
PowerShell automation to download large language models (LLMs) from Git repositories and quantize them with llama.cpp into the GGUF format.
Language: PowerShell - Size: 32.2 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

j-marple-dev/model_compression
PyTorch Model Compression
Language: Python - Size: 31 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 232 - Forks: 25

minseok0809/awesome-ai-paper
A curated list of awesome NLP, Computer Vision, Model Compression, XAI, Reinforcement Learning, Security, etc Paper
Language: Jupyter Notebook - Size: 38.3 MB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 6 - Forks: 0

e-dupuis/awesome-approximate-dnn
Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment
Size: 166 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 25 - Forks: 6
