Topic: "model-quantization"
Efficient-ML/Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Size: 61.5 MB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 2,084 - Forks: 221

horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Language: Python - Size: 62.3 MB - Last synced at: 4 days ago - Pushed at: 25 days ago - Stars: 1,657 - Forks: 134

datawhalechina/awesome-compression
模型压缩的小白入门教程
Size: 302 MB - Last synced at: 11 days ago - Pushed at: 6 months ago - Stars: 274 - Forks: 34

inferflow/inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
Language: C++ - Size: 1.89 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 243 - Forks: 25

Efficient-ML/Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Size: 63.5 KB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 178 - Forks: 11

sayakpaul/Adventures-in-TensorFlow-Lite
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
Language: Jupyter Notebook - Size: 49.1 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 172 - Forks: 35

RodolfoFerro/psychopathology-fer-assistant
[WINNER! 🏆] Psychopathology FER Assistant. Because mental health matters. My project submission for #TFWorld TF 2.0 Challenge at Devpost.
Language: Jupyter Notebook - Size: 12 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 67 - Forks: 25

htqin/BiBench
This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.
Language: Python - Size: 110 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 41 - Forks: 3

htqin/QuantSR
This project is the official implementation of our accepted NeurIPS 2023 (spotlight) paper QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution.
Language: Python - Size: 9.75 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 31 - Forks: 2

seonglae/llama2gptq
Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.
Language: Python - Size: 9.48 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 29 - Forks: 0

nbasyl/OFQ
The official implementation of the ICML 2023 paper OFQ-ViT
Language: Python - Size: 640 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 0

dcarpintero/ai-engineering
AI Engineering: Annotated NBs to dive into Self-Attention, In-Context Learning, RAG, Knowledge-Graphs, Fine-Tuning, Model Optimization, and many more.
Language: Jupyter Notebook - Size: 11.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 0

NANEXLABS/Nanex-AI
Enterprise multi-agent framework for secure, borderless data collaboration with zero-trust and federated learning-lightweight edge-ready.
Language: Python - Size: 119 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

frickyinn/BiDense
PyTorch implementation of "BiDense: Binarization for Dense Prediction," A binary neural network for dense prediction tasks.
Language: Python - Size: 1.21 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

SRDdev/Model-Quantization
Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32).
Language: Jupyter Notebook - Size: 3.16 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

dwain-barnes/LLM-GGUF-Auto-Converter
Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.
Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 2

nnilayy/Spresense
Language: C++ - Size: 2.59 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

Keshavpatel2/local-llm-workbench
🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.
Language: Shell - Size: 8.79 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

xhay-p/ttPG
Torch and Transformers Playground: Learn and Code Deep Learning using PyTorch and HuggingFace Transformers.
Language: Jupyter Notebook - Size: 154 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

satyampurwar/large-language-models
Unlocking the Power of Generative AI: In-Context Learning, Instruction Fine-Tuning and Reinforcement Learning Fine-Tuning.
Language: Jupyter Notebook - Size: 170 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Chenguiti6444/Vehicle_Detection_and_Classification_using_Deep_Learning
Fine-tuning Pretrained Deep Learning Models to Classify Low Quality Images of Land Vehicles.
Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

dslisleedh/NCNet-flax
Unofficial implementation of NCNet using flax and jax
Language: Python - Size: 131 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
