int8 | Topic | Ecosyste.ms: Repos

Topic: "int8"

intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language: Python - Size: 468 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,375 - Forks: 267

intel/neural-speed 📦

An innovative library for efficient LLM inference via low-bit quantization

Language: C++ - Size: 16.2 MB - Last synced at: 22 days ago - Pushed at: 8 months ago - Stars: 351 - Forks: 38

clancylian/retinaface

Reimplement RetinaFace use C++ and TensorRT

Language: C++ - Size: 5.71 MB - Last synced at: 9 months ago - Pushed at: over 5 years ago - Stars: 296 - Forks: 90

Wulingtian/yolov5_tensorrt_int8_tools

tensorrt int8 量化yolov5 onnx模型

Language: Python - Size: 7.51 MB - Last synced at: about 6 hours ago - Pushed at: almost 4 years ago - Stars: 182 - Forks: 42

Wulingtian/yolov5_tensorrt_int8

TensorRT int8 量化部署 yolov5s 模型，实测3.3ms一帧！

Language: C++ - Size: 6.66 MB - Last synced at: about 6 hours ago - Pushed at: almost 4 years ago - Stars: 168 - Forks: 26

Wulingtian/RepVGG_TensorRT_int8

RepVGG TensorRT int8 量化，实测推理不到1ms一帧！

Language: Python - Size: 469 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 57 - Forks: 14

Wulingtian/nanodet_tensorrt_int8

nanodet int8 量化，实测推理2ms一帧！

Language: C++ - Size: 6.24 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 37 - Forks: 6

xuanandsix/Tensorrt-int8-quantization-pipline

a simple pipline of int8 quantization based on tensorrt.

Language: Python - Size: 836 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 2

ppogg/ncnn-yolov4-int8

NCNN+Int8+YOLOv4 quantitative modeling and real-time inference

Language: C++ - Size: 14 MB - Last synced at: 17 days ago - Pushed at: over 3 years ago - Stars: 24 - Forks: 5

whitelok/tensorrt-int8-python-sample

TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です

Language: Python - Size: 1.5 MB - Last synced at: 9 months ago - Pushed at: about 6 years ago - Stars: 14 - Forks: 1

aahouzi/llama2-chatbot-cpu

A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.

Language: Python - Size: 30.3 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 13 - Forks: 0

cbalint13/rvv-kernels

RISCV Vector Kernel C/LLVM-IR generator

Language: C - Size: 13.8 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 7 - Forks: 1

dasdristanta13/LLM-Lora-PEFT_accumulate

LLM-Lora-PEFT_accumulate explores optimizations for Large Language Models (LLMs) using PEFT, LORA, and QLORA. Contribute experiments and implementations to enhance LLM efficiency. Join discussions and push the boundaries of LLM optimization. Let's make LLMs more efficient together!

Language: Jupyter Notebook - Size: 138 KB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 1

JohnClaw/chatllm.vb

VB.NET api wrapper for llm-inference chatllm.cpp

Language: Visual Basic .NET - Size: 6.84 KB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 4 - Forks: 0

egbertYeah/mt-yolov6_tensorrt

MT-Yolov6 TensorRT Inference with Python.

Language: Python - Size: 55.6 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 0

JohnClaw/chatllm.cs

C# api wrapper for llm-inference chatllm.cpp

Language: C# - Size: 779 KB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

psychose-club/IBO

IBO stands for "Internal binary operations" and it is a library for Java to read, write, and handle binary files and data types that aren't available in Java.

Language: Java - Size: 546 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

stdlib-js/array-int8

Int8Array.

Language: JavaScript - Size: 671 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/constants-int8

8-bit signed integer mathematical constants.

Language: JavaScript - Size: 490 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

stdlib-js/constants-int8-min

Minimum signed 8-bit integer.

Language: JavaScript - Size: 349 KB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

stdlib-js/constants-int8-max

Maximum signed 8-bit integer.

Language: JavaScript - Size: 347 KB - Last synced at: 8 days ago - Pushed at: 14 days ago - Stars: 1 - Forks: 0

stdlib-js/constants-int8-num-bytes

Size (in bytes) of an 8-bit signed integer.

Language: JavaScript - Size: 321 KB - Last synced at: 9 days ago - Pushed at: 14 days ago - Stars: 1 - Forks: 0

stdlib-js/napi-argv-int8array

Convert a Node-API value to a signed 8-bit integer array.

Language: C - Size: 200 KB - Last synced at: 8 days ago - Pushed at: 14 days ago - Stars: 1 - Forks: 0

stdlib-js/napi-argv-strided-int8array2d

Convert a Node-API value representing a two-dimensional strided array to a signed 8-bit integer array.

Language: C - Size: 59.6 KB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

stdlib-js/assert-is-int8array

Test if a value is an Int8Array.

Language: JavaScript - Size: 463 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

stdlib-js/napi-argv-strided-int8array

Convert a Node-API value representing a strided array to a signed 8-bit integer array.

Language: C - Size: 188 KB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

lbin/gie_int8_sample

Language: C++ - Size: 24.4 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

loveboyme/YOLOv5-TensorRT-Accelerator

基于TensorRT加速的YOLOv5高性能推理框架 | High-performance YOLOv5 inference framework accelerated by TensorRT with dynamic optimization

Language: Python - Size: 0 Bytes - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

Egorundel/int8_calibrator_cpp

INT8 calibrator for ONNX model with dynamic batch_size at the input and NMS module at the output. C++ Implementation.

Language: C++ - Size: 81 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

yester31/Quantization_EX

quantization example for pqt & qat

Language: Python - Size: 94.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

MrFMach/Practice-C-types

Practicing C data types using the sizeof function

Language: C - Size: 2.93 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos