GitHub topics: int8-inference

Repositories

jahongir7174/YOLOv8-qat

Quantization Aware Training

Language: Python - Size: 9.31 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 70 - Forks: 11

DerryHub/BEVFormer_tensorrt

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).

Language: Python - Size: 403 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 466 - Forks: 76

BUG1989/caffe-int8-convert-tools

Generate a quantization parameter file for ncnn framework int8 inference

Language: Python - Size: 622 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 519 - Forks: 154

anilsathyan7/Portrait-Segmentation

Real-time portrait segmentation for mobile devices

Language: Jupyter Notebook - Size: 495 MB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 645 - Forks: 135

ENOT-AutoDL/gpt-j-6B-tensorrt-int8

GPT-J 6B inference on TensorRT with INT-8 precision

Language: Python - Size: 24.4 KB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 0

JohnClaw/chatllm.vb

VB.NET api wrapper for llm-inference chatllm.cpp

Language: Visual Basic .NET - Size: 6.84 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 4 - Forks: 0

akashAD98/yolov7_vino_with_object_tracking

it has support for openvino converted model of yolov7-int.xml ,yolov7x,

Language: Python - Size: 673 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 2

JohnClaw/chatllm.cs

C# api wrapper for llm-inference chatllm.cpp

Language: C# - Size: 779 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 0

daniel-rychlewski/cnn-planesnet

Compressed CNNs for airplane classification in satellite images (APoZ-based parameter pruning, INT8 weight quantization)

Language: Python - Size: 497 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

ENOT-AutoDL/ENOT-transformers

Size: 8.79 KB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 1

yester31/TensorRT_ONNX

Generating tensorrt model using onnx

Language: C++ - Size: 91.6 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

whitelok/tensorrt-int8-python-sample

TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です

Language: Python - Size: 1.5 MB - Last synced at: 10 months ago - Pushed at: over 6 years ago - Stars: 14 - Forks: 1

Howell-Yang/onnx2trt

将端上模型部署过程中，常见的问题以及解决办法记录并汇总，希望能给其他人带来一点帮助。

Language: Python - Size: 258 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

Related Keywords

int8-inference 13 int8-quantization 6 tensorrt 5 quantization 5 python 4 inference 4 pytorch 3 int8 3 keras 2 tensorflow 2 enot-autodl 2 transformers 2 api-wrapper 2 bindings 2 chatllm 2 cpu-inference 2 gemma 2 ggml 2 llama 2 llm-inference 2 mistral 2 qwen 2 onnx 2 deep-learning 2 openvino 1 yolov7 1 csharp 1 llm 1 llms 1 batch-size 1 classification 1 convolutional-neural-networks 1 object-detection 1 epoch 1 image-classification 1 kaggle-dataset 1 model-compression 1 neural-network 1 neural-network-compression 1 planes 1 pruning 1 satellite-imagery 1 tensorflow-gpu 1 tflite 1 gpt2 1 gptj 1 onnxruntime 1 post-training-quantization 1 ptq 1 tensorrt-inference 1 ai 1 machine-learning 1 nvidia 1 tensorrt-int8-python 1 calibrator 1 jetson-tx2 1 mediapipe 1 mobilenetv2 1 opencv-dnn 1 portrait-matting 1 portrait-segmentation 1 gpu-delegate 1 tensorflow-lite 1 tensorflowjs 1 unet-image-segmentation 1 edge-ai 1 gpt-j 1 gpt-j-6b 1 deepstream 1 deeplearning 1 deeplab 1 coral-tpu 1 color-harmonization 1 android 1 quantized-neural-networks 1 ncnn 1 deeplearning-ai 1 caffe 1 tensorrt-plugins 1 cuda 1 bevformer 1 yolov8 1 vb-net 1 vbnet 1 deepsort 1 quantization-aware-training 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos