GitHub topics: triton-inference-server

Repositories

alinabil74568/onnxruntime

ONNX Runtime is a powerful tool for accelerating machine learning tasks across various platforms. It enhances both inference and training, making it easier to deploy models from popular frameworks like PyTorch and TensorFlow. 🐙💻

Language: Dockerfile - Size: 9.77 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

NVIDIA/GenerativeAIExamples

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

Language: Jupyter Notebook - Size: 91.8 MB - Last synced at: 8 days ago - Pushed at: 19 days ago - Stars: 3,201 - Forks: 766

Armaggheddon/VoiceFlow

Meet VoiceFlow 🎙️🔊, your production-ready microservices platform for all things AI speech! It's designed to make high-performance voice processing a breeze, letting you effortlessly transcribe audio to text and convert text into natural-sounding speech. 🚀

Language: Python - Size: 609 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

triton-inference-server/onnxruntime_backend

The Triton backend for the ONNX Runtime.

Language: C++ - Size: 311 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 152 - Forks: 64

NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference

NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

Language: C++ - Size: 405 KB - Last synced at: about 13 hours ago - Pushed at: 4 months ago - Stars: 116 - Forks: 16

clearml/clearml-serving

ClearML - Model-Serving Orchestration and Repository Solution

Language: Python - Size: 1.85 MB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 151 - Forks: 42

torchpipe/torchpipe

Serving Inside Pytorch

Language: C++ - Size: 41.6 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 160 - Forks: 13

olibartfast/computer-vision-triton-cpp-client

C++ application to perform computer vision tasks using Nvidia Triton Server for model inference

Language: C++ - Size: 8.23 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 23 - Forks: 2

npuichigo/openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend

Language: Rust - Size: 1.35 MB - Last synced at: 2 days ago - Pushed at: 11 months ago - Stars: 209 - Forks: 29

viam-modules/viam-mlmodelservice-triton

MLModelService wrapping Nvidia's Triton Server

Language: C++ - Size: 168 KB - Last synced at: 25 days ago - Pushed at: 26 days ago - Stars: 5 - Forks: 6

notAI-tech/fastDeploy

Deploy DL/ ML inference pipelines with minimal extra code.

Language: Python - Size: 15.7 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 98 - Forks: 17

Maria-Antony/KernelCraft

KernelCraft is a GPU kernel visualizer and profiler built with Triton, PyTorch, and Streamlit.

Language: Python - Size: 21.5 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

NetEase-Media/grps

Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, offering scalability, extensibility, and high performance. It helps users quickly deploy models and provide services through HTTP/RPC interfaces.

Language: C++ - Size: 67.8 MB - Last synced at: 29 days ago - Pushed at: about 2 months ago - Stars: 164 - Forks: 13

huseyindeniz/gitops-lab

GitOps Playground (K8S, Terraform, Argo CD, Helm, Github Workflows etc.)

Language: TypeScript - Size: 11.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

CoinCheung/BiSeNet

Add bisenetv2. My implementation of BiSeNet

Language: Python - Size: 3.62 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1,519 - Forks: 325

YeonwooSung/MLOps

Miscellaneous codes and writings for MLOps

Language: Jupyter Notebook - Size: 540 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 12 - Forks: 1

Koldim2001/TrafficAnalyzer

Анализ трафика на круговом движении с использованием компьютерного зрения

Language: Python - Size: 401 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 86 - Forks: 6

fversaci/cassandra-dali-plugin

Cassandra plugin for NVIDIA DALI

Language: C++ - Size: 777 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 2

Koldim2001/Triton_example

Triton Inference Server + TensorRT + метрики

Language: Jupyter Notebook - Size: 1.53 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 0

fano2458/Zhadiger-Kazakh-Language-AI

AI services project "Zhadiger" for Kazakh Language developed using NVIDIA Triton Inference Server. Including LLM, OCR, Image Captioning, NER, TTS, STT, Translator and etc.

Language: Python - Size: 47.4 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

uzunenes/triton-server-hpa

Horizontal Pod Autoscaler (HPA) test on Kubernetes using NVIDIA Triton Inference Server with an AI model

Language: Python - Size: 1.13 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

light-hat/smart_ids

🧠🛡️ Web service for detecting network attacks in PCAP using ML.

Language: Python - Size: 134 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 30 - Forks: 2

luochang212/clip-server

CLIP Triton Inference Server

Language: Jupyter Notebook - Size: 3.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

tech4ai/t4ai-signature-detect-server

This project provides a pipeline for deploying and performing inference with the YOLOv8 object detection model using the Triton Inference Server. It supports integration with local systems, Docker-based setups, or Google Cloud’s Vertex AI. The repository includes scripts for automated deployment, benchmarks and GUI inference.

Language: Jupyter Notebook - Size: 151 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 12 - Forks: 1

isarsoft/yolov4-triton-tensorrt

This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server

Language: C++ - Size: 629 KB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 283 - Forks: 64

dpressel/reserve

FastAPI + WebSockets + SSE service to interface with Triton/Riva ASR

Language: Python - Size: 11.7 KB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 12 - Forks: 2

ZQuang2202/Zipformer_Triton

A template for serving zipformer on Triton Inference Server.

Language: Python - Size: 1.65 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Awrsha/inference-comparison

A comprehensive platform for comparing performance across various inference servers including Triton Inference Server, TorchServe, and PyTorch Direct with a modern UI and advanced analytics capabilities.

Language: HTML - Size: 43.9 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

HROlive/Computer-Vision-for-Industrial-Inspection

How to create an end-to-end hardware-accelerated industrial inspection pipeline to automate defect detection.

Language: Jupyter Notebook - Size: 145 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

rtzr/tritony

Tiny configuration for Triton Inference Server

Language: Python - Size: 82 KB - Last synced at: about 15 hours ago - Pushed at: 6 months ago - Stars: 45 - Forks: 1

jibbs1703/Model-Deployment-with-Triton

This repository contains the steps in deploying/serving models in production using the Triton Inference Server.

Language: Python - Size: 33.2 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

kamalkraj/stable-diffusion-tritonserver

Deploy stable diffusion model with onnx/tenorrt + tritonserver

Language: Jupyter Notebook - Size: 2.62 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 123 - Forks: 19

niqbal996/triton_client

This repository serves as a client to send sensor messages from ROS or other sources to the Inference server and processes the inference results.

Language: Python - Size: 37.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Zerohertz/yolo-serving-cookbook 📦

📸 YOLO Serving Cookbook based on Triton Inference Server 📸

Language: Python - Size: 1.49 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 1

ccyrene/resnet50_optimization

Language: Python - Size: 2.54 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

duydvu/triton-inference-server-web-ui

Triton Inference Server Web UI

Language: TypeScript - Size: 210 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 1

cnwangjie/triton-client-js

A Node.js client for the Triton Inference Server.

Language: JavaScript - Size: 171 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

vnk8071/depth-anything-triton

Depth Anything V2 run on ONNX and TensorRT with easy UI and deploy Tritonserver

Language: Python - Size: 20.2 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

ccyrene/flash_whisper

Whisper optimization for real-time application

Language: Python - Size: 2.04 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

vectornguyen76/search-engine-system

Search Engine on Shopee apply Image Search, Full-text Search, Auto-complete

Language: Jupyter Notebook - Size: 105 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 1

PD-Mera/triton-basic

An easy classification implement to explain how triton work

Language: Python - Size: 8.79 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

vadimkantorov/tritonservercppsharp

Example of generating C# bindings to the C-API interface of Triton Inference Server using the CppSharp tool

Size: 61.5 KB - Last synced at: 4 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

vadimkantorov/tritonserverstringproc

Example string processing pipeline on Triton Inference Server

Language: Python - Size: 67.4 KB - Last synced at: 27 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

ddoddii/ASO-Lab

Application-aware System Optimization Lab Research Study

Size: 5.88 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

chiehpower/Setup-deeplearning-tools

Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ Pytorch/ Triton-Inference-Server/ Bazel/ Tesseract/ PaddleOCR/ NVIDIA-docker/ minIO/ Supervisord on AGX or PC from scratch.

Language: Python - Size: 4.7 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 43 - Forks: 6

yas-sim/openvino-model-server-wrapper

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.

Language: Python - Size: 24.5 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 1

janwytze/InferenceStore

Language: Rust - Size: 69.3 KB - Last synced at: 2 days ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

dev6699/face

A collection of Face AI models

Language: Go - Size: 3.27 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

gordonmurray/aws_terraform_triton_inference_server

Using Packer, Ansible and Terraform to create a small Triton Inference cluster on AWS

Language: HCL - Size: 18.6 KB - Last synced at: 3 days ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

redis-applied-ai/redis-feast-gcp

A demo of Redis Enterprise as the Online Feature Store deployed on GCP with Feast and NVIDIA Triton Inference Server.

Language: Jupyter Notebook - Size: 5.38 MB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 15 - Forks: 2

ConnorSouthEngineering/MVision

This repository contains the content for a proof of concept implementation of computer vision systems in industry. The project explores scalability and performance using the NVIDIA ecosystem, aiming to create an example scaffold for implementing a system accessible to non-technical users.

Language: TypeScript - Size: 15.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

TunggTungg/image_retrieval

An image retrieval system that utilizes deep learning ResNet for feature extraction, Local Optimized Product Quantization techniques for storage and retrieval, and efficient deployment using Nvidia technologies like TensorRT and Triton Server, all accessible through a FastAPI-powered web API.

Language: Jupyter Notebook - Size: 1.23 GB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

ybai789/yolov8-triton-tensorrt

Provides an ensemble model to deploy a YOLOv8 TensorRT model to Triton

Language: Python - Size: 20.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Curt-Park/serving-codegen-gptj-triton

Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes

Language: Python - Size: 5.47 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 20 - Forks: 0

TunggTungg/Celebrity-Look-Alike

An innovative project designed to provide users with an entertaining and engaging experience by comparing their facial features to those of celebrities.

Language: Python - Size: 149 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

howsmyanimeprofilepicture/trt-diffusion-tutorial-kr

TensorRT를 통한 Stable Diffusion 가속하기

Language: Jupyter Notebook - Size: 3.02 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

trinhtuanvubk/Diff-VC

Diffusion Model for Voice Conversion

Language: Jupyter Notebook - Size: 35.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 5

haritsahm/cpp-ml-server

Web Services for Machine Learning in C++

Language: C++ - Size: 188 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

levipereira/deepstream-yolo-triton-server-rtsp-out

The Purpose of this repository is to create a DeepStream/Triton-Server sample application that utilizes yolov7, yolov7-qat, yolov9 models to perform inference on video files or RTSP streams.

Language: Python - Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Lapland-UAS-Tequ/tequ-setup-triton-inference-server

Configure NVIDIA Triton Inference Server on different platforms. Deploy object detection model in Tensorflow SavedModel format to server. Send images to server for inference with Node-RED. Triton Inference Server HTTP API is used for inference.

Size: 683 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

bug-developer021/YOLOV5_optimization_on_triton

Compare multiple optimization methods on triton to imporve model service performance

Language: Jupyter Notebook - Size: 2.13 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 41 - Forks: 9

Eilliw/trash-classification-public

Custom Yolov8x-cls edge model deployment and training to classify trash vs recycling.

Language: Python - Size: 234 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

dudeperf3ct/end-to-end-images

This repo contains code for training and deploying PyTorch models with applications in images in end-to-end fashion.

Language: Jupyter Notebook - Size: 78.8 MB - Last synced at: 29 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

Zerohertz/triton-inference-server

Triton Inference Server Template

Language: Python - Size: 5.86 KB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Bobo-y/triton_ensemble_model_demo

triton server ensemble model demo

Language: Python - Size: 109 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 28 - Forks: 8

omarabid59/yolov8-triton

Provides an ensemble model to deploy a YoloV8 ONNX model to Triton

Language: Python - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 4

dev6699/yolotriton

Go gRPC client for YOLO-NAS, YOLOv8 inference using the Triton Inference Server.

Language: Go - Size: 13.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

RajeshThallam/fastertransformer-converter

This repository is a code sample to serve Large Language Models (LLM) on a Google Kubernetes Engine (GKE) cluster with GPUs running NVIDIA Triton Inference Server with FasterTransformer backend.

Language: Python - Size: 139 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

akiragy/recsys_pipeline

Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.

Language: Python - Size: 26 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 16 - Forks: 3

Achiwilms/NVIDIA-Triton-Deployment-Quickstart

QuickStart Guide for Deploying a Basic ResNet Model on the Triton Inference Server

Language: Python - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

swapkh91/detectron2-to-tensorrt

Notebook with commands to convert a Detectron2 MaskRCNN model to TensorRT

Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

eitansela/sagemaker-mme-gpu-triton-java-client

Run Multiple Models on the Same GPU with Amazon SageMaker Multi-Model Endpoints Powered by NVIDIA Triton Inference Server. A Java client is also provided.

Language: Java - Size: 376 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

detail-novelist/novelist-triton-server

Deploy KoGPT with Triton Inference Server

Language: Shell - Size: 7.81 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 0

trinhtuanvubk/Wav2Vec2-Triton-Serving

Serve Wav2Vec2 model using Triton Inference Server

Language: Python - Size: 623 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Team-BoonMoSa/Amazon-EC2-Inf1 📦

Serving YOLOv5 Segmentation Model with Amazon EC2 Inf1

Language: Python - Size: 46.9 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

tamanna18/Triton-Inference-Server-Deployment-with-ONNX-Models

Triton Inference Server Deployment with ONNX Models

Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

cmpark0126/trytune-py

Heterogeneous System ML Pipeline Scheduling Framework with Triton Inference Server as Backend

Language: Python - Size: 900 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

rushai-dev/triton-server-ensemble-sidecar

Triton backend is difficult for a client to use whether it's sending by rest-api or grpc. If the client wants to customize the request body then this repository would like to offer a sidecar along with rest-api and triton client on Kubernetes.

Language: Python - Size: 23.4 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

suryanshgupta9933/Scene-Script

An image to text model/pipeline using VIT and Transformers and deployment using Nvidia's Pytrition and Streamlit app.

Language: Python - Size: 3.25 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

rushai-dev/licence-plate-triton-server-ensemble

Triton backend that enables pre-processing, post-processing and other logic to be implemented in Python. In the repository, I use tech stack including YOLOv8, ONNX, EasyOCR, Triton Inference Server, CV2, Minio, Docker, and K8S. All of which we deploy on k80 and use CUDA 11.4

Language: Python - Size: 41 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

smarter-project/armnn_tflite_backend

TensorFlow Lite backend with ArmNN delegate support for Nvidia Triton

Language: C++ - Size: 15.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 2

octoml/TransparentAI

An example of building your own ML cloud app using OctoML.

Language: Python - Size: 9.82 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 0

tonhathuy/tensorrt-triton-magface

Magface Triton Inferece Server Using Tensorrt

Language: Jupyter Notebook - Size: 722 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 2

mirekphd/tritonserver-tritonclient-starter-xgb

A complete containerized setup for Triton inference server and its python client using a realistic pre-trained XGBoost classifier model.

Language: Python - Size: 781 KB - Last synced at: 23 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

tuxedocat/triton-client-polyglot-example

Example of generating triton-inference-server clients for some programming languages

Language: TypeScript - Size: 471 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 2

isarsoft/csrnet-triton-tensorrt

This repository deploys CSRNet as an optimized TensorRT engine to Triton Inference Server

Language: C++ - Size: 16.6 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

k9ele7en/Triton-TensorRT-Inference-CRAFT-pytorch

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX

Language: Python - Size: 15.5 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 25 - Forks: 6

Related Keywords

triton-inference-server 95 tensorrt 22 pytorch 18 deep-learning 17 onnx 17 docker 17 machine-learning 14 triton 14 python 13 inference 12 fastapi 11 kubernetes 9 nvidia 8 tensorflow 8 yolov8 7 triton-server 7 mlops 6 ai 6 gpu 6 object-detection 6 llm 5 serving 5 torch 5 onnxruntime 5 tritonclient 5 docker-compose 5 k8s 4 python3 4 minio 4 model-serving 4 tensorflow-serving 4 inference-server 3 openvino 3 model-deployment 3 torchserve 3 computer-vision 3 tensorrt-inference 3 grafana 3 deployment 3 cuda 3 tensorrt-llm 3 fastertransformer 3 aws 3 gpu-acceleration 3 face-recognition 3 grpc 3 transformers 3 postgresql 3 elasticsearch 3 redis 3 yolov5 3 gradio 3 ner 2 nlp 2 tensorrt-engine 2 rest-api 2 typescript 2 inference-engine 2 grpc-client 2 go 2 face-detection 2 feast 2 flask 2 streamlit 2 vector-database 2 cv2 2 image-classification 2 jetson-nano 2 prometheus 2 terraform 2 ml-serving 2 ultralytics 2 large-language-models 2 jetson 2 serving-pytorch-models 2 ros 2 paddleocr 2 nvidia-docker 2 llm-inference 2 onnx-runtime 2 nemo 2 ocr 2 rag 2 easyocr 2 backend 2 onnx2trt 1 onnx-simplifier 1 installation 1 supervisord 1 cudnn 1 tensorrt-inference-server 1 tesseract-ocr 1 area-intrusion-detection 1 cloud 1 ci 1 edge 1 mnist-handwriting-recognition 1 intel 1 line-crossing-detection 1 object-tracking 1