An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: inference-api

roboflow/inference

Turn any computer or edge device into a command center for your computer vision projects.

Language: Python - Size: 130 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,762 - Forks: 187

quic/ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Language: Python - Size: 257 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 725 - Forks: 118

BorjaOteroFerreira/IALab-Suite

Tool for test diferents large language models without code.

Language: Python - Size: 30.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 18 - Forks: 0

SocAIty/socaity

SDK for generative AI.

Language: Python - Size: 26.7 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

Twixie5/OpenVINO_Asynchronous_API_Performance_Demo

This project demonstrates the high performance of OpenVINO asynchronous inference API

Language: Python - Size: 28.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

basetenlabs/truss

The simplest way to serve AI/ML models in production

Language: Python - Size: 17.3 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,015 - Forks: 88

ai4os-hub/diamorph-detection

Diamorph object detection

Language: Dockerfile - Size: 521 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

quic/ai-hub-apps

The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Language: Java - Size: 27.9 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 218 - Forks: 52

amotivv-inc/memory-box-inference

Enterprise AI Governance Platform: A secure, multi-tenant infrastructure for LLM deployment with comprehensive authentication, cost management, usage analytics, and semantic memory integration. Transforms stateless LLM calls into an organizational AI brain that gets smarter over time.

Language: Python - Size: 182 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

SearchSavior/OpenArc

Lightweight Inference server for OpenVINO

Language: Python - Size: 2.23 MB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 187 - Forks: 6

BMW-InnovationLab/BMW-TensorFlow-Training-GUI

This repository allows you to get started with a gui based training a State-of-the-art Deep Learning model with little to no configuration needed! NoCode training with TensorFlow has never been so easy.

Language: Python - Size: 260 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 979 - Forks: 164

inference-gateway/inference-gateway

An open-source, high-performance gateway unifying multiple LLM providers, from local solutions like Ollama to major cloud providers such as OpenAI, Groq, Cohere, Anthropic, Cloudflare and DeepSeek.

Language: Go - Size: 1.43 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 35 - Forks: 1

Trojan3877/Facial-Emotion-Recognition-System

The **Facial Emotion Recognition System** is a robust computer vision pipeline that detects and classifies human emotions (e.g., happy, sad, angry, surprised) from facial images and video streams. It leverages transfer learning with state-of-the-art convolutional neural networks (e.g., ResNet, EfficientNet) in PyTorch, fine-tuned on the FER2013 ben

Language: Python - Size: 21.9 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

RichNasz/vLLMSwiftApp

Contains a simple macOS/iOS chatbot connection to vLLM using llama-stack or OpenAI APIs

Language: Swift - Size: 119 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 0

decisionfacts/semantic-ai

An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate human readable conversational response with the help of LLM (Large Language Model).

Language: Python - Size: 4.53 MB - Last synced at: 11 days ago - Pushed at: 12 months ago - Stars: 21 - Forks: 1

almc-c/ai-project

Import of Code repositories for Ryan Day's O'Reilly Book titled Hands-on APIs for AI and Data Science.

Language: Python - Size: 182 KB - Last synced at: 13 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

pandruszkow/whisper-inference-server

A networked inference server for Whisper speech recognition

Language: Python - Size: 6.84 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 8 - Forks: 0

kyryl-opens-ml/ml-in-production-practice

Practice for Machine Learning in Production course

Language: Python - Size: 11.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 12 - Forks: 5

jparkerweb/bedrock-proxy-endpoint

🔀 Bedrock Proxy Endpoint ⇢ Spin up your own custom OpenAI API server endpoint for easy AWS Bedrock inference (using standard baseUrl, and apiKey params)

Language: JavaScript - Size: 509 KB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 5

PromptOn/prompton

Chat prompt template evaluation and inference monitoring

Language: Python - Size: 16.7 MB - Last synced at: 29 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

WeOnlyLiveOnce13/Text-Summarizer-Fine-tuning

End to End implementation machine learning Operations of a text summarizer model by fine-tuning a hugging face model.

Language: Jupyter Notebook - Size: 161 KB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

shusingh/trip-planner

TripPlanner: AI‑powered travel planner with a Vite + React + TypeScript frontend, Go backend, and Hugging Face integration.

Language: TypeScript - Size: 24.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

pszemraj/textsum

CLI & Python API to easily summarize text-based files with transformers

Language: Python - Size: 70.3 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 129 - Forks: 8

stephanj/Llama3JavaChatCompletionService

Llama3.java Inference engine with OpenAI Chat Completion REST API/

Language: Java - Size: 146 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 25 - Forks: 2

hupe1980/go-huggingface

🤗 Hugging Face Inference Client written in Go

Language: Go - Size: 64.5 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 48 - Forks: 6

tmcarmichael/fabricai-inference-server

A hackable, modular, containerized inference server for deploying large language models in local or hybrid environments.

Language: Python - Size: 190 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

yas-sim/OpenVINO_Asynchronous_API_Performance_Demo

This project demonstrates the high performance of OpenVINO asynchronous inference API

Language: Python - Size: 28.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

BMW-InnovationLab/BMW-Classification-Training-GUI

This repository allows you to get started with training a State-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset and you can start the training right away. You can even test your model with our built-in Inference REST API. Training classification models with GluonCV has never been so easy.

Language: Python - Size: 1.07 GB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 73 - Forks: 2

Prismadic/magnet

the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly

Language: Python - Size: 11.8 MB - Last synced at: 13 days ago - Pushed at: 9 months ago - Stars: 31 - Forks: 3

YAV-AI/NodeJS-Stable-Diffusion-XL-Base-1.0-Hugging-Face-Inference-API

A simple node.js example that generates an image using StableDiffusion via Hugging Face Inference API.

Language: JavaScript - Size: 297 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 1

adv-11/Llama1B

Use this app to run any Hugging Face model you want with a StreamlitUI . Adding a model requires only +3 Lines of addition :D

Language: Python - Size: 33.2 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

rockett-m/Funnel-Summarizer

Using Cerebras Fast Inference to summarize Unlimited input size with a Funnel and Concat strategy

Language: Python - Size: 1.23 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

intelligencedev/eternal 📦

Eternal is an experimental platform for machine learning models and workflows.

Language: Go - Size: 73.9 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 68 - Forks: 5

idaifish/llama_rs

simple bindings to llama.cpp

Language: Rust - Size: 12.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Aparnap2/ai-code-mentor

Language: TypeScript - Size: 9.76 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

jithin8mathew/yolomosaic

A Python library for visualizing YOLO detections and segmented instances on large orthomosaic images, with the ability to generate shapefiles for GIS integration

Language: Python - Size: 45.4 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Kardbord/hfapigo

Unofficial (Golang) Go bindings for the Hugging Face Inference API

Language: Go - Size: 3.35 MB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 62 - Forks: 5

bilguun0203/buuz-api

Автомат бууз тоологчийн inference API 🥟

Language: Python - Size: 21.5 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ededddy/groq-api-rs

A Rust Client For groq Inference API

Language: Rust - Size: 77.1 KB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 3

natnicha/master-thesis-image-classification

This application is an image classification inference, a target application of our research study as machine learning inference. This implementation is developed in Python using FastAPI

Language: Python - Size: 259 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Santhoshmani1/Inferhub

Text to image generation with stable diffusion xl model powered by hugging face inference api

Language: JavaScript - Size: 7.71 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

M-YasirGhaffar/flux.1-schnell-ai-image-generator

FLUX.1-schnell is a state-of-the-art image generation tool powered by an AI model from Black Forest Labs, available through Hugging Face’s inference API. Generate high-quality, unique images from text prompts with ease. This application is a MERN stack implementation of the model.

Language: JavaScript - Size: 111 KB - Last synced at: 16 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

extrawest/flutter-podcast-to-blog-ai-app

This app is designed to provide a way to cooperate with PodcastIndex.org. You can listen to podcasts, get text version, short summary, audio file based on summary, and AI chat on the subject of podcast

Language: Dart - Size: 9.19 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

BMW-InnovationLab/BMW-Classification-Inference-GPU-CPU

This is a repository for an image classification inference API using the Gluoncv framework. The inference REST API works on CPU/GPU. It's supported on Windows and Linux Operating systems. Models trained using our Gluoncv Classification training repository can be deployed in this API. Several models can be loaded and used at the same time.

Language: Python - Size: 3.04 MB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 49 - Forks: 0

RageAgainstThePixel/com.rest.huggingface

A Non-Official HuggingFace Rest Client for Unity (UPM)

Language: C# - Size: 2.28 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 1

gmkung/Cheemera

A Node.js backend that exposes a Typescript implementation of the deCheem inference engine.

Language: Python - Size: 7.09 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 2

geniusrise/text

Text components powering LLMs & SLMs for geniusrise framework

Language: Python - Size: 15.6 MB - Last synced at: 18 days ago - Pushed at: 9 months ago - Stars: 5 - Forks: 2

yas-sim/openvino-ep-enabled-onnxruntime

Describing How to Enable OpenVINO Execution Provider for ONNX Runtime

Language: C++ - Size: 18.6 MB - Last synced at: 2 months ago - Pushed at: about 5 years ago - Stars: 19 - Forks: 1

antoninoLorenzo/Ollama-on-Colab-with-ngrok

Notebook to run Ollama on Google Colab

Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

defenseunicorns/leapfrogai-api 📦

LeapfrogAI API

Language: Python - Size: 551 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 8 - Forks: 2

sniiz/hf-inferrer

simple node.js huggingface inference package

Language: JavaScript - Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

lordofthejars-ai/event-driven-ai

Language: HTML - Size: 4.67 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

lordofthejars-ai/fraud-detection-inference

Language: Java - Size: 85.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

dimdasci/apartment-price-model

A pricing model, which can predict the acceptable per night price for Airbnb apartment based on its properties and the offered amenities

Language: Jupyter Notebook - Size: 38.6 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

LaurentVeyssier/Dynamic_risk_assessment_system

Project #4 from Udacity's ML DevOps engineer Nanodegree

Language: Jupyter Notebook - Size: 283 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

pcs-work/Healthcare-Demo-API

Healthcare Demo API

Language: Python - Size: 76.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

InquestGeronimo/superlaser

MLOps library for LLM deployment w/ the vLLM engine on RunPod's infra.

Language: Python - Size: 535 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

pchandrasekaran1595/Depth-Inference-API

Depth Inference API built using Sanic and the MiDaS v2.7 Small Model in ONNX format

Language: Python - Size: 57.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

pchandrasekaran1595/BG-Remove-API

Background Removal and Replacement API built using the Sanic Framework

Language: Python - Size: 3.97 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

obeidahmad/Video-Inference-API

This is a repository for a semantic segmentation inference API of a video using the BMW-IntelOpenVINO-Segmentation-Inference-API.

Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

SaeedNajafi/infer-pytorch-pyspark

Coupling PySpark with PyTorch Models

Size: 130 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 11 - Forks: 6

pchandrasekaran1595/Computer-Vision-API-V2

Computer Vision API V2 - FastAPI & ONNX Models

Language: Python - Size: 143 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

flkapes/deepray

DeepRay: Elevating X-Ray Diagnosis with AI-Powered Bone Image Classification

Language: Python - Size: 99.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Michael-OvO/Yolov7-Flask

A Beautiful Flask Web API for Yolov7 (and custom) models

Language: Python - Size: 1.59 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 202 - Forks: 11

mustafamerttunali/deep-learning-training-gui

Train and predict your model on pre-trained deep learning models through the GUI (web app). No more many parameters, no more data preprocessing.

Language: Python - Size: 393 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 120 - Forks: 31

shivamMg/stable-diffusion-on-azureml

REST APIs for StableDiffusion. Inferencing support on AzureML

Language: Jupyter Notebook - Size: 1.34 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 4

TimMikeladze/huggingface 📦

Typescript wrapper for the Hugging Face Inference API.

Language: TypeScript - Size: 1.22 MB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 2

peb-peb/shravan

Unlocking Value from Customer Call Data

Language: Python - Size: 41 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

ChildishGirl/monitoring-lambda-ml-inference

Monitor Lambda ML inference with CloudWatch Dashboard using AWS CDK (Python)

Language: Python - Size: 265 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

TeimasTeimoso/ADPT-BE

Backend inference server for ADPT-AI

Language: Python - Size: 38.1 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

deryann/diffod

Diff tool to check object detection result.

Language: Python - Size: 1.44 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

jegun19/try-me

Machine learning preview platform for everyone

Language: Vue - Size: 1.46 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

pchandrasekaran1595/Yolo-Inference-API

Yolo Inference API built using FastAPI

Language: Python - Size: 211 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

scattering-ai/MLDrop

MLDrop model serving for Pytorch

Size: 33.2 KB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

helloerikaaa/InferenceAPI

Inference API for a machine learning model

Language: Python - Size: 29.3 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

pchandrasekaran1595/Computer-Vision-API 📦

Computer VIsion API built using FastAPI and pretrained models converted to ONNX format

Language: Python - Size: 39.9 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

pcs-work/Computer-Vision-API-V2 Fork of pchandrasekaran1595/Computer-Vision-API-V2

Computer Vision API V2 - FastAPI & ONNX Models

Language: Python - Size: 143 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

pchandrasekaran1595/BGRemove-API

Background Removal and Replacement API

Language: Python - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0