An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: model-inference-service

bentoml/BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Language: Python - Size: 95.3 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 7,635 - Forks: 834

bentoml/transformers-nlp-service

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

Language: Python - Size: 6.3 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 44 - Forks: 3

bentoml/CLIP-API-service

CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search

Language: Jupyter Notebook - Size: 945 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 60 - Forks: 4

ksm26/Efficiently-Serving-LLMs

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

Language: Jupyter Notebook - Size: 2.34 MB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 3