GitHub topics: model-inference-service

Repositories

bentoml/BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Language: Python - Size: 95.8 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7,777 - Forks: 843

ksm26/Efficiently-Serving-LLMs

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

Language: Jupyter Notebook - Size: 2.34 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 15 - Forks: 4

bentoml/CLIP-API-service

CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search

Language: Jupyter Notebook - Size: 945 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 62 - Forks: 4

bentoml/transformers-nlp-service

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

Language: Python - Size: 6.3 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 44 - Forks: 3

Related Keywords

model-inference-service 4 model-serving 4 mlops 3 llm 2 llmops 2 optimization-techniques 1 performance-enhancement 1 scalability-strategies 1 server-optimization 1 serving-infrastructure 1 text-generation 1 ai-applications 1 clip 1 cloud-native 1 model-inference 1 openai-clip 1 model-deployment 1 nlp 1 nlp-machine-learning 1 online-inference 1 transformer 1 model-acceleration 1 machine-learning-operations 1 large-scale-deployment 1 inference-optimization 1 deep-learning-techniques 1 batch-processing 1 python 1 multimodal 1 ml-engineering 1 machine-learning 1 llm-serving 1 llm-inference 1 inference-platform 1 generative-ai 1 deep-learning 1 ai-inference 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub topics: model-inference-service

bentoml/BentoML

ksm26/Efficiently-Serving-LLMs

bentoml/CLIP-API-service

bentoml/transformers-nlp-service