Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / ksm26 / Efficiently-Serving-LLMs
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksm26%2FEfficiently-Serving-LLMs
Stars: 1
Forks: 0
Open Issues: 0
License: None
Language: Jupyter Notebook
Repo Size: 2.34 MB
Dependencies:
0
Created: about 2 months ago
Updated: about 2 months ago
Last pushed: about 1 month ago
Last synced: about 1 month ago
Topics: batch-processing, deep-learning-techniques, inference-optimization, large-scale-deployment, machine-learning-operations, model-acceleration, model-inference-service, model-serving, optimization-techniques, performance-enhancement, scalability-strategies, server-optimization, serving-infrastructure, text-generation
Files
No dependencies found