Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / ksm26 / Efficiently-Serving-LLMs

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksm26%2FEfficiently-Serving-LLMs

Stars: 1
Forks: 0
Open Issues: 0

License: None
Language: Jupyter Notebook
Repo Size: 2.34 MB
Dependencies: 0

Created: about 2 months ago
Updated: about 2 months ago
Last pushed: about 1 month ago
Last synced: about 1 month ago

Topics: batch-processing, deep-learning-techniques, inference-optimization, large-scale-deployment, machine-learning-operations, model-acceleration, model-inference-service, model-serving, optimization-techniques, performance-enhancement, scalability-strategies, server-optimization, serving-infrastructure, text-generation

Files

Readme

No dependencies found