GitHub topics: text-generation-inference
InftyAI/llmaz
βΈοΈ Easy, advanced inference platform for large language models on Kubernetes. π Star to support our work!
Language: Go - Size: 6.46 MB - Last synced at: about 8 hours ago - Pushed at: about 9 hours ago - Stars: 125 - Forks: 20

huggingface/optimum-benchmark
ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
Language: Python - Size: 8.23 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 295 - Forks: 57

Mikesterner87/Nano-R1
This project demonstrates the process of fine-tuning the Qwen2.5-3B-Instruct model using GRPO (Generalized Reward Policy Optimization) on the GSM8K dataset.
Language: Jupyter Notebook - Size: 109 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

aisingapore/sealion-tgi
Serve the AI Singapore SEA-LION model β with TGI
Language: Shell - Size: 6.84 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 2 - Forks: 0

Akshint0407/Nano-R1
This project demonstrates the process of fine-tuning the Qwen2.5-3B-Instruct model using GRPO (Generalized Reward Policy Optimization) on the GSM8K dataset.
Language: Jupyter Notebook - Size: 769 KB - Last synced at: 12 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

HyperBlaze456/risu-backend-python
RisuAI backend with python only. TextGen works, need more memory related updates
Language: Python - Size: 95.6 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 1

aws-samples/amazon-sagemaker-llama2-response-streaming-recipes
Amazon SageMaker Llama 2 Inference via Response Streaming
Language: Jupyter Notebook - Size: 565 KB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 13 - Forks: 4

magichub-opensource/CLAM-Conversational-Language-AI-from-MagicData
This repo introduces MagicData-CLAM, a Chinese SFT dataset, and provides to the community two relevant models that we finetuned. Contact [email protected] for more information.
Language: Python - Size: 20.5 KB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 0

k8rw/openapi-wiki
Collection of Cloud Native and AI related OpenAPIs.
Language: Dockerfile - Size: 71.3 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

pyouthful/openapi-wiki
Let's collect useful APIs
Size: 45.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

yjg30737/pyqt-text-generation-inference-gui
GUI version of text-generation-inference
Language: Python - Size: 14.6 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0
