GitHub topics: local-inference
Raxephion/AuraGen-AuraFlow-WebUI
Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.
Language: Python - Size: 6.12 MB - Last synced at: about 5 hours ago - Pushed at: 10 days ago - Stars: 3 - Forks: 0

SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving for Local Deployment
Language: C++ - Size: 11.1 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 8,217 - Forks: 432

aTh1ef/ai-debate-agents
Verify claims using AI agents that debate using scraped evidence and local language models.
Language: Python - Size: 52.7 KB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

yas-sim/openvino-llm-chatbot-rag
LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).
Language: Python - Size: 163 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 6

efeslab/fiddler
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
Language: Python - Size: 1.72 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 210 - Forks: 20

tinyBigGAMES/JetInfero
Local LLM Inference Library
Language: Pascal - Size: 10.2 MB - Last synced at: 9 days ago - Pushed at: 5 months ago - Stars: 12 - Forks: 3

BorjaOteroFerreira/IALab-Suite
Tool for test diferents large language models without code.
Language: JavaScript - Size: 28.5 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 18 - Forks: 0

cuiyuheng/nexa-sdk Fork of NexaAI/nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
Size: 192 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nazago/meeting-minutes-generator
Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated
Language: Python - Size: 6.84 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
