An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: local-inference

Raxephion/AuraGen-AuraFlow-WebUI

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

Language: Python - Size: 6.12 MB - Last synced at: about 5 hours ago - Pushed at: 10 days ago - Stars: 3 - Forks: 0

SJTU-IPADS/PowerInfer

High-speed Large Language Model Serving for Local Deployment

Language: C++ - Size: 11.1 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 8,217 - Forks: 432

aTh1ef/ai-debate-agents

Verify claims using AI agents that debate using scraped evidence and local language models.

Language: Python - Size: 52.7 KB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

yas-sim/openvino-llm-chatbot-rag

LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).

Language: Python - Size: 163 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 6

efeslab/fiddler

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

Language: Python - Size: 1.72 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 210 - Forks: 20

tinyBigGAMES/JetInfero

Local LLM Inference Library

Language: Pascal - Size: 10.2 MB - Last synced at: 9 days ago - Pushed at: 5 months ago - Stars: 12 - Forks: 3

BorjaOteroFerreira/IALab-Suite

Tool for test diferents large language models without code.

Language: JavaScript - Size: 28.5 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 18 - Forks: 0

cuiyuheng/nexa-sdk Fork of NexaAI/nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

Size: 192 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nazago/meeting-minutes-generator

Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated

Language: Python - Size: 6.84 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0