GitHub topics: local-inference

Repositories

Raxephion/AuraGen-AuraFlow-WebUI

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

Language: Python - Size: 6.12 MB - Last synced at: about 5 hours ago - Pushed at: 10 days ago - Stars: 3 - Forks: 0

SJTU-IPADS/PowerInfer

High-speed Large Language Model Serving for Local Deployment

Language: C++ - Size: 11.1 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 8,217 - Forks: 432

aTh1ef/ai-debate-agents

Verify claims using AI agents that debate using scraped evidence and local language models.

Language: Python - Size: 52.7 KB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

yas-sim/openvino-llm-chatbot-rag

LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).

Language: Python - Size: 163 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 6

efeslab/fiddler

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

Language: Python - Size: 1.72 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 210 - Forks: 20

tinyBigGAMES/JetInfero

Local LLM Inference Library

Language: Pascal - Size: 10.2 MB - Last synced at: 9 days ago - Pushed at: 5 months ago - Stars: 12 - Forks: 3

BorjaOteroFerreira/IALab-Suite

Tool for test diferents large language models without code.

Language: JavaScript - Size: 28.5 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 18 - Forks: 0

cuiyuheng/nexa-sdk Fork of NexaAI/nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

Size: 192 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nazago/meeting-minutes-generator

Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated

Language: Python - Size: 6.84 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Related Keywords

local-inference 9 llm-inference 4 llm 4 llama2 2 python 2 mixtral-8x7b 2 large-language-models 2 procedural-api 1 pascal 1 llama-cpp 1 library 1 c-cpp 1 ai-inference 1 mixture-of-experts 1 retrieval-augmented-generation 1 rag 1 openvino 1 offline 1 neural-chat 1 natural-language-processing 1 ai-image-generator 1 win64 1 api-rest 1 chat-application 1 flask-api 1 inference-api 1 llama-cpp-python 1 llama2-7b 1 llamacpp 1 ai 1 langchain-python 1 meeting-minutes 1 ollama 1 speech-to-text 1 summarization 1 whisper 1 auraflow 1 diffusers 1 generative-ai 1 gradio 1 image-generation 1 low-vram 1 open-source 1 stable-diffusion 1 text-to-image 1 webui 1 llama 1 agentic-ai 1 autonomous-agents 1 beautifulsoup 1 claim-verification 1 evidence-based-ai 1 langgraph 1 lm-studio 1 phi-4-mini 1 private-llm 1 qwen 1 scraping 1 chatbot 1 cloud-free 1 dolly2 1 edge-computing 1 edge-inference 1 huggingface 1 intel 1 langchain 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos