GitHub topics: llamacpp
vercel/modelfusion
The TypeScript library for building AI applications.
Language: TypeScript - Size: 15.6 MB - Last synced at: about 5 hours ago - Pushed at: about 1 year ago - Stars: 1,298 - Forks: 89

LostRuins/koboldcpp Fork of ggml-org/llama.cpp
Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
Language: C++ - Size: 301 MB - Last synced at: about 14 hours ago - Pushed at: about 16 hours ago - Stars: 8,132 - Forks: 525

maruel/ask
Run AI at CLI
Language: Go - Size: 222 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2 - Forks: 0

cactus-compute/cactus-react
Cactus React Native package: Run AI locally in your React Native apps
Language: TypeScript - Size: 801 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

oofdere/oke
an oke library for making llamacpp grammars
Language: TypeScript - Size: 48.8 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

cactus-compute/cactus-flutter
Cactus Flutter plugin: Run AI locally in your Flutter apps
Language: Dart - Size: 194 MB - Last synced at: about 3 hours ago - Pushed at: about 5 hours ago - Stars: 2 - Forks: 0

gptme/gptme
Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.
Language: Python - Size: 17 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3,982 - Forks: 332

twinnydotdev/twinny
The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but 100% free.
Language: TypeScript - Size: 60.4 MB - Last synced at: 2 days ago - Pushed at: 28 days ago - Stars: 3,589 - Forks: 205

Mobile-Artificial-Intelligence/maid
Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
Language: Dart - Size: 124 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 2,167 - Forks: 218

mrdbourke/mac-ml-speed-test
A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS.
Language: Jupyter Notebook - Size: 1.51 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 195 - Forks: 33

julep-ai/steadytext
Deterministic text generation and embeddings with zero configuration
Language: PLpgSQL - Size: 9.75 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 16 - Forks: 0

CommanderLake/LMStud
Chat with GGUF LLMs using llama.cpp and a classic Windows Forms interface for minimal GUI bloat.
Language: C# - Size: 1.74 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4 - Forks: 0

Phate334/llamacpp-stress-test
Wrapper script + Docker setup for llama.cpp batched-bench: run, collect, and browse historical performance results.
Language: HTML - Size: 358 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

nerve-sparks/iris_android
IRIS is an android app for interfacing with GGUF / llama.cpp models locally.
Language: Kotlin - Size: 9.3 MB - Last synced at: 2 days ago - Pushed at: 7 months ago - Stars: 234 - Forks: 24

RunanywhereAI/runanywhere-sdks
Production ready toolkit to run AI locally
Language: Swift - Size: 5.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 25 - Forks: 0

gotzmann/booster
Booster - open accelerator for LLM models. Better inference and debugging for AI hackers
Language: C++ - Size: 144 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 162 - Forks: 8

getumbrel/llama-gpt
A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
Language: TypeScript - Size: 1.71 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 11,000 - Forks: 711

reorproject/reor
Private & local AI personal knowledge management app for high entropy people.
Language: JavaScript - Size: 93.7 MB - Last synced at: 2 days ago - Pushed at: 4 months ago - Stars: 8,225 - Forks: 501

menloresearch/cortex.cpp 📦
Local AI API Platform
Language: C++ - Size: 139 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 2,761 - Forks: 179

innightwolfsleep/llm_telegram_bot
LLM telegram bot
Language: Python - Size: 1.18 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 125 - Forks: 25

0rzech/llama-swap
Custom Llama Swap Container Image
Language: Dockerfile - Size: 6.84 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

morpheuslord/HackBot
AI-powered cybersecurity chatbot designed to provide helpful and accurate answers to your cybersecurity-related queries and also do code analysis and scan analysis.
Language: Python - Size: 56.6 KB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 314 - Forks: 54

RahulSChand/gpu_poor
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Language: JavaScript - Size: 1.56 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 1,346 - Forks: 76

kbrisso/byte-vision
Byte-Vision is a privacy-first document intelligence platform that transforms static documents into an interactive, searchable knowledge base. Built on Elasticsearch with RAG (Retrieval-Augmented Generation) capabilities, it offers document parsing, OCR processing, and a modern UI.
Language: JavaScript - Size: 3.52 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 62 - Forks: 8

joone/loz
Loz is a command-line tool that enables your preferred LLM to execute system commands and utilize Unix pipes, integrating AI capabilities with other Unix tools.
Language: TypeScript - Size: 1.51 MB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 263 - Forks: 15

shubham0204/SmolChat-Android
Running any GGUF SLMs/LLMs locally, on-device in Android
Language: Kotlin - Size: 24.5 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 475 - Forks: 65

awaescher/OllamaSharp
The easiest way to use Ollama in .NET
Language: C# - Size: 26.7 MB - Last synced at: 3 days ago - Pushed at: 14 days ago - Stars: 1,112 - Forks: 157

ddh0/easy-llama
Python package wrapping llama.cpp for on-device LLM inference
Language: Python - Size: 845 KB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 86 - Forks: 5

SilasMarvin/lsp-ai
LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.
Language: Rust - Size: 1.61 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 2,962 - Forks: 104

MaoJianwei/llama.cpp-arm-armv7l-Raspberry-Release Fork of ggml-org/llama.cpp
On the Releases page, you can download pre-built binaries for arm, armv7l and Raspberry pi. LLM inference in C/C++
Language: C++ - Size: 146 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

eliranwong/toolmate
ToolMate AI, developed by Eliran Wong, is a cutting-edge AI companion that seamlessly integrates agents, tools, and plugins to excel in conversations, generative work, and task execution. Supports custom workflow and plugins to automate multi-step actions.
Language: Python - Size: 40.2 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 169 - Forks: 19

patw/LlamaHerder
A web UI for managing multiple models with llama-server.exe on windows
Language: HTML - Size: 44.9 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

lone-cloud/gerbil
A desktop app for running Large Language Models locally.
Language: TypeScript - Size: 2.59 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 14 - Forks: 0

jofizcd/Soul-of-Waifu
Breathe an AI soul into your favorite characters with lifelike emotions, voice, and deep roleplay using Soul of Waifu
Language: Python - Size: 29.5 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 348 - Forks: 23

DarkSorrow/llamarn
React Native Turbo Module for llama.cpp integration, optimized for the New Architecture
Language: C++ - Size: 1.59 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 1

adrianliechti/wingman
Inference Hub for AI at Scale
Language: Go - Size: 2.05 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 63 - Forks: 11

spicyneuron/llama-config
Automatically apply optimal settings to your LLM requests
Language: Go - Size: 8.79 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

containers/ramalama
RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.
Language: Python - Size: 4.32 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2,100 - Forks: 246

Josh-XT/AGiXT
AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.
Language: Python - Size: 168 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3,072 - Forks: 426

Freed-Wu/translate-shell
Translate text by google, bing, youdaozhiyun, haici, stardict, openai, large language model of local machine, etc at same time from CLI, GUI (GNU/Linux, Android, macOS and Windows), REPL, python, shell and vim.
Language: Python - Size: 454 KB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 40 - Forks: 5

llmware-ai/llmware
Unified framework for building enterprise RAG pipelines with small, specialized models
Language: Python - Size: 967 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 14,284 - Forks: 2,902

serge-chat/serge
A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
Language: Svelte - Size: 3 MB - Last synced at: 3 days ago - Pushed at: 7 days ago - Stars: 5,751 - Forks: 402

ngxson/wllama
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
Language: TypeScript - Size: 31.5 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 865 - Forks: 59

intel/neural-speed 📦
An innovative library for efficient LLM inference via low-bit quantization
Language: C++ - Size: 16.2 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 348 - Forks: 39

MiniatureEge2006/g-man
Multipurpose Discord bot made in discord.py
Language: Python - Size: 2.12 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

DevXT-LLC/ezlocalai
ezlocalai is an easy to set up local artificial intelligence server with OpenAI Style Endpoints.
Language: Jupyter Notebook - Size: 149 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 90 - Forks: 17

SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
Language: C# - Size: 393 MB - Last synced at: 4 days ago - Pushed at: 18 days ago - Stars: 3,339 - Forks: 462

futursolo/pai
Collection of AI Containers - Prebuilt and Ready-to-Use
Language: Dockerfile - Size: 104 KB - Last synced at: 2 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing
Language: Scala - Size: 3.43 GB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,039 - Forks: 731

BrowserOperator/browser-operator-core Fork of ChromeDevTools/devtools-frontend
Browser Operator - The AI browser with built in Multi-Agent platform! Open source alternative to Perplexity Comet, Dia and Microsoft CoPilot Edge Browser
Language: TypeScript - Size: 1010 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 287 - Forks: 43

mgonzs13/llama_ros
llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2
Language: C++ - Size: 13.4 MB - Last synced at: 5 days ago - Pushed at: 10 days ago - Stars: 223 - Forks: 40

Mobile-Artificial-Intelligence/llama_sdk
lcpp is a dart implementation of llama.cpp used by the mobile artificial intelligence distribution (maid)
Language: C++ - Size: 1.78 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 101 - Forks: 23

Nexesenex/croco.cpp Fork of LostRuins/koboldcpp
Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compatible with most of Ikawrakow's quants except Bitnet.
Language: C++ - Size: 365 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 135 - Forks: 5

khoj-ai/khoj
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
Language: Python - Size: 111 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 30,815 - Forks: 1,781

benman1/generative_ai_with_langchain
Build production-ready LLM applications and advanced agents using Python, LangChain, and LangGraph. This is the companion repository for the book on generative AI with LangChain.
Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,004 - Forks: 429

ahoylabs/gguf.js
A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.
Language: TypeScript - Size: 979 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 50 - Forks: 1

vallahulmustaan/eucalypt
🌿 Build ClojureScript UIs with Eucalypt, a lightweight library offering a Reagent-like API for efficient, small JavaScript outputs.
Language: HTML - Size: 55.7 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

remixer-dec/botality-ii
telegram bot for self-hosted local inference of stable diffusion, text-to-speech and large language models, such as llama3
Language: Python - Size: 377 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 10

huggingface/llm-ls
LSP server leveraging LLMs for code completion (and more?)
Language: Rust - Size: 344 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 809 - Forks: 65

gpustack/gpustack
Simple, scalable AI model deployment on GPU clusters
Language: Python - Size: 132 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 3,579 - Forks: 364

Dicklesworthstone/swiss_army_llama
A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.
Language: Python - Size: 7.25 MB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 1,023 - Forks: 60

danielsobrado/llm_notebooks
Concepts and examples on using and training LLMs
Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 45 - Forks: 6

mbwika/Janda_AI-Job-Application-Agent
Janda is a sophisticated, autonomous AI job search and application assistant — a highly practical use case that merges multi-agent orchestration, Retrieval-Augmented Generation (RAG) pipelines, LLM reasoning, resume/CV comparison, and web scraping/search APIs built using open-source and free tools.
Language: Python - Size: 622 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Language: Python - Size: 47 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 8,444 - Forks: 731

menloresearch/jan
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer
Language: TypeScript - Size: 1.3 GB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 37,479 - Forks: 2,205

floneum/floneum
Instant, controllable, local pre-trained AI models in Rust
Language: Rust - Size: 258 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,995 - Forks: 110

mostlygeek/llama-swap
Model swapping for llama.cpp (or any local OpenAPI compatible server)
Language: Go - Size: 1.92 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,396 - Forks: 84

qarinai/qarinai
Create unlimited AI chatbot agents for your website — powered by OpenAI-compatible LLMs, RAG, and MCP.
Language: TypeScript - Size: 830 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2 - Forks: 0

nekomeowww/ollama-operator
🚢 Yet another operator for running large language models on Kubernetes with ease. Powered by Ollama! 🐫
Language: Go - Size: 1.62 MB - Last synced at: 2 days ago - Pushed at: 10 days ago - Stars: 208 - Forks: 24

pythops/tenere
🤖 TUI interface for LLMs written in Rust
Language: Rust - Size: 642 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 573 - Forks: 25

iohub/collama Fork of sourcegraph/cody-public-snapshot 📦
VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.
Language: TypeScript - Size: 9.96 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 183 - Forks: 13

KolosalAI/Kolosal
Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run LLMs 100% offline on your device.
Language: C++ - Size: 89.3 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 295 - Forks: 22

xNul/code-llama-for-vscode
Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.
Language: Python - Size: 10.7 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 567 - Forks: 33

Agora-Lab-AI/Atom
a suite of finetuned LLMs for atomically precise function calling 🧪
Language: Python - Size: 2.35 MB - Last synced at: 2 days ago - Pushed at: 17 days ago - Stars: 15 - Forks: 1

ToxyBorg/llama_langchain_documents_embeddings
just testing langchain with llama cpp documents embeddings
Language: Python - Size: 60.5 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 1

tinyBigGAMES/Sophora
Sophora - AI Reasoning, Function-calling & Knowledge Retrieval
Language: Pascal - Size: 7.17 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 21 - Forks: 2

cactus-compute/cactus
Cross-platform framework for deploying LLM/VLM/TTS models locally on smartphones.
Language: C++ - Size: 1.91 GB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 2,912 - Forks: 170

CutTheWire/Dogi
반려견 질병 조언 AI 에이전트
Language: Python - Size: 2.92 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

jerryshell/resumind
AI 智能简历分析系统,为每个职位定制专属反馈与 ATS 评分
Language: JavaScript - Size: 2.45 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 1

mukel/llama3.java
Practical Llama 3 inference in Java
Language: Java - Size: 187 KB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 779 - Forks: 93

mybigday/llama.node
Node.js binding of llama.cpp
Language: C - Size: 29.3 MB - Last synced at: 5 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 3

juliensimon/sagemaker-inference-container-cpu
An Amazon SageMaker Container for Hugging Face Inference on Graviton and Intel CPUs
Language: Python - Size: 94.7 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 5 - Forks: 1

jeromeboivin/ollama-chat
A single file, customizable Python CLI tool for interacting with local Language Models, ensuring data privacy while providing conversation memory and extensibility through plugins and efficient Retrieval-Augmented Generation capabilities with ChromaDB integration. Also compatible with OpenAI API.
Language: Python - Size: 982 KB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 3

wilian-ol/gpt4all
GPT4All runs powerful local LLMs on desktops and laptops for private, offline AI — no API or GPU needed. 🐱💻
Language: C++ - Size: 40 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

andrewkchan/yalm
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
Language: C++ - Size: 349 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 473 - Forks: 41

alexrozanski/LlamaChat
Chat with your favourite LLaMA models in a native macOS app
Language: Swift - Size: 14.6 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 1,516 - Forks: 62

kelindar/search
Go library for embedded vector search and semantic embeddings using llama.cpp
Language: Go - Size: 736 KB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 483 - Forks: 18

adithya-s-k/CompanionLLM
CompanionLLM - A framework to finetune LLMs to be your own sentient conversational companion
Language: Jupyter Notebook - Size: 40.1 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 5

itcrown07/ai-gpu-playground-mac
Language: Python - Size: 6.84 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

smartloop-ai/smartloop
Smartloop is an open-source SLM platform to train and run models on an edge device
Language: Python - Size: 89.8 KB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 2 - Forks: 0

CentralFloridaAttorney/zmongo_retriever
Use data from MongoDB in LangChain, Llama and OpenAI
Language: Python - Size: 27.7 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 2

mixa3607/ML-gfx906
ML software (llama.cpp, ComfyUI) builds for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
Language: C# - Size: 83 KB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 8 - Forks: 0

firatkiral/kodibot
Local and Offline AI Chatbot App for Desktop
Language: CSS - Size: 5.04 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 4

nadchif/in-browser-llm-inference
Download and run local LLMs within your browser
Language: JavaScript - Size: 252 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 7 - Forks: 2

samestrin/llm-interface
A simple NPM interface for seamlessly interacting with 36 Large Language Model (LLM) providers, including OpenAI, Anthropic, Google Gemini, Cohere, Hugging Face Inference, NVIDIA AI, Mistral AI, AI21 Studio, LLaMA.CPP, and Ollama, and hundreds of models.
Language: JavaScript - Size: 1.95 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 113 - Forks: 14

xorbitsai/xllamacpp Fork of shakfu/cyllama
xllamacpp - a Python wrapper of llama.cpp
Language: C++ - Size: 4.56 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 49 - Forks: 7

michaelsoftmd/zenbot-chrome
LLM-powered live web browser automation from the complete safety of a Podman/Docker container using Smolagents, Zendriver, chrome dev tools and VNC. Features caching as memory!
Language: Python - Size: 1.17 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

mirpo/fastapi-gen
Build LLM-enabled FastAPI applications without build configuration.
Language: Python - Size: 502 KB - Last synced at: 4 days ago - Pushed at: 12 days ago - Stars: 7 - Forks: 1

FilipFan/PolyEngineInfer
Run AI inference in an Android app with llama.cpp, ExecuTorch, LiteRT, ONNX, and more.
Language: Kotlin - Size: 42.2 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

InftyAI/llmaz
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Language: Go - Size: 12.7 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 242 - Forks: 37
