An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: llamacpp

vercel/modelfusion

The TypeScript library for building AI applications.

Language: TypeScript - Size: 15.6 MB - Last synced at: about 5 hours ago - Pushed at: about 1 year ago - Stars: 1,298 - Forks: 89

LostRuins/koboldcpp Fork of ggml-org/llama.cpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

Language: C++ - Size: 301 MB - Last synced at: about 14 hours ago - Pushed at: about 16 hours ago - Stars: 8,132 - Forks: 525

maruel/ask

Run AI at CLI

Language: Go - Size: 222 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2 - Forks: 0

cactus-compute/cactus-react

Cactus React Native package: Run AI locally in your React Native apps

Language: TypeScript - Size: 801 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

oofdere/oke

an oke library for making llamacpp grammars

Language: TypeScript - Size: 48.8 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

cactus-compute/cactus-flutter

Cactus Flutter plugin: Run AI locally in your Flutter apps

Language: Dart - Size: 194 MB - Last synced at: about 3 hours ago - Pushed at: about 5 hours ago - Stars: 2 - Forks: 0

gptme/gptme

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.

Language: Python - Size: 17 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3,982 - Forks: 332

twinnydotdev/twinny

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but 100% free.

Language: TypeScript - Size: 60.4 MB - Last synced at: 2 days ago - Pushed at: 28 days ago - Stars: 3,589 - Forks: 205

Mobile-Artificial-Intelligence/maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

Language: Dart - Size: 124 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 2,167 - Forks: 218

mrdbourke/mac-ml-speed-test

A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS.

Language: Jupyter Notebook - Size: 1.51 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 195 - Forks: 33

julep-ai/steadytext

Deterministic text generation and embeddings with zero configuration

Language: PLpgSQL - Size: 9.75 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 16 - Forks: 0

CommanderLake/LMStud

Chat with GGUF LLMs using llama.cpp and a classic Windows Forms interface for minimal GUI bloat.

Language: C# - Size: 1.74 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4 - Forks: 0

Phate334/llamacpp-stress-test

Wrapper script + Docker setup for llama.cpp batched-bench: run, collect, and browse historical performance results.

Language: HTML - Size: 358 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

nerve-sparks/iris_android

IRIS is an android app for interfacing with GGUF / llama.cpp models locally.

Language: Kotlin - Size: 9.3 MB - Last synced at: 2 days ago - Pushed at: 7 months ago - Stars: 234 - Forks: 24

RunanywhereAI/runanywhere-sdks

Production ready toolkit to run AI locally

Language: Swift - Size: 5.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 25 - Forks: 0

gotzmann/booster

Booster - open accelerator for LLM models. Better inference and debugging for AI hackers

Language: C++ - Size: 144 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 162 - Forks: 8

getumbrel/llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

Language: TypeScript - Size: 1.71 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 11,000 - Forks: 711

reorproject/reor

Private & local AI personal knowledge management app for high entropy people.

Language: JavaScript - Size: 93.7 MB - Last synced at: 2 days ago - Pushed at: 4 months ago - Stars: 8,225 - Forks: 501

menloresearch/cortex.cpp 📦

Local AI API Platform

Language: C++ - Size: 139 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 2,761 - Forks: 179

innightwolfsleep/llm_telegram_bot

LLM telegram bot

Language: Python - Size: 1.18 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 125 - Forks: 25

0rzech/llama-swap

Custom Llama Swap Container Image

Language: Dockerfile - Size: 6.84 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

morpheuslord/HackBot

AI-powered cybersecurity chatbot designed to provide helpful and accurate answers to your cybersecurity-related queries and also do code analysis and scan analysis.

Language: Python - Size: 56.6 KB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 314 - Forks: 54

RahulSChand/gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

Language: JavaScript - Size: 1.56 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 1,346 - Forks: 76

kbrisso/byte-vision

Byte-Vision is a privacy-first document intelligence platform that transforms static documents into an interactive, searchable knowledge base. Built on Elasticsearch with RAG (Retrieval-Augmented Generation) capabilities, it offers document parsing, OCR processing, and a modern UI.

Language: JavaScript - Size: 3.52 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 62 - Forks: 8

joone/loz

Loz is a command-line tool that enables your preferred LLM to execute system commands and utilize Unix pipes, integrating AI capabilities with other Unix tools.

Language: TypeScript - Size: 1.51 MB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 263 - Forks: 15

shubham0204/SmolChat-Android

Running any GGUF SLMs/LLMs locally, on-device in Android

Language: Kotlin - Size: 24.5 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 475 - Forks: 65

awaescher/OllamaSharp

The easiest way to use Ollama in .NET

Language: C# - Size: 26.7 MB - Last synced at: 3 days ago - Pushed at: 14 days ago - Stars: 1,112 - Forks: 157

ddh0/easy-llama

Python package wrapping llama.cpp for on-device LLM inference

Language: Python - Size: 845 KB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 86 - Forks: 5

SilasMarvin/lsp-ai

LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.

Language: Rust - Size: 1.61 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 2,962 - Forks: 104

MaoJianwei/llama.cpp-arm-armv7l-Raspberry-Release Fork of ggml-org/llama.cpp

On the Releases page, you can download pre-built binaries for arm, armv7l and Raspberry pi. LLM inference in C/C++

Language: C++ - Size: 146 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

eliranwong/toolmate

ToolMate AI, developed by Eliran Wong, is a cutting-edge AI companion that seamlessly integrates agents, tools, and plugins to excel in conversations, generative work, and task execution. Supports custom workflow and plugins to automate multi-step actions.

Language: Python - Size: 40.2 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 169 - Forks: 19

patw/LlamaHerder

A web UI for managing multiple models with llama-server.exe on windows

Language: HTML - Size: 44.9 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

lone-cloud/gerbil

A desktop app for running Large Language Models locally.

Language: TypeScript - Size: 2.59 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 14 - Forks: 0

jofizcd/Soul-of-Waifu

Breathe an AI soul into your favorite characters with lifelike emotions, voice, and deep roleplay using Soul of Waifu

Language: Python - Size: 29.5 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 348 - Forks: 23

DarkSorrow/llamarn

React Native Turbo Module for llama.cpp integration, optimized for the New Architecture

Language: C++ - Size: 1.59 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 1

adrianliechti/wingman

Inference Hub for AI at Scale

Language: Go - Size: 2.05 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 63 - Forks: 11

spicyneuron/llama-config

Automatically apply optimal settings to your LLM requests

Language: Go - Size: 8.79 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

containers/ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

Language: Python - Size: 4.32 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2,100 - Forks: 246

Josh-XT/AGiXT

AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.

Language: Python - Size: 168 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3,072 - Forks: 426

Freed-Wu/translate-shell

Translate text by google, bing, youdaozhiyun, haici, stardict, openai, large language model of local machine, etc at same time from CLI, GUI (GNU/Linux, Android, macOS and Windows), REPL, python, shell and vim.

Language: Python - Size: 454 KB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 40 - Forks: 5

llmware-ai/llmware

Unified framework for building enterprise RAG pipelines with small, specialized models

Language: Python - Size: 967 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 14,284 - Forks: 2,902

serge-chat/serge

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

Language: Svelte - Size: 3 MB - Last synced at: 3 days ago - Pushed at: 7 days ago - Stars: 5,751 - Forks: 402

ngxson/wllama

WebAssembly binding for llama.cpp - Enabling on-browser LLM inference

Language: TypeScript - Size: 31.5 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 865 - Forks: 59

intel/neural-speed 📦

An innovative library for efficient LLM inference via low-bit quantization

Language: C++ - Size: 16.2 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 348 - Forks: 39

MiniatureEge2006/g-man

Multipurpose Discord bot made in discord.py

Language: Python - Size: 2.12 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

DevXT-LLC/ezlocalai

ezlocalai is an easy to set up local artificial intelligence server with OpenAI Style Endpoints.

Language: Jupyter Notebook - Size: 149 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 90 - Forks: 17

SciSharp/LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

Language: C# - Size: 393 MB - Last synced at: 4 days ago - Pushed at: 18 days ago - Stars: 3,339 - Forks: 462

futursolo/pai

Collection of AI Containers - Prebuilt and Ready-to-Use

Language: Dockerfile - Size: 104 KB - Last synced at: 2 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

JohnSnowLabs/spark-nlp

State of the Art Natural Language Processing

Language: Scala - Size: 3.43 GB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,039 - Forks: 731

BrowserOperator/browser-operator-core Fork of ChromeDevTools/devtools-frontend

Browser Operator - The AI browser with built in Multi-Agent platform! Open source alternative to Perplexity Comet, Dia and Microsoft CoPilot Edge Browser

Language: TypeScript - Size: 1010 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 287 - Forks: 43

mgonzs13/llama_ros

llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2

Language: C++ - Size: 13.4 MB - Last synced at: 5 days ago - Pushed at: 10 days ago - Stars: 223 - Forks: 40

Mobile-Artificial-Intelligence/llama_sdk

lcpp is a dart implementation of llama.cpp used by the mobile artificial intelligence distribution (maid)

Language: C++ - Size: 1.78 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 101 - Forks: 23

Nexesenex/croco.cpp Fork of LostRuins/koboldcpp

Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compatible with most of Ikawrakow's quants except Bitnet.

Language: C++ - Size: 365 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 135 - Forks: 5

khoj-ai/khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

Language: Python - Size: 111 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 30,815 - Forks: 1,781

benman1/generative_ai_with_langchain

Build production-ready LLM applications and advanced agents using Python, LangChain, and LangGraph. This is the companion repository for the book on generative AI with LangChain.

Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,004 - Forks: 429

ahoylabs/gguf.js

A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.

Language: TypeScript - Size: 979 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 50 - Forks: 1

vallahulmustaan/eucalypt

🌿 Build ClojureScript UIs with Eucalypt, a lightweight library offering a Reagent-like API for efficient, small JavaScript outputs.

Language: HTML - Size: 55.7 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

remixer-dec/botality-ii

telegram bot for self-hosted local inference of stable diffusion, text-to-speech and large language models, such as llama3

Language: Python - Size: 377 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 10

huggingface/llm-ls

LSP server leveraging LLMs for code completion (and more?)

Language: Rust - Size: 344 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 809 - Forks: 65

gpustack/gpustack

Simple, scalable AI model deployment on GPU clusters

Language: Python - Size: 132 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 3,579 - Forks: 364

Dicklesworthstone/swiss_army_llama

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

Language: Python - Size: 7.25 MB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 1,023 - Forks: 60

danielsobrado/llm_notebooks

Concepts and examples on using and training LLMs

Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 45 - Forks: 6

mbwika/Janda_AI-Job-Application-Agent

Janda is a sophisticated, autonomous AI job search and application assistant — a highly practical use case that merges multi-agent orchestration, Retrieval-Augmented Generation (RAG) pipelines, LLM reasoning, resume/CV comparison, and web scraping/search APIs built using open-source and free tools.

Language: Python - Size: 622 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

xorbitsai/inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Language: Python - Size: 47 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 8,444 - Forks: 731

menloresearch/jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer

Language: TypeScript - Size: 1.3 GB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 37,479 - Forks: 2,205

floneum/floneum

Instant, controllable, local pre-trained AI models in Rust

Language: Rust - Size: 258 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,995 - Forks: 110

mostlygeek/llama-swap

Model swapping for llama.cpp (or any local OpenAPI compatible server)

Language: Go - Size: 1.92 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,396 - Forks: 84

qarinai/qarinai

Create unlimited AI chatbot agents for your website — powered by OpenAI-compatible LLMs, RAG, and MCP.

Language: TypeScript - Size: 830 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2 - Forks: 0

nekomeowww/ollama-operator

🚢 Yet another operator for running large language models on Kubernetes with ease. Powered by Ollama! 🐫

Language: Go - Size: 1.62 MB - Last synced at: 2 days ago - Pushed at: 10 days ago - Stars: 208 - Forks: 24

pythops/tenere

🤖 TUI interface for LLMs written in Rust

Language: Rust - Size: 642 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 573 - Forks: 25

iohub/collama Fork of sourcegraph/cody-public-snapshot 📦

VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.

Language: TypeScript - Size: 9.96 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 183 - Forks: 13

KolosalAI/Kolosal

Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run LLMs 100% offline on your device.

Language: C++ - Size: 89.3 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 295 - Forks: 22

xNul/code-llama-for-vscode

Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.

Language: Python - Size: 10.7 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 567 - Forks: 33

Agora-Lab-AI/Atom

a suite of finetuned LLMs for atomically precise function calling 🧪

Language: Python - Size: 2.35 MB - Last synced at: 2 days ago - Pushed at: 17 days ago - Stars: 15 - Forks: 1

ToxyBorg/llama_langchain_documents_embeddings

just testing langchain with llama cpp documents embeddings

Language: Python - Size: 60.5 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 1

tinyBigGAMES/Sophora

Sophora - AI Reasoning, Function-calling & Knowledge Retrieval

Language: Pascal - Size: 7.17 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 21 - Forks: 2

cactus-compute/cactus

Cross-platform framework for deploying LLM/VLM/TTS models locally on smartphones.

Language: C++ - Size: 1.91 GB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 2,912 - Forks: 170

CutTheWire/Dogi

반려견 질병 조언 AI 에이전트

Language: Python - Size: 2.92 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

jerryshell/resumind

AI 智能简历分析系统,为每个职位定制专属反馈与 ATS 评分

Language: JavaScript - Size: 2.45 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 1

mukel/llama3.java

Practical Llama 3 inference in Java

Language: Java - Size: 187 KB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 779 - Forks: 93

mybigday/llama.node

Node.js binding of llama.cpp

Language: C - Size: 29.3 MB - Last synced at: 5 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 3

juliensimon/sagemaker-inference-container-cpu

An Amazon SageMaker Container for Hugging Face Inference on Graviton and Intel CPUs

Language: Python - Size: 94.7 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 5 - Forks: 1

jeromeboivin/ollama-chat

A single file, customizable Python CLI tool for interacting with local Language Models, ensuring data privacy while providing conversation memory and extensibility through plugins and efficient Retrieval-Augmented Generation capabilities with ChromaDB integration. Also compatible with OpenAI API.

Language: Python - Size: 982 KB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 3

wilian-ol/gpt4all

GPT4All runs powerful local LLMs on desktops and laptops for private, offline AI — no API or GPU needed. 🐱💻

Language: C++ - Size: 40 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

andrewkchan/yalm

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

Language: C++ - Size: 349 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 473 - Forks: 41

alexrozanski/LlamaChat

Chat with your favourite LLaMA models in a native macOS app

Language: Swift - Size: 14.6 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 1,516 - Forks: 62

kelindar/search

Go library for embedded vector search and semantic embeddings using llama.cpp

Language: Go - Size: 736 KB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 483 - Forks: 18

adithya-s-k/CompanionLLM

CompanionLLM - A framework to finetune LLMs to be your own sentient conversational companion

Language: Jupyter Notebook - Size: 40.1 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 5

itcrown07/ai-gpu-playground-mac

Language: Python - Size: 6.84 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

smartloop-ai/smartloop

Smartloop is an open-source SLM platform to train and run models on an edge device

Language: Python - Size: 89.8 KB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 2 - Forks: 0

CentralFloridaAttorney/zmongo_retriever

Use data from MongoDB in LangChain, Llama and OpenAI

Language: Python - Size: 27.7 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 2

mixa3607/ML-gfx906

ML software (llama.cpp, ComfyUI) builds for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60

Language: C# - Size: 83 KB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 8 - Forks: 0

firatkiral/kodibot

Local and Offline AI Chatbot App for Desktop

Language: CSS - Size: 5.04 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 4

nadchif/in-browser-llm-inference

Download and run local LLMs within your browser

Language: JavaScript - Size: 252 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 7 - Forks: 2

samestrin/llm-interface

A simple NPM interface for seamlessly interacting with 36 Large Language Model (LLM) providers, including OpenAI, Anthropic, Google Gemini, Cohere, Hugging Face Inference, NVIDIA AI, Mistral AI, AI21 Studio, LLaMA.CPP, and Ollama, and hundreds of models.

Language: JavaScript - Size: 1.95 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 113 - Forks: 14

xorbitsai/xllamacpp Fork of shakfu/cyllama

xllamacpp - a Python wrapper of llama.cpp

Language: C++ - Size: 4.56 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 49 - Forks: 7

michaelsoftmd/zenbot-chrome

LLM-powered live web browser automation from the complete safety of a Podman/Docker container using Smolagents, Zendriver, chrome dev tools and VNC. Features caching as memory!

Language: Python - Size: 1.17 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

mirpo/fastapi-gen

Build LLM-enabled FastAPI applications without build configuration.

Language: Python - Size: 502 KB - Last synced at: 4 days ago - Pushed at: 12 days ago - Stars: 7 - Forks: 1

FilipFan/PolyEngineInfer

Run AI inference in an Android app with llama.cpp, ExecuTorch, LiteRT, ONNX, and more.

Language: Kotlin - Size: 42.2 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

InftyAI/llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

Language: Go - Size: 12.7 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 242 - Forks: 37