llama-cpp | Topic | Ecosyste.ms: Repos

getumbrel/llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

Language: TypeScript - Size: 1.71 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 10,984 - Forks: 712

SciSharp/LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

Language: C# - Size: 392 MB - Last synced at: about 1 hour ago - Pushed at: 3 days ago - Stars: 3,263 - Forks: 446

Mobile-Artificial-Intelligence/maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

Language: Dart - Size: 123 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 2,056 - Forks: 215

withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

Language: TypeScript - Size: 21.9 MB - Last synced at: 4 days ago - Pushed at: 27 days ago - Stars: 1,572 - Forks: 136

gotzmann/llama.go

llama.go is like llama.cpp in pure Golang!

Language: Go - Size: 9.33 MB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 1,368 - Forks: 67

undreamai/LLMUnity

Create characters in Unity with LLMs!

Language: C# - Size: 19.3 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 1,186 - Forks: 126

Lizonghang/prima.cpp

prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters

Language: C++ - Size: 56.2 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 951 - Forks: 65

the-crypt-keeper/can-ai-code

Self-evaluating interview for AI coders

Language: Python - Size: 8.49 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 582 - Forks: 35

mybigday/llama.rn

React Native binding of llama.cpp

Language: C - Size: 10.2 MB - Last synced at: 4 days ago - Pushed at: 8 days ago - Stars: 556 - Forks: 53

withcatai/catai

Run AI ✨ assistant locally! with simple API for Node.js 🚀

Language: TypeScript - Size: 20.4 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 474 - Forks: 36

mdrokz/rust-llama.cpp

LLama.cpp rust bindings

Language: Rust - Size: 62.5 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 390 - Forks: 48

jlonge4/local_llama

This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.

Language: Python - Size: 71.3 KB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 265 - Forks: 45

gpustack/gguf-parser-go

Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

Language: Go - Size: 526 KB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 178 - Forks: 18

ptsochantaris/emeltal

Local ML voice chat using high-end models.

Language: C++ - Size: 50.6 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 173 - Forks: 11

phronmophobic/llama.clj

Run LLMs locally. A clojure wrapper for llama.cpp.

Language: Clojure - Size: 305 KB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 166 - Forks: 9

gotzmann/booster

Booster - open accelerator for LLM models. Better inference and debugging for AI hackers

Language: C++ - Size: 144 MB - Last synced at: 4 days ago - Pushed at: 11 months ago - Stars: 158 - Forks: 8

BrutalCoding/shady.ai

Making offline AI models accessible to all types of edge devices.

Language: Dart - Size: 101 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 142 - Forks: 16

nuance1979/llama-server

LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.

Language: Python - Size: 25.4 KB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 126 - Forks: 14

nrl-ai/CustomChar

Your customized AI assistant - Personal assistants on any hardware! With llama.cpp, whisper.cpp, ggml, LLaMA-v2.

Language: C++ - Size: 14.3 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 116 - Forks: 11

vtuber-plan/langport

Langport is a language model inference service

Language: Python - Size: 872 KB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 93 - Forks: 12

robiwan303/babyagi Fork of yoheinakajima/babyagi

BabyAGI-🦙: Enhanced for Llama models (running 100% local) and persistent memory, with smart internet search based on BabyCatAGI and document embedding in langchain based on privateGPT

Language: Python - Size: 8.06 MB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 89 - Forks: 7

R3gm/InsightSolver-Colab

InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine learning, and related models.

Language: Jupyter Notebook - Size: 382 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 87 - Forks: 28

Abhi5h3k/PrivateDocBot

📚 Local PDF-Integrated Chat Bot: Secure Conversations and Document Assistance with LLM-Powered Privacy

Language: Python - Size: 2.39 MB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 84 - Forks: 20

llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.

Language: Python - Size: 602 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 80 - Forks: 16

greynewell/musegpt

Local LLMs in your DAW!

Language: C++ - Size: 1.84 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 65 - Forks: 1

rbourgeat/ImpAI

😈 ImpAI is an advanced role play app using large language and diffusion models.

Language: JavaScript - Size: 22.5 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 63 - Forks: 4

ystemsrx/Code-Atlas

A C++ implementation of Open Interpreter, based on llama.cpp. / Open Interpreter 的 C++ 实现，基于 llama.cpp

Language: C++ - Size: 224 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 62 - Forks: 16

fboulnois/llama-cpp-docker

Run llama.cpp in a GPU accelerated Docker container

Language: Shell - Size: 32.2 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 42 - Forks: 13

hyparam/hyllama

llama.cpp gguf file parser for javascript

Language: JavaScript - Size: 153 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 42 - Forks: 3

blueraai/universal-intelligence

◉ Universal Intelligence: AI made simple.

Language: Python - Size: 4.86 MB - Last synced at: 8 days ago - Pushed at: 27 days ago - Stars: 40 - Forks: 3

tinyBigGAMES/Lumina

Local Generative AI

Language: Pascal - Size: 16.7 MB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 34 - Forks: 5

ossirytk/llama-cpp-chat-memory

Local character AI chatbot with chroma vector store memory and some scripts to process documents for Chroma

Language: Python - Size: 45.6 MB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 34 - Forks: 5

countzero/windows_llama.cpp

PowerShell automation to rebuild llama.cpp for a Windows environment.

Language: PowerShell - Size: 3.81 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 30 - Forks: 3

rbourgeat/llm-rp 📦

✨ Your Custom Offline Role Play with LLM and Stable Diffusion on Mac and Linux (for now) 🧙‍♂️

Language: Python - Size: 6.24 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 29 - Forks: 2

JavaLLM/llama4j 📦

An easy-to-use Java SDK for running LLaMA models on edge devices, powered by LLaMA.cpp

Language: Java - Size: 3.7 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 1

BrutalCoding/llama_dart

Flutter / Dart bindings for llama.cpp

Language: Dart - Size: 335 KB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 20 - Forks: 2

Rin313/StegLLM

离线的跨平台LLM文本隐写加密程序。Offline cross-platform LLM text steganography program.

Language: Astro - Size: 1.12 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 17 - Forks: 0

yportne13/chatbot-ui-llama.cpp

A static web ui for llama.cpp server. The llama.cpp chat interface for everyone. base on chatbot-ui

Language: TypeScript - Size: 2.05 MB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 16 - Forks: 6

ItzDerock/llama-playground

A simple to use and powerful web-interface to mess around with Meta's LLaMA LLM.

Language: TypeScript - Size: 266 KB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 2

svjack/CodeActAgent-Gradio

UnOfficial Gradio Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.

Language: Jupyter Notebook - Size: 608 KB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 13 - Forks: 1

tinyBigGAMES/JetInfero

Local LLM Inference Library

Language: Pascal - Size: 10.2 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 12 - Forks: 3

mybigday/llama.node

Node.js binding of llama.cpp

Language: C - Size: 29.3 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 11 - Forks: 2

controlecidadao/samantha_ia

Experimental interface environment for open source LLM, designed to democratize the use of AI. Powered by llama-cpp, llama-cpp-python and Gradio.

Language: Python - Size: 23.2 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 11 - Forks: 1

ashioyajotham/fingpt_trader

An algorithmic trading system based on FinGPT, demonstrating new applications of large pre-trained Language Models in quantitative finance.

Language: Python - Size: 879 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 4

robjsliwa/llama-agent

Fun project to run your own LLM chat bot using llama.cpp

Language: Jupyter Notebook - Size: 27.3 KB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 3

e-lab/SyntaxShaper

Powering Agent Chains by Constraining LLM Outputs

Language: Python - Size: 680 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 0

dipampaul17/KVSplit

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

Language: Python - Size: 717 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 0

acai66/qwen2.5_numpy

使用numpy实现DeepSeek-R1-Distill-Qwen-1.5B的推理过程，易于学习LLM推理与移植到其它编程语言加速。 Implementing the inference process of DeepSeek-R1-Distill-Qwen-1.5B using numpy, making it easy to learn LLM (Large Language Model) inference and to port to other programming languages for acceleration.

Language: Python - Size: 31.3 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 8 - Forks: 0

awinml/llama-cpp-python-bindings

Run fast LLM Inference using Llama.cpp in Python

Language: Jupyter Notebook - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 1

RAHB-REALTORS-Association/email-autodrafts 📦

Email Auto-ReplAI is a Python tool that uses AI to automate drafting responses to unread Gmail messages, streamlining email management tasks.

Language: Python - Size: 35.2 KB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 2

rudolfolah/metatron

Metatron is a project that brings together whisper.cpp, llama.cpp, and piper into a deployable stack with an awesome Node.js API wrapper for each of them.

Language: JavaScript - Size: 34.2 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 7 - Forks: 0

rbourgeat/llm-cmd

✨ LLM CMD is a toolbox allowing you to use LLM in daily developer commands 💻

Language: Python - Size: 196 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

mizy/local-agent-chat

a flutter llama.cpp chat ui

Language: Metal - Size: 2.52 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 1

kevinknights29/Llama-v2-GPU-GTX-1650

Running Llama v2 with Llama.cpp in a 4GB VRAM GTX 1650.

Language: Python - Size: 51.8 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

ShahabSH94/AutoCompleter

Auto Complete anything using a gguf model

Language: Python - Size: 1.27 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

mili-tan/Onllama.GGUFLinkOut

Create out symbolic links for the GGUF Models in Ollama Blobs. for use in other applications such as Llama.cpp/Jan/LMStudio etc. / 将 Ollama GGUF 模型文件软链接出，以便其他应用使用。

Language: C# - Size: 15.6 KB - Last synced at: 30 days ago - Pushed at: 5 months ago - Stars: 5 - Forks: 1

viniciusarruda/llama-cpp-chat-completion-wrapper

Wrapper around llama-cpp-python for chat completion with LLaMA v2 models.

Language: Jupyter Notebook - Size: 145 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

bdqfork/go-llama.cpp

go binding for llama.cpp, offer low level and high level api

Language: Go - Size: 52.7 KB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

statikfintechllc/GodCore

All-in-one local AI stack for Mistral-13B and Llama.cpp, with one-step CUDA wheel install, OpenAI-compatible API, and modern web dashboard. Switch between local and cloud chat, run on your own GPU, and deploy instantly—no API keys or paywalls. Designed for easy install, custom builds, and fast remote access. Enjoy!

Language: Python - Size: 25.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 0

LISA-ITMO/LLM-resume-moderator

Автоматизирует модерацию резюме на русском языке с помощью LLM. Для модерации используются t-it-1.0 модели.

Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 4 - Forks: 1

SullyGreene/TinyAGI

TinyAGI is a lightweight, modular, and extensible Python-based AGI framework designed to create and manage AI agents seamlessly. It supports various model backends like OpenAI, Llama.cpp, Ollama, AlpacaX, and Tabitha, along with dynamic plugin loading for enhanced flexibility.

Language: Python - Size: 271 KB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 1

countzero/windows_manage_large_language_models

PowerShell automation to download large language models (LLMs) from Git repositories and quantize them with llama.cpp into the GGUF format.

Language: PowerShell - Size: 32.2 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

ossirytk/llm_resources

Information and resources on everything related about running large language models locally and their development

Size: 2.64 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

RAHB-REALTORS-Association/transcriber-describer 📦

Transcribes videos and describes them with OpenAI APIs or local models.

Language: Python - Size: 45.9 KB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

eniompw/llama-cpp-gpu

Load larger models by offloading model layers to both GPU and CPU

Language: Jupyter Notebook - Size: 109 KB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

Demyblue/llm-hacker-news

LLM plugin for pulling content from Hacker News

Language: Python - Size: 10.7 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

kaust-generative-ai/local-deployment-of-generative-ai-models

Training materials on how to deploy generative AI models locally on your laptop or workstation.

Size: 6.94 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

jWinman91/AI-OCR

An AI-powered, but model-agnostic (Optical-Character-Recognition) OCR tool

Language: Python - Size: 79.1 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

haschka/CLI-RAG

Command line tool to Interact with a llama.cpp server. Also implements a basic vector database with cosine similarity search.

Language: C - Size: 43.9 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

dwain-barnes/LLM-GGUF-Auto-Converter

Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.

Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 2

SwamiKannan/LlamaCpp-Install-Procedure-in-Windows-and-CUDA

Size: 2.61 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

shekharP1536/ollama-web

Ollama Web UI is a simple yet powerful web-based interface for interacting with large language models. It offers chat history, voice commands, voice output, model download and management, conversation saving, terminal access, multi-model chat, and more—all in one streamlined platform.

Language: JavaScript - Size: 1.17 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 1