Topic: "vllm"
meta-llama/llama-cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
Language: Jupyter Notebook - Size: 214 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 17,273 - Forks: 2,476

xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Language: Python - Size: 44.9 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 7,804 - Forks: 665

OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)
Language: Python - Size: 2.54 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 6,674 - Forks: 651

katanaml/sparrow
Data processing and instruction calling with ML, LLM and Vision LLM
Language: Python - Size: 11.3 MB - Last synced at: 3 days ago - Pushed at: 7 days ago - Stars: 4,523 - Forks: 460

xlite-dev/Awesome-LLM-Inference
📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.
Language: Python - Size: 115 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 3,982 - Forks: 277

gpustack/gpustack
Manage GPU clusters for running AI models
Language: Python - Size: 95.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,680 - Forks: 266

containers/ramalama
Ramalama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.
Language: Python - Size: 2.64 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,600 - Forks: 183

bricks-cloud/BricksLLM
🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI, Azure OpenAI, Anthropic, vLLM, and open-source LLMs.
Language: Go - Size: 3.89 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 955 - Forks: 66

substratusai/kubeai
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
Language: Go - Size: 15.9 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 924 - Forks: 84

prometheus-eval/prometheus-eval
Evaluate your LLM's response with Prometheus and GPT4 💯
Language: Python - Size: 15 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 915 - Forks: 55

harleyszhang/llm_note
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
Language: Python - Size: 177 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 765 - Forks: 78

mostlygeek/llama-swap
Model swapping for llama.cpp (or any local OpenAPI compatible server)
Language: Go - Size: 907 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 738 - Forks: 38

vllm-project/vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
Language: Python - Size: 1.28 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 603 - Forks: 130

pgalko/BambooAI
A Python library powered by Language Models (LLMs) for conversational data discovery and analysis.
Language: Python - Size: 7.65 MB - Last synced at: about 2 hours ago - Pushed at: 2 days ago - Stars: 581 - Forks: 58

apconw/sanic-web
一个轻量级、支持全链路且易于二次开发的大模型应用项目(Large Model Data Assistant) 支持DeepSeek/Qwen2.5等大模型 基于 Dify 、Ollama&Vllm、Sanic 和 Text2SQL 📊 等技术构建的一站式大模型应用开发项目,采用 Vue3、TypeScript 和 Vite 5 打造现代UI。它支持通过 ECharts 📈 实现基于大模型的数据图形化问答,具备处理 CSV 文件 📂 表格问答的能力。同时,能方便对接第三方开源 RAG 系统 检索系统 🌐等,以支持广泛的通用知识问答。
Language: JavaScript - Size: 144 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 571 - Forks: 99

jakobdylanc/llmcord
Make Discord your LLM frontend ● Supports any OpenAI compatible API (Ollama, LM Studio, vLLM, OpenRouter, xAI, Mistral, Groq and more)
Language: Python - Size: 159 KB - Last synced at: about 3 hours ago - Pushed at: 2 days ago - Stars: 553 - Forks: 117

ModelCloud/GPTQModel
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Language: Python - Size: 12 MB - Last synced at: 3 days ago - Pushed at: 7 days ago - Stars: 540 - Forks: 77

ModelTC/llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Language: Python - Size: 28.9 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 470 - Forks: 53

varunvasudeva1/llm-server-docs
Documentation on setting up an LLM server on Debian from scratch, using Ollama/vLLM, Open WebUI, OpenedAI Speech/Kokoro FastAPI, and ComfyUI.
Size: 32.2 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 435 - Forks: 37

varunshenoy/super-json-mode
Low latency JSON generation using LLMs ⚡️
Language: Jupyter Notebook - Size: 652 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 400 - Forks: 14

HuiResearch/FlashTTS
基于SparkTTS、OrpheusTTS等模型,提供高质量中文语音合成与声音克隆服务。
Language: Python - Size: 31.1 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 383 - Forks: 53

microsoft/vidur
A large-scale simulation framework for LLM inference
Language: Python - Size: 156 MB - Last synced at: about 19 hours ago - Pushed at: 6 months ago - Stars: 374 - Forks: 65

runpod-workers/worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
Language: Python - Size: 26.4 MB - Last synced at: about 18 hours ago - Pushed at: about 19 hours ago - Stars: 314 - Forks: 153

chtmp223/topicGPT
TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)
Language: Python - Size: 828 KB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 290 - Forks: 47

jasonacox/TinyLLM
Setup and run a local LLM and Chatbot using consumer grade hardware.
Language: JavaScript - Size: 493 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 245 - Forks: 28

FlagOpen/RoboBrain
[CVPR 2025] RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete. Official Repository.
Language: Python - Size: 13.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 189 - Forks: 10

lucasjinreal/Namo-R1
A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.
Language: Python - Size: 1.13 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 185 - Forks: 17

shell-nlp/gpt_server
gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR和TTS的开源框架。
Language: Python - Size: 4.67 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 173 - Forks: 16

InftyAI/llmaz
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Language: Go - Size: 9.88 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 165 - Forks: 27

NetEase-Media/grps
Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, offering scalability, extensibility, and high performance. It helps users quickly deploy models and provide services through HTTP/RPC interfaces.
Language: C++ - Size: 67.8 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 157 - Forks: 13

gotzmann/booster
Booster - open accelerator for LLM models. Better inference and debugging for AI hackers
Language: C++ - Size: 144 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 155 - Forks: 7

yoziru/nextjs-vllm-ui
Fully-featured, beautiful web interface for vLLM - built with NextJS.
Language: TypeScript - Size: 6.07 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 135 - Forks: 18

JackYFL/awesome-VLLMs
This repository collects papers on VLLM applications. We will update new papers irregularly.
Size: 939 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 119 - Forks: 12

IDEA-Research/RexSeek
Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
Language: Python - Size: 9.55 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 112 - Forks: 8

nbasyl/DoRA 📦
Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"
Size: 557 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 104 - Forks: 2

Trainy-ai/llm-atc 📦
Fine-tuning and serving LLMs on any cloud
Language: Python - Size: 1.71 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 2

ALucek/ppt2desc
Convert PowerPoint files into semantically rich text using vision language models
Language: Python - Size: 1.42 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 84 - Forks: 7

OpenCSGs/llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
Language: Python - Size: 602 KB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 80 - Forks: 16

llmariner/llmariner
Extensible generative AI platform on Kubernetes with OpenAI-compatible APIs.
Language: Go - Size: 7.87 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 67 - Forks: 5

hyperai/vllm-cn
vLLM Documentation in Chinese Simplified / vLLM 中文文档
Language: TypeScript - Size: 7.45 MB - Last synced at: 7 days ago - Pushed at: 22 days ago - Stars: 63 - Forks: 5

VectorInstitute/vector-inference
Efficient LLM inference on Slurm clusters using vLLM.
Language: Python - Size: 2.79 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 59 - Forks: 10

intelligentnode/IntelliChat
Modern AI chatbot supporting multiple LLMs. Switch between Gemini, Mistral, Llama, Claude and ChatGPT.
Language: TypeScript - Size: 20.8 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 55 - Forks: 17

aws-samples/easy-model-deployer
A user-friendly Command-line/SDK tool that makes it quickly and easier to deploy open-source LLMs on AWS
Language: Python - Size: 45 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 42 - Forks: 7

argonne-lcf/LLM-Inference-Bench
LLM-Inference-Bench
Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 39 - Forks: 4

jeremyarancio/VLM-Batch-Deployment
Batch Deployment for Document Parsing with AWS Batch & Qwen-2.5-VL
Language: Jupyter Notebook - Size: 398 KB - Last synced at: 8 days ago - Pushed at: 18 days ago - Stars: 36 - Forks: 13

gameofdimension/vllm-cn
演示 vllm 对中文大语言模型的神奇效果
Language: Jupyter Notebook - Size: 152 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 31 - Forks: 1

lework/llm-benchmark
LLM 并发性能测试工具,支持自动化压力测试和性能报告生成。
Language: Python - Size: 117 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 28 - Forks: 6

LLM-inference-router/vllm-router
vLLM Router
Language: Python - Size: 45.9 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 28 - Forks: 1

phospho-app/fastassert
Dockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate limits. Compare the quality and latency to your current LLM API provider.
Language: Jupyter Notebook - Size: 176 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 0

France-Travail/happy_vllm
A REST API for vLLM, production ready
Language: Python - Size: 859 KB - Last synced at: 17 days ago - Pushed at: 25 days ago - Stars: 20 - Forks: 2

zRzRzRzRzRzRzR/lm-fly
大模型推理框架加速,让 LLM 飞起来
Language: Python - Size: 7.45 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 18 - Forks: 4

vam876/LocalAPI.AI
LocalAPI.AI is a local AI management tool for Ollama, offering Web UI management and compatibility with vLLM, LM Studio, llama.cpp, Mozilla-Llamafile, Jan Al, Cortex API, Local-LLM, LiteLLM, GPT4All, and more.
Language: HTML - Size: 1.25 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 17 - Forks: 0

YY0649/ICE-PIXIU
ICE-PIXIU:A Cross-Language Financial Megamodeling Framework
Language: Python - Size: 118 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 17 - Forks: 0

NEOS-AI/Neosearch
AI-based search engine done right
Language: TypeScript - Size: 99 MB - Last synced at: about 3 hours ago - Pushed at: about 4 hours ago - Stars: 16 - Forks: 0

scitix/arks
Arks is a cloud-native inference framework running on Kubernetes
Language: Go - Size: 1.78 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 16 - Forks: 4

sherlockchou86/PyLangPipe
a simple lightweight large language model pipeline framework.
Language: Python - Size: 790 KB - Last synced at: 27 days ago - Pushed at: 28 days ago - Stars: 16 - Forks: 2

Climatik-Project/Climatik-Project
Carbon Limiting Auto Tuning for Kubernetes
Language: Go - Size: 411 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 16 - Forks: 7

automatika-robotics/ros-agents
ROS Agents is a fully-loaded framework for creating interactive embodied agents that can understand, remember, and act upon contextual information from their environment.
Language: Python - Size: 3.64 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 15 - Forks: 0

hcd233/Aris-AI-Model-Server
An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API
Language: Python - Size: 1.05 MB - Last synced at: 8 days ago - Pushed at: 28 days ago - Stars: 14 - Forks: 1

lamalab-org/macbench
Probing the limitations of multimodal language models for chemistry and materials research
Language: Python - Size: 2.18 GB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 14 - Forks: 0

hienhayho/rag-colls
Collection of recent advanced RAG techniques.
Language: Python - Size: 10.1 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 12 - Forks: 4

itelnov/skeernir
UI to deploy locally agents and customise interaction with them
Language: Python - Size: 13.8 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 11 - Forks: 2

iNeil77/vllm-code-harness
Run code inference-only benchmarks quickly using vLLM
Language: Python - Size: 814 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 0

neuralmagic/nm-vllm-certs
General Information, model certifications, and benchmarks for nm-vllm enterprise distributions
Size: 877 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 11 - Forks: 2

joydeb28/llm-lab
LLM, Fine Tuning, Llama 2, Gemma, Mixtral, vLLM, LangChain, RAG, ChromaDB, FAISS
Language: Jupyter Notebook - Size: 123 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 6

blib-la/ask-poddy
Ask Poddy: Run Open Source LLMs and Embeddings as OpenAI-Compatible Serverless Endpoints (Tutorial)
Language: TypeScript - Size: 7.18 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 10 - Forks: 1

jparkerweb/down-craft
📑 npm pacakge to Craft files into Markdown with ease
Language: JavaScript - Size: 17.4 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 9 - Forks: 1

kyegomez/SimpleUnet
An simple implementation of Unet because all the implementations i've seen are wayy tooo complicated.
Language: Python - Size: 205 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 9 - Forks: 1

wangcx18/llm-vscode-inference-server
An endpoint server for efficiently serving quantized open-source LLMs for code.
Language: Python - Size: 85 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

qizhou000/VisEdit
[AAAI 2025 oral] Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit
Language: Python - Size: 3.46 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 8 - Forks: 0

KevinLee1110/dynamic-batching
The official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"
Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 1

FreeIPCC/LLM-ContactCenter-AI-CallCenter
LLM Call Center,AI Call Center,大模型呼叫中心,大模型客服系统,可以对接市面上主流模型与私有模型:OpenAI,LLaMA,Kimi,通义千问,智谱AI,讯飞星火,Gemini,Xorbits Inference,Amazon Bedrock,火山引擎,腾讯混元,Claude,Bard,DeepSeek,Azure OpenAI,千帆大模型,Ollama,qwen,vLLM
Language: TypeScript - Size: 23.2 MB - Last synced at: about 11 hours ago - Pushed at: about 12 hours ago - Stars: 7 - Forks: 4

vectara/mirage-bench
Repository for Multililngual Generation, RAG evaluations, and surrogate judge training for Arena RAG leaderboard (NAACL'25)
Language: Python - Size: 2.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 0

Tangkfan/Awesome-Temporal-Video-Grounding
paper list on Video Moment Retrieval (VMR), or Temporal Video Grounding (TVG), Video Grounding (VG), or Temporal Sentence Grounding in Videos (TSGV)
Size: 59.6 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 6 - Forks: 0

bluechanel/deploy_llm
Rapid Deployment of LLM and Embedding Based on VLLM Using Docker
Language: Python - Size: 375 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 1

arc53/doc2md
Convert pdf and image files into markdown
Language: TypeScript - Size: 269 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0

France-Travail/benchmark_llm_serving
A library to benchmark LLMs via their API exposure
Language: Python - Size: 8.04 MB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0

NetEase-Media/grps_vllm
【grps接入vllm】通过vllm LLMEngine Api实现LLM服务。
Language: Python - Size: 177 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 6 - Forks: 0

ivangabriele/docker-llm
Pre-loaded LLMs served as an OpenAI-Compatible API via Docker images.
Language: Dockerfile - Size: 199 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 2

itsvaibhav01/Immune
[CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
Language: Python - Size: 2.77 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

ivangabriele/docker-functionary
Ready-to-deploy Docker image for Functionary LLM served as an OpenAI-Compatible API.
Language: Dockerfile - Size: 27.3 KB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 5 - Forks: 1

asprenger/ray_vllm_inference
A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
Language: Python - Size: 81.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

NJUxlj/Chinese-MedQA-Qwen2
基于Qwen2+SFT+DPO的医疗问答系统,项目中使用了LLaMA-Factory用于训练,fastllm和vllm用于推理,
Language: Python - Size: 523 KB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 4 - Forks: 0

stackav-oss/conch
A "standard library" of Triton kernels.
Language: Python - Size: 293 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 0

aws-samples/multi-modal-examples-for-amazon-sagemaker
A workshop for collections of multi-modal LLM examples, samples, reference architecture and demos on Amazon SageMaker.
Language: Jupyter Notebook - Size: 33.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 2

moeru-ai/demodel
🚀🛸 Easily boost the speed of pulling your models and datasets from various of inference runtimes. (e.g. 🤗 HuggingFace, 🐫 Ollama, vLLM, and more!)
Language: Rust - Size: 47.9 KB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 4 - Forks: 0

yas-sim/openvino_genai_sample_codes
OpenVINO.genai sample codes with a helper class that supports vLLM-like iterator-based streaming output.
Language: Python - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 4 - Forks: 0

esmailza/Llama2-vLLM-LangChain-knowledge-graph
Preserving entities through the integration of knowledge graphs, Llama 2, vLLM, and LangChain.
Language: Python - Size: 763 KB - Last synced at: 5 days ago - Pushed at: 12 months ago - Stars: 4 - Forks: 0

gusanmaz/echosight
EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
Language: Python - Size: 213 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

cosmic-heart/AI-Learning-Platform
AI-Learning-Platform, a LLM-RAG pipeline which behaves like a guide and able to solve doubts. Deployed on-premise IBM ppc64le architecture. vLLM for model inference & Qdrant with Langchain for RAG Pipeline. Server written in django, postgres & cassandra as the sql & nosql databases.
Language: Python - Size: 1.71 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 3 - Forks: 0

0-mostafa-rezaee-0/Batch_LLM_Inference_with_Ray_Data_LLM
Batch LLM Inference with Ray Data LLM: From Simple to Advanced
Language: Jupyter Notebook - Size: 1.63 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 1

Getty/langertha
Perl Framework for AI - Langertha - the viking of AI
Language: Perl - Size: 326 KB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

Joshue2006/LLM-Reasoner
Make any LLM to think like OpenAI o1 and deepseek R1
Size: 1.95 KB - Last synced at: 22 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

claw1200/llama-cord
Discord App for Interacting with local Ollama Models. Multiple Agents Supported!
Language: Python - Size: 60.5 KB - Last synced at: 26 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

lucataco/cog-Hermes-2-Pro-Llama-3-8B
Cog wrapper for NousResearch/Hermes-2-Pro-Llama-3-8B
Language: Python - Size: 3.91 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

TimeSurgeLabs/promptproxy
Call many AIs from a single API.
Language: Go - Size: 284 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

lklivingstone/sih_2023
A Large Language Model based tool for generating human like responses to natural language inputs for network not connected over internet.
Language: Python - Size: 646 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

Liquid4All/on-prem-stack
Scripts to launch Liquid on-prem stack
Language: Shell - Size: 195 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2 - Forks: 1

umi-AIGC-saas/umi_ai_cms
双重驱动的智能AI系统,它对接了目前市场上主流的AI大模型,并根据这些大模型的优劣势进行算法分类。通过综合利用各种AI大模型的优势,无忧AI智脑能够提供更准 确、更可靠的信息和解答。
Language: Python - Size: 4.16 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2 - Forks: 0

xxrjun/local-inference
🐑 Run LLM inference locally for various downstream applications.
Language: Shell - Size: 2.53 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0
