GitHub topics: llm-server

Repositories

onnx/turnkeyml

Local LLM Server with NPU Acceleration

Language: Python - Size: 1.9 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 173 - Forks: 23

pikocloud/pikobrain

Function-calling API for LLM from multiple providers

Language: Go - Size: 408 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 5 - Forks: 0

fcn94/llm_stream_endpoint

Simple LLM Rest API using Rust, Warp and Candle. Dedicated for quantized version of either phi-2 ( default) , Mistral, or Llama. Work using CPU or CUDA

Language: Rust - Size: 74.2 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 1

Related Keywords

llm-server 3 llm 2 api 2 openai 1 rag 1 candle 1 cpu 1 cuda 1 cuda-kernels 1 endpoint 1 llama2 1 mistral-7b 1 phi-2 1 quantized 1 rest 1 rest-api 1 rust 1 streaming 1 ollama 1 gemini 1 function-calling 1 aws-bedrock 1 toolchain 1 onnxruntime-genai 1 onnx 1 npu 1 local-server 1 igpu 1 gpu 1 benchmark 1 amd 1 ai 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub topics: llm-server

onnx/turnkeyml

pikocloud/pikobrain

fcn94/llm_stream_endpoint