GitHub topics: batch-inference
ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM
Batch LLM Inference with Ray Data LLM: From Simple to Advanced
Language: Dockerfile - Size: 1.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 1
kimmmmyy223/llm-batch
đ Process JSON data in batches with `llm-batch`, leveraging sequential or parallel modes for efficient interaction with LLMs.
Language: Go - Size: 1.3 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0
mili-tan/Onllama.OllamaBatch
įŽåį Ollama JSONL æšéæ¨įåˇĨå ˇ / Simple Ollama JSONL batch inference tool.
Language: C# - Size: 82 KB - Last synced at: 15 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 2
tungngreen/PipelineScheduler
PipelineScheduler optimizes workload distribution between servers and edge devices, setting optimal batch sizes to maximize throughput and minimize latency amid content dynamics and network instability. It also addresses resource contention with spatiotemporal inference scheduling to reduce co-location interference.
Language: C++ - Size: 14 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 2
milenkovicm/lightfusion
LightGBM Inference on Datafusion
Language: Rust - Size: 9.82 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0
sutro-sh/sutro
Analyze and generate unstructured data using LLMs, from quick experiments to billion token jobs.
Language: Python - Size: 20.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 9 - Forks: 1
milenkovicm/torchfusion
Torchfusion is a very opinionated torch inference on datafusion.
Language: Rust - Size: 93.8 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 5 - Forks: 0
0-mostafa-rezaee-0/Batch_LLM_Inference_with_Ray_Data_LLM
Batch LLM Inference with Ray Data LLM: From Simple to Advanced
Language: Jupyter Notebook - Size: 1.63 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1
ttanzhiqiang/onnx_tensorrt_project
Support Yolov5(4.0)/Yolov5(5.0)/YoloR/YoloX/Yolov4/Yolov3/CenterNet/CenterFace/RetinaFace/Classify/Unet. use darknet/libtorch/pytorch/mxnet to onnx to tensorrt
Language: C++ - Size: 30.7 MB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 214 - Forks: 43
brnaguiar/mlops-next-watch
MLOps project that recommends movies to watch implementing Data Engineering and MLOps best practices.
Language: Jupyter Notebook - Size: 3.46 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1
ray-project/ray-saturday-dec-2022 đĻ
Ray Saturday Dec 2022 edition
Language: Jupyter Notebook - Size: 18 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 1
yuwenmichael/Grounding-DINO-Batch-Inference
Support batch inference of Grounding DINO. "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language: Jupyter Notebook - Size: 7.73 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0
jasper881108/audio-foundation-finetune
Repo for whisper tuning on Transcribe Ecosystem, including fine-tuning, web-crawling, label-rendering and batch inferencing
Language: Python - Size: 8.19 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0
louisoutin/yolov5_torchserve
Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready and real time inference.
Language: Python - Size: 446 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 89 - Forks: 17
SABER-labs/torch_batcher
Serve pytorch inference requests using batching with redis for faster performance.
Language: Python - Size: 13.7 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0
kyoro1/image_analysis_with_automl_in_azure
This repository provides sample codes, which enable you to learn how to use auto-ml image classification, or object detection under Azure ML(AML) environment.
Language: Jupyter Notebook - Size: 692 KB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0
rohanchauhan/azure-batch-inference-service
We perform batch inference on lead scoring task using Pyspark.
Language: Jupyter Notebook - Size: 522 KB - Last synced at: almost 3 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0
una-ai-mlops-agency/ML-Batch-Serving
[WIP] Advanced workshop covering ML Batch serving on Azure
Size: 7.81 KB - Last synced at: 9 months ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0