GitHub topics: batch-inference

Repositories

ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM

Batch LLM Inference with Ray Data LLM: From Simple to Advanced

Language: Dockerfile - Size: 1.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 1

kimmmmyy223/llm-batch

🚀 Process JSON data in batches with `llm-batch`, leveraging sequential or parallel modes for efficient interaction with LLMs.

Language: Go - Size: 1.3 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

mili-tan/Onllama.OllamaBatch

简单的 Ollama JSONL 批量推理工具 / Simple Ollama JSONL batch inference tool.

Language: C# - Size: 82 KB - Last synced at: 15 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 2

PipelineScheduler optimizes workload distribution between servers and edge devices, setting optimal batch sizes to maximize throughput and minimize latency amid content dynamics and network instability. It also addresses resource contention with spatiotemporal inference scheduling to reduce co-location interference.

Language: C++ - Size: 14 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 2

milenkovicm/lightfusion

LightGBM Inference on Datafusion

Language: Rust - Size: 9.82 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

sutro-sh/sutro

Analyze and generate unstructured data using LLMs, from quick experiments to billion token jobs.

Language: Python - Size: 20.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 9 - Forks: 1

milenkovicm/torchfusion

Torchfusion is a very opinionated torch inference on datafusion.

Language: Rust - Size: 93.8 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 5 - Forks: 0

0-mostafa-rezaee-0/Batch_LLM_Inference_with_Ray_Data_LLM

Batch LLM Inference with Ray Data LLM: From Simple to Advanced

Language: Jupyter Notebook - Size: 1.63 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

ttanzhiqiang/onnx_tensorrt_project

Support Yolov5(4.0)/Yolov5(5.0)/YoloR/YoloX/Yolov4/Yolov3/CenterNet/CenterFace/RetinaFace/Classify/Unet. use darknet/libtorch/pytorch/mxnet to onnx to tensorrt

Language: C++ - Size: 30.7 MB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 214 - Forks: 43

brnaguiar/mlops-next-watch

MLOps project that recommends movies to watch implementing Data Engineering and MLOps best practices.

Language: Jupyter Notebook - Size: 3.46 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

ray-project/ray-saturday-dec-2022 📦

Ray Saturday Dec 2022 edition

Language: Jupyter Notebook - Size: 18 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 1

yuwenmichael/Grounding-DINO-Batch-Inference

Support batch inference of Grounding DINO. "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Language: Jupyter Notebook - Size: 7.73 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

jasper881108/audio-foundation-finetune

Repo for whisper tuning on Transcribe Ecosystem, including fine-tuning, web-crawling, label-rendering and batch inferencing

Language: Python - Size: 8.19 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

louisoutin/yolov5_torchserve

Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready and real time inference.

Language: Python - Size: 446 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 89 - Forks: 17

SABER-labs/torch_batcher

Serve pytorch inference requests using batching with redis for faster performance.

Language: Python - Size: 13.7 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

kyoro1/image_analysis_with_automl_in_azure

This repository provides sample codes, which enable you to learn how to use auto-ml image classification, or object detection under Azure ML(AML) environment.

Language: Jupyter Notebook - Size: 692 KB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

rohanchauhan/azure-batch-inference-service

We perform batch inference on lead scoring task using Pyspark.

Language: Jupyter Notebook - Size: 522 KB - Last synced at: almost 3 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

una-ai-mlops-agency/ML-Batch-Serving

[WIP] Advanced workshop covering ML Batch serving on Azure

Size: 7.81 KB - Last synced at: 9 months ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

Related Keywords

pytorch 5 machine-learning 3 mlops 3 vllm 3 distributed-computing 3 ray-data 3 large-language-models 3 object-detection 3 nlp 3 data-engineering 2 llm-inference 2 azure 2 userdefined-functions 2 llm 2 deep-learning 2 llm-api 2 parallel-processing 2 ray 2 ray-serve 2 azure-machine-learning 2 sql 2 rust 2 inference 2 datafusion 2 udf 1 lightgbm 1 groundingdino 1 ollama-interface 1 ollama-client 1 ollama-api 1 ollama 1 batch-processing 1 batch 1 whisperx 1 whisper 1 transcribe 1 lora 1 label-studio 1 spark 1 recommender-system 1 prometheus 1 postgresql 1 movie-recommendation 1 mlflow 1 minio 1 grafana 1 dvc 1 batch-scoring 1 aws-s3 1 artificial-intelligence 1 airflow 1 react 1 rabbitmq 1 python 1 ops 1 llm-agent 1 language-model 1 flask 1 dynamic-batching 1 bedrock 1 aws 1 pyspark 1 lead-scoring 1 managed-identity 1 image-classification 1 azure-machine-learning-pipeline 1 automl 1 pipelines 1 mlops-workshop 1 huggingface 1 distributed-inference 1 data-processing 1 data-pipelines 1 csv 1 model-serving 1 gpu-scheduling 1 dnn-serving 1 torchserve 1 service 1 docker 1 yolox 1 yolov5 1 yolov4 1 yolor 1 unet 1 retinaface 1 onnx-tensorrt 1 mxnet 1 libtorch 1 darknet 1 classify 1 centernet 1 centerface 1 fine-tuning 1 faster-whisper 1 crawler 1 common-voice 1 asr 1 semantic-segmentation 1 ray-distributed 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos