An open API service providing repository metadata for many open source software ecosystems.

Topic: "pipeline-parallelism"

hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

Language: Python - Size: 62.9 MB - Last synced at: about 13 hours ago - Pushed at: about 13 hours ago - Stars: 40,819 - Forks: 4,496

deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language: Python - Size: 217 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 38,040 - Forks: 4,342

bigscience-workshop/petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Language: Python - Size: 4.06 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 9,591 - Forks: 553

kakaobrain/torchgpipe

A GPipe implementation in PyTorch

Language: Python - Size: 449 KB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 836 - Forks: 99

PaddlePaddle/PaddleFleetX

飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。

Language: Python - Size: 637 MB - Last synced at: 2 days ago - Pushed at: 11 months ago - Stars: 465 - Forks: 164

Coobiw/MPP-LLaVA

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.

Language: Jupyter Notebook - Size: 73.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 420 - Forks: 23

Oneflow-Inc/libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

Language: Python - Size: 34.6 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 402 - Forks: 56

InternLM/InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Language: Python - Size: 6.77 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 382 - Forks: 64

alibaba/EasyParallelLibrary

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Language: Python - Size: 771 KB - Last synced at: 22 days ago - Pushed at: about 2 years ago - Stars: 267 - Forks: 49

Shenggan/awesome-distributed-ml

A curated list of awesome projects and papers for distributed training or inference

Size: 44.9 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 231 - Forks: 27

torchpipe/torchpipe

Serving Inside Pytorch

Language: C++ - Size: 41.4 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 160 - Forks: 13

ai-decentralized/BloomBee

Decentralized LLMs fine-tuning and inference with offloading

Language: Python - Size: 36.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 88 - Forks: 13

xrsrke/pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

Language: Python - Size: 1.26 MB - Last synced at: about 2 hours ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 18

AlibabaPAI/DAPPLE

An Efficient Pipelined Data Parallel Approach for Training Large Model

Language: Python - Size: 1.64 MB - Last synced at: 13 days ago - Pushed at: over 4 years ago - Stars: 73 - Forks: 17

ParCIS/Chimera

Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.

Language: Python - Size: 1.05 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 62 - Forks: 8

saareliad/FTPipe

FTPipe and related pipeline model parallelism research.

Language: Python - Size: 11.4 MB - Last synced at: 13 days ago - Pushed at: almost 2 years ago - Stars: 41 - Forks: 7

nawnoes/pytorch-gpt-x

Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.

Language: Python - Size: 2.98 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 29 - Forks: 2

fanpu/DynPartition

Official implementation of DynPartition: Automatic Optimal Pipeline Parallelism of Dynamic Neural Networks over Heterogeneous GPU Systems for Inference Tasks

Language: Python - Size: 135 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

torchpipe/torchpipe.github.io

Docs for torchpipe: https://github.com/torchpipe/torchpipe

Language: MDX - Size: 7.86 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 4 - Forks: 1

LER0ever/HPGO

Development of Project HPGO | Hybrid Parallelism Global Orchestration

Size: 5.29 MB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

garg-aayush/model-parallelism

Model parallelism for NN architectures with skip connections (eg. ResNets, UNets)

Language: Python - Size: 6.85 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

1set-t/ai-model

Industrial-grade weather visualization system that transforms AI model predictions into professional meteorological plots, emphasizing operational forecasting capabilities.

Size: 1.95 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Related Topics
model-parallelism 11 pytorch 11 deep-learning 9 data-parallelism 8 machine-learning 6 tensor-parallelism 4 inference 4 large-scale 4 distributed-training 4 gpipe 3 transformer 3 distributed-systems 3 nlp 3 gpt 2 serving 2 tensorrt 2 gpu 2 large-language-models 2 llama 2 mixture-of-experts 2 sequence-parallelism 2 fine-tuning 2 pretraining 2 self-supervised-learning 2 deepspeed 2 transformers 2 neural-networks 2 deployment 2 transformers-models 1 distribution-strategy-planner 1 ring-attention 1 ray 1 multi-modal 1 llm-training 1 llm-framework 1 serve 1 triton-inference-server 1 llava 1 llama3 1 internlm2 1 internlm 1 gemma 1 flash-attention 1 deepspeed-ulysses 1 910b 1 torch2trt 1 t5 1 deep-neural-networks 1 hybrid-parallelism 1 big-model 1 distributed-computing 1 foundation-models 1 ai 1 zero 1 trillion-parameters 1 compression 1 billion-parameters 1 vision-transformer 1 oneflow 1 python 1 heterogeneous-training 1 hpc 1 pose-estimation 1 object-detection 1 llm-serving 1 multimodal 1 mlops 1 llms 1 image-classification 1 face-detection 1 data-science 1 zero3 1 3d-parallelism 1 volunteer-computing 1 pretrained-models 1 mixtral 1 language-models 1 guanaco 1 falcon 1 chatbot 1 bloom 1 parallelism 1 checkpointing 1 memory-efficient 1 treelstm 1 scheduling 1 reinforcement-learning 1 dynpartition 1 dynamic-neural-network 1 distributed-deep-learning 1 tensorflow 1 rust 1 pipedream 1 high-performance-computing 1 unsupervised-learning 1 paddlepaddle 1 paddlecloud 1 lightning 1 fleet-api 1 elastic 1