model-parallelism | Topic | Ecosyste.ms: Repos

Topic: "model-parallelism"

hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

Language: Python - Size: 62.8 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 40,831 - Forks: 4,499

deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language: Python - Size: 217 MB - Last synced at: about 3 hours ago - Pushed at: about 4 hours ago - Stars: 38,178 - Forks: 4,349

kakaobrain/torchgpipe

A GPipe implementation in PyTorch

Language: Python - Size: 449 KB - Last synced at: 12 days ago - Pushed at: 9 months ago - Stars: 836 - Forks: 99

PaddlePaddle/PaddleFleetX

飞桨大模型开发套件，提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。

Language: Python - Size: 637 MB - Last synced at: 7 days ago - Pushed at: 11 months ago - Stars: 465 - Forks: 164

Oneflow-Inc/libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

Language: Python - Size: 34.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 402 - Forks: 56

kaiyuyue/torchshard

Slicing a PyTorch Tensor Into Parallel Shards

Language: Python - Size: 4.8 MB - Last synced at: about 7 hours ago - Pushed at: almost 4 years ago - Stars: 298 - Forks: 15

alibaba/EasyParallelLibrary

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Language: Python - Size: 771 KB - Last synced at: 27 days ago - Pushed at: about 2 years ago - Stars: 267 - Forks: 49

Shenggan/awesome-distributed-ml

A curated list of awesome projects and papers for distributed training or inference

Size: 44.9 KB - Last synced at: 9 days ago - Pushed at: 7 months ago - Stars: 231 - Forks: 27

xrsrke/pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

Language: Python - Size: 1.26 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 18

tanyuqian/redco

NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference

Language: Python - Size: 11.5 MB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 65 - Forks: 7

hkproj/pytorch-transformer-distributed

Distributed training (multi-node) of a Transformer model

Language: Python - Size: 4.03 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 63 - Forks: 26

NERSC/sc23-dl-tutorial

SC23 Deep Learning at Scale Tutorial Material

Language: Python - Size: 15.7 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 41 - Forks: 9

ryantd/veloce

WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.

Language: Python - Size: 9.13 MB - Last synced at: 21 days ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 0

AlibabaPAI/FlashModels

Fast and easy distributed model training examples.

Language: Python - Size: 42.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 9 - Forks: 4

Shenggan/atp

Adaptive Tensor Parallelism for Foundation Models

Language: Python - Size: 3.22 MB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 0

atakehiro/3D-U-Net-pytorch-model-parallel

PyTorch implementation of 3D U-Net with model parallel in 2GPU for large model

Language: Python - Size: 85 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 9 - Forks: 1

dlzou/computron

Serving distributed deep learning models with model parallel swapping.

Language: Jupyter Notebook - Size: 2.1 MB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 1

fanpu/DynPartition

Official implementation of DynPartition: Automatic Optimal Pipeline Parallelism of Dynamic Neural Networks over Heterogeneous GPU Systems for Inference Tasks

Language: Python - Size: 135 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

sjlee25/legion-readme

Description of Framework for Efficient Fused-layer Cost Estimation, Legion (2021)

Size: 1.58 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

LER0ever/HPGO

Development of Project HPGO | Hybrid Parallelism Global Orchestration

Size: 5.29 MB - Last synced at: 3 days ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

dscpesu/NetTorrent

A decentralized and distributed framework for training DNNs

Language: Python - Size: 9.92 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

NERSC/dl-at-scale-training

Deep Learning at Scale Training Event at NERSC

Language: Python - Size: 17.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 2

zjc664656505/LinguaLinked

Distributed-Parallelism over Heterogeneous Devices

Language: Python - Size: 302 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

AnveshaM/Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Language: Jupyter Notebook - Size: 8.88 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

garg-aayush/model-parallelism

Model parallelism for NN architectures with skip connections (eg. ResNets, UNets)

Language: Python - Size: 6.85 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

EunjuYang/distributed-tf

distributed tensorflow (model parallelism) example repository

Language: Python - Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 0

d4l3k/axe

A simple graph partitioning algorithm written in Go. Designed for use for partitioning neural networks across multiple devices which has an added cost when crossing device boundaries.

Language: Go - Size: 9.77 KB - Last synced at: 29 days ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

zhuangsc/altsplit

An MPI-based distributed model parallelism technique for MLP

Language: C - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 2

mkrdip/alcf

Contains materials of internship at ALCF during summer of 2019

Language: Python - Size: 24.4 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

joelrorseth/HyperTune

A fully distributed hyperparameter optimization tool for PyTorch DNNs

Language: Python - Size: 6.48 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ankahira/chainermnx

Extended ChainerMN

Language: Python - Size: 338 KB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

olk/mnist-performance

performance test of MNIST hand writings usign MXNet + TF

Language: Python - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

mzj14/mesh Fork of tensorflow/mesh

Mesh TensorFlow: Model Parallelism Made Easier

Language: Python - Size: 1.01 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0