Topic: "tensor-parallelism"
bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
Language: Python - Size: 4.06 MB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 9,616 - Forks: 554

InternLM/InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
Language: Python - Size: 6.73 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 386 - Forks: 65

kaiyuyue/torchshard
Slicing a PyTorch Tensor Into Parallel Shards
Language: Python - Size: 4.8 MB - Last synced at: 18 days ago - Pushed at: almost 4 years ago - Stars: 298 - Forks: 15

ai-decentralized/BloomBee
Decentralized LLMs fine-tuning and inference with offloading
Language: Python - Size: 36.6 MB - Last synced at: about 4 hours ago - Pushed at: 6 days ago - Stars: 91 - Forks: 15

xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Language: Python - Size: 1.26 MB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 18

ShinoharaHare/LLM-Training
A distributed training framework for large language models powered by Lightning.
Language: Python - Size: 281 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 19 - Forks: 4

aniquetahir/JORA
JORA: JAX Tensor-Parallel LoRA Library
Language: Python - Size: 6.89 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 18 - Forks: 0

AlibabaPAI/FlashModels
Fast and easy distributed model training examples.
Language: Python - Size: 42.9 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 9 - Forks: 4

fattorib/transformer_shmap
Tensor Parallelism with JAX + Shard Map
Language: Python - Size: 85 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
