GitHub topics: distributed-parallel-training
300degree/MPI-Cluster
Making an MPI Cluster using two nodes (1 Master, 1 Slave) within the same LAN
Language: C - Size: 104 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

AdpartSim/AdpartSim
A Distributed Parallel Training Simulation Tool (AdpartSim) for Data Center focuses on helping us study and simulate the parallel optimization strategies of Large Models (LM), as well as the impact of network topology and collective communication on the training efficiency of LM.
Language: C++ - Size: 360 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1
