GitHub topics: large-model

Repositories

knagrecha/saturn

Saturn accelerates the training of large-scale deep learning models with a novel joint optimization approach.

Language: Python - Size: 107 KB - Last synced at: about 14 hours ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 5

thuml/OpenLTM

Implementations, Pre-training Code and Datasets of Large Time-Series Models

Language: Jupyter Notebook - Size: 2.85 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 351 - Forks: 25

Time-MoE/Time-MoE

[ICLR 2025 Spotlight] Official implementation of "Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts"

Language: Python - Size: 700 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 656 - Forks: 65

long123524/TFNet

Official code: "Integrating Segment Anything Model derived boundary prior and high-level semantics for cropland extraction from high-resolution remote sensing images

Language: Python - Size: 829 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 13 - Forks: 2

QInzhengk/Math-Model-and-Machine-Learning

数学建模和机器学习/深度学习/大模型的笔记和资料（持续更新中......）。

Language: Jupyter Notebook - Size: 7.12 GB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 487 - Forks: 116

OpenGVLab/Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language: Python - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 3,239 - Forks: 262

whisperpine/ollama-compose

Ollama docker compose.

Language: YAML - Size: 1000 Bytes - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

asigalov61/Giant-Music-Transformer

[SOTA] [92% acc] 786M-8k-44L-32H multi-instrumental music transformer with true full MIDI instruments range, efficient encoding, octo-velocity and outro tokens

Language: Python - Size: 13.6 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 43 - Forks: 6

Shenggan/atp

Adaptive Tensor Parallelism for Foundation Models

Language: Python - Size: 3.22 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 0

A Distributed Parallel Training Simulation Tool (AdpartSim) for Data Center focuses on helping us study and simulate the parallel optimization strategies of Large Models (LM), as well as the impact of network topology and collective communication on the training efficiency of LM.

Language: C++ - Size: 360 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1