GitHub topics: large-model
knagrecha/saturn
Saturn accelerates the training of large-scale deep learning models with a novel joint optimization approach.
Language: Python - Size: 107 KB - Last synced at: about 14 hours ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 5

thuml/OpenLTM
Implementations, Pre-training Code and Datasets of Large Time-Series Models
Language: Jupyter Notebook - Size: 2.85 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 351 - Forks: 25

Time-MoE/Time-MoE
[ICLR 2025 Spotlight] Official implementation of "Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts"
Language: Python - Size: 700 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 656 - Forks: 65

long123524/TFNet
Official code: "Integrating Segment Anything Model derived boundary prior and high-level semantics for cropland extraction from high-resolution remote sensing images
Language: Python - Size: 829 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 13 - Forks: 2

QInzhengk/Math-Model-and-Machine-Learning
数学建模和机器学习/深度学习/大模型的笔记和资料(持续更新中......)。
Language: Jupyter Notebook - Size: 7.12 GB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 487 - Forks: 116

OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language: Python - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 3,239 - Forks: 262

whisperpine/ollama-compose
Ollama docker compose.
Language: YAML - Size: 1000 Bytes - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

asigalov61/Giant-Music-Transformer
[SOTA] [92% acc] 786M-8k-44L-32H multi-instrumental music transformer with true full MIDI instruments range, efficient encoding, octo-velocity and outro tokens
Language: Python - Size: 13.6 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 43 - Forks: 6

Shenggan/atp
Adaptive Tensor Parallelism for Foundation Models
Language: Python - Size: 3.22 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 0

AdpartSim/AdpartSim
A Distributed Parallel Training Simulation Tool (AdpartSim) for Data Center focuses on helping us study and simulate the parallel optimization strategies of Large Models (LM), as well as the impact of network topology and collective communication on the training efficiency of LM.
Language: C++ - Size: 360 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

windson/inferentia-deployments
Deploy Large Models on AWS Inferentia (Inf2) instances.
Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

FreeTwilight/OpenPrompt
FreeTwilight OpenPrompt is an AI large model pre-training prompt word engineering framework and solution.
Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0
