GitHub topics: large-model
QInzhengk/Math-Model-and-Machine-Learning
数学建模和机器学习/深度学习/大模型的笔记和资料(持续更新中......)。
Language: Jupyter Notebook - Size: 7.12 GB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 485 - Forks: 116

OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language: Python - Size: 20.7 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 3,233 - Forks: 262

Time-MoE/Time-MoE
[ICLR 2025 Spotlight] Official implementation of "Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts"
Language: Python - Size: 697 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 588 - Forks: 55

thuml/OpenLTM
Open Implementations of Large Time-Series Models
Language: Python - Size: 2.42 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 258 - Forks: 13

long123524/TFNet
Official code: "Integrating Segment Anything Model derived boundary prior and high-level semantics for cropland extraction from high-resolution remote sensing images
Language: Python - Size: 819 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 6 - Forks: 2

knagrecha/saturn
Saturn accelerates the training of large-scale deep learning models with a novel joint optimization approach.
Language: Python - Size: 107 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 5

whisperpine/ollama-compose
Ollama docker compose.
Language: YAML - Size: 1000 Bytes - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

asigalov61/Giant-Music-Transformer
[SOTA] [92% acc] 786M-8k-44L-32H multi-instrumental music transformer with true full MIDI instruments range, efficient encoding, octo-velocity and outro tokens
Language: Python - Size: 13.6 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 43 - Forks: 6

Shenggan/atp
Adaptive Tensor Parallelism for Foundation Models
Language: Python - Size: 3.22 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 0

AdpartSim/AdpartSim
A Distributed Parallel Training Simulation Tool (AdpartSim) for Data Center focuses on helping us study and simulate the parallel optimization strategies of Large Models (LM), as well as the impact of network topology and collective communication on the training efficiency of LM.
Language: C++ - Size: 360 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

windson/inferentia-deployments
Deploy Large Models on AWS Inferentia (Inf2) instances.
Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

FreeTwilight/OpenPrompt
FreeTwilight OpenPrompt is an AI large model pre-training prompt word engineering framework and solution.
Size: 3.91 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0
