Topic: "mixture-of-experts"
deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language: Python - Size: 217 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 38,300 - Forks: 4,360

dvmazur/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
Language: Python - Size: 261 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 2,311 - Forks: 232

codelion/optillm
Optimizing inference proxy for LLMs
Language: Python - Size: 1.7 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,226 - Forks: 174

learning-at-home/hivemind
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
Language: Python - Size: 12.1 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 2,176 - Forks: 186

PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
Language: Python - Size: 16.5 MB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 2,158 - Forks: 133

rhymes-ai/Aria
Codebase for Aria - an Open Multimodal Native MoE
Language: Jupyter Notebook - Size: 120 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 995 - Forks: 83

pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Language: Python - Size: 1.69 MB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 961 - Forks: 56

microsoft/Tutel
Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4
Language: C - Size: 1.11 MB - Last synced at: 2 days ago - Pushed at: 4 days ago - Stars: 820 - Forks: 97

davidmrau/mixture-of-experts
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Language: Python - Size: 73.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 818 - Forks: 88

SMTorg/smt
Surrogate Modeling Toolbox
Language: Jupyter Notebook - Size: 163 MB - Last synced at: 16 days ago - Pushed at: 19 days ago - Stars: 755 - Forks: 215

lucidrains/mixture-of-experts
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
Language: Python - Size: 136 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 744 - Forks: 59

AviSoori1x/makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
Language: Jupyter Notebook - Size: 6.96 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 686 - Forks: 73

drawbridge/keras-mmoe
A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
Language: Python - Size: 9.11 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 670 - Forks: 219

ymcui/Chinese-Mixtral
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
Language: Python - Size: 519 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 603 - Forks: 44

lucidrains/st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
Language: Python - Size: 178 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 326 - Forks: 28

lucidrains/soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Language: Python - Size: 1.38 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 290 - Forks: 8

Luodian/Generalizable-Mixture-of-Experts
GMoE could be the next backbone model for many kinds of generalization task.
Language: Python - Size: 2.04 MB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 269 - Forks: 35

inferflow/inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
Language: C++ - Size: 1.89 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 243 - Forks: 25

SkyworkAI/MoH
MoH: Multi-Head Attention as Mixture-of-Head Attention
Language: Python - Size: 5.26 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 233 - Forks: 9

efeslab/fiddler
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
Language: Python - Size: 1.72 MB - Last synced at: about 3 hours ago - Pushed at: 6 months ago - Stars: 210 - Forks: 20

EfficientMoE/MoE-Infinity
PyTorch library for cost-effective, fast and easy serving of MoE models.
Language: Python - Size: 457 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 179 - Forks: 13

koayon/awesome-adaptive-computation
A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
Size: 331 KB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 143 - Forks: 9

eduardzamfir/seemoredetails
[ICML 2024] See More Details: Efficient Image Super-Resolution by Experts Mining
Language: Python - Size: 10.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 138 - Forks: 2

lucidrains/PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
Language: Python - Size: 271 KB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 123 - Forks: 3

shufangxun/LLaVA-MoD
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Language: Python - Size: 3.41 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 120 - Forks: 7

lucidrains/mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
Language: Python - Size: 34.1 MB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 118 - Forks: 4

Adlith/MoE-Jetpack
[NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
Language: Python - Size: 32.3 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 115 - Forks: 1

YangLing0818/RealCompo
[NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Language: Python - Size: 7.45 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 115 - Forks: 4

arpita8/Awesome-Mixture-of-Experts-Papers
Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.
Size: 2.21 MB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 115 - Forks: 3

relf/egobox
Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods
Language: Rust - Size: 11.9 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 112 - Forks: 6

liuqidong07/MOELoRA-peft
[SIGIR'24] The official implementation code of MOELoRA.
Language: Python - Size: 10.2 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 105 - Forks: 11

kyegomez/SwitchTransformers
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
Language: Python - Size: 2.42 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 100 - Forks: 12

LINs-lab/DynMoE
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Language: Python - Size: 57.3 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 89 - Forks: 11

xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Language: Python - Size: 1.26 MB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 18

OpenSparseLLMs/LLaMA-MoE-v2
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Language: Python - Size: 2.21 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 78 - Forks: 11

Leeroo-AI/mergoo
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Language: Python - Size: 1.61 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 76 - Forks: 3

HLTCHKUST/MoEL
MoEL: Mixture of Empathetic Listeners
Language: Python - Size: 8.52 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 71 - Forks: 14

fkodom/soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
Language: Python - Size: 152 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 71 - Forks: 5

CASE-Lab-UMD/Unified-MoE-Compression
The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".
Language: Python - Size: 47.1 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 67 - Forks: 5

dmis-lab/Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
Language: Python - Size: 252 KB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 66 - Forks: 3

UNITES-Lab/MC-SMoE
[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
Language: Python - Size: 1.8 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 56 - Forks: 7

bwconrad/soft-moe
PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
Language: Python - Size: 344 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 56 - Forks: 3

mryab/learning-at-home
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implementation
Language: Jupyter Notebook - Size: 272 KB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 54 - Forks: 1

AmazaspShumik/mtlearn
Multi-Task Learning package built with tensorflow 2 (Multi-Gate Mixture of Experts, Cross-Stitch, Ucertainty Weighting)
Language: Python - Size: 10.1 MB - Last synced at: 13 days ago - Pushed at: over 5 years ago - Stars: 52 - Forks: 6

Leeroo-AI/leeroo_orchestrator
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
Language: Python - Size: 857 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 46 - Forks: 4

AmazaspShumik/Mixture-Models
Hierarchical Mixture of Experts,Mixture Density Neural Network
Language: Jupyter Notebook - Size: 4.57 MB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 45 - Forks: 17

LoserCheems/WonderfulMatrices
Wonderful Matrices to Build Small Language Models
Language: Python - Size: 8.78 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 43 - Forks: 0

924973292/DeMo
【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Language: Python - Size: 17 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 41 - Forks: 2

VITA-Group/Neural-Implicit-Dict
[ICML 2022] "Neural Implicit Dictionary via Mixture-of-Expert Training" by Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang
Language: Python - Size: 958 KB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 1

Spico197/MoE-SFT
🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Language: Python - Size: 552 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 38 - Forks: 0

AIDC-AI/Parrot
🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
Language: Python - Size: 25.2 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 36 - Forks: 1

lucidrains/sinkhorn-router-pytorch
Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise
Language: Python - Size: 27.3 KB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 34 - Forks: 0

eduardzamfir/MoCE-IR
[CVPR 2025] Complexity Experts are Task-Discriminative Learners for Any Image Restoration
Language: Python - Size: 821 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 33 - Forks: 0

umbertocappellazzo/PETL_AST
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters".
Language: Python - Size: 3.09 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 32 - Forks: 1

BorealisAI/MMoEEx-MTL
PyTorch Implementation of the Multi-gate Mixture-of-Experts with Exclusivity (MMoEEx)
Language: Python - Size: 31.4 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 32 - Forks: 4

eduardzamfir/DaAIR
GitHub repository for our project "Efficient Degradation-aware Any Image Restoration"
Size: 15.6 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 30 - Forks: 0

james-oldfield/muMoE
[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
Language: Python - Size: 2.95 MB - Last synced at: 28 days ago - Pushed at: 8 months ago - Stars: 30 - Forks: 1

RoyalSkye/Routing-MVMoE
[ICML 2024] "MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts"
Language: Python - Size: 379 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 30 - Forks: 3

OpenSparseLLMs/CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
Language: Python - Size: 2.35 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 29 - Forks: 0

kyegomez/LIMoE
Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts"
Language: Python - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 28 - Forks: 2

zjukg/MoMoK
[Paper][ICLR 2025] Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
Language: Python - Size: 7 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 27 - Forks: 3

SuperBruceJia/Awesome-Mixture-of-Experts
Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts (MoME)
Size: 438 KB - Last synced at: 15 days ago - Pushed at: 4 months ago - Stars: 27 - Forks: 3

sammcj/moa Fork of togethercomputer/MoA
Mixture-of-Ollamas
Language: Python - Size: 1.72 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 25 - Forks: 1

jaisidhsingh/pytorch-mixtures
One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.
Language: Python - Size: 366 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 22 - Forks: 1

Wuyxin/GraphMETRO
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts (NeurIPS 2024)
Language: Python - Size: 36.1 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 21 - Forks: 1

dsy109/mixtools
Tools for Analyzing Finite Mixture Models
Language: R - Size: 499 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 20 - Forks: 4

tsc2017/MIX-GAN
Some recent state-of-the-art generative models in ONE notebook: (MIX-)?(GAN|WGAN|BigGAN|MHingeGAN|AMGAN|StyleGAN|StyleGAN2)(\+ADA|\+CR|\+EMA|\+GP|\+R1|\+SA|\+SN)*
Language: Jupyter Notebook - Size: 771 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 20 - Forks: 1

danelpeng/Awesome-Continual-Leaning-with-PTMs
This is a curated list of "Continual Learning with Pretrained Models" research.
Size: 254 KB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 17 - Forks: 0

checkstep/mole-stance
MoLE: Cross-Domain Label-Adaptive Stance Detection
Language: Python - Size: 47.9 KB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 17 - Forks: 5

AdamG012/moe-paper-models
A sumary of MoE experimental setups across a number of different papers.
Size: 10.7 KB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 16 - Forks: 1

dominiquegarmier/grok-pytorch
pytorch implementation of grok
Language: Python - Size: 44.9 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 13 - Forks: 0

gaozhitong/MoSE-AUSeg
The official code repo for the paper "Mixture of Stochastic Experts for Modeling Aleatoric Uncertainty in Segmentation". (ICLR 2023)
Language: Python - Size: 939 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 1

hanyas/mimo
A toolbox for inference of mixture models
Language: Python - Size: 713 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 4

vivamoto/classifier
Machine learning code, derivatives calculation and optimization algorithms developed during the Machine Learning course at Universidade de Sao Paulo. All codes in Python, NumPy and Matplotlib with example in the end of file.
Language: Python - Size: 13.1 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 12 - Forks: 4

cmavro/PackLLM
Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization
Language: Python - Size: 169 KB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 1

UNITES-Lab/HEXA-MoE
Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE Acceleration with Zero Computation Redundancy"
Language: Python - Size: 19.2 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 9 - Forks: 1

he-h/ST-MoE-BERT
This repository contains the code for the paper "ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction".
Language: Python - Size: 872 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 9 - Forks: 3

yuzhimanhua/SciMult
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)
Language: Python - Size: 173 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 9 - Forks: 0

ilyalasy/moe-routing
Analysis of token routing for different implementations of Mixture of Experts
Language: Jupyter Notebook - Size: 882 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 0

clint-kristopher-morris/llm-guided-evolution
LLM Guided Evolution - The Automation of Models Advancing Models
Language: Python - Size: 345 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 8 - Forks: 5

yanring/Megatron-MoE-ModelZoo
Best practices for testing advanced Mixtral, DeepSeek, and Qwen series MoE models using Megatron Core MoE.
Language: Python - Size: 26.4 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 8 - Forks: 1

EfficientMoE/MoE-Gen
High-throughput offline inference for MoE models with limited GPUs
Language: Python - Size: 552 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 0

louisbrulenaudet/mergeKit
Tools for merging pretrained Large Language Models and create Mixture of Experts (MoE) from open-source models.
Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

RoyZry98/T-REX-Pytorch
[Arxiv 2024] Official code for T-REX: Mixture-of-Rank-One-Experts with semantic-aware Intuition for Multi-task Large Language Model Finetuning
Language: Python - Size: 19.2 MB - Last synced at: about 21 hours ago - Pushed at: about 22 hours ago - Stars: 7 - Forks: 0

UNITES-Lab/glider
Official code for the paper "Glider: Global and Local Instruction-Driven Expert Router"
Language: Python - Size: 477 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 7 - Forks: 0

Keefe-Murphy/MoEClust
Gaussian Parsimonious Clustering Models with Gating and Expert Network Covariates
Language: R - Size: 1.67 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 7 - Forks: 0

skiptoniam/ecomix
ecomix is a package to implement model based species level (Species Archetype Models) or site level (Regions of Common Profile) grouping of community data.
Language: R - Size: 63 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 2

AhmedMagdyHendawy/MOORE
Official code of the paper "Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts" at ICLR2024
Language: Python - Size: 813 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 7 - Forks: 3

jyjohnchoi/SMoP
The repository contains the code for our EMNLP 2023 paper "SMoP: Towards Efficient and Effective Prompt Tuning with Sparse Mixture-of-Prompts", written by Joon-Young Choi, Junho Kim, Jun-Hyung Park, Mok-Wing Lam, and SangKeun Lee.
Language: Python - Size: 154 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 7 - Forks: 2

BearCleverProud/MoME
Repository for Mixture of Multimodal Experts
Language: Python - Size: 857 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 7 - Forks: 0

dannyxiaocn/awesome-moe
a repo for moe papers and systems aggregation
Size: 474 KB - Last synced at: 9 days ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 2

ozyurtf/mixture-of-experts
Training two separate expert neural networks and one gater that can switch the expert networks.
Language: Python - Size: 10.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 0

antonio-f/mixture-of-experts-from-scratch
Mixture of Experts from scratch
Language: Jupyter Notebook - Size: 234 KB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 1

clementetienam/Ultra-Fast-Mixture-of-Experts-Regression
Language: MATLAB - Size: 1.75 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

alexliap/greek_gpt
MoE Decoder Transformer implementation with MLX
Language: Python - Size: 107 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 5 - Forks: 1

ZhenbangDu/DSD
[IEEE TAI] Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach
Language: Python - Size: 643 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 5 - Forks: 0

nusnlp/moece
The official code of the "Efficient and Interpretable Grammatical Error Correction with Mixture of Experts" paper
Language: Python - Size: 4.83 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 5 - Forks: 0

Keefe-Murphy/MEDseq
Mixtures of Exponential-Distance Models for Clustering Longitudinal Life-Course Sequences with Gating Covariates and Sampling Weights
Language: R - Size: 10.1 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

ZhenbangDu/Seizure_MoE
The official code for the paper 'Mixture of Experts for EEG-Based Seizure Subtype Classification'.
Language: Python - Size: 150 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 5 - Forks: 0

yamsgithub/modular_deep_learning
This repository contains scripts for implementing various learning from expert architectures, such as mixture of experts and product of experts, and performing various experiments with these architectures.
Language: Jupyter Notebook - Size: 332 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 1
