GitHub topics: mixture-of-experts
deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language: Python - Size: 217 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 38,040 - Forks: 4,342

xmarva/transformer-based-architectures
Breakdown of SoTA transformer-based architectures
Language: Jupyter Notebook - Size: 741 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

relf/egobox
Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods
Language: Rust - Size: 12.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 100 - Forks: 6

nusnlp/moece
The official code of the "Efficient and Interpretable Grammatical Error Correction with Mixture of Experts" paper
Language: Python - Size: 4.83 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 5 - Forks: 0

RufelleEmmanuelPactol/Mixture-of-Experts-Transcript-Evaluator
A mixture of experts inspired transcript evaluator using LLM fine-tuning. Contains a routing mechanism to assign specific questions to "experts".
Language: Jupyter Notebook - Size: 1.89 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 2

codelion/optillm
Optimizing inference proxy for LLMs
Language: Python - Size: 1.61 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,163 - Forks: 168

learning-at-home/hivemind
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
Language: Python - Size: 12.1 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,162 - Forks: 183

SuperBruceJia/Awesome-Mixture-of-Experts
Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts (MoME)
Size: 438 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 24 - Forks: 3

microsoft/Tutel
Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4
Language: Python - Size: 1.07 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 801 - Forks: 96

pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Language: Python - Size: 1.69 MB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 956 - Forks: 56

koayon/awesome-adaptive-computation
A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
Size: 331 KB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 143 - Forks: 9

kyegomez/SwitchTransformers
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
Language: Python - Size: 2.42 MB - Last synced at: 5 days ago - Pushed at: 20 days ago - Stars: 97 - Forks: 13

korovod/kenotron
Experimental fork of Nanotron, a minimalistic large language model 4D-parallelism training
Language: Python - Size: 12.3 MB - Last synced at: 1 day ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

EfficientMoE/MoE-Infinity
PyTorch library for cost-effective, fast and easy serving of MoE models.
Language: Python - Size: 456 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 166 - Forks: 12

inferflow/inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
Language: C++ - Size: 1.89 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 242 - Forks: 25

dvmazur/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
Language: Python - Size: 261 KB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 2,303 - Forks: 233

arpita8/Awesome-Mixture-of-Experts-Papers
Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.
Size: 2.21 MB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 110 - Forks: 2

PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
Language: Python - Size: 16.5 MB - Last synced at: 11 days ago - Pushed at: 5 months ago - Stars: 2,140 - Forks: 134

james-oldfield/muMoE
[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
Language: Python - Size: 2.95 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 30 - Forks: 1

ymcui/Chinese-Mixtral
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
Language: Python - Size: 519 KB - Last synced at: 12 days ago - Pushed at: 12 months ago - Stars: 604 - Forks: 44

Adlith/MoE-Jetpack
[NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
Language: Python - Size: 32.3 MB - Last synced at: 11 days ago - Pushed at: 5 months ago - Stars: 115 - Forks: 1

jaisidhsingh/pytorch-mixtures
One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.
Language: Python - Size: 366 KB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 22 - Forks: 1

he-h/ST-MoE-BERT
This repository contains the code for the paper "ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction".
Language: Python - Size: 872 KB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 7 - Forks: 2

louisbrulenaudet/mergeKit
Tools for merging pretrained Large Language Models and create Mixture of Experts (MoE) from open-source models.
Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 0

kyegomez/LIMoE
Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts"
Language: Python - Size: 2.17 MB - Last synced at: 8 days ago - Pushed at: 18 days ago - Stars: 28 - Forks: 2

OpenSparseLLMs/LLaMA-MoE-v2
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Language: Python - Size: 2.21 MB - Last synced at: 17 days ago - Pushed at: 5 months ago - Stars: 78 - Forks: 11

lucidrains/soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Language: Python - Size: 1.38 MB - Last synced at: 19 days ago - Pushed at: 22 days ago - Stars: 280 - Forks: 8

BorealisAI/MMoEEx-MTL
PyTorch Implementation of the Multi-gate Mixture-of-Experts with Exclusivity (MMoEEx)
Language: Python - Size: 31.4 MB - Last synced at: 17 days ago - Pushed at: almost 4 years ago - Stars: 32 - Forks: 4

lucidrains/st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
Language: Python - Size: 178 KB - Last synced at: 19 days ago - Pushed at: 10 months ago - Stars: 326 - Forks: 28

lucidrains/mixture-of-experts
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
Language: Python - Size: 136 KB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 724 - Forks: 55

AviSoori1x/makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
Language: Jupyter Notebook - Size: 6.96 MB - Last synced at: 20 days ago - Pushed at: 6 months ago - Stars: 686 - Forks: 73

SkyworkAI/MoH
MoH: Multi-Head Attention as Mixture-of-Head Attention
Language: Python - Size: 5.26 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 233 - Forks: 9

dannyxiaocn/awesome-moe
a repo for moe papers and systems aggregation
Size: 474 KB - Last synced at: 11 days ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 2

Bhazantri/EvoLingua
EvoLingua: A Scalable Mixture-of-Experts Language Model Framework
Language: Python - Size: 36.1 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

efeslab/fiddler
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
Language: Python - Size: 1.72 MB - Last synced at: 20 days ago - Pushed at: 5 months ago - Stars: 203 - Forks: 18

zjukg/MoMoK
[Paper][ICLR 2025] Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
Language: Python - Size: 6.99 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 24 - Forks: 2

lucidrains/PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
Language: Python - Size: 271 KB - Last synced at: 14 days ago - Pushed at: 8 months ago - Stars: 123 - Forks: 3

AIDC-AI/Parrot
🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
Language: Python - Size: 25.2 MB - Last synced at: 10 days ago - Pushed at: 23 days ago - Stars: 36 - Forks: 1

shufangxun/LLaVA-MoD
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Language: Python - Size: 3.41 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 120 - Forks: 7

lucidrains/mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
Language: Python - Size: 34.1 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 118 - Forks: 4

YAGI0423/gpt_modules
「Pytorch 기반 GPT 모델 및 모듈 라이브러리」에 대한 내용을 다루고 있습니다.
Language: Python - Size: 343 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

Vignesh010101/Intelligent-Health-LLM-System
An Intelligent Health LLM System for Personalized Medication Guidance and Support.
Language: Jupyter Notebook - Size: 615 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

danelpeng/Awesome-Continual-Leaning-with-PTMs
This is a curated list of "Continual Learning with Pretrained Models" research.
Size: 254 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 16 - Forks: 0

Nitin-Sagar-B/RaTiO-CoRE
A modular multi-model AI framework demonstrating advanced techniques in semantic knowledge transfer, context management, and collaborative intelligence across diverse language models.
Language: Python - Size: 136 KB - Last synced at: 16 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

lucidrains/sinkhorn-router-pytorch
Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise
Language: Python - Size: 27.3 KB - Last synced at: 16 days ago - Pushed at: 8 months ago - Stars: 33 - Forks: 0

YangLing0818/RealCompo
[NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Language: Python - Size: 7.45 MB - Last synced at: 27 days ago - Pushed at: 5 months ago - Stars: 115 - Forks: 4

alt2177/mllm-public
A framework for merging multiple LMs to improve OOD performance without additional training
Language: Jupyter Notebook - Size: 166 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ozyurtf/mixture-of-experts
Training two separate expert neural networks and one gater that can switch the expert networks.
Language: Python - Size: 10.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0

bwconrad/soft-moe
PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
Language: Python - Size: 344 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 53 - Forks: 3

OpenSparseLLMs/CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
Language: Python - Size: 2.35 MB - Last synced at: 21 days ago - Pushed at: 7 months ago - Stars: 29 - Forks: 0

EfficientMoE/MoE-Gen
High-throughput offline inference for MoE models with limited GPUs
Language: Python - Size: 552 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 0

alexliap/greek_gpt
MoE Decoder Transformer implementation with MLX
Language: Python - Size: 107 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 4 - Forks: 1

CASE-Lab-UMD/Unified-MoE-Compression
The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".
Language: Python - Size: 47.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 5

AdamG012/moe-paper-models
A sumary of MoE experimental setups across a number of different papers.
Size: 10.7 KB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 16 - Forks: 1

APWS25/AccelMoE
This repository is for CUDA kernel re-implementation of CPU-based MoE model.
Language: C++ - Size: 687 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

antonio-f/mixture-of-experts-from-scratch
Mixture of Experts from scratch
Language: Jupyter Notebook - Size: 234 KB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 1

clint-kristopher-morris/llm-guided-evolution
LLM Guided Evolution - The Automation of Models Advancing Models
Language: Python - Size: 345 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 5

eduardzamfir/MoCE-IR
[CVPR 2025] Complexity Experts are Task-Discriminative Learners for Any Image Restoration
Language: Python - Size: 821 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 0

vishwapatel14/Skin-Cancer-Classification-Model
Skin Cancer Classification Project for Advanced Machine Learning course in Fall 2024 at The University of Texas at Austin
Language: Jupyter Notebook - Size: 15 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 2

dmis-lab/Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
Language: Python - Size: 252 KB - Last synced at: 30 days ago - Pushed at: 3 months ago - Stars: 60 - Forks: 3

Keefe-Murphy/MEDseq
Mixtures of Exponential-Distance Models for Clustering Longitudinal Life-Course Sequences with Gating Covariates and Sampling Weights
Language: R - Size: 10.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 0

checkstep/mole-stance
MoLE: Cross-Domain Label-Adaptive Stance Detection
Language: Python - Size: 47.9 KB - Last synced at: 25 days ago - Pushed at: about 3 years ago - Stars: 17 - Forks: 5

924973292/DeMo
【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Language: Python - Size: 17 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 41 - Forks: 2

UNITES-Lab/HEXA-MoE
Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE Acceleration with Zero Computation Redundancy"
Language: Python - Size: 19.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 9 - Forks: 1

UNITES-Lab/glider
Official code for the paper "Glider: Global and Local Instruction-Driven Expert Router"
Language: Python - Size: 477 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 7 - Forks: 0

Keefe-Murphy/MoEClust
Gaussian Parsimonious Clustering Models with Gating and Expert Network Covariates
Language: R - Size: 1.67 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 7 - Forks: 0

Luodian/Generalizable-Mixture-of-Experts
GMoE could be the next backbone model for many kinds of generalization task.
Language: Python - Size: 2.04 MB - Last synced at: 14 days ago - Pushed at: about 2 years ago - Stars: 269 - Forks: 35

eduardzamfir/DaAIR
GitHub repository for our project "Efficient Degradation-aware Any Image Restoration"
Size: 15.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 30 - Forks: 0

Wuyxin/GraphMETRO
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts (NeurIPS 2024)
Language: Python - Size: 36.1 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 21 - Forks: 1

xpmoe/app
Mixture of Experts Framework for Enhanced Explainability of Anxiety States Pre- and Post-Intervention Across Experimental Groups
Size: 9.77 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

Spico197/MoE-SFT
🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Language: Python - Size: 552 KB - Last synced at: 11 days ago - Pushed at: 7 months ago - Stars: 38 - Forks: 0

discover-Austin/Neural_Network_Mixture_of_Experts
Mixture_of_Experts implemented in an easy to understand and custom Neural Network
Language: Python - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

rhymes-ai/Aria
Codebase for Aria - an Open Multimodal Native MoE
Language: Jupyter Notebook - Size: 120 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 995 - Forks: 83

mryab/learning-at-home
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implementation
Language: Jupyter Notebook - Size: 272 KB - Last synced at: 24 days ago - Pushed at: over 4 years ago - Stars: 54 - Forks: 1

LoserCheems/WonderfulMatrices
Wonderful Matrices to Build Small Language Models
Language: Python - Size: 8.78 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 43 - Forks: 0

PV-Bhat/RSRC
RSRC Calculator is a practical tool designed to evaluate the efficiency of AI models in the post-scaling era: Recursive Self-Referential Compression (RSRC), this tool computes training efficiency metrics by analyzing factors such as training FLOPs, energy consumption, and model architecture details.
Language: Python - Size: 75.2 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

LINs-lab/DynMoE
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Language: Python - Size: 57.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 67 - Forks: 9

fkodom/soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
Language: Python - Size: 152 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 71 - Forks: 5

clementetienam/Data_Driven_MPC_Controller_Using_CCR
Language: MATLAB - Size: 11.9 MB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 2

VITA-Group/Neural-Implicit-Dict
[ICML 2022] "Neural Implicit Dictionary via Mixture-of-Expert Training" by Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang
Language: Python - Size: 958 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 1

nitic-nlp-team/webnavix
A continuous generalist web navigation agent that merges individually fine-tuned LLMs as domain experts.
Language: Typst - Size: 56.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

ilyalasy/moe-routing
Analysis of token routing for different implementations of Mixture of Experts
Language: Jupyter Notebook - Size: 882 KB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 0

dominiquegarmier/grok-pytorch
pytorch implementation of grok
Language: Python - Size: 44.9 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 13 - Forks: 0

dsy109/mixtools
Tools for Analyzing Finite Mixture Models
Language: R - Size: 499 KB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 20 - Forks: 4

AhmedMagdyHendawy/MOORE
Official code of the paper "Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts" at ICLR2024
Language: Python - Size: 813 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 3

s-chh/PyTorch-Scratch-LLM
Simple and easy to understand PyTorch implementation of Large Language Model (LLM) GPT and LLAMA from scratch with detailed steps. Implemented: Byte-Pair Tokenizer, Rotational Positional Embedding (RoPe), SwishGLU, RMSNorm, Mixture of Experts (MOE). Tested on Taylor Swift song lyrics dataset.
Language: Python - Size: 58.6 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

jyjohnchoi/SMoP
The repository contains the code for our EMNLP 2023 paper "SMoP: Towards Efficient and Effective Prompt Tuning with Sparse Mixture-of-Prompts", written by Joon-Young Choi, Junho Kim, Jun-Hyung Park, Mok-Wing Lam, and SangKeun Lee.
Language: Python - Size: 154 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 7 - Forks: 2

ednialzavlare/MixKABRN
This is the repo for the MixKABRN Neural Network (Mixture of Kolmogorov-Arnold Bit Retentive Networks), and an attempt at first adapting it for training on text, and later adjust it for other modalities.
Language: Python - Size: 85.9 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 4 - Forks: 0

lolguy91/perfect-llm-imho
The idea to create the perfect LLM currently possible came to my mind because I was watching a YouTube on GaLore, the "sequel" to LoRa, and I realized how fucking groundbreaking that tech is. I was daydreaming about pretraining my own model, this (probably impossible to implement) concept is a refined version of that model.
Size: 17.6 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

yuzhimanhua/SciMult
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)
Language: Python - Size: 173 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 9 - Forks: 0

ZhenbangDu/Seizure_MoE
The official code for the paper 'Mixture of Experts for EEG-Based Seizure Subtype Classification'.
Language: Python - Size: 150 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 5 - Forks: 0

sammcj/moa Fork of togethercomputer/MoA
Mixture-of-Ollamas
Language: Python - Size: 1.72 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 25 - Forks: 1

nath54/Dedale_LLM
This is a prototype of a MixtureOfExpert LLM made with pytorch. Currently in developpment, I am testing its capabilities of learning with simple little tests before learning it on large language datasets.
Language: Python - Size: 75.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

BearCleverProud/MoME
Repository for Mixture of Multimodal Experts
Language: Python - Size: 857 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 7 - Forks: 0

umbertocappellazzo/PETL_AST
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters".
Language: Python - Size: 3.09 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 32 - Forks: 1

liuqidong07/MOELoRA-peft
[SIGIR'24] The official implementation code of MOELoRA.
Language: Python - Size: 10.2 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 105 - Forks: 11

RoyalSkye/Routing-MVMoE
[ICML 2024] "MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts"
Language: Python - Size: 379 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 30 - Forks: 3

eduardzamfir/seemoredetails
Repository for "See More Details: Efficient Image Super-Resolution by Experts Mining", ICML 2024
Language: Python - Size: 10.2 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 74 - Forks: 1

yuhaoliu94/GP-HME
Gaussian Process-Gated Hierarchical Mixture of Experts
Language: Python - Size: 59.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

fchamroukhi/MEteorits
Mixtures-of-ExperTs modEling for cOmplex and non-noRmal dIsTributionS
Language: R - Size: 28.3 MB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 2
