GitHub topics: mixture-of-experts

Repositories

EfficientMoE/MoE-Infinity

PyTorch library for cost-effective, fast and easy serving of MoE models.

Language: Python - Size: 614 KB - Last synced at: about 1 hour ago - Pushed at: 11 days ago - Stars: 200 - Forks: 17

ai-agi/TableMoE

TableMoE-Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table Understanding

Language: HTML - Size: 626 KB - Last synced at: about 5 hours ago - Pushed at: about 6 hours ago - Stars: 2 - Forks: 0

ymcui/Chinese-Mixtral

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

Language: Python - Size: 519 KB - Last synced at: about 13 hours ago - Pushed at: about 1 year ago - Stars: 602 - Forks: 44

Kedhareswer/platform-prompt-alchemy-lab

An open-source web app that enhances and refines AI prompts using advanced prompt engineering techniques like Chain of Thought, Few-Shot Learning, Persona Injection, and more. Supports multiple LLM providers and domains with a modern, intuitive UI built with React, TypeScript, and Tailwind CSS.

Language: TypeScript - Size: 1.16 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language: Python - Size: 217 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 39,089 - Forks: 4,433

microsoft/Tutel

Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4

Language: C - Size: 1.43 MB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 844 - Forks: 96

SMTorg/smt

Surrogate Modeling Toolbox

Language: Jupyter Notebook - Size: 166 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 775 - Forks: 217

codelion/optillm

Optimizing inference proxy for LLMs

Language: Python - Size: 1.84 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,564 - Forks: 192

learning-at-home/hivemind

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

Language: Python - Size: 12.1 MB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 2,211 - Forks: 191

relf/egobox

Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods

Language: Rust - Size: 14.4 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 121 - Forks: 8

pjlab-sys4nlp/llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

Language: Python - Size: 1.69 MB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 967 - Forks: 62

ozyurtf/mixture-of-experts

Training two separate expert neural networks and one gater that can switch the expert networks.

Language: Python - Size: 10.1 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 7 - Forks: 1

bitnulleins/rivermoe

Mixture of experts architecture for online machine learning.

Language: Python - Size: 236 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

SuperBruceJia/Awesome-Mixture-of-Experts

Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts (MoME)

Size: 438 KB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 31 - Forks: 3

SkyworkAI/MoE-plus-plus

[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts

Language: Python - Size: 1.94 MB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 226 - Forks: 9

arpita8/Awesome-Mixture-of-Experts-Papers

Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.

Size: 2.21 MB - Last synced at: 13 days ago - Pushed at: 10 months ago - Stars: 119 - Forks: 4

dsy109/mixtools

Tools for Analyzing Finite Mixture Models

Language: R - Size: 499 KB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 22 - Forks: 4

alexliap/greek_gpt

MoE Decoder Transformer implementation with MLX

Language: Python - Size: 138 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 5 - Forks: 1

Fsoft-AIC/CompeteSMoE

CompeteSMoE – Statistically Guaranteed Mixture of Experts Training via Competition

Language: Python - Size: 5.42 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 0

pranoyr/attention-models

Simplified Implementation of SOTA Deep Learning Papers in Pytorch

Language: Python - Size: 190 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 4 - Forks: 0

UNITES-Lab/Flex-MoE

[NeurIPS 2024 Spotlight] Code for the paper "Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts"

Language: Python - Size: 3.14 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 52 - Forks: 2

Spico197/MoE-SFT

🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Language: Python - Size: 552 KB - Last synced at: 14 days ago - Pushed at: 9 months ago - Stars: 39 - Forks: 0

kyegomez/SwitchTransformers

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"

Language: Python - Size: 2.42 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 108 - Forks: 13

lucidrains/PEER-pytorch

Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind

Language: Python - Size: 271 KB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 127 - Forks: 3

Vignesh010101/Intelligent-Health-LLM-System

An Intelligent Health LLM System for Personalized Medication Guidance and Support.

Language: Jupyter Notebook - Size: 620 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 3 - Forks: 0

kyegomez/LIMoE

Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts"

Language: Python - Size: 2.17 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 29 - Forks: 2

koayon/awesome-adaptive-computation

A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).

Size: 331 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 147 - Forks: 9

lucidrains/mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

Language: Python - Size: 34.1 MB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 119 - Forks: 4

danelpeng/Awesome-Continual-Leaning-with-PTMs

This is a curated list of "Continual Learning with Pretrained Models" research.

Size: 351 KB - Last synced at: 10 days ago - Pushed at: 30 days ago - Stars: 18 - Forks: 0

RoyalSkye/Routing-MVMoE

[ICML 2024] "MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts"

Language: Python - Size: 379 MB - Last synced at: 24 days ago - Pushed at: 25 days ago - Stars: 70 - Forks: 6

AI-14/micar-vl-moe

[IJCNN 2025] [Official code] - MicarVLMoE: A modern gated cross-aligned vision-language mixture of experts model for medical image captioning and report generation

Language: Python - Size: 1.17 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 2 - Forks: 0

zjukg/MoMoK

[Paper][ICLR 2025] Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning

Language: Python - Size: 7 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 27 - Forks: 3

bwconrad/soft-moe

PyTorch implementation of "From Sparse to Soft Mixtures of Experts"

Language: Python - Size: 344 KB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 57 - Forks: 3

PKU-YuanGroup/MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language: Python - Size: 16.5 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 2,170 - Forks: 135

cmu-flame/FLAME-MoE

Official repository for FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

Language: Jupyter Notebook - Size: 4.85 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 0

jaisidhsingh/pytorch-mixtures

One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.

Language: Python - Size: 371 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 22 - Forks: 1

Luodian/Generalizable-Mixture-of-Experts

GMoE could be the next backbone model for many kinds of generalization task.

Language: Python - Size: 2.04 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 270 - Forks: 35

VinsmokeSomya/Mixture-of-Idiotic-Experts

🧠✍️🎭 Mixture of Idiotic Experts: A PyTorch-based Sparse Mixture of Experts (MoE) model for generating Shakespeare-like text, character by character. Inspired by Andrej Karpathy's makemore.

Language: Python - Size: 3.01 MB - Last synced at: 22 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

SkyworkAI/MoH

MoH: Multi-Head Attention as Mixture-of-Head Attention

Language: Python - Size: 5.26 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 244 - Forks: 10

Rikka-Botan/IMoE

Pytorch official implementation of IMoE: Indefinacy Mixture of Experts. IMoE dynamically selects neurons. This implementation replicates the neurons which is dynamically changing synapses and enables efficient and diverse inference.

Language: Python - Size: 43.9 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

lucidrains/st-moe-pytorch

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

Language: Python - Size: 178 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 332 - Forks: 29

korovod/kenotron

Experimental fork of Nanotron, a minimalistic large language model 4D-parallelism training

Language: Python - Size: 12.3 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

lucidrains/mixture-of-experts

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

Language: Python - Size: 136 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 752 - Forks: 60

clint-kristopher-morris/llm-guided-evolution

LLM Guided Evolution - The Automation of Models Advancing Models

Language: Python - Size: 345 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 7

nusnlp/moece

The official code of the "Efficient and Interpretable Grammatical Error Correction with Mixture of Experts" paper

Language: Python - Size: 5.15 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 6 - Forks: 0

OpenSparseLLMs/LLaMA-MoE-v2

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Language: Python - Size: 2.21 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 84 - Forks: 12

dvmazur/mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

Language: Python - Size: 261 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 2,311 - Forks: 232

lucidrains/soft-moe-pytorch

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

Language: Python - Size: 1.38 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 289 - Forks: 8

RoyZry98/T-REX-Pytorch

[Arxiv 2024] Official code for T-REX: Mixture-of-Rank-One-Experts with semantic-aware Intuition for Multi-task Large Language Model Finetuning

Language: Python - Size: 19.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 0

efeslab/fiddler

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

Language: Python - Size: 1.72 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 210 - Forks: 20

inferflow/inferflow

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

Language: C++ - Size: 1.89 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 242 - Forks: 25

lucidrains/sinkhorn-router-pytorch

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise

Language: Python - Size: 27.3 KB - Last synced at: 16 days ago - Pushed at: 10 months ago - Stars: 35 - Forks: 0

eduardzamfir/seemoredetails

[ICML 2024] See More Details: Efficient Image Super-Resolution by Experts Mining

Language: Python - Size: 10.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 138 - Forks: 2

andersonvc/alpha-pulse

Real-Time SEC 8-K Filing Analyzer for Financial Modeling

Language: Python - Size: 683 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

he-h/ST-MoE-BERT

This repository contains the code for the paper "ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction".

Language: Python - Size: 872 KB - Last synced at: about 11 hours ago - Pushed at: 4 months ago - Stars: 9 - Forks: 3

LINs-lab/DynMoE

[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

Language: Python - Size: 57.3 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 89 - Forks: 11

CASE-Lab-UMD/Unified-MoE-Compression

The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".

Language: Python - Size: 47.1 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 67 - Forks: 5

AIDC-AI/Parrot

🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.

Language: Python - Size: 25.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 36 - Forks: 1

alexdrnd/micar-vl-moe

[IJCNN 2025] [Official code] - MicarVLMoE: A modern gated cross-aligned vision-language mixture of experts model for medical image captioning and report generation

Language: Python - Size: 935 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

dmis-lab/Monet

[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers

Language: Python - Size: 252 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 66 - Forks: 3

reshalfahsi/gpt2moe-instruct

Instruction Fine-tuning of the GPT2MoE Model: GPT-2 with Mixture-of-Experts

Language: Jupyter Notebook - Size: 12 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

ZhenbangDu/DSD

[IEEE TAI] Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach

Language: Python - Size: 643 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

GnanaPrakashSG2004/Concept_Distillation

Framework to distill concepts of a teacher model to a student model. Trying to understand impact on performance and interpretability of the distilled student model obtained from this training paradigm.

Language: Jupyter Notebook - Size: 71.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

yanring/Megatron-MoE-ModelZoo

Best practices for testing advanced Mixtral, DeepSeek, and Qwen series MoE models using Megatron Core MoE.

Language: Python - Size: 26.4 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 8 - Forks: 1

AmazaspShumik/mtlearn

Multi-Task Learning package built with tensorflow 2 (Multi-Gate Mixture of Experts, Cross-Stitch, Ucertainty Weighting)

Language: Python - Size: 10.1 MB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 52 - Forks: 6

xmarva/transformer-based-architectures

Breakdown of SoTA transformer-based architectures

Language: Jupyter Notebook - Size: 741 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

RufelleEmmanuelPactol/Mixture-of-Experts-Transcript-Evaluator

A mixture of experts inspired transcript evaluator using LLM fine-tuning. Contains a routing mechanism to assign specific questions to "experts".

Language: Jupyter Notebook - Size: 1.89 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 2

xrsrke/pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

Language: Python - Size: 1.26 MB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 18

cmavro/PackLLM

Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization

Language: Python - Size: 169 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 1

james-oldfield/muMoE

[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization

Language: Python - Size: 2.95 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 30 - Forks: 1

Adlith/MoE-Jetpack

[NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks

Language: Python - Size: 32.3 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 115 - Forks: 1

antonio-f/mixture-of-experts-from-scratch

Mixture of Experts from scratch

Language: Jupyter Notebook - Size: 234 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 1

louisbrulenaudet/mergeKit

Tools for merging pretrained Large Language Models and create Mixture of Experts (MoE) from open-source models.

Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

BorealisAI/MMoEEx-MTL

PyTorch Implementation of the Multi-gate Mixture-of-Experts with Exclusivity (MMoEEx)

Language: Python - Size: 31.4 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 32 - Forks: 4

AviSoori1x/makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Language: Jupyter Notebook - Size: 6.96 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 686 - Forks: 73

dannyxiaocn/awesome-moe

a repo for moe papers and systems aggregation

Size: 474 KB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 2

Bhazantri/EvoLingua

EvoLingua: A Scalable Mixture-of-Experts Language Model Framework

Language: Python - Size: 36.1 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

shufangxun/LLaVA-MoD

[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

Language: Python - Size: 3.41 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 120 - Forks: 7

YAGI0423/gpt_modules

「Pytorch 기반 GPT 모델 및 모듈 라이브러리」에 대한 내용을 다루고 있습니다.

Language: Python - Size: 343 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Nitin-Sagar-B/RaTiO-CoRE

A modular multi-model AI framework demonstrating advanced techniques in semantic knowledge transfer, context management, and collaborative intelligence across diverse language models.

Language: Python - Size: 136 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

YangLing0818/RealCompo

[NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models

Language: Python - Size: 7.45 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 115 - Forks: 4

alt2177/mllm-public

A framework for merging multiple LMs to improve OOD performance without additional training

Language: Jupyter Notebook - Size: 166 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

dominiquegarmier/grok-pytorch

pytorch implementation of grok

Language: Python - Size: 44.9 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 12 - Forks: 0

OpenSparseLLMs/CLIP-MoE

CLIP-MoE: Mixture of Experts for CLIP

Language: Python - Size: 2.35 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 29 - Forks: 0

EfficientMoE/MoE-Gen

High-throughput offline inference for MoE models with limited GPUs

Language: Python - Size: 552 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 8 - Forks: 0

AdamG012/moe-paper-models

A sumary of MoE experimental setups across a number of different papers.

Size: 10.7 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 1

APWS25/AccelMoE

This repository is for CUDA kernel re-implementation of CPU-based MoE model.

Language: C++ - Size: 687 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1

eduardzamfir/MoCE-IR

[CVPR 2025] Complexity Experts are Task-Discriminative Learners for Any Image Restoration

Language: Python - Size: 821 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 33 - Forks: 0

vishwapatel14/Skin-Cancer-Classification-Model

Skin Cancer Classification Project for Advanced Machine Learning course in Fall 2024 at The University of Texas at Austin

Language: Jupyter Notebook - Size: 15 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 4

Keefe-Murphy/MEDseq

Mixtures of Exponential-Distance Models for Clustering Longitudinal Life-Course Sequences with Gating Covariates and Sampling Weights

Language: R - Size: 10.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 5 - Forks: 0

checkstep/mole-stance

MoLE: Cross-Domain Label-Adaptive Stance Detection

Language: Python - Size: 47.9 KB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 5

924973292/DeMo

【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification

Language: Python - Size: 17 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 41 - Forks: 2

UNITES-Lab/HEXA-MoE

Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE Acceleration with Zero Computation Redundancy"

Language: Python - Size: 19.2 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 9 - Forks: 1

UNITES-Lab/glider

Official code for the paper "Glider: Global and Local Instruction-Driven Expert Router"

Language: Python - Size: 477 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

Keefe-Murphy/MoEClust

Gaussian Parsimonious Clustering Models with Gating and Expert Network Covariates

Language: R - Size: 1.67 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

eduardzamfir/DaAIR

GitHub repository for our project "Efficient Degradation-aware Any Image Restoration"

Size: 15.6 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 30 - Forks: 0

Wuyxin/GraphMETRO

GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts (NeurIPS 2024)

Language: Python - Size: 36.1 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 21 - Forks: 1

xpmoe/app

Mixture of Experts Framework for Enhanced Explainability of Anxiety States Pre- and Post-Intervention Across Experimental Groups

Size: 9.77 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

discover-Austin/Neural_Network_Mixture_of_Experts

Mixture_of_Experts implemented in an easy to understand and custom Neural Network

Language: Python - Size: 12.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

rhymes-ai/Aria

Codebase for Aria - an Open Multimodal Native MoE

Language: Jupyter Notebook - Size: 120 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 995 - Forks: 83

Related Keywords

mixture-of-experts 160 deep-learning 32 moe 28 machine-learning 23 large-language-models 23 pytorch 23 llm 22 nlp 14 artificial-intelligence 14 transformer 11 ai 10 computer-vision 9 transformers 9 python 6 language-model 6 gaussian-processes 6 neural-networks 6 efficiency 5 llms 5 multi-task-learning 5 deep-neural-networks 5 ml 5 llm-inference 5 generative-ai 5 llama 5 vision-transformer 4 multimodal-large-language-models 4 attention 4 transfer-learning 4 natural-language-processing 4 ensemble 4 huggingface 3 vision-language-models 3 low-level-vision 3 conditional-computation 3 llms-reasoning 3 foundation-models 3 mixtral-8x7b 3 pytorch-implementation 3 multi-modal 3 instruction-tuning 3 lora 3 ensemble-learning 3 neural-network 3 gpt 3 tensorflow 3 inference 3 keras 3 prompt-tuning 3 graph-neural-networks 3 unsupervised-learning 3 mixture-models 3 deepseek 3 mixture-of-models 2 diffusion-models 2 generalization 2 generative-model 2 deep-reinforcement-learning 2 alignment-strategies 2 federated-learning 2 anomaly-detection 2 llama3 2 mistral-7b 2 chest-xrays 2 ct-scans 2 contrastive-learning 2 adaptive-computation 2 radiology-report-generation 2 mri-images 2 medical-imaging 2 medical-image-captioning 2 feature-pyramid-network 2 bert 2 training 2 interpretability 2 domain-adaptation 2 megatron-lm 2 kdd2018 2 multitask-learning 2 tensorflow2 2 peft 2 kv-cache 2 deeplearning 2 supervised-learning 2 mergekit 2 model 2 small-language-models 2 all-in-one-restoration 2 image-restoration 2 model-based-clustering 2 regression-algorithms 2 machine-learning-algorithms 2 cnn 2 mistral 2 qwen 2 sft 2 uncertainty-quantification 2 fine-tuning 2 r-package 2 partial-differential-equations 2