GitHub topics: mixture-of-experts
lolguy91/perfect-llm-imho
The idea to create the perfect LLM currently possible came to my mind because I was watching a YouTube on GaLore, the "sequel" to LoRa, and I realized how fucking groundbreaking that tech is. I was daydreaming about pretraining my own model, this (probably impossible to implement) concept is a refined version of that model.
Size: 17.6 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

yuzhimanhua/SciMult
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)
Language: Python - Size: 173 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 9 - Forks: 0

ZhenbangDu/Seizure_MoE
The official code for the paper 'Mixture of Experts for EEG-Based Seizure Subtype Classification'.
Language: Python - Size: 150 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 5 - Forks: 0

sammcj/moa Fork of togethercomputer/MoA
Mixture-of-Ollamas
Language: Python - Size: 1.72 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 25 - Forks: 1

nath54/Dedale_LLM
This is a prototype of a MixtureOfExpert LLM made with pytorch. Currently in developpment, I am testing its capabilities of learning with simple little tests before learning it on large language datasets.
Language: Python - Size: 75.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

BearCleverProud/MoME
Repository for Mixture of Multimodal Experts
Language: Python - Size: 857 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 7 - Forks: 0

umbertocappellazzo/PETL_AST
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters".
Language: Python - Size: 3.09 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 32 - Forks: 1

liuqidong07/MOELoRA-peft
[SIGIR'24] The official implementation code of MOELoRA.
Language: Python - Size: 10.2 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 105 - Forks: 11

RoyalSkye/Routing-MVMoE
[ICML 2024] "MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts"
Language: Python - Size: 379 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 30 - Forks: 3

yuhaoliu94/GP-HME
Gaussian Process-Gated Hierarchical Mixture of Experts
Language: Python - Size: 59.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

fchamroukhi/MEteorits
Mixtures-of-ExperTs modEling for cOmplex and non-noRmal dIsTributionS
Language: R - Size: 28.3 MB - Last synced at: 14 days ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 2

ChihLi/mcGP
This R package allows the emulation using a mesh-clustered Gaussian process (mcGP) model for partial differential equation (PDE) systems.
Language: R - Size: 2.58 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 1

UNITES-Lab/MC-SMoE
[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
Language: Python - Size: 1.8 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 56 - Forks: 7

ssube/packit
an LLM toolkit
Language: Python - Size: 716 KB - Last synced at: 29 days ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

DongmingShenDS/Mistral_From_Scratch
Mistral and Mixtral (MoE) from scratch
Language: Python - Size: 5.13 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

pranoyr/attention-models
Simplified Implementation of SOTA Deep Learning Papers in Pytorch
Language: Python - Size: 190 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

danilo-assuncao/classifiers
Several machine learning classifiers in Python
Language: Python - Size: 11.3 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

Leeroo-AI/leeroo_orchestrator
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
Language: Python - Size: 857 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 46 - Forks: 4

clementetienam/CCR_piezoresponse-force-microscopy-
Using CCR to predict piezoresponse force microscopy datasets
Size: 35.3 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

clementetienam/Ultra-fast-Deep-Mixtures-of-Gaussian-Process-Experts
Language: MATLAB - Size: 15.8 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

davidmrau/mixture-of-experts
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Language: Python - Size: 73.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 818 - Forks: 88

Leeroo-AI/mergoo
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Language: Python - Size: 1.61 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 76 - Forks: 3

HLTCHKUST/MoEL
MoEL: Mixture of Empathetic Listeners
Language: Python - Size: 8.52 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 71 - Forks: 14

drawbridge/keras-mmoe
A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
Language: Python - Size: 9.11 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 670 - Forks: 219

terru3/moe-kit
Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

aidanscannell/phd-thesis
Bayesian Learning for Control in Multimodal Dynamical Systems | written in Org-mode
Language: TeX - Size: 35.2 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

YuejiangLIU/csl
[Preprint] Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts
Language: Python - Size: 232 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

hviidhenrik/argue
Anomaly detection using ARGUE - an advanced mixture-of-experts autoencoder model
Language: Python - Size: 7.03 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

JonasSievers/Mixture-of-Experts-based-Federated-Learning-for-Energy-Forecasting
Source code for our preprint paper "Advancing Accuracy in Load Forecasting using Mixture-ofExperts and Federated Learning".
Size: 566 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

andriygav/MixtureLib
The implementation of mixtures for different tasks.
Language: Python - Size: 8.77 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

tsc2017/MIX-GAN
Some recent state-of-the-art generative models in ONE notebook: (MIX-)?(GAN|WGAN|BigGAN|MHingeGAN|AMGAN|StyleGAN|StyleGAN2)(\+ADA|\+CR|\+EMA|\+GP|\+R1|\+SA|\+SN)*
Language: Jupyter Notebook - Size: 771 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 20 - Forks: 1

lucascodinglab/uMoE-Training_Neural_Networks_with_uncertain_data
Welcome to the Uncertainty-aware Mixture of Experts (uMoE) GitHub repository. This repository contains the implementation and documentation for our uMoE model, designed to train Neural Networks effectively using a mixture of experts architecture.
Language: Python - Size: 3.77 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AmazaspShumik/Mixture-Models
Hierarchical Mixture of Experts,Mixture Density Neural Network
Language: Jupyter Notebook - Size: 4.57 MB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 45 - Forks: 17

ChihLi/mcGP-Reproducibility
This instruction aims to reproduce the results in the paper “Mesh-clustered Gaussian Process emulator for partial differential equation systems”(2023).
Language: R - Size: 194 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aniket-agarwal1999/Mixture_of_Experts
Implementation of Mixture of Experts paper
Language: Python - Size: 7.81 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

gaozhitong/MoSE-AUSeg
The official code repo for the paper "Mixture of Stochastic Experts for Modeling Aleatoric Uncertainty in Segmentation". (ICLR 2023)
Language: Python - Size: 939 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 1

hanyas/mimo
A toolbox for inference of mixture models
Language: Python - Size: 713 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 4

vivamoto/classifier
Machine learning code, derivatives calculation and optimization algorithms developed during the Machine Learning course at Universidade de Sao Paulo. All codes in Python, NumPy and Matplotlib with example in the end of file.
Language: Python - Size: 13.1 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 12 - Forks: 4

etesami/MOE-FL
The implementation of the "Robust Federated Learning by Mixture of Experts" study.
Language: Python - Size: 20.3 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

ricardokleinklein/audio_attention
Dataset, example models and demostration of our Interspeech 2019 paper
Language: Python - Size: 59.5 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

YeonwooSung/Pytorch_mixture-of-experts
PyTorch implementation of moe, which stands for mixture of experts
Language: Python - Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 1

yamsgithub/modular_deep_learning
This repository contains scripts for implementing various learning from expert architectures, such as mixture of experts and product of experts, and performing various experiments with these architectures.
Language: Jupyter Notebook - Size: 332 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 1

jackgoffinet/poe-vae
A modular implementation of product of experts VAEs for multimodal data
Language: Python - Size: 456 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 1

Fraunhofer-AISEC/ARGUE 📦
Anomaly Detection by Recombining Gated Unsupervised Experts
Language: Python - Size: 51.8 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

skiptoniam/ecomix
ecomix is a package to implement model based species level (Species Archetype Models) or site level (Regions of Common Profile) grouping of community data.
Language: R - Size: 63 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 2

clementetienam/Ultra-Fast-Mixture-of-Experts-Regression
Language: MATLAB - Size: 1.75 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

mike-gimelfarb/contextual-policy-reuse-deep-rl
Framework for Contextually Transferring Knowledge from Multiple Source Policies in Deep Reinforcement Learning
Size: 7.81 KB - Last synced at: 4 months ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

mhnazeri/PMoE
Planning Mixture of Experts model for end-to-end autonomous driving.
Language: Python - Size: 26.7 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

madaan/thinkaboutit
Code, data, and pre-trained models for our EMNLP 2021 paper "Think about it! Improving defeasible reasoning by first modeling the question scenario"
Language: Python - Size: 302 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

brchung2/tech_review Fork of BillyZhaohengLi/tech_review
Review on Google's multitask ranking system by comparing to other methods used in recommender systems
Size: 93.8 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
