GitHub topics: mixture-of-experts

Repositories

lolguy91/perfect-llm-imho

The idea to create the perfect LLM currently possible came to my mind because I was watching a YouTube on GaLore, the "sequel" to LoRa, and I realized how fucking groundbreaking that tech is. I was daydreaming about pretraining my own model, this (probably impossible to implement) concept is a refined version of that model.

Size: 17.6 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

yuzhimanhua/SciMult

Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)

Language: Python - Size: 173 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 9 - Forks: 0

ZhenbangDu/Seizure_MoE

The official code for the paper 'Mixture of Experts for EEG-Based Seizure Subtype Classification'.

Language: Python - Size: 150 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 5 - Forks: 0

sammcj/moa Fork of togethercomputer/MoA

Mixture-of-Ollamas

Language: Python - Size: 1.72 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 25 - Forks: 1

nath54/Dedale_LLM

This is a prototype of a MixtureOfExpert LLM made with pytorch. Currently in developpment, I am testing its capabilities of learning with simple little tests before learning it on large language datasets.

Language: Python - Size: 75.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

BearCleverProud/MoME

Repository for Mixture of Multimodal Experts

Language: Python - Size: 857 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 7 - Forks: 0

umbertocappellazzo/PETL_AST

This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters".

Language: Python - Size: 3.09 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 32 - Forks: 1

liuqidong07/MOELoRA-peft

[SIGIR'24] The official implementation code of MOELoRA.

Language: Python - Size: 10.2 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 105 - Forks: 11

RoyalSkye/Routing-MVMoE

[ICML 2024] "MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts"

Language: Python - Size: 379 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 30 - Forks: 3

yuhaoliu94/GP-HME

Gaussian Process-Gated Hierarchical Mixture of Experts

Language: Python - Size: 59.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

fchamroukhi/MEteorits

Mixtures-of-ExperTs modEling for cOmplex and non-noRmal dIsTributionS

Language: R - Size: 28.3 MB - Last synced at: 14 days ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 2

ChihLi/mcGP

This R package allows the emulation using a mesh-clustered Gaussian process (mcGP) model for partial differential equation (PDE) systems.

Language: R - Size: 2.58 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 1

UNITES-Lab/MC-SMoE

[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"

Language: Python - Size: 1.8 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 56 - Forks: 7

ssube/packit

an LLM toolkit

Language: Python - Size: 716 KB - Last synced at: 29 days ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

DongmingShenDS/Mistral_From_Scratch

Mistral and Mixtral (MoE) from scratch

Language: Python - Size: 5.13 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

pranoyr/attention-models

Simplified Implementation of SOTA Deep Learning Papers in Pytorch

Language: Python - Size: 190 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

danilo-assuncao/classifiers

Several machine learning classifiers in Python

Language: Python - Size: 11.3 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

Leeroo-AI/leeroo_orchestrator

The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"

Language: Python - Size: 857 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 46 - Forks: 4

clementetienam/CCR_piezoresponse-force-microscopy-

Using CCR to predict piezoresponse force microscopy datasets

Size: 35.3 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

clementetienam/Ultra-fast-Deep-Mixtures-of-Gaussian-Process-Experts

Language: MATLAB - Size: 15.8 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

davidmrau/mixture-of-experts

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

Language: Python - Size: 73.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 818 - Forks: 88

Leeroo-AI/mergoo

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

Language: Python - Size: 1.61 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 76 - Forks: 3

HLTCHKUST/MoEL

MoEL: Mixture of Empathetic Listeners

Language: Python - Size: 8.52 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 71 - Forks: 14

drawbridge/keras-mmoe

A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)

Language: Python - Size: 9.11 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 670 - Forks: 219

terru3/moe-kit

Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

aidanscannell/phd-thesis

Bayesian Learning for Control in Multimodal Dynamical Systems | written in Org-mode

Language: TeX - Size: 35.2 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

YuejiangLIU/csl

[Preprint] Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

Language: Python - Size: 232 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

hviidhenrik/argue

Anomaly detection using ARGUE - an advanced mixture-of-experts autoencoder model

Language: Python - Size: 7.03 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

JonasSievers/Mixture-of-Experts-based-Federated-Learning-for-Energy-Forecasting

Source code for our preprint paper "Advancing Accuracy in Load Forecasting using Mixture-ofExperts and Federated Learning".

Size: 566 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

andriygav/MixtureLib

The implementation of mixtures for different tasks.

Language: Python - Size: 8.77 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

tsc2017/MIX-GAN

Some recent state-of-the-art generative models in ONE notebook: (MIX-)?(GAN|WGAN|BigGAN|MHingeGAN|AMGAN|StyleGAN|StyleGAN2)(\+ADA|\+CR|\+EMA|\+GP|\+R1|\+SA|\+SN)*

Language: Jupyter Notebook - Size: 771 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 20 - Forks: 1

lucascodinglab/uMoE-Training_Neural_Networks_with_uncertain_data

Welcome to the Uncertainty-aware Mixture of Experts (uMoE) GitHub repository. This repository contains the implementation and documentation for our uMoE model, designed to train Neural Networks effectively using a mixture of experts architecture.

Language: Python - Size: 3.77 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AmazaspShumik/Mixture-Models

Hierarchical Mixture of Experts,Mixture Density Neural Network

Language: Jupyter Notebook - Size: 4.57 MB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 45 - Forks: 17

ChihLi/mcGP-Reproducibility

This instruction aims to reproduce the results in the paper “Mesh-clustered Gaussian Process emulator for partial differential equation systems”(2023).

Language: R - Size: 194 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aniket-agarwal1999/Mixture_of_Experts

Implementation of Mixture of Experts paper

Language: Python - Size: 7.81 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

gaozhitong/MoSE-AUSeg

The official code repo for the paper "Mixture of Stochastic Experts for Modeling Aleatoric Uncertainty in Segmentation". (ICLR 2023)

Language: Python - Size: 939 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 1

hanyas/mimo

A toolbox for inference of mixture models

Language: Python - Size: 713 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 4

vivamoto/classifier

Machine learning code, derivatives calculation and optimization algorithms developed during the Machine Learning course at Universidade de Sao Paulo. All codes in Python, NumPy and Matplotlib with example in the end of file.

Language: Python - Size: 13.1 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 12 - Forks: 4

etesami/MOE-FL

The implementation of the "Robust Federated Learning by Mixture of Experts" study.

Language: Python - Size: 20.3 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

ricardokleinklein/audio_attention

Dataset, example models and demostration of our Interspeech 2019 paper

Language: Python - Size: 59.5 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

YeonwooSung/Pytorch_mixture-of-experts

PyTorch implementation of moe, which stands for mixture of experts

Language: Python - Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 1

yamsgithub/modular_deep_learning

This repository contains scripts for implementing various learning from expert architectures, such as mixture of experts and product of experts, and performing various experiments with these architectures.

Language: Jupyter Notebook - Size: 332 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 1

jackgoffinet/poe-vae

A modular implementation of product of experts VAEs for multimodal data

Language: Python - Size: 456 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 1

Fraunhofer-AISEC/ARGUE 📦

Anomaly Detection by Recombining Gated Unsupervised Experts

Language: Python - Size: 51.8 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

skiptoniam/ecomix

ecomix is a package to implement model based species level (Species Archetype Models) or site level (Regions of Common Profile) grouping of community data.

Language: R - Size: 63 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 2

clementetienam/Ultra-Fast-Mixture-of-Experts-Regression

Language: MATLAB - Size: 1.75 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

mike-gimelfarb/contextual-policy-reuse-deep-rl

Framework for Contextually Transferring Knowledge from Multiple Source Policies in Deep Reinforcement Learning

Size: 7.81 KB - Last synced at: 4 months ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

mhnazeri/PMoE

Planning Mixture of Experts model for end-to-end autonomous driving.

Language: Python - Size: 26.7 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

madaan/thinkaboutit

Code, data, and pre-trained models for our EMNLP 2021 paper "Think about it! Improving defeasible reasoning by first modeling the question scenario"

Language: Python - Size: 302 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

brchung2/tech_review Fork of BillyZhaohengLi/tech_review

Review on Google's multitask ranking system by comparing to other methods used in recommender systems

Size: 93.8 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Related Keywords

mixture-of-experts 150 deep-learning 30 moe 24 machine-learning 22 llm 22 pytorch 21 large-language-models 20 artificial-intelligence 14 nlp 13 transformer 11 transformers 9 ai 9 computer-vision 9 neural-networks 6 language-model 6 gaussian-processes 6 deep-neural-networks 5 llama 5 multi-task-learning 5 llm-inference 5 efficiency 5 ensemble 4 generative-ai 4 python 4 ml 4 attention 4 natural-language-processing 4 vision-transformer 4 llms 4 multimodal-large-language-models 4 transfer-learning 4 llms-reasoning 3 conditional-computation 3 vision-language-models 3 mixture-models 3 foundation-models 3 ensemble-learning 3 prompt-tuning 3 tensorflow 3 keras 3 low-level-vision 3 multi-modal 3 mixtral-8x7b 3 huggingface 3 instruction-tuning 3 graph-neural-networks 3 unsupervised-learning 3 pytorch-implementation 3 gpt 3 lora 3 inference 3 deepseek 3 neural-network 3 quantization 2 multitask-learning 2 genai 2 model 2 kdd2018 2 contrastive-learning 2 megatron-lm 2 anomaly-detection 2 awesome 2 training 2 federated-learning 2 moa 2 optimization 2 bert 2 attention-mechanism 2 deep-reinforcement-learning 2 domain-adaptation 2 generative-model 2 peft 2 mistral-7b 2 distributed-systems 2 kv-cache 2 clustering 2 finite-element-methods 2 partial-differential-equations 2 r-package 2 uncertainty-quantification 2 surrogate-models 2 routing 2 cnn 2 sft 2 qwen 2 machine-learning-algorithms 2 mistral 2 regression-algorithms 2 agent 2 adaptive-computation 2 peft-fine-tuning-llm 2 mergekit 2 generalization 2 fine-tuning 2 llama3 2 parameter-efficient-fine-tuning 2 agents 2 mixture-of-adapters 2 tensorflow2 2 small-language-models 2