GitHub topics: muzero

Repositories

opendilab/LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Language: Python - Size: 117 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,416 - Forks: 165

rlglab/minizero

MiniZero: An AlphaZero and MuZero Training Framework

Language: C++ - Size: 2.34 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 94 - Forks: 24

SverreNystad/MuZero

An implementation of the MuZero algorithm by Google Deepmind.

Language: Python - Size: 37.9 MB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

yenw/computer-go-dataset

datasets for computer go

Language: C++ - Size: 597 MB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 153 - Forks: 40

huawei-noah/xingtian

xingtian is a componentized library for the development and verification of reinforcement learning algorithms

Language: Python - Size: 7.05 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 311 - Forks: 90

werner-duvaud/muzero-general

MuZero

Language: Python - Size: 7.09 MB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 2,644 - Forks: 647

rystrauss/dopamax

Reinforcement learning in pure JAX.

Language: Python - Size: 262 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 13 - Forks: 1

rlglab/optionzero

[ICLR 2025 Oral] OptionZero: A method for autonomously discovering and utilizing options in the MuZero algorithm

Language: C++ - Size: 2.67 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 14 - Forks: 0

sail-sg/rosmo

Codes for "Efficient Offline Policy Optimization with a Learned Model", ICLR2023

Language: Python - Size: 66.4 KB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 29 - Forks: 0

jianzhnie/RLZero

A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.

Language: Python - Size: 384 KB - Last synced at: 4 months ago - Pushed at: 10 months ago - Stars: 14 - Forks: 0

Hwhitetooth/jax_muzero

An implementation of MuZero in JAX.

Language: Python - Size: 519 KB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 56 - Forks: 8

alexZajac/muzero_experiments

A set of experiments and human-playing comparisons with the Muzero agent from Google DeepMind, made as part of a research project with l'école polytechnique.

Language: Python - Size: 131 MB - Last synced at: 4 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 2

trunghng/muzero

Language: Python - Size: 187 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Zeta36/muzero

A simple implementation of MuZero algorithm for connect4 game

Language: Jupyter Notebook - Size: 59.6 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 97 - Forks: 20

bellerb/chappie.ai

Generalized AI to perform a multitude of tasks written in python3

Language: Jupyter Notebook - Size: 406 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 21 - Forks: 6

souvikshanku/tic-tac-toe-zero

MuZero - tic-tac-toe

Language: Python - Size: 87.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

tuero/muzero-cpp

A C++ pytorch implementation of MuZero

Language: C++ - Size: 65.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 5

benborder/drla-sim

Trains a deep reinforcement learning agent in simulation testbed environments with the DRLA library.

Language: C++ - Size: 68.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

benborder/drla-atari

Trains deep reinforcement learning agents in Atari environments via the DRLA library.

Language: C++ - Size: 1.19 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

benborder/drla

C++ Deep Reinforcement Learning Agent library

Language: C++ - Size: 690 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 1

AntoniovanDijck/BlackJackRL

Deep Q Learning blackbox strategies for casino games

Language: Jupyter Notebook - Size: 95.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Itomigna2/Muesli-lunarlander

Muesli RL algorithm implementation (PyTorch) (LunarLander-v2)

Language: Jupyter Notebook - Size: 225 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 3

DHDev0/Muzero

Pytorch Implementation of MuZero for gym environment. It support any Discrete , Box and Box2D configuration for the action space and observation space.

Language: Python - Size: 10 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 1

abrahamabel/Muzero-GDM_Pseudo_Code

A Notebook implementation of the Pseudocode from the original Muzero paper

Language: Jupyter Notebook - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

ChukwumaChukwuma/enyimba_ai

Applying AlphaZero Self-Play Tactics to LLaMA for Enhanced Chatbot Interaction

Language: Python - Size: 1.11 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Itomigna2/Muesli-cartpole

Simple Muesli RL algorithm implementation (PyTorch)

Language: Jupyter Notebook - Size: 8.35 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Pytorch Implementation of Stochastic MuZero for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.

Language: Python - Size: 12.6 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 37 - Forks: 3

abrahamabel/GenesisZero

GenesisZERO : potential applications for MCTS agents with LLMs for Sequential decision-making

Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

DHDev0/Muzero-unplugged

Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.

Language: Python - Size: 2.27 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 2

michaelnny/muzero

A PyTorch implementation of DeepMind's MuZero agent

Language: Python - Size: 48.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 0

kaesve/muzero

A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.

Language: Jupyter Notebook - Size: 115 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 138 - Forks: 24

johan-gras/MuZero

A structured implementation of MuZero

Language: Python - Size: 41 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 198 - Forks: 56

hr0nix/omega

A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.

Language: Python - Size: 577 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 32 - Forks: 4

liudengfeng/mrlxq

muzero Algorithm Reinforcement Learning for Chinese XiangQi

Language: Python - Size: 168 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

BIGBALLON/Toward-AGZ

Materials for AlphaGo

Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

hayashimasa/Robust_MuZero

A robust variant of MuZero

Language: Python - Size: 26.9 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

mdhiebert/meta-minichess

Meta-learning experiments for the game of minichess and related rule variants.

Language: Python - Size: 67.5 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

svenssona/muzero

Learning how muzero works

Language: Jupyter Notebook - Size: 19.5 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

expz/muzero-ray

MuZero for Atari implemented using the ray library.

Language: Python - Size: 345 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

Nebraskinator/SuperMarioBrosAI Fork of johan-gras/MuZero

MuZero for Super Mario Bros

Language: Python - Size: 14.8 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

Atze00/muzero-cartpole

Language: Python - Size: 178 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

fpga-tom/pyzero

Language: C++ - Size: 262 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

Related Keywords

muzero 42 reinforcement-learning 26 deep-reinforcement-learning 16 pytorch 13 alphazero 11 mcts 11 deep-learning 9 monte-carlo-tree-search 8 machine-learning 7 model-based-rl 6 atari 6 gym 5 ppo 5 libtorch 4 cpp 4 transformer 4 reinforcement-learning-algorithms 4 ai 4 python3 4 rl 4 jax 4 arxiv-papers 3 gym-environments 3 artificial-intelligence 3 python 3 alphago 3 model-based-reinforcement-learning 3 tensorflow 3 resnetv2 3 dreamer 3 gumbel-muzero 3 neural-network 3 lstm 3 stochastic-muzero 3 llms 2 muzero-stochastic 2 dqn 2 resnetv1 2 muzero-unplugged 2 offline-reinforcement-learning 2 board-games 2 gomoku 2 mcts-algorithm 2 cartpole 2 self-play 2 tictactoe 2 go 2 ray 2 colab 2 jupyter-notebook 2 deepmind 2 muesli 2 arxiv 2 rlx 1 q-learning-algorithm 1 torch 1 lunarlander-v2 1 mlx 1 muzero-pseudocode 1 chatbot 1 generative-ai 1 deep-q-network 1 qmix 1 replay 1 super-mario-bros 1 minichess 1 meta-minichess 1 meta-learning 1 gym-minichess 1 robust-control 1 deep 1 alphago-zero 1 xiqngqi 1 rllib 1 rlax 1 nethack 1 minihack 1 flax 1 world-models 1 tf2 1 tensorflow2 1 reinforcement-learning-agent 1 llm-agent 1 llm 1 large-language-models 1 gym-environment 1 online-reinforcement-learning 1 multilayer-perceptron 1 cartpole-v1 1 strategy 1 rlhf 1 prompt-engineering 1 policy-evaluation 1 natural-language-processing 1 llama2 1 impala 1 tygem 1 sgf 1 phoenixgo 1 minigo 1