GitHub topics: self-play

Repositories

suragnair/alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Language: Jupyter Notebook - Size: 414 MB - Last synced at: 4 days ago - Pushed at: 5 months ago - Stars: 4,134 - Forks: 1,088

opendilab/LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Language: Python - Size: 115 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 1,370 - Forks: 154

opendilab/DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Language: Python - Size: 292 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 3,405 - Forks: 400

Merve00akckaya/tictactoe_AI

Play classic Tic-Tac-Toe against an intelligent AI opponent in your terminal.

Language: Python - Size: 19.5 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

opendilab/DI-star

An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.

Language: Python - Size: 19.4 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 1,273 - Forks: 120

uclaml/SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language: Python - Size: 3.49 MB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 1,146 - Forks: 100

inspirai/TimeChamber

A Massively Parallel Large Scale Self-Play Framework

Language: Python - Size: 182 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 337 - Forks: 35

jianzhnie/RLZero

A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.

Language: Python - Size: 384 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 14 - Forks: 0

StarLight1212/self_play

Self play strategy for all interesting games.

Language: Jupyter Notebook - Size: 206 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

uclaml/SPPO

The official implementation of Self-Play Preference Optimization (SPPO)

Language: Python - Size: 3 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 463 - Forks: 47

cestpasphoto/alpha-zero-general

A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available

Language: Python - Size: 643 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 42 - Forks: 12

seungeunrho/football-paris

The exact codes used by the team "liveinparis" at the kaggle football competition ranked 6th/1141

Language: Python - Size: 56.4 MB - Last synced at: 19 days ago - Pushed at: over 4 years ago - Stars: 57 - Forks: 12

shuoyang2000/STLgame

Code for "STLGame: Signal Temporal Logic Games in Adversarial Multi-Agent Systems"

Language: Python - Size: 24.7 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

e-dong/space-war-rl

Recreating Bill Seiler's 1985 version of Space War and training RL agents with Self-Play

Language: Python - Size: 14.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

neoyung/connect-4

A reinforcement learning agent trained without prior human knowledge

Language: Jupyter Notebook - Size: 1.25 MB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 6

AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.

Language: Python - Size: 124 KB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 88 - Forks: 28

hishamcse/Advanced-DRL-Renegades-Game-Bots

A collection of my implemented advanced & complex RL agents for games like Soccer, Street Fighter, Mortal Kombat, Rubik's Cube, Vizdoom, Montezuma, Kungfu-master, Super-Mario-bros, HalfCheetah and more by implementing advanced DRL concepts like decision transformers, RND, MARL, A3C, ICM & sample_factory. To see my other rl agents please visit

Language: Jupyter Notebook - Size: 119 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

keeeal/alpha-ut3

AlphaZero for ultimate tic-tac-toe.

Language: Python - Size: 1000 KB - Last synced at: 4 months ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 0

lukasmyth96/Piggy

Using Value Iteration and Policy Iteration to discover the optimal solution for the strategic dice game PIG. Ultimately interested in whether the optimal solution can be reached through self-play alone.

Language: Python - Size: 23.1 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

rlsn/AlphaYun Fork of sinmentis/AlphaYun

Play Bor-Bor Zan strategically!

Language: Python - Size: 1020 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Sebastian-Schuchmann/Self-Play-TicTacToe-AI-ML-Agents-

A Self Play reinforcement learning Agent learns to play TicTacToe using the ML-Agents Framework in Unity.

Language: C# - Size: 8.4 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 32 - Forks: 9

Galtvam/OthelloZero

A Smart Agent using reinforcement learning with CNN + MCTS to learn to play Othello/Reversi

Language: Python - Size: 101 MB - Last synced at: 6 days ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

cmubig/sorts

Code base for Social Robot Tree Search (SoRTS).

Language: Python - Size: 39.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 14 - Forks: 3

tobiasemrich/SchafkopfRL

AI agents for the bavarian card game Schafkopf trained with reinforcement learning

Language: Python - Size: 482 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 34 - Forks: 5

avoroshilov/rl-selfplay

Simple reinforcement learning framework for selfplay experiments

Language: Python - Size: 22.5 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

Naton1/osrs-pvp-reinforcement-learning

Train a neural network to PvP in Old School RuneScape using reinforcement learning.

Language: Java - Size: 195 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 3

af1tang/convogym

A gym environment to train chatbots.

Language: Python - Size: 7.18 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 20 - Forks: 3

ChuaCheowHuan/gym-continuousDoubleAuction

A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.

Language: Jupyter Notebook - Size: 51.7 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 129 - Forks: 29

cedrickchee/baselines Fork of openai/baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Language: Python - Size: 4.59 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 1

maniator/self-playing-baseball

Self playing, talking baseball game.

Language: TypeScript - Size: 251 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

backpropper/s2p

Code repository for On the interaction between supervision and self-play in emergent communication (ICLR 2020)

Language: Python - Size: 13.7 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 15 - Forks: 2

TARTRL/TARTRL

基于PyTorch的分布式强化学习框架

Language: Python - Size: 14.6 KB - Last synced at: 28 days ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

Jackory/RPBT

Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)

Language: Python - Size: 56.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

novoselov-ab/ai-zero

Implementation of an AlphaGo Zero paper in one C++ header file without any dependencies

Language: C++ - Size: 11.5 MB - Last synced at: 23 days ago - Pushed at: about 7 years ago - Stars: 5 - Forks: 5

dellalibera/gym-backgammon

Backgammon OpenAI Gym

Language: Python - Size: 5.67 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 31 - Forks: 11

dellalibera/td-gammon

TD-Gammon implementation

Language: Python - Size: 1.06 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 25 - Forks: 8

ChuaCheowHuan/PBT_MARL_watered_down

My attempt to reproduce a water down version of PBT (Population based training) for MARL (Multi-agent reinforcement learning) using DDPPO (Decentralized & distributed proximal policy optimization) from ray[rllib].

Language: Jupyter Notebook - Size: 441 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

fqjin/2048NN

Train a neural network to play 2048

Language: Python - Size: 21.7 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

peldszus/alpha-zero-general-lib

An implementation of the AlphaZero algorithm for adversarial games to be used with the machine learning framework of your choice

Language: Python - Size: 154 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 9 - Forks: 1

mbaske/ml-selfplay-fighter

Self-Play Boxing Match made with Unity Machine Learning Agents

Language: C# - Size: 389 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 4

riturajkaushik/self-learning-tic-tac-toe

Donald Michie's MENACE approach to an unbeatable self-learning Tic-Tac-Toe AI game

Language: Python - Size: 96.7 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 3

OneUpWallStreet/TD-Gammon

Implementation of TD Gammon algorithm by Gerald Tesauro at IBM's Thomas J. Watson Research Center in Python.

Language: Python - Size: 1.33 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

sirmammingtonham/alphastone

Using self-play, MCTS, and a deep neural network to create a hearthstone ai player

Language: Python - Size: 16.5 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 24 - Forks: 7

brayo303/GGPinLudii

Artificial Intelligence General Game Playing Undergraduate Thesis

Language: Java - Size: 21.2 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ankursharma-iitd/AlphaZero-for-Go

Implementation of Alpha Go Zero - Reinforcement Learning Project, COL870 @iit-delhi

Language: Python - Size: 513 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

keeeal/temporal-ut3

Temporal difference learning for ultimate tic-tac-toe.

Language: Python - Size: 21.5 KB - Last synced at: 4 months ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 2

Kajune/KODOKU

Multi-agent Self-Play Reinforcement Learning Library

Language: Python - Size: 90.8 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

AutumnCrocus/shadow_sim

Emulator and AI of Shadowverse

Language: Python - Size: 579 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

navreeetkaur/AlphaGoZero

Implementation of Alpha Go Zero - Reinforcement Learning Project, COL870 @iit-delhi

Language: Python - Size: 341 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

noahwillcrow/alpha-noah

A simple implementation of an AI game player inspired by AlphaZero

Language: Rust - Size: 231 KB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ShibiHe/Model-Free-Episodic-Control

This is the implementation of paper Model Free Episodic Control

Language: Python - Size: 1.73 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 34 - Forks: 11

XLekunberri/ZeroChess

A chess program based on Deep Mind's AlphaZero.

Language: Python - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

Related Keywords

self-play 52 reinforcement-learning 36 pytorch 16 deep-learning 15 mcts 13 machine-learning 10 artificial-intelligence 10 ppo 9 alphazero 8 alphago-zero 7 monte-carlo-tree-search 7 multi-agent-reinforcement-learning 6 neural-network 6 game 6 alpha-zero 6 deep-reinforcement-learning 5 python 5 ai 4 gym 4 tensorflow 4 alphago 4 ray 3 rllib 3 atari 3 openai-gym 3 othello 3 policy-gradient 3 backgammon 3 temporal-differencing-learning 2 cpp 2 td-gammon 2 tic-tac-toe-game 2 unity 2 reinforcment-learning 2 fine-tuning 2 large-language-models 2 multi-agent 2 ml-agents 2 convolutional-neural-networks 2 population-based-training 2 resnet 2 marl 2 ultimate-tic-tac-toe 2 muzero 2 deep 2 tictactoe 2 imitation-learning 2 numpy 2 game-playing-agent 2 gomoku 2 keras 2 gobang 1 game-theory 1 market-microstructure 1 fictitious 1 n-player 1 quantitative-finance 1 quantitative-trading 1 dqn-ep 1 rust 1 convolutional-neural-network 1 zero-sum 1 zero-sum-games 1 algorithms 1 dota2-bot 1 machine-learning-engineering 1 openai-five 1 gogame 1 cardgame 1 chess 1 java 1 oldschool-runescape 1 osrs 1 rsps 1 runescape 1 active-learning 1 chatbot-platform 1 convogym 1 dialog-systems 1 natural-language-generation 1 natural-language-processing 1 nlp 1 double-auction 1 financial-engineering 1 knn 1 gym-environment 1 high-frequency-trading 1 limit-order-book 1 lstm 1 temporal-difference 1 temporal 1 backgammon-game 1 ggp 1 gym-backgammon 1 ismcts 1 gym-env 1 openai-gym-environment 1 hearthstone 1 td-learning 1 value-function 1