An open API service providing repository metadata for many open source software ecosystems.

Topic: "temporal-differencing-learning"

mpatacchiola/dissecting-reinforcement-learning

Python code, PDFs and resources for the series of posts on Reinforcement Learning which I published on my personal blog

Language: Python - Size: 28.1 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 621 - Forks: 180

Madhu009/Deep-math-machine-learning.ai

A blog which talks about machine learning, deep learning algorithms and the Math. and Machine learning algorithms written from scratch.

Language: Jupyter Notebook - Size: 44.5 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 195 - Forks: 174

Scitator/rl-course-experiments

Language: Jupyter Notebook - Size: 2 MB - Last synced at: about 9 hours ago - Pushed at: about 8 years ago - Stars: 77 - Forks: 23

BardOfCodes/DRL_in_CV

A course on Deep Reinforcement Learning in Computer Vision. Visit Website:

Language: HTML - Size: 26.4 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 62 - Forks: 12

callmespring/RL-short-course

Reinforcement Learning Short Course

Language: Jupyter Notebook - Size: 95.6 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 53 - Forks: 18

BY571/Munchausen-RL

PyTorch implementation of the Munchausen Reinforcement Learning Algorithms M-DQN and M-IQN

Language: Jupyter Notebook - Size: 6.56 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 37 - Forks: 3

agrawal-rohit/tic-tac-toe-ai-bots

AI bots playing Tic Tac Toe

Language: Python - Size: 43.9 KB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 33 - Forks: 56

dellalibera/gym-backgammon

Backgammon OpenAI Gym

Language: Python - Size: 5.67 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 31 - Forks: 11

mvrahden/reinforce-js

[INACTIVE] A collection of various machine learning solver. The library is an object-oriented approach (baked with Typescript) and tries to deliver simplified interfaces that make using the algorithms pretty simple.

Language: TypeScript - Size: 169 KB - Last synced at: 8 days ago - Pushed at: about 7 years ago - Stars: 31 - Forks: 7

dellalibera/td-gammon

TD-Gammon implementation

Language: Python - Size: 1.06 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 25 - Forks: 8

tirthajyoti/RL_basics

Basic Reinforcement Learning algorithms

Language: Jupyter Notebook - Size: 2.29 MB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 18 - Forks: 13

moporgic/TDL2048-Demo

Temporal Difference Learning for the Game of 2048 (Demo)

Language: Python - Size: 160 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 12 - Forks: 6

RicardoDominguez/RL-Intro

Introduction to Reinforcement Learning in Python

Language: Python - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 4

shehio/ReinforcementLearning

Reinforcement Learning algorithms with nothing abstracted away

Language: Python - Size: 788 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

shuvoxcd01/GridMind

A library of reinforcement learning (RL) algorithms.

Language: Python - Size: 334 KB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 7 - Forks: 1

Elktrn/Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_SARSA-in-python

solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using SARSA Temporal difference method Reinforcement Learning

Language: Jupyter Notebook - Size: 245 KB - Last synced at: 14 days ago - Pushed at: 4 months ago - Stars: 5 - Forks: 0

pouyan-asg/path-planning-with-RL-algorithms

Path Planning with Reinforcement Learning algorithms in an unknown environment

Language: Python - Size: 4.76 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

Vansh404/PathPlanning_withRL

Using Q-Learning Control for path planning of mobile agents in an enviroment.

Language: Python - Size: 137 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

matakshay/DeepRL-for-Delayed-Rewards

Deep RL for Temporal Credit Assignment in decision processes with delayed rewards

Language: Jupyter Notebook - Size: 5.24 MB - Last synced at: about 14 hours ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 2

JHurricane96/chessai

A self-learning chess artificial intelligence

Language: Python - Size: 12.7 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

Suchetaaa/CS747-Assignments

Foundations Of Intelligent Learning Agents (FILA) Assignments

Language: Python - Size: 3.04 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 5 - Forks: 0

RPegoud/Temporal-Difference-learning

Implementation of Temporal Difference Learning algorithms, experiment featured in Towards Data Science

Language: Jupyter Notebook - Size: 24.2 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

Elktrn/Reinforcement-Learning-solving-a-simple-4-4-Gridworld-using-TD0-evaluation-method-in-python

solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Temporal difference method Reinforcement Learning

Language: Jupyter Notebook - Size: 55.7 KB - Last synced at: 24 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

Quentin18/gymnasium-2048

Gymnasium environment for the game 2048

Language: Python - Size: 2.46 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 1

rhalbersma/doctrina

Exercises in reinforcement learning

Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

francescotorregrossa/deep-reinforcement-learning-nanodegree

Exercises and projects from Udacity's Nanodegree

Language: Jupyter Notebook - Size: 106 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

ricky-ma/DecentralizedRL

Decentralized temporal-difference reinforcement learning over randomly reshuffled topology

Language: Python - Size: 32.2 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1

Sushant-ctrl/RL-IMPLEMENTATIONS

This repository has all the codes and sources of various RL algorithms that I have implemented.

Language: Python - Size: 3.03 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

CoeusMaze/Adaptive-Temporal-Difference-Learning

Implemented AdaTD and compared it with other optimization methods in temporal difference learning.

Language: Python - Size: 50.8 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

Prakhar-FF13/Reinforcement-Learning-With-Python

Reinforcement Learning Notebooks

Language: Python - Size: 115 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

tnmichael309/2048AI

My RL Project (2048 World Record + IEEE TCIAIG Journal Source Code)

Language: C++ - Size: 26.4 KB - Last synced at: 8 months ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 0

shaheennabi/Reinforcement-or-Deep-Reinforcement-Learning-Practices-and-Mini-Projects

Reinforcement Learning (RL) 🤖! This repository is your hands-on guide to implementing RL algorithms, from Markov Decision Processes (MDPs) to advanced methods like PPO and DDPG. 🚀 Build smart agents, learn the math behind policies, and experiment with real-world applications! 🔥💡

Size: 27.3 KB - Last synced at: 26 days ago - Pushed at: 27 days ago - Stars: 2 - Forks: 0

social-ai-uoft/ad-paper

[NeurIPS 2024] Temporal-Difference Learning Using Distributed Error Signals

Language: Python - Size: 713 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

VEXLife/Accelerated-TD

My Implementation of the Accelerated Gradient Temporal Difference Learning algorithm in Python

Language: Python - Size: 1.66 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

imimali/reinforcement-learning-specialization

Reinforcement Learning Specialization courses solutions

Language: Jupyter Notebook - Size: 74.3 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

imimali/blackjack

Well I'm gonna build my own theme park

Language: Python - Size: 9.77 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

worenga/nine-mens-morris-challenge

Einreichung für die it-talents.de/Adesso Code-Competition Oktober 2017 ("Kampf gegen Mühlen"). Eine ES6-Webapplikation auf Basis von vue.js, fabric.js und synaptic für das Spiel Mühle im Browser. Es stehen unterschiedlich starke AI mit diversen Charakteristika zur Verfuegung. Das Spiel und AI laufen komplett im Browser als WebWorker.

Language: JavaScript - Size: 2.32 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0

Mileristovski/AI-ReinforcementLearning

Un projet d'apprentissage par renforcement testant divers algorithmes RL, notamment la Programmation Dynamique, Monte Carlo et l'Apprentissage par Différence Temporelle, sur plusieurs environnements comme Grid World, Monty Hall et Pierre-Papier-Ciseaux. 🚀

Language: Rust - Size: 1.38 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

arielfayol37/Easy21

Applying reinforcement learning methods to a simple card game.

Language: Python - Size: 2.24 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

aadimator/drl-nd

My solution notebooks for the Deep Reinforcement Learning Nanodegree by Udacity

Language: Jupyter Notebook - Size: 33.2 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 2

ahmed-k-aly/pacman-contest Fork of ngacho/pacman-contest

Pacman AI contest for COSC-241

Language: Python - Size: 7.36 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

AestheticVoyager/Temporal-Difference-Learning

TD-Gammon is a computer backgammon program developed in 1992 by Gerald Tesauro at IBM's Thomas J. Watson Research Center. Its name comes from the fact that it is an artificial neural net trained by a form of temporal-difference learning, specifically TD-lambda.

Language: Python - Size: 79.1 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

John-CYHui/Reinforcement-Learning-Cliff-Walking

This repo contains python implementation to the cliff walking problem from RL Introduction by Sutton & Barto Example 6.6.

Language: Python - Size: 1.24 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

NagaChiang/Fib2584-AI Fork of oxguy3/2584

An AI plays Fib2584, a variation of the well-known game 2048, with temporal difference learning.

Language: JavaScript - Size: 2.78 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

tybens/rl-easy21

Reinforcement Learning as applied to a simplified blackjack game: Easy21

Language: Python - Size: 418 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

purvasingh96/Deep-Reinforcement-Learning

Various reinforcement learning algorithms implemented using Python. This repo also contains a DQN approach to solve credit-card anomaly detection use-case.

Language: Jupyter Notebook - Size: 18.3 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

muffintoad/game-of-ur-reinforcement-learning

Solving The Royal Game of Ur using Reinforcement Learning - Monte Carlo, TD Methods, Dynamic Programming, DQN

Language: Jupyter Notebook - Size: 9.52 MB - Last synced at: 5 months ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 1

MrGeislinger/UdacityMLND_RL-MiniProject_TemporalDifference

Temporal difference mini project from the reinforcement learning section of Udacity's Machine Learning Nanodegree (MLND). This mini project wasn't required to be turned in; used as a teaching tool.

Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: 5 days ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

chetweger/min-max-games

Watch the AI learn to play Meta-Tic-Tac-Toe:

Language: JavaScript - Size: 2.97 MB - Last synced at: about 1 year ago - Pushed at: about 11 years ago - Stars: 1 - Forks: 0

di0nion/Tic-Tac-Toe

# Tic-Tac-Toe![HTML5](https://img.shields.io/badge/HTML5-E34F26?logo=html5&style=for-the-badge)![CSS3](https://img.shields.io/badge/CSS3-1572B6?logo=css3&style=for-the-badge)![JavaScript](https://img.shields.io/badge/JavaScript-F7DF1E?logo=javascript&style=for-the-badge)![WebGame](https://img.shields.io/badge/WebGame-TicTacToe-blue?style=for-t

Language: JavaScript - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

soufianeayache/all-rl-algorithms

Implementation of all RL algorithms in a simpler way

Language: Jupyter Notebook - Size: 4.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

mshuqair/Reinforcement-Learning-Grid-World

Implementation of some Reinforcement Learning methods for grid world in Python

Language: Python - Size: 210 KB - Last synced at: 11 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

mhizterpaul/taxi-route-optimization

Language: Jupyter Notebook - Size: 2.9 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

swarup3204/RL_2024_Assignments

RL Assignments , IIT KGP Autumn 2024

Language: Python - Size: 264 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Triadziuch/2048-AI-Temporal-Difference

Temporal Difference Learning AI to play 2048

Language: C++ - Size: 7.82 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

sColin16/Tic-Tac-Toe-Learning

Play Tic-Tac-Toe against a reinforcement learning agent that just learned to play through temporal difference learning

Language: Python - Size: 21.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

safffrron/Analysis-of-Deep-Reinforcement-Learning

Theories and code related to Deep learning topics involved in Reinforcement learning

Language: Jupyter Notebook - Size: 20.1 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

fionalutaj/Year4_ReinforcementLearning

This repo contains the code used in the two Reinforcement Learning assignments offered by the Department of Computing at Imperial College. Each folder contains the individual spec file along with the code solution.

Language: Jupyter Notebook - Size: 1.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

arya-ebrahimi/rl-playground

tabular and deep rl algorithms

Language: Jupyter Notebook - Size: 64.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

hritikb/Reinforcement-Learning-Algorithms

Language: Jupyter Notebook - Size: 1.02 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

JVP15/td-gammon Fork of dellalibera/td-gammon

TD-Gammon implementation

Language: Python - Size: 1.27 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Kasaderos/reinforce

RL algorithms

Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

alliajagbe/temporaldiff-vs-monte

Visualization of some reinforcement learning algorithms (sarsa, qlearning, monte carlo)

Language: Jupyter Notebook - Size: 54.7 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

saschaschramm/tiny-chatgpt

Researching the reinforcement learning algorithm of ChatGPT

Language: Jupyter Notebook - Size: 3.01 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ArianQazvini/Ai-Reinforcement_Learning

Language: Python - Size: 344 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AlinaBaber/OpenAIGymGames-GameAgent-TemporalDiffereceLearning--ReinforcemtLearning

Language: Python - Size: 6.84 KB - Last synced at: 19 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

RadiumLZhang/Reinforcement-Learning-with-Flappy-Bird

Automating flappy bird using reinforcement learning.

Language: Python - Size: 11.3 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

pilarski/FrostHollowVR

Virtual reality (VR) environment for studying human-agent decision making.

Language: C# - Size: 104 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

n1ghtf4l1/improved-lamp

Implemented an Agent using Temporal Difference Learning to play TicTacToe

Language: Jupyter Notebook - Size: 35.2 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Kapi2910/CartPoleGym

This is my implementation of Q-Learning on a cart-pole system using OpenAI Gym

Language: Python - Size: 312 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

e-candeloro/Reinforcement-Learning-Maze-Solver

A Python script that executes a RL algorithm (Temporal Difference/Q-Learning) that trains an agent inside a labyrinth to find the exit with the least number of steps possible

Language: Python - Size: 247 KB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ayushnoori/rl-simulation

Simulating epsilon-greedy and temporal difference reinforcement learning algorithms.

Language: R - Size: 609 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

JeffreyTsa1/rl_task

Trained an artificial intelligence agent using reinforcement learning to play a simple version of the game "Snake". Implemented a Temporal Difference version of the Q-learning Algorithm. Completed for school.

Language: Python - Size: 26.4 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

mew-two-github/CS6700-Project

Implementation of REINFORCE for open ai env acrobot, epsilon greedy Q-Learning for open ai env taxi & TD(0) for custom gameshow env KBC.

Language: Python - Size: 56.6 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ThaiDat/Temporal-Difference-Learning-to-Play-2048-Pascal-Version

A simple reinforcement learning AI to play 2048 games

Language: Pascal - Size: 19.5 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

DavideDeVita/Adaptive-PacMan

This directory, contains the source code of the Adaptive PacMan i designed for my Master's degree dissertation. [Videoclip Inside]

Language: Java - Size: 19.8 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

skypitcher/hats

Hyper-accelerated tree search (HATS) algorithm for solving integer least-squares problems in large-scale systems.

Language: Python - Size: 13.4 MB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

EdanToledo/Easy21RL

Attempt at the UCL 2015 David Silver Reinforcement Learning Course Assignment

Language: HTML - Size: 1.17 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

melodiCyb/MSc

MSc Course Projects

Language: Python - Size: 23.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 1

harshsiloiya98/CS747-Assignments

Assignments for CS747 - Foundations of Intelligent and Learning Agents

Language: Python - Size: 692 KB - Last synced at: 3 months ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

0cherry/Reinforcement_Learning

Language: Python - Size: 4.42 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

lephamtuyen/RL_lecture

Lecture for AgileSoda

Language: Jupyter Notebook - Size: 81 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

JayLohokare/taxi-v2

Implementation for OpenAI taxi-v2 (Using temporal-difference methods)

Language: Python - Size: 9.77 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

adajed/rl2048

Reinforcement Learning agent for 2048 game

Language: C++ - Size: 30.3 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

hurshprasad/RL-easy21

Language: Python - Size: 79.1 KB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

Related Topics
reinforcement-learning 62 q-learning 21 monte-carlo 17 artificial-intelligence 14 reinforcement-learning-algorithms 12 deep-reinforcement-learning 11 sarsa 10 python 9 machine-learning 9 policy-gradient 9 ai 9 markov-decision-processes 8 value-iteration 8 monte-carlo-methods 7 dynamic-programming 7 openai-gym 7 policy-iteration 6 game 6 pytorch 5 deep-q-network 5 dqn 5 sarsa-learning 5 neural-network 5 epsilon-greedy 4 deep-learning 4 2048 4 qlearning 4 actor-critic 4 rl 3 reinforcement-learning-agent 3 2048-game 3 tic-tac-toe 3 td-learning 3 backgammon 3 multi-armed-bandits 3 minimax-algorithm 3 temporal-difference-algorithms 2 word2vec 2 dqn-pytorch 2 simulation 2 deep-neural-networks 2 minimax 2 sarsa-lambda 2 tabular-rl 2 self-play 2 montecarlomethod 2 tic-tac-toe-game 2 gym-backgammon 2 deep-q-learning 2 tensorflow 2 monte-carlo-simulation 2 neural-networks 2 easy21 2 agent 2 grid-world 2 optimistic-inital-values 2 pygame 2 path-planning 2 python3 2 blackjack 2 reinforcement 2 numpy 2 function-approximation 2 tree-search 2 model-based-rl 2 frozenlake 2 udacity-nanodegree 2 tictactoe 2 games 2 ucb 2 machine-learning-algorithms 2 genetic-algorithm 2 qlearning-algorithm 2 model-free-rl 2 alpha-beta-pruning 2 n-tuple-networks 2 open-ai-gym 1 snake-game 1 demo 1 tictactoe-game 1 board-game 1 boardgame 1 nine-mens-morris 1 gridworld-environment 1 typescript 1 linear-function-approximation 1 chatgpt 1 gae 1 general-advantage-estimation 1 ppo 1 rlhf 1 futurama 1 monte-carlo-control 1 fitted-q-iteration 1 off-policy-evaluation 1 offline-rl 1 order-dispatch-recommendation 1 policy-based-method 1 ridesharing 1 analysis 1