on-policy | Topic | Ecosyste.ms: Repos

Topic: "on-policy"

MarcoMeter/episodic-transformer-memory-ppo

Clean baseline implementation of PPO using an episodic TransformerXL memory

Language: Python - Size: 23.9 MB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 172 - Forks: 22

MarcoMeter/recurrent-ppo-truncated-bptt

Baseline implementation of recurrent PPO using truncated BPTT

Language: Jupyter Notebook - Size: 17.7 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 138 - Forks: 18

wisnunugroho21/reinforcement_learning_v_mpo

Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

Language: Python - Size: 15.6 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 1

wisnunugroho21/reinforcement_learning_truly_ppo

Deep Reinforcement Learning by using Truly Proximal Policy Optimization in Tensorflow 2 and Pytorch

Language: Python - Size: 27.3 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 11 - Forks: 1

OpenRL-Lab/RL_Tutorial

Reinforcement Learning Tutorial (强化学习教程)

Size: 7.81 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

BY571/pytorch-vmpo

PyTorch implementation of V-MPO

Language: Python - Size: 35.2 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

nima-siboni/simplest-world-Actor-Critic

Reinforcement learning, Policy Gradient, Actor-Critic, AC, Agent-based Simulation, Simple-world

Language: Python - Size: 698 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 1

TheUnsolvedDev/ReinforcementLearning

Repository containing basic algorithm applied in python.

Language: Jupyter Notebook - Size: 121 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

amirhosein-mesbah/Reinforcement_learning

This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.

Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0