Topic: "policy-optimization"
chauncygu/Multi-Agent-Constrained-Policy-Optimisation
Multi-Agent Constrained Policy Optimisation (MACPO; MAPPO-L).
Language: Python - Size: 8.48 MB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 165 - Forks: 27

cxxgtxy/POP3D
Policy Optimization with Penalized Point Probability Distance: an Alternative to Proximal Policy Optimization
Language: Python - Size: 2.36 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 42 - Forks: 2

elsheikh21/car-racing-ppo
Implementation of a Deep Reinforcement Learning algorithm, Proximal Policy Optimization (SOTA), on a continuous action space openai gym (Box2D/Car Racing v0)
Language: Python - Size: 21.4 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 35 - Forks: 5

MahanFathi/Model-Based-RL
Model-based Policy Gradients
Language: Python - Size: 1.89 MB - Last synced at: 9 months ago - Pushed at: about 5 years ago - Stars: 29 - Forks: 4

manantomar/Mirror-Descent-Policy-Optimization
Mirror Descent Policy Optimization
Language: Python - Size: 44.9 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 25 - Forks: 3

CLAIRE-Labo/no-representation-no-trust
Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Moalla et al. 2024). Uses TorchRL and provides extensive tools for studying representation dynamics in policy optimization.
Language: Jupyter Notebook - Size: 8.28 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 13 - Forks: 1

sarmueller/gibo
This repository contains the code for the paper "Local policy search with Bayesian optimization".
Language: Jupyter Notebook - Size: 87 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 6

shaheennabi/Reinforcement-or-Deep-Reinforcement-Learning-Practices-and-Mini-Projects
Reinforcement Learning (RL) 🤖! This repository is your hands-on guide to implementing RL algorithms, from Markov Decision Processes (MDPs) to advanced methods like PPO and DDPG. 🚀 Build smart agents, learn the math behind policies, and experiment with real-world applications! 🔥💡
Size: 24.4 KB - Last synced at: 29 days ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

bmaxdk/OpenAI-Gym-PongDeterministic-v4-PPO
Language: Jupyter Notebook - Size: 1.77 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

grassking100/reinforcement_learning
An implementation of the reinforcement learning for CartPole-v0 by policy optimization
Language: Python - Size: 6.84 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

proceduralia/randomist
Code for Policy Optimization as Online Learning with Mediator Feedback
Language: Python - Size: 31.3 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

MehdiShahbazi/REINFORCE-Cart-Pole-Gymnasium
This repo implements the REINFORCE algorithm for solving the Cart Pole V1 environment of the Gymnasium library using Python 3.8 and PyTorch 2.0.1.
Language: Python - Size: 636 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 2

ConnorWatts/kpo
Policy optimization algorithm with trust regions based on the Maximum Mean Discrepancy (MMD) metric. Investigates the efficiency and effectiveness of the approach as well as exploring the different techniques used to approximate the policy update.
Size: 1.95 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0
