GitHub topics: trust-region-policy-optimization

Repositories

niupuhua1234/GFN-PG

Code for the ICML 2024 paper 'GFlowNet Training by Policy Gradients'

Language: Python - Size: 5.35 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

ikostrikov/pytorch-trpo

PyTorch implementation of Trust Region Policy Optimization

Language: Python - Size: 9.77 KB - Last synced at: 16 days ago - Pushed at: over 6 years ago - Stars: 440 - Forks: 90

legalaspro/rl-odyssey

RL-Odyssey is a research framework for continuous control that implements state-of-the-art RL algorithms (SAC, TD3, PPO, etc.) with clean experiment scripts and interactive notebooks.

Language: Jupyter Notebook - Size: 66 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

hcnoh/rl-collection-pytorch

A collection of Reinforcement Learning implementations with PyTorch

Language: Python - Size: 5.84 MB - Last synced at: 22 days ago - Pushed at: about 3 years ago - Stars: 20 - Forks: 1

pompetzki/nes-npg

Benchmarking the Natural Gradient in Policy Gradient Methods and Evolution Strategies

Language: Python - Size: 12.9 MB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 11 - Forks: 0

TianhongDai/reinforcement-learning-algorithms

This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are still in progress)

Language: Python - Size: 3.94 MB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 666 - Forks: 110

nslyubaykin/trpo_schedule_kl

Scheduling TRPO's KL Divergence Constraint

Language: Jupyter Notebook - Size: 212 KB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

funnydman/BFGS-NelderMead-TrustRegion

Python implementation of some numerical (optimization) methods

Language: Python - Size: 16.6 KB - Last synced at: 18 days ago - Pushed at: about 4 years ago - Stars: 30 - Forks: 3

waynemystir/deep-RL-bootcamp

My solutions to the labs from this bootcamp:

Language: Jupyter Notebook - Size: 110 MB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 0

GioStamoulos/BTC_RL_Trading_Bot

A trading bitcoin agent was created with deep reinforcement learning implementations.

Language: Jupyter Notebook - Size: 53.9 MB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 27 - Forks: 6

LihangLiu/CS395T-Numerical-Optimization

Course projects of CS395T Numerical Optimization, UT Austin

Language: Python - Size: 21.8 MB - Last synced at: about 2 months ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 2

Akella17/Deep-Bayesian-Quadrature-Policy-Optimization

Official implementation of the AAAI 2021 paper Deep Bayesian Quadrature Policy Optimization.

Language: Python - Size: 3.44 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 13 - Forks: 7

kparnis3/Final-Year-Project

Undergraduate Dissertation (University of Malta) 2020-2023 - 'Autonomous Drone Control using Reinforcement Learning''

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

YixiongRen/Dynamics

works about solving nonlinear dynamic systems

Language: Matlab - Size: 308 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 4 - Forks: 2

MahanFathi/TRPO-TensorFlow

Trust Region Policy Optimization (TRPO) in pure TensorFlow

Language: Python - Size: 55.7 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 17 - Forks: 8

RLOpensource/spinning_up_kr

Language: Python - Size: 1.95 MB - Last synced at: 6 days ago - Pushed at: about 6 years ago - Stars: 6 - Forks: 3

Related Keywords

trust-region-policy-optimization 16 reinforcement-learning 9 deep-reinforcement-learning 8 trpo 8 policy-gradient 7 proximal-policy-optimization 6 deep-learning 6 continuous-control 5 pytorch 5 ppo 4 actor-critic 4 sac 3 soft-actor-critic 3 ddpg 3 reinforcement-learning-algorithms 3 mujoco 3 natural-policy-gradient 3 td3 2 optimization 2 robotics 2 monte-carlo 1 gaussian-processes 1 probablistic-numerics 1 roboschool 1 airsim 1 deep-q-learning 1 double-deep-q-learning 1 drone 1 bayesian-quadrature 1 advantage-actor-critic 1 trading-bot 1 time-series-analysis 1 stable-baselines 1 multilayer-perceptron-network 1 lstm-neural-networks 1 gym-environment 1 cryptocurrency 1 spinningup 1 ppo2 1 ou-noise 1 deep-deterministic-policy-gradient 1 tensorflow 1 vibration 1 turbine 1 nonlinear 1 newton-raphson 1 micro-slip 1 macro-slip 1 frequency-domain 1 dynamic 1 derivate 1 blade 1 aft 1 obstacle-detection 1 obstacle-avoidance 1 flappy-bird 1 dueling-dqn 1 dqn 1 atari2600 1 algorithm 1 a2c 1 quanser-robots 1 natural-evolution-strategies 1 benchmarking 1 openai-gym 1 generalized-advantage-estimation 1 gae 1 gymnasium 1 dm-control 1 a3c 1 reinfrocement-learning 1 gflownet 1 actor-critic-algorithm 1 q-learning 1 trust-region-dogleg-algorithm 1 trust-region 1 python 1 numerical-optimization 1 numerical-methods 1 nelder-mead 1 mathematics 1 machine-learning-algorithms 1 machine-learning 1 dogleg-method 1 dogleg-algorithm 1 bfgs 1 ai 1 scheduling 1 kl-divergence 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos