An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: offline-reinforcement-learning

Brezy024/Mind-the-Gap

# Mind-the-GapMind the Gap aims to enhance Chain of Thought (CoT) tuning for better AI performance. Join us in exploring innovative solutions and contributing to the project! ๐Ÿ™๐ŸŒŸ

Language: Python - Size: 11 MB - Last synced at: about 8 hours ago - Pushed at: about 9 hours ago - Stars: 1 - Forks: 0

nico-espinosadice/SORL

Scaling Offline RL via Efficient and Expressive Shortcut Models

Language: Python - Size: 22.5 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

kaushikd24/Decision-Transformer-Deployment-Stack

Reinforcement Learning is often complex, but Decision Transformers frame offline RL as a sequence modeling problem. This repo provides the complete stack -- from development to deployment of Decision Transformers.

Language: Python - Size: 1000 Bytes - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

instadeepai/og-marl

Datasets with baselines for offline multi-agent reinforcement learning.

Language: Python - Size: 54.7 MB - Last synced at: 14 days ago - Pushed at: 28 days ago - Stars: 169 - Forks: 14

ikostrikov/jaxrl

JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.

Language: Jupyter Notebook - Size: 4.17 MB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 683 - Forks: 72

pi-optimal/pi-optimal

An open-source Python library for Reinforcement Learning (RL), designed to model, optimize, and control dynamic systems.

Language: Jupyter Notebook - Size: 3.62 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 38 - Forks: 1

tinkoff-ai/CORL ๐Ÿ“ฆ

High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC

Language: Python - Size: 3.33 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 1,196 - Forks: 147

Cryolite/kanachan

A Japanese (Riichi) Mahjong AI Framework

Language: Python - Size: 996 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 308 - Forks: 40

EmptyJackson/unifloral

Unified Implementations of Offline Reinforcement Learning Algorithms

Language: Python - Size: 47.9 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 58 - Forks: 3

ZhengYinan-AIR/FISOR

[ICLR 2024] The official implementation of "Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model"

Language: Python - Size: 13.1 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 97 - Forks: 7

tomWitkowski/recurrence-mimicking-learning

Hyper-efficient Offline Recurrent Reinforcement Learning Algorithm. It solves decision path of any length without sequential processing. Implemented for Sharpe Ratio optimization as a base problem.

Language: Python - Size: 28.3 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

nikhilbarhate99/min-decision-transformer

Minimal implementation of Decision Transformer: Reinforcement Learning via Sequence Modeling in PyTorch for mujoco control tasks in OpenAI gym

Language: Python - Size: 21.1 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 267 - Forks: 26

ZhengyaoJiang/latentplan

Code release for Efficient Planning in a Compact Latent Action Space (ICLR2023) https://arxiv.org/abs/2208.10291.

Language: Python - Size: 659 KB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 105 - Forks: 12

msramada/data-conforming-control

This code can be used to reproduce the results in our paper ``Data-conforming data-driven control: avoiding premature generalizations beyond data''

Language: Jupyter Notebook - Size: 40.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

xiaobaobaochifan/NAC

The official repository for Net Actor-Critic

Language: Python - Size: 6.91 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

Manchery/iql-pytorch

Unofficial PyTorch implementation (replicating paper results) of Implicit Q-Learning (In-sample Q-Learning) for offline RL

Language: Python - Size: 261 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 23 - Forks: 1

data-intelligence-for-health-lab/RL4CAD

RL4CAD: Personalized Decision Making for Coronary Artery Disease Treatment using Offline Reinforcement Learning

Language: Python - Size: 113 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 1

kwanyoungpark/LEQ

Code for Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

Language: Python - Size: 20.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 8 - Forks: 0

Allenpandas/2020-Reinforcement-Learning-Conferences-Papers ๐Ÿ“ฆ

The proceedings of top conference in 2020 on the topic of Reinforcement Learning (RL), including: AAAI, IJCAI, NeurIPS, ICML, ICLR, ICRA, AAMAS and more.

Size: 94.7 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

enjeeneer/zero-shot-rl

VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)

Language: Python - Size: 2.45 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 12 - Forks: 1

Udit176/OfflineRL

This repository contains code for my ECE595RL term project (Fall 2024) at Purdue University.

Language: Jupyter Notebook - Size: 2.41 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

navarog/cross-validated-ope

The source code to Cross-Validated Off-Policy Evaluation

Language: Python - Size: 2.23 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

polixir/NeoRL

Python interface for accessing the near real-world offline reinforcement learning (NeoRL) benchmark datasets

Language: Python - Size: 21.1 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 109 - Forks: 12

zaiyan-x/RFQI

Implementation of Robust Reinforcement Learning using Offline Data [NeurIPS'22]

Language: Python - Size: 26.6 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 22 - Forks: 3

axelbr/offline-mmd

Implementation of Offline Munchausen Mirror Descent.

Language: Python - Size: 559 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

RLE-Foundation/rllte-hub

Large-Scale and Comprehensive Data Hub for Reinforcement Learning

Language: Python - Size: 466 KB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 2

Mamba413/ROOM

Robust Offline Reinforcement Learning with Heavy-Tailed Rewards

Language: Python - Size: 958 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 4 - Forks: 1

snu-mllab/DPPO

Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)

Language: Python - Size: 26.5 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 35 - Forks: 1

ganjiro/OfflineMania

Official repository of "OfflineMania: A Benchmark Environment for Offline Reinforcement Learning in Racing Games"

Language: ASP.NET - Size: 140 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

LoopMind-AI/loopquest

A Production Tool for Embodied AI

Language: Python - Size: 2.89 MB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 28 - Forks: 0

AdamJelley/EfficientOfflineRL

Codebase for the paper "Efficient Offline Reinforcement Learning: The Critic is Critical"

Language: Python - Size: 4.75 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

elated-sawyer/RL-in-Federated-Setting

Summarising the research of Offline RL in Federated Setting.

Size: 16.6 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

Allenpandas/Reinforcement-Learning-Papers

๐Ÿ“š List of Top-tier Conference Papers on Reinforcement Learning (RL)๏ผŒincluding: NeurIPS, ICML, AAAI, IJCAI, AAMAS, ICLR, ICRA, etc.

Size: 608 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 246 - Forks: 30

ryanxhr/CQL

Implementation of CQL in "Conservative Q-Learning for Offline Reinforcement Learning" based on BRAC family.

Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 0

ryanxhr/Fisher_BRC

Implementation of Fisher_BRC in "Offline Reinforcement Learning with Fisher Divergence Critic Regularization" based on BRAC family.

Language: Python - Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 1

yihaosun1124/OfflineRL-Kit

An elegant PyTorch offline reinforcement learning library for researchers.

Language: Python - Size: 398 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 221 - Forks: 28

ryanxhr/DeepThermal

[AAAI 2022] The official implementation of "DeepThermal: Combustion Optimization for Thermal Power Generating Units Using Offline Reinforcement Learning"

Language: Python - Size: 47.9 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 2

HYDesmondLiu/B2RL

The First Open-Sourced Building Batch Reinforcement Learning Dataset

Size: 914 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 1

ryanxhr/POR

[NeurIPS 2022 Oral] The official implementation of POR in "A Policy-Guided Imitation Approach for Offline Reinforcement Learning"

Language: Python - Size: 4.09 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 53 - Forks: 6

BY571/CQL

PyTorch implementation of the Offline Reinforcement Learning algorithm CQL. Includes the versions DQN-CQL and SAC-CQL for discrete and continuous action spaces.

Language: Python - Size: 28.6 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 99 - Forks: 20

ryanxhr/DWBC

[ICML 2022] The official implementation of DWBC in "Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations"

Language: Python - Size: 29.3 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 31 - Forks: 2

snu-mllab/EDAC

Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Language: Python - Size: 123 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 64 - Forks: 2

czp16/cde-offline-rl

Codes for "Learning from Sparse Offline Datasets via Conservative Density Estimation"

Language: Python - Size: 45.9 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ZhengYinan-AIR/OMIGA

[NeurIPS 2023] The official implementation of "Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization"

Language: Python - Size: 74.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 3

koulanurag/opcc

Benchmark for "Offline Policy Comparison with Confidence"

Language: Python - Size: 70.7 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

aryandeshwal/pytorch_coms

Pytorch based reimplementation of COMS: Conservative Objective Models for Effective Offline Model-Based Optimization.

Language: Python - Size: 8.79 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

yudasong/HyQ

Official code repo for paper: Hybrid RL: Using both offline and online data can make RL efficient.

Language: Python - Size: 26.3 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 20 - Forks: 3

ryanxhr/CPQ

[AAAI 2022] The official implementation of CPQ in "Constraints Penalized Q-learning for Safe Offline Reinforcement Learning"

Language: Python - Size: 49.8 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 2

Allenpandas/2019-Reinforcement-Learning-Conferences-Papers ๐Ÿ“ฆ

The proceedings of top conference in 2019 on the topic of Reinforcement Learning (RL), including: AAAI, IJCAI, NeurIPS, ICML, ICLR, ICRA, AAMAS and more.

Size: 55.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Allenpandas/2018-Reinforcement-Learning-Conferences-Papers ๐Ÿ“ฆ

The proceedings of top conference in 2018 on the topic of Reinforcement Learning (RL), including: AAAI, IJCAI, NeurIPS, ICML, ICLR, ICRA, AAMAS and more.

Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

xionghuichen/MAPLE

The Official Code for Offline Model-based Adaptable Policy Learning (NeurIPS'21 & TPAMI)

Language: Python - Size: 5.31 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 5

holarissun/RewardShifting

Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL

Language: Python - Size: 1.79 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 1

DHDev0/Stochastic-muzero

Pytorch Implementation of Stochastic MuZero for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.

Language: Python - Size: 12.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 37 - Forks: 3

shivakanthsujit/seq_eval

Code for Sequential Evaluation of Offline RL paper

Language: Jupyter Notebook - Size: 6.67 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

polixir/OfflineRL

A collection of offline reinforcement learning algorithms. This is a mirror repo from https://agit.ai/Polixir/OfflineRL

Language: Python - Size: 237 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 124 - Forks: 16

MLD3/CounterfactualAnnot-SemiOPE

[NeurIPS 2023] Counterfactual-Augmented Importance Sampling for Semi-Offline Policy Evaluation. https://arxiv.org/abs/2310.17146

Language: Jupyter Notebook - Size: 744 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

DesikRengarajan/FEDORA

[FL-ICML 2023] Code for Federated Ensemble-Directed Offline Reinforcement Learning

Language: Python - Size: 24.4 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 9 - Forks: 2

ryanxhr/BEAR

Pytorch implementation of BEAR in "Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction"

Language: Python - Size: 104 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 10 - Forks: 1

tinkoff-ai/ReBRAC ๐Ÿ“ฆ

Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC

Language: Jupyter Notebook - Size: 1.91 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 34 - Forks: 1

LanqingLi1993/FOCAL-ICLR

Code for FOCAL Paper Published at ICLR 2021

Language: Python - Size: 2.41 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 42 - Forks: 10

ZJLAB-AMMI/HS-OMRL

Python code to implement hard sampling based task representation learning for robust offline meta RL

Language: Python - Size: 125 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

Sagarnandeshwar/Offline_Reinforcement_Learning

Reinforcement Learning (COMP 579) Project

Language: Jupyter Notebook - Size: 1.63 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

sail-sg/rosmo

Code for "Efficient Offline Policy Optimization with a Learned Model", ICLR2023

Language: Python - Size: 74.2 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 19 - Forks: 0

tinkoff-ai/sac-rnd

Official implementation for "Anti-Exploration by Random Network Distillation", ICML 2023

Language: Python - Size: 22.5 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 23 - Forks: 1

tinkoff-ai/lb-sac

Official implementation for "Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size", NeurIPS 2022, Offline RL Workshop

Language: Python - Size: 905 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 1

YangRui2015/AWGCSL

Code for ICLR 2022 paper Rethinking Goal-Conditioned Supervised Learning and Its Connection to Offline RL.

Language: Python - Size: 180 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 23 - Forks: 0

Howuhh/sac-n-jax

Single-file SAC-N implementation on jax with flax and equinox. 10x faster than pytorch

Language: Python - Size: 531 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 36 - Forks: 1

tinkoff-ai/cnf

Official implementation for "Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows", NeurIPS 2022, Offline RL Workshop

Language: Python - Size: 1.14 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 0

danilovsnnv/TLab2023-RL-task

Implementation and comparison of offline RL algorithms

Size: 1000 Bytes - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

kschweig/OfflineRL

Experiment for Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning

Language: Jupyter Notebook - Size: 3.28 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 5

Facebear-ljx/RGM

The official implementation of "Mind the Gap: Offline Policy Optimization for Imperfect Rewards" (ICLR2023)

Language: Python - Size: 3.75 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 1

omrijsharon/fpv_data_box

The easiest way to copy your flight log files and videos from racing drones and goggles DVR.

Language: Python - Size: 52.7 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

alexchen-buaa/flexrl

Non-modular implementation of common RL algorithms

Language: Python - Size: 120 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

resuldagdanov/offline-rl-minigrid-env

Implementation of Offline Reinforcement Learning in Gym Mini-Grid Environment :key:

Language: Python - Size: 112 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

Mohan-Zhang-u/smpl

Language: Jupyter Notebook - Size: 5.71 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 2

chenci107/adaptive_bc Fork of zhaoyi11/adaptive_bc

Size: 13.7 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

YangRui2015/RORL

Code for NeurIPS 2022 paper "Robust offline Reinforcement Learning via Conservative Smoothing"

Language: Python - Size: 144 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 2

vballoli/offlax

Offline Reinforcement Learning Framework in JAX

Language: Python - Size: 36.1 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

dsshim0125/s2p

"S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement Learning" (NeurIPS 2022)

Language: Python - Size: 33.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

ReinholdM/Papers-of-Offline-RL

Related papers for offline reforcement learning (we mainly focus on representation and sequence modeling and conventional offline RL)

Size: 24.4 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 12 - Forks: 3

WeiChengTseng/CQL-pytorch

Size: 1.95 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

Related Keywords
offline-reinforcement-learning 81 reinforcement-learning 52 pytorch 15 deep-reinforcement-learning 13 machine-learning 8 deep-learning 7 multi-agent-reinforcement-learning 6 reinforcement-learning-algorithms 6 jax 6 transformer 5 imitation-learning 5 pytorch-implementation 5 d4rl 5 meta-reinforcement-learning 5 ijcai 4 gym 4 icml 4 aaai 4 neurips 4 dqn 4 model-based-reinforcement-learning 4 q-learning 4 offline-rl 4 tensorflow 4 flax 3 behavioral-cloning 3 safe-reinforcement-learning 3 inverse-reinforcement-learning 3 mujoco 3 atari 2 datasets 2 reinforcement-learning-papers 2 robotics 2 generative-model 2 off-policy-evaluation 2 rl 2 deep-q-network 2 reinforcement-learning-agent 2 muzero 2 game-ai 2 imperfect-reward-function 2 multiagent-reinforcement-learning 2 icra 2 iclr 2 sac 2 federated-learning 2 implicit-q-learning 2 rl-papers 2 aamas 2 ensemble-learning 2 reward-engineering 1 business-rl 1 reward-shaping 1 rnd 1 value-based-methods 1 arxiv-papers 1 gym-environments 1 flax-implementation 1 hvac-control 1 open-source 1 optimal-control 1 discrete-sac 1 confidence-estimation 1 offline-policy-comparison 1 policy-evaluation 1 uncertainty-estimation 1 black-box-optimization 1 offline-optimization 1 offline-optimization-problem 1 surrogate-models 1 hybrid-reinforcement-learing 1 reinforcement-learning-theory 1 constrained-reinforcement-learning 1 nips 1 reinforcement-learning-tutorials 1 paper 1 dqn-rnd 1 ensemble 1 ensemble-rl 1 exploration-exploitation 1 reward-design 1 jax-implementation 1 random-network-distillation 1 hindsight-experience-replay 1 equinox 1 normalizing-flows 1 keras 1 dataset-generation 1 betaflight 1 betaflight-blackbox 1 raspberry-pi 1 in-sample 1 iql 1 graph-algorithms 1 minigrid 1 benchmarks 1 bioinformatics 1 biology 1 chemistry 1 control-theory 1