GitHub topics: off-policy-evaluation

Repositories

hanjuku-kaso/awesome-offline-rl

An index of algorithms for offline reinforcement learning (offline-rl)

Size: 293 KB - Last synced at: about 7 hours ago - Pushed at: 12 months ago - Stars: 977 - Forks: 89

st-tech/zr-obp

Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation

Language: Python - Size: 28.6 MB - Last synced at: 8 days ago - Pushed at: 11 months ago - Stars: 666 - Forks: 91

hakuhodo-technologies/scope-rl

SCOPE-RL: A python library for offline reinforcement learning, off-policy evaluation, and selection

Language: Python - Size: 580 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 122 - Forks: 12

Mamba413/cope

Off-Policy Interval Estimation withConfounded Markov Decision Process

Language: Python - Size: 3.14 MB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 4

banditml/offline-policy-evaluation

Implementations and examples of common offline policy evaluation methods in Python.

Language: Python - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 222 - Forks: 25

callmespring/MediationRL Fork of linlinlin97/MediationRL

Implementation of "A Reinforcement Learning Framework for Dynamic Mediation Analysis" (ICML 2023) in Python.

Language: Jupyter Notebook - Size: 13.1 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

callmespring/RL-short-course

Reinforcement Learning Short Course

Language: Jupyter Notebook - Size: 95.6 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 53 - Forks: 18

Mamba413/ROOM

Robust Offline Reinforcement Learning with Heavy-Tailed Rewards

Language: Python - Size: 958 KB - Last synced at: 28 days ago - Pushed at: 9 months ago - Stars: 4 - Forks: 1

airboxlab/hopes

HOPES: HVAC optimization with Off-Policy Evaluation and Selection

Language: Python - Size: 3.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

joshuaspear/offline_rl_ope

Stateful implementations of OPE algorithms, designed for use in the development of offline RL models

Language: Python - Size: 205 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

CausalML/bcrl

Representation Learning for OPE

Language: Python - Size: 14.6 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 0

callmespring/cope Fork of Mamba413/cope

Implementation of "Off-Policy Interval Estimation with Confounded Markov Decision Process" (JASA, 2022+)

Language: Python - Size: 3.14 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

MLD3/CounterfactualAnnot-SemiOPE

[NeurIPS 2023] Counterfactual-Augmented Importance Sampling for Semi-Offline Policy Evaluation. https://arxiv.org/abs/2310.17146

Language: Jupyter Notebook - Size: 744 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

aiueola/wsdm2022-cascade-dr

(WSDM2022 Best Paper Award Runner-Up) "Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model"

Language: Python - Size: 1.48 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 0

callmespring/COPP Fork of yyzhangecnu/COPP

Conformal Off-policy Prediction

Language: R - Size: 23 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

callmespring/DJL Fork of HengruiCai/DJL

Implementation of Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings (NeurIPS, 2021) in Python

Language: Python - Size: 1.36 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

dtak/osiris

Omitting-States-Irrelevant-to-Return Importance Sampling estimator for off-policy evaluation

Language: Python - Size: 29.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

callmespring/D2OPE Fork of RunzheStat/D2OPE

Implementation of "Deeply-Debiased Off-Policy Interval Estimation" (ICML, 2021) in Python

Language: Python - Size: 5 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

callmespring/Confounded-POMDP-OPE Fork of jiaweihhuang/Confounded-POMDP-Exp

Implementation of "A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes" (ICML)

Language: Python - Size: 1.09 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Related Keywords

off-policy-evaluation 19 reinforcement-learning 11 research 4 offline-rl 3 importance-sampling 2 mediation-analysis 2 unmeasured-confounding 2 offline-reinforcement-learning 2 confidence-intervals 2 rl 1 offlinerl 1 robust-statistics 1 heavy-tailed-distributions 1 value-iteration 1 temporal-differencing-learning 1 ridesharing 1 q-learning 1 policy-iteration 1 deep-learning 1 machine-learning 1 counterfactual-reasoning 1 neurips-2023 1 ranking 1 recommender-system 1 conformal-prediction 1 change-point-detection 1 continuous-action-space 1 partially-observable-environment 1 awesome 1 awesome-list 1 contextual-bandits 1 datasets 1 multi-armed-bandits 1 risk-assessment 1 causal-inference 1 statistical-inference 1 counterfactual-learning 1 counterfactual-policy-evaluation 1 doubly-robust 1 offline-policy-evaluation 1 deep-q-network 1 dynamic-programming 1 fitted-q-iteration 1 markov-decision-processes 1 model-based-rl 1 monte-carlo-methods 1 order-dispatch-recommendation 1 policy-based-method 1 policy-gradient 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos