exploration-exploitation | Topic

Topic: "exploration-exploitation"

opendilab/DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Language: Python - Size: 292 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 3,416 - Forks: 401

wzhe06/Reco-papers

Classic papers and resources on recommendation

Language: Python - Size: 169 MB - Last synced at: 7 days ago - Pushed at: almost 5 years ago - Stars: 3,367 - Forks: 810

tigerneil/awesome-deep-rl

For deep RL and the future of AI.

Language: HTML - Size: 1.92 MB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 1,459 - Forks: 218

imsheridan/DeepRec

推荐、广告工业界经典以及最前沿的论文、资料集合/ Must-read Papers on Recommendation System and CTR Prediction

Size: 273 MB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 1,007 - Forks: 219

david-cortes/contextualbandits

Python implementations of contextual bandits algorithms

Language: Python - Size: 9.88 MB - Last synced at: about 16 hours ago - Pushed at: 12 days ago - Stars: 780 - Forks: 147

opendilab/awesome-exploration-rl

A curated list of awesome exploration RL resources (continually updated)

Size: 2.46 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 474 - Forks: 14

TianhongDai/self-imitation-learning-pytorch

This is the pytorch implementation of ICML 2018 paper - Self-Imitation Learning.

Language: Python - Size: 3.48 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 61 - Forks: 11

stratisMarkou/sample-efficient-bayesian-rl

Source for the sample efficient tabular RL submission to the 2019 NIPS workshop on Biological and Artificial RL

Language: Jupyter Notebook - Size: 44.8 MB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 25 - Forks: 15

holarissun/RewardShifting

Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL

Language: Python - Size: 1.79 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 1

YaoYao1995/MEEE

Code to reproduce the experiments in Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation (MEEE).

Language: Python - Size: 155 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 4

gokceuludogan/interactive-music-recommendation

Personalized and Interactive Music Recommendation with Bandit approach

Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 9 - Forks: 2

kkm24132/ReinforcementLearning

Focuses on Reinforcement Learning related concepts, use cases, and learning approaches

Language: Jupyter Notebook - Size: 7.56 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7 - Forks: 3

kakaobrain/leco

Official implementation of LECO (NeurIPS'22)

Language: Python - Size: 403 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 0

Amshra267/Thompson-Greedy-Comparison-for-MultiArmed-Bandits

Repository Containing Comparison of two methods for dealing with Exploration-Exploitation dilemma for MultiArmed Bandits

Language: Python - Size: 12.9 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 0

mbhenaff/neural-e3

Language: Python - Size: 5.36 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 1

haoyangzheng1996/ts_ulmc

The GitHub repository for "Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo", AISTATS 2024.

Language: Python - Size: 56.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 4 - Forks: 0

baturaysaglam/DISCOVER

Deep Intrinsically Motivated Exploration in Continuous Control

Language: Python - Size: 52.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

SXV357/Inspirit-AI-Deep-Dive-Designing-DL-Systems-FinalProject-RL-for-Autonomous-Vehicles

This project uses Reinforcement Learning to teach an agent to drive by itself and learn from its observations so that it can maximize the reward(180+ lines)

Language: Jupyter Notebook - Size: 14.6 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 3 - Forks: 0

fabprezja/Deep-Learning-TPBook-Points

Some Key Points from the Deep Learning Tuning Playbook

Size: 3.91 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

hridayns/Research-Project-on-Reinforcement-learning

Research Thesis - Reinforcement Learning

Language: Python - Size: 81.1 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 3 - Forks: 1

guptav96/bandit-algorithms

A short implementation of bandit algorithms - ETC, UCB, MOSS and KL-UCB

Language: Python - Size: 5.56 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

kochlisGit/Reinforcement-Learning-Algorithms

This project focuses on comparing different Reinforcement Learning Algorithms, including monte-carlo, q-learning, lambda q-learning epsilon-greedy variations, etc.

Language: Python - Size: 460 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

Ralami1859/Action-Elimination-for-Multi-Armed-Bandits

Action elimination for multi-armed bandits

Language: MATLAB - Size: 59.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

tyoon10/Exploration-and-Exploitation

Language: Jupyter Notebook - Size: 1.96 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 1

ivotints/Learn2Slither

A reinforcement learning project where a snake learns to navigate and survive in a dynamic environment through Q-learning.

Language: Python - Size: 17.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

hmishfaq/LMC-LSVI

The official code release for Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo, ICLR 2024.

Language: Python - Size: 32.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 2

kalexandriabond/competing-representations-shape-evidence-accumulation

Human and sim. behavioral / small-scale neural data for paper: https://www.biorxiv.org/content/10.1101/2022.10.03.510668v2

Language: Jupyter Notebook - Size: 9.55 MB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Sagarnandeshwar/Bandit_Algorithms

Reinforcement Learning (COMP 579) Project

Language: Jupyter Notebook - Size: 3.03 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

baturaysaglam/Q-Error-Exploration

An Optimistic Approach to the Q-Network Error in Actor-Critic Methods

Language: Python - Size: 23.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

rom1mouret/exploration

over-parameterization = exploration ?

Language: Python - Size: 81.1 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

Ali5tan/bandit

Solutions and walkthroughs for OverTheWire: Bandit - learn Linux command-line basics through real hacking challenges.

Language: Shell - Size: 6.84 KB - Last synced at: about 7 hours ago - Pushed at: about 8 hours ago - Stars: 0 - Forks: 0

tomooda/HiDeHo

the HiDeHo (HInts for Directing the Exploration from History of Operations) framework for Pharo

Language: Smalltalk - Size: 112 KB - Last synced at: 2 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

panxulab/LSVI-ASE

The official code release for "More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling", Reinforcement Learning Conference (RLC) 2024

Language: Python - Size: 16.4 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Alishafzd/NoisyQ

NoisyQ is a noise-based exploration method for DDQN

Language: Jupyter Notebook - Size: 1.95 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

avorozhtsov/shipit

Exploitation vs Exploration problem stated as A/B-testing with maximum profit per unit time.

Language: Mathematica - Size: 5.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

alxndrTL/RL-essais-cliniques

Size: 4.07 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Anjali001/Reinforcement-Learning

Language: Jupyter Notebook - Size: 1.05 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

siavashadpey/MultiArmedBandits

Language: Python - Size: 290 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

nsandholtz/hotspot_paper

A companion repository for 'Inverse Bayesian Optimization: Learning Human Acquisition Functions in an Exploration vs Exploitation Search Task'

Language: R - Size: 74.1 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ruqoyyasadiq/deep_RL-multi-arm-bandit-exploration

This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.

Language: Python - Size: 1.37 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hashem20/Active-Passive-Gap-in-Exploration

Active versus Passive exploration

Language: MATLAB - Size: 21.4 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

zwkcoding/explore_map_standalone

Maintain an environmental exploration map & Update by Bayesian probability **For Autonomous Vehicle**

Language: C++ - Size: 281 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 5