An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: bandit-algorithms

SMPyBandits/SMPyBandits

🔬 Research Framework for Single and Multi-Players 🎰 Multi-Arms Bandits (MAB) Algorithms, implementing all the state-of-the-art algorithms for single-player (UCB, KL-UCB, Thompson...) and multi-player (MusicalChair, MEGA, rhoRand, MCTop/RandTopM etc).. Available on PyPI: https://pypi.org/project/SMPyBandits/ and documentation on

Language: Jupyter Notebook - Size: 392 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 405 - Forks: 59

sshkhr/Practical_RL

My solutions to Yandex Practical Reinforcement Learning course in PyTorch and Tensorflow

Language: Jupyter Notebook - Size: 9.91 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 53 - Forks: 25

Vinit-4689/Multi-Armed-Bandit

Efficient exploration and exploitation strategies using Epsilon-Greedy, UCB1, and Thompson Sampling — with code, math, and intuition.

Language: Python - Size: 14.6 KB - Last synced at: 18 days ago - Pushed at: 28 days ago - Stars: 1 - Forks: 0

WilliamLwj/PyXAB

PyXAB - A Python Library for X-Armed Bandit and Online Blackbox Optimization Algorithms

Language: Python - Size: 13.8 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 126 - Forks: 30

rssalessio/reading-list

This is a collection of interesting papers that I have read so far or want to read. Note that the list is not up-to-date. Topics: reinforcement learning, deep learning, mathematics, statistics, bandit algorithms, optimization.

Size: 82 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 0

c-bata/goptuna

A hyperparameter optimization framework, inspired by Optuna.

Language: Go - Size: 15.4 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 266 - Forks: 23

ZIYU-DEEP/Awesome-Papers-on-Combinatorial-Semi-Bandit-Problems

A curated list on papers about combinatorial multi-armed bandit problems.

Size: 40 KB - Last synced at: 4 days ago - Pushed at: about 4 years ago - Stars: 17 - Forks: 0

albertopirillo/ola-project-2023

Pricing and advertising strategy for the e-commerce of an airline company, based on Multi-Armed Bandits (MABs) algorithms and Gaussian Processes. Simulations include non-stationary environments.

Language: Python - Size: 20.5 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

doerlbh/BanditZoo

Python library of bandits and RL agents in different real-world environments

Language: Python - Size: 205 KB - Last synced at: 28 days ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 4

dkimpara/Bandit_OCO

Extending Agarwal, Dekel, and Xiao (2010) to the online convex optimization setting with experiments.

Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: 29 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

GjjvdBurg/ThompsonSampling

Source code for blog post on Thompson Sampling

Language: JavaScript - Size: 18.6 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 1

BorealisAI/raps

Code for the paper "Causal Bandits without Graph Learning"

Language: Jupyter Notebook - Size: 571 KB - Last synced at: 28 days ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

singhsidhukuldeep/contextual-bandits

A comprehensive Python library implementing a variety of contextual and non-contextual multi-armed bandit algorithms—including LinUCB, Epsilon-Greedy, Upper Confidence Bound (UCB), Thompson Sampling, KernelUCB, NeuralLinearBandit, and DecisionTreeBandit—designed for reinforcement learning applications

Language: Python - Size: 88.9 KB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

DURUII/Replica-AUCB

🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"

Language: Python - Size: 3.83 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 0

Naereen/KullbackLeibler.jl

💫 Fast Julia implementation of various Kullback-Leibler divergences for 1D parametric distributions. 🏋 Also provides optimized code for kl-UCB indexes

Language: Julia - Size: 56.6 KB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 1

ZiruiYan/awesome-causal-bandit

An list of papers for causal bandit

Size: 10.7 KB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

JurajZelman/multi-armed-bandits

Several multi-armed bandit strategies with additional holding option for smoother exploration.

Language: Jupyter Notebook - Size: 297 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

park-jihoo/RL_TIL

Today I Learned - Reinforcement Learning

Language: Python - Size: 32.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

4dnaanM/bandits

Language: Jupyter Notebook - Size: 394 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

MIFA-Lab/LDPbandit2020

Implementation for NeurIPS 2020 paper "Locally Differentially Private (Contextual) Bandits Learning" (https://arxiv.org/abs/2006.00701)

Language: Python - Size: 92.8 KB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 1

HridayM25/ReinforcementLearning

Some algorithms of Reinforcement Learning implemented by me, in accordance to "Introduction to Reinforcement Learning" by Richard Sutton and Andrew Barto.

Language: Jupyter Notebook - Size: 538 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

shashankp914/Over-the-wire-wargames-Solutions

Detailed solution of solving wargames of over the wire which includes bandit and in future many more.

Size: 39.1 KB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

szrlee/GPT-HyperAgent

The official code repo for HyperAgent for neural bandits and GPT-HyperAgent for content moderation.

Language: Python - Size: 34.2 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

jajajang/LowPopArt

2024 ICML Official code

Language: C - Size: 44.8 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

duoan/OpenMultiarmedBandits

A open source multi arm bandit framework for optimize your website quickly. You’ll quickly use the benefits of several simple algorithms—including the epsilon-Greedy, Softmax, and Upper Confidence Bound (UCB) algorithms—by working through this framework written in Java, which you can easily adapt for deployment on your own website.

Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

EnkiDoctor/Neural_bandit

This is a repo for research proposal of Du Junye

Language: Jupyter Notebook - Size: 129 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Rajarshi1001/CS780

Repository contains codes for the course CS780: Deep Reinforcement Learning

Language: Jupyter Notebook - Size: 167 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

doerlbh/MiniVox

Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".

Language: Cuda - Size: 998 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 25 - Forks: 5

jayrcausal/Essential3CRL

Research about Causality-based Reinforcement Learning. This repository includes all needed fundamentals, summary of past work and some most recent development

Language: Jupyter Notebook - Size: 63.1 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

mknbv/zamburak

Bandit algorithms in OCaml

Language: OCaml - Size: 479 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

aayushmanghosh/RL-Algorithms-for-iBMI-Applications

Official repository for Reinforcement Learning Decoders used for intra-cortical brain machine interfaces - IEEE TNNLS 2023

Language: MATLAB - Size: 197 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

luke-davidson/ReinforcementLearning

Programming assignments completed for my Reinforcement Learning course: Topics include Bandit Algorithms, Dynamic Programming, policy iteration, Monte-Carlo methods, SARSA, Q-Learning, Dyna-Q/Dyna-Q+, gradient control methods, state aggregation methods, and Deep Q-Learning Networks (DQNs).

Language: Jupyter Notebook - Size: 26.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

hvishal512/CS6700-Reinforcement-Learning

Artificial Intelligence series

Language: Jupyter Notebook - Size: 5.04 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 4

gokceuludogan/interactive-music-recommendation

Personalized and Interactive Music Recommendation with Bandit approach

Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 9 - Forks: 2

siavashadpey/MultiArmedBandits

Language: Python - Size: 290 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Naereen/Kullback-Leibler-divergences-and-kl-UCB-indexes

🐍 🔬 Fast Python implementation of various Kullback-Leibler divergences for 1D and 2D parametric distributions. Also provides optimized code for kl-UCB indexes

Language: HTML - Size: 136 KB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 9 - Forks: 7

TheUnsolvedDev/ReinforcementLearning

Repository containing basic algorithm applied in python.

Language: Jupyter Notebook - Size: 121 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

jia-yi-chen/Bandit-and-Reinforcement-Learning

Python implementation for Reinforcement Learning algorithms -- Bandit algorithms, MDP, Dynamic Programming (value/policy iteration), Model-free Control (off-policy Monte Carlo, Q-learning)

Language: Python - Size: 31.3 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

junjiedong/warfarin-bandit

Contextual Bandit algorithms for Warfarin Treatment

Language: Jupyter Notebook - Size: 1.67 MB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

rafaol/no-regret-approximate-inference-via-bo

Code repository for the paper No-Regret Approximate Inference via Bayesian Optimisation, published at UAI 2021

Language: Jupyter Notebook - Size: 6.35 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rasros/combo

Language: Kotlin - Size: 13.5 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

niazangels/bandits

An introduction to multi arm bandits

Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Sagarnandeshwar/Bandit_Algorithms

Reinforcement Learning (COMP 579) Project

Language: Jupyter Notebook - Size: 3.03 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

niffler92/Bandit

Bandit algorithms

Language: Python - Size: 300 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 29 - Forks: 6

anselmeamekoe/Graphs_in_ML_MVA

Language: Jupyter Notebook - Size: 2.99 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

babaniyi/Deep-contextual-bandits

A benchmark to test decision-making algorithms for contextual-bandits. The library implements a variety of algorithms (many of them based on approximate Bayesian Neural Networks and Thompson sampling), and a number of real and syntethic data problems exhibiting a diverse set of properties.

Language: Python - Size: 58.6 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 1

amirbalef/PS_MOMAB

Multi-Objective Multi-Armed Bandit

Language: Python - Size: 608 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

ruqoyyasadiq/deep_RL-multi-arm-bandit-exploration

This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.

Language: Python - Size: 1.37 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

rojagtap/value-function-methods

Implementation of greedy, ε-greedy and softmax methods for n-armed bandit problem

Language: Jupyter Notebook - Size: 59.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

guptav96/bandit-algorithms

A short implementation of bandit algorithms - ETC, UCB, MOSS and KL-UCB

Language: Python - Size: 5.56 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

alif898/Yelp-Recommender-System

Proof of concept for a recommender system for Yelp, using bandit algorithms .

Language: Jupyter Notebook - Size: 40.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

rmitsuboshi/bandit

A small collection of Bandit algorithms, written in Rust 🦀.

Language: Rust - Size: 1.72 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

kulinshah98/Multi-Armed-Bandit-Algorithms

Python implementation of UCB, EXP3 and Epsilon greedy algorithms

Language: Python - Size: 760 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 24 - Forks: 9

nicoleorzan/Multi-armed-bandit-RL

C++ implementation of Multi-Armed Bandits (Gaussian and Bernoulli)

Language: C++ - Size: 73.2 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 2

sparsh-ai/reco-bandit

Building recommender Systems using contextual bandit methods to address cold-start issue and online real-time learning

Language: Jupyter Notebook - Size: 7.32 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 1

rssalessio/DPE

DPE code - Code used in "Optimal Algorithms for Multiplayer Multi-Armed Bandits" (AISTATS 2020)

Language: Python - Size: 95.7 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 2

gdmarmerola/interactive-intro-rl Fork of bigdatabr/interactive-intro-rl

Big Data's open seminars: An Interactive Introduction to Reinforcement Learning

Language: Jupyter Notebook - Size: 7.71 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 48 - Forks: 19

gdmarmerola/advanced-bandit-problems

More about the exploration-exploitation tradeoff with harder bandits

Language: Jupyter Notebook - Size: 2.82 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 21 - Forks: 9

fouratifares/RGL

Randomized Greedy Learning Under Full-bandit Feedback

Language: Python - Size: 150 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

amirhosein-mesbah/Reinforcement_learning

This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.

Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

swasun/BanditProblem 📦

A collection of implementations of the bandit problem.

Language: Jupyter Notebook - Size: 580 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

mmalekzadeh/privacy-preserving-bandits

Privacy-Preserving Bandits (MLSys'20)

Language: Jupyter Notebook - Size: 35.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 21 - Forks: 6

anishacharya/Bandits-Online-Learning

Simple Implementations of Bandit Algorithms in python

Language: Jupyter Notebook - Size: 120 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

duongnhatthang/meta-bandit

Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms

Language: Python - Size: 12.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

EnigmaData/rlinc

Reinforcement Learning Starters Package for Multi-arm Bandits Problem

Language: Python - Size: 35.2 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

alextanhongpin/bandit-learn

A knowledge base for Bandit Algorithm

Size: 7.81 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 1

Alanthink/banditpylib

A lightweight python library for bandit algorithms

Language: Python - Size: 11.2 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 25 - Forks: 4

niravnb/Multi-armed-bandit-algortihms

Implementation of famous Bandits algortihm: Explore then commit, UCB & Thompson sampling in python.

Language: Jupyter Notebook - Size: 11.5 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 3

showman-sharma/Semi-bandits

We show performance of various algorithms in semi-bandit setting and try to solve a real word problem using the same

Language: Jupyter Notebook - Size: 2.59 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

hughrawlinson/bandit-algorithms

🎩🤠Some Bandit Algorithms in Typescript

Language: TypeScript - Size: 313 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ngutowski/algossim

This repository aims at learning most popular MAB and CMAB algorithms and watch how they run. It is interesting for those wishing to start learning these topics.

Language: Python - Size: 420 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 3

gohjiayi/beer_recommender

Building a beer recommender using collaborative filtering and bandit algorithms, and evaluating the best performing technique.

Language: Jupyter Notebook - Size: 5.6 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 2

2ailesB/RLD

An implementation of the TME from the Reinforcement Learning course given at Sorbonne University.

Language: Jupyter Notebook - Size: 42.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

Nicolivain/trustful-bandits

A two armed bandit simulation and comparison with theoritical convergence

Language: Jupyter Notebook - Size: 28.9 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

Acemad/alphaNTBEA

An Implementation of the N-Tuple Bandits Evolutionary Algorithm.

Language: Java - Size: 65.4 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Hins-Hu/Bandit-Algorithms

An illustrative project including some multi-armed bandit algorithms and contextual bandit algorithms

Language: Python - Size: 902 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

jialinyi94/matching-bandit

An implementation of the matching bandit algorithm in http://proceedings.mlr.press/v139/sentenac21a.html.

Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

adborroto/reinforcement_learning

AI Reinforcement Learning in Python

Language: Python - Size: 16.6 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

rssalessio/py-lower-bound-bai

Python utilities to compute a lower bound of the expected sample complexity to identify the best arm in a bandit model

Language: Python - Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

MaxenceGiraud/MachineLearningAlgos

Personal reimplementation of some ML algorithms for learning purposes

Language: Python - Size: 297 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 2

chunjenpeng/pyBandit

Bandit and Evolutionary Algorithms using Python

Language: Python - Size: 2.07 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

brown9804/Filters_through_electrical_circuits

Creation of filters using electric passive elements

Language: MATLAB - Size: 421 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

Related Keywords
bandit-algorithms 82 reinforcement-learning 30 machine-learning 13 multi-armed-bandit 9 reinforcement-learning-algorithms 7 multi-armed-bandits 7 thompson-sampling 6 python 6 optimization 6 contextual-bandits 6 bandit 6 online-learning 5 bandits 5 epsilon-greedy 5 exploration-exploitation 5 bandit-learning 5 ucb 4 policy-gradient 4 recommender-system 4 multiarmed-bandits 4 policy-iteration 4 monte-carlo 4 deep-learning 3 multiarm-bandit 3 causality 3 linucb 3 q-learning 3 machine-learning-algorithms 3 markov-decision-processes 3 evolutionary-algorithms 3 deep-reinforcement-learning 3 qlearning 2 simulation 2 gradient-descent 2 sarsa-learning 2 divergence 2 open-source 2 kl-ucb 2 kullback-leibler-divergence 2 contextual-bandit 2 artificial-intelligence 2 differential-privacy 2 numpy 2 dynamic-programming 2 genetic-algorithm 2 algorithm 2 on-policy 2 optimization-algorithms 2 ucb-algorithm 2 off-policy 2 recommendation-system 2 data-science 2 learning 2 python3 2 blackbox-optimization 2 bayesian-optimization 2 online-learning-algorithms 1 federated-learning 1 submodular-optimization 1 differentially-private 1 criteo-dataset 1 machinelearning 1 online-learning-python 1 agent 1 kotlin-library 1 submodularity 1 deeprl 1 online-machine-learning 1 recommendation 1 privacy-preserving-machine-learning 1 distributed-reinforcement-learning 1 gym 1 mdp 1 bandit-algorithm 1 privacy-preserving-bandits 1 multi-agent-reinforcement-learning 1 stablebaselines3 1 network-routing 1 bernoulli-distribution 1 graph-neural-networks 1 graphs-theory 1 semi-supervised-learning 1 multi-objective 1 non-stationary 1 exploration-strategy 1 matplotlib 1 n-armed-bandit-problem 1 yelp-dataset 1 asymptotically-optimal-ucb-algorithm 1 etc-algorithm 1 adversarial-bandit-algorithms 1 exp3-algorithm 1 stochastic-bandit-algorithms 1 upper-confidence-bounds 1 bernoulli-bandit 1 gaussian-bandit 1 softmax 1 softmax-policy 1 aistats 1 aistats-2020 1