GitHub topics: bandit-algorithms
SMPyBandits/SMPyBandits
🔬 Research Framework for Single and Multi-Players 🎰 Multi-Arms Bandits (MAB) Algorithms, implementing all the state-of-the-art algorithms for single-player (UCB, KL-UCB, Thompson...) and multi-player (MusicalChair, MEGA, rhoRand, MCTop/RandTopM etc).. Available on PyPI: https://pypi.org/project/SMPyBandits/ and documentation on
Language: Jupyter Notebook - Size: 392 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 405 - Forks: 59

sshkhr/Practical_RL
My solutions to Yandex Practical Reinforcement Learning course in PyTorch and Tensorflow
Language: Jupyter Notebook - Size: 9.91 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 53 - Forks: 25

Vinit-4689/Multi-Armed-Bandit
Efficient exploration and exploitation strategies using Epsilon-Greedy, UCB1, and Thompson Sampling — with code, math, and intuition.
Language: Python - Size: 14.6 KB - Last synced at: 18 days ago - Pushed at: 28 days ago - Stars: 1 - Forks: 0

WilliamLwj/PyXAB
PyXAB - A Python Library for X-Armed Bandit and Online Blackbox Optimization Algorithms
Language: Python - Size: 13.8 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 126 - Forks: 30

rssalessio/reading-list
This is a collection of interesting papers that I have read so far or want to read. Note that the list is not up-to-date. Topics: reinforcement learning, deep learning, mathematics, statistics, bandit algorithms, optimization.
Size: 82 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 0

c-bata/goptuna
A hyperparameter optimization framework, inspired by Optuna.
Language: Go - Size: 15.4 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 266 - Forks: 23

ZIYU-DEEP/Awesome-Papers-on-Combinatorial-Semi-Bandit-Problems
A curated list on papers about combinatorial multi-armed bandit problems.
Size: 40 KB - Last synced at: 4 days ago - Pushed at: about 4 years ago - Stars: 17 - Forks: 0

albertopirillo/ola-project-2023
Pricing and advertising strategy for the e-commerce of an airline company, based on Multi-Armed Bandits (MABs) algorithms and Gaussian Processes. Simulations include non-stationary environments.
Language: Python - Size: 20.5 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

doerlbh/BanditZoo
Python library of bandits and RL agents in different real-world environments
Language: Python - Size: 205 KB - Last synced at: 28 days ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 4

dkimpara/Bandit_OCO
Extending Agarwal, Dekel, and Xiao (2010) to the online convex optimization setting with experiments.
Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: 29 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

GjjvdBurg/ThompsonSampling
Source code for blog post on Thompson Sampling
Language: JavaScript - Size: 18.6 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 1

BorealisAI/raps
Code for the paper "Causal Bandits without Graph Learning"
Language: Jupyter Notebook - Size: 571 KB - Last synced at: 28 days ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

singhsidhukuldeep/contextual-bandits
A comprehensive Python library implementing a variety of contextual and non-contextual multi-armed bandit algorithms—including LinUCB, Epsilon-Greedy, Upper Confidence Bound (UCB), Thompson Sampling, KernelUCB, NeuralLinearBandit, and DecisionTreeBandit—designed for reinforcement learning applications
Language: Python - Size: 88.9 KB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

DURUII/Replica-AUCB
🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"
Language: Python - Size: 3.83 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 0

Naereen/KullbackLeibler.jl
💫 Fast Julia implementation of various Kullback-Leibler divergences for 1D parametric distributions. 🏋 Also provides optimized code for kl-UCB indexes
Language: Julia - Size: 56.6 KB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 1

ZiruiYan/awesome-causal-bandit
An list of papers for causal bandit
Size: 10.7 KB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

JurajZelman/multi-armed-bandits
Several multi-armed bandit strategies with additional holding option for smoother exploration.
Language: Jupyter Notebook - Size: 297 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

park-jihoo/RL_TIL
Today I Learned - Reinforcement Learning
Language: Python - Size: 32.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

4dnaanM/bandits
Language: Jupyter Notebook - Size: 394 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

MIFA-Lab/LDPbandit2020
Implementation for NeurIPS 2020 paper "Locally Differentially Private (Contextual) Bandits Learning" (https://arxiv.org/abs/2006.00701)
Language: Python - Size: 92.8 KB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 1

HridayM25/ReinforcementLearning
Some algorithms of Reinforcement Learning implemented by me, in accordance to "Introduction to Reinforcement Learning" by Richard Sutton and Andrew Barto.
Language: Jupyter Notebook - Size: 538 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

shashankp914/Over-the-wire-wargames-Solutions
Detailed solution of solving wargames of over the wire which includes bandit and in future many more.
Size: 39.1 KB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

szrlee/GPT-HyperAgent
The official code repo for HyperAgent for neural bandits and GPT-HyperAgent for content moderation.
Language: Python - Size: 34.2 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

jajajang/LowPopArt
2024 ICML Official code
Language: C - Size: 44.8 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

duoan/OpenMultiarmedBandits
A open source multi arm bandit framework for optimize your website quickly. You’ll quickly use the benefits of several simple algorithms—including the epsilon-Greedy, Softmax, and Upper Confidence Bound (UCB) algorithms—by working through this framework written in Java, which you can easily adapt for deployment on your own website.
Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

EnkiDoctor/Neural_bandit
This is a repo for research proposal of Du Junye
Language: Jupyter Notebook - Size: 129 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Rajarshi1001/CS780
Repository contains codes for the course CS780: Deep Reinforcement Learning
Language: Jupyter Notebook - Size: 167 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

doerlbh/MiniVox
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".
Language: Cuda - Size: 998 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 25 - Forks: 5

jayrcausal/Essential3CRL
Research about Causality-based Reinforcement Learning. This repository includes all needed fundamentals, summary of past work and some most recent development
Language: Jupyter Notebook - Size: 63.1 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

mknbv/zamburak
Bandit algorithms in OCaml
Language: OCaml - Size: 479 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

aayushmanghosh/RL-Algorithms-for-iBMI-Applications
Official repository for Reinforcement Learning Decoders used for intra-cortical brain machine interfaces - IEEE TNNLS 2023
Language: MATLAB - Size: 197 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

luke-davidson/ReinforcementLearning
Programming assignments completed for my Reinforcement Learning course: Topics include Bandit Algorithms, Dynamic Programming, policy iteration, Monte-Carlo methods, SARSA, Q-Learning, Dyna-Q/Dyna-Q+, gradient control methods, state aggregation methods, and Deep Q-Learning Networks (DQNs).
Language: Jupyter Notebook - Size: 26.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

hvishal512/CS6700-Reinforcement-Learning
Artificial Intelligence series
Language: Jupyter Notebook - Size: 5.04 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 4

gokceuludogan/interactive-music-recommendation
Personalized and Interactive Music Recommendation with Bandit approach
Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 9 - Forks: 2

siavashadpey/MultiArmedBandits
Language: Python - Size: 290 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Naereen/Kullback-Leibler-divergences-and-kl-UCB-indexes
🐍 🔬 Fast Python implementation of various Kullback-Leibler divergences for 1D and 2D parametric distributions. Also provides optimized code for kl-UCB indexes
Language: HTML - Size: 136 KB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 9 - Forks: 7

TheUnsolvedDev/ReinforcementLearning
Repository containing basic algorithm applied in python.
Language: Jupyter Notebook - Size: 121 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

jia-yi-chen/Bandit-and-Reinforcement-Learning
Python implementation for Reinforcement Learning algorithms -- Bandit algorithms, MDP, Dynamic Programming (value/policy iteration), Model-free Control (off-policy Monte Carlo, Q-learning)
Language: Python - Size: 31.3 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

junjiedong/warfarin-bandit
Contextual Bandit algorithms for Warfarin Treatment
Language: Jupyter Notebook - Size: 1.67 MB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

rafaol/no-regret-approximate-inference-via-bo
Code repository for the paper No-Regret Approximate Inference via Bayesian Optimisation, published at UAI 2021
Language: Jupyter Notebook - Size: 6.35 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rasros/combo
Language: Kotlin - Size: 13.5 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

niazangels/bandits
An introduction to multi arm bandits
Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Sagarnandeshwar/Bandit_Algorithms
Reinforcement Learning (COMP 579) Project
Language: Jupyter Notebook - Size: 3.03 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

niffler92/Bandit
Bandit algorithms
Language: Python - Size: 300 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 29 - Forks: 6

anselmeamekoe/Graphs_in_ML_MVA
Language: Jupyter Notebook - Size: 2.99 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

babaniyi/Deep-contextual-bandits
A benchmark to test decision-making algorithms for contextual-bandits. The library implements a variety of algorithms (many of them based on approximate Bayesian Neural Networks and Thompson sampling), and a number of real and syntethic data problems exhibiting a diverse set of properties.
Language: Python - Size: 58.6 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 1

amirbalef/PS_MOMAB
Multi-Objective Multi-Armed Bandit
Language: Python - Size: 608 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

ruqoyyasadiq/deep_RL-multi-arm-bandit-exploration
This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.
Language: Python - Size: 1.37 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

rojagtap/value-function-methods
Implementation of greedy, ε-greedy and softmax methods for n-armed bandit problem
Language: Jupyter Notebook - Size: 59.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

guptav96/bandit-algorithms
A short implementation of bandit algorithms - ETC, UCB, MOSS and KL-UCB
Language: Python - Size: 5.56 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

alif898/Yelp-Recommender-System
Proof of concept for a recommender system for Yelp, using bandit algorithms .
Language: Jupyter Notebook - Size: 40.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

rmitsuboshi/bandit
A small collection of Bandit algorithms, written in Rust 🦀.
Language: Rust - Size: 1.72 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

kulinshah98/Multi-Armed-Bandit-Algorithms
Python implementation of UCB, EXP3 and Epsilon greedy algorithms
Language: Python - Size: 760 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 24 - Forks: 9

nicoleorzan/Multi-armed-bandit-RL
C++ implementation of Multi-Armed Bandits (Gaussian and Bernoulli)
Language: C++ - Size: 73.2 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 2

sparsh-ai/reco-bandit
Building recommender Systems using contextual bandit methods to address cold-start issue and online real-time learning
Language: Jupyter Notebook - Size: 7.32 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 1

rssalessio/DPE
DPE code - Code used in "Optimal Algorithms for Multiplayer Multi-Armed Bandits" (AISTATS 2020)
Language: Python - Size: 95.7 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 2

gdmarmerola/interactive-intro-rl Fork of bigdatabr/interactive-intro-rl
Big Data's open seminars: An Interactive Introduction to Reinforcement Learning
Language: Jupyter Notebook - Size: 7.71 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 48 - Forks: 19

gdmarmerola/advanced-bandit-problems
More about the exploration-exploitation tradeoff with harder bandits
Language: Jupyter Notebook - Size: 2.82 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 21 - Forks: 9

fouratifares/RGL
Randomized Greedy Learning Under Full-bandit Feedback
Language: Python - Size: 150 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

amirhosein-mesbah/Reinforcement_learning
This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.
Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

swasun/BanditProblem 📦
A collection of implementations of the bandit problem.
Language: Jupyter Notebook - Size: 580 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

mmalekzadeh/privacy-preserving-bandits
Privacy-Preserving Bandits (MLSys'20)
Language: Jupyter Notebook - Size: 35.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 21 - Forks: 6

anishacharya/Bandits-Online-Learning
Simple Implementations of Bandit Algorithms in python
Language: Jupyter Notebook - Size: 120 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

duongnhatthang/meta-bandit
Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms
Language: Python - Size: 12.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

EnigmaData/rlinc
Reinforcement Learning Starters Package for Multi-arm Bandits Problem
Language: Python - Size: 35.2 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

alextanhongpin/bandit-learn
A knowledge base for Bandit Algorithm
Size: 7.81 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 1

Alanthink/banditpylib
A lightweight python library for bandit algorithms
Language: Python - Size: 11.2 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 25 - Forks: 4

niravnb/Multi-armed-bandit-algortihms
Implementation of famous Bandits algortihm: Explore then commit, UCB & Thompson sampling in python.
Language: Jupyter Notebook - Size: 11.5 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 3

showman-sharma/Semi-bandits
We show performance of various algorithms in semi-bandit setting and try to solve a real word problem using the same
Language: Jupyter Notebook - Size: 2.59 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

hughrawlinson/bandit-algorithms
🎩🤠Some Bandit Algorithms in Typescript
Language: TypeScript - Size: 313 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ngutowski/algossim
This repository aims at learning most popular MAB and CMAB algorithms and watch how they run. It is interesting for those wishing to start learning these topics.
Language: Python - Size: 420 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 3

gohjiayi/beer_recommender
Building a beer recommender using collaborative filtering and bandit algorithms, and evaluating the best performing technique.
Language: Jupyter Notebook - Size: 5.6 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 2

2ailesB/RLD
An implementation of the TME from the Reinforcement Learning course given at Sorbonne University.
Language: Jupyter Notebook - Size: 42.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

Nicolivain/trustful-bandits
A two armed bandit simulation and comparison with theoritical convergence
Language: Jupyter Notebook - Size: 28.9 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

Acemad/alphaNTBEA
An Implementation of the N-Tuple Bandits Evolutionary Algorithm.
Language: Java - Size: 65.4 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Hins-Hu/Bandit-Algorithms
An illustrative project including some multi-armed bandit algorithms and contextual bandit algorithms
Language: Python - Size: 902 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

jialinyi94/matching-bandit
An implementation of the matching bandit algorithm in http://proceedings.mlr.press/v139/sentenac21a.html.
Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

adborroto/reinforcement_learning
AI Reinforcement Learning in Python
Language: Python - Size: 16.6 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

rssalessio/py-lower-bound-bai
Python utilities to compute a lower bound of the expected sample complexity to identify the best arm in a bandit model
Language: Python - Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

MaxenceGiraud/MachineLearningAlgos
Personal reimplementation of some ML algorithms for learning purposes
Language: Python - Size: 297 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 2

chunjenpeng/pyBandit
Bandit and Evolutionary Algorithms using Python
Language: Python - Size: 2.07 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

brown9804/Filters_through_electrical_circuits
Creation of filters using electric passive elements
Language: MATLAB - Size: 421 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0
