GitHub topics: bandit-algorithms

Repositories

babaniyi/Deep-contextual-bandits

Deep contextual bandits in PyTorch: Neural Bandits, Neural Linear, and Linear Full Posterior Sampling with comprehensive benchmarking on synthetic and real datasets

Language: Python - Size: 58.6 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 9 - Forks: 1

ZiruiYan/awesome-causal-bandit

An list of papers for causal bandit

Size: 30.3 KB - Last synced at: 3 days ago - Pushed at: 11 days ago - Stars: 7 - Forks: 0

ZIYU-DEEP/Awesome-Papers-on-Combinatorial-Semi-Bandit-Problems

A curated list on papers about combinatorial multi-armed bandit problems.

Size: 40 KB - Last synced at: 6 days ago - Pushed at: about 4 years ago - Stars: 17 - Forks: 0

WilliamLwj/PyXAB

PyXAB - A Python Library for X-Armed Bandit and Online Blackbox Optimization Algorithms

Language: Python - Size: 13.8 MB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 127 - Forks: 30

KKeishiro/Yahoo_recommendation

Yahoo! news article recommendation system by linUCB

Language: Python - Size: 1.78 MB - Last synced at: 29 days ago - Pushed at: over 7 years ago - Stars: 113 - Forks: 44

🔬 Research Framework for Single and Multi-Players 🎰 Multi-Arms Bandits (MAB) Algorithms, implementing all the state-of-the-art algorithms for single-player (UCB, KL-UCB, Thompson...) and multi-player (MusicalChair, MEGA, rhoRand, MCTop/RandTopM etc).. Available on PyPI: https://pypi.org/project/SMPyBandits/ and documentation on

Language: Jupyter Notebook - Size: 392 MB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 406 - Forks: 59

sshkhr/Practical_RL

My solutions to Yandex Practical Reinforcement Learning course in PyTorch and Tensorflow

Language: Jupyter Notebook - Size: 9.91 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 54 - Forks: 25

singhsidhukuldeep/contextual-bandits

A comprehensive Python library implementing a variety of contextual and non-contextual multi-armed bandit algorithms—including LinUCB, Epsilon-Greedy, Upper Confidence Bound (UCB), Thompson Sampling, KernelUCB, NeuralLinearBandit, and DecisionTreeBandit—designed for reinforcement learning applications

Language: Python - Size: 88.9 KB - Last synced at: 21 days ago - Pushed at: 6 months ago - Stars: 8 - Forks: 0

c-bata/goptuna

A hyperparameter optimization framework, inspired by Optuna.

Language: Go - Size: 15.4 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 267 - Forks: 23

Vinit-4689/Multi-Armed-Bandit

Efficient exploration and exploitation strategies using Epsilon-Greedy, UCB1, and Thompson Sampling — with code, math, and intuition.

Language: Python - Size: 14.6 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

rssalessio/reading-list

This is a collection of interesting papers that I have read so far or want to read. Note that the list is not up-to-date. Topics: reinforcement learning, deep learning, mathematics, statistics, bandit algorithms, optimization.

Size: 82 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 11 - Forks: 0

albertopirillo/ola-project-2023

Pricing and advertising strategy for the e-commerce of an airline company, based on Multi-Armed Bandits (MABs) algorithms and Gaussian Processes. Simulations include non-stationary environments.

Language: Python - Size: 20.5 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 0

doerlbh/BanditZoo

Python library of bandits and RL agents in different real-world environments

Language: Python - Size: 205 KB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 4

dkimpara/Bandit_OCO

Extending Agarwal, Dekel, and Xiao (2010) to the online convex optimization setting with experiments.

Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

GjjvdBurg/ThompsonSampling

Source code for blog post on Thompson Sampling

Language: JavaScript - Size: 18.6 KB - Last synced at: 7 days ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 1

rsoaresp/bandits_notebooks

a collection of google colab notebooks with educational stuff about bandits and their variations

Language: Jupyter Notebook - Size: 567 KB - Last synced at: 4 days ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

BorealisAI/raps

Code for the paper "Causal Bandits without Graph Learning"

Language: Jupyter Notebook - Size: 571 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

DURUII/Replica-AUCB

🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"

Language: Python - Size: 3.83 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 0

Naereen/KullbackLeibler.jl

💫 Fast Julia implementation of various Kullback-Leibler divergences for 1D parametric distributions. 🏋 Also provides optimized code for kl-UCB indexes

Language: Julia - Size: 56.6 KB - Last synced at: 3 months ago - Pushed at: about 7 years ago - Stars: 4 - Forks: 1

JurajZelman/multi-armed-bandits

Several multi-armed bandit strategies with additional holding option for smoother exploration.

Language: Jupyter Notebook - Size: 297 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

park-jihoo/RL_TIL

Today I Learned - Reinforcement Learning

Language: Python - Size: 32.9 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

4dnaanM/bandits

Language: Jupyter Notebook - Size: 394 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

MIFA-Lab/LDPbandit2020

Implementation for NeurIPS 2020 paper "Locally Differentially Private (Contextual) Bandits Learning" (https://arxiv.org/abs/2006.00701)

Language: Python - Size: 92.8 KB - Last synced at: 6 months ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 1

HridayM25/ReinforcementLearning

Some algorithms of Reinforcement Learning implemented by me, in accordance to "Introduction to Reinforcement Learning" by Richard Sutton and Andrew Barto.

Language: Jupyter Notebook - Size: 538 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

shashankp914/Over-the-wire-wargames-Solutions

Detailed solution of solving wargames of over the wire which includes bandit and in future many more.

Size: 39.1 KB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

szrlee/GPT-HyperAgent

The official code repo for HyperAgent for neural bandits and GPT-HyperAgent for content moderation.

Language: Python - Size: 34.2 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

jajajang/LowPopArt

2024 ICML Official code

Language: C - Size: 44.8 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

duoan/OpenMultiarmedBandits

A open source multi arm bandit framework for optimize your website quickly. You’ll quickly use the benefits of several simple algorithms—including the epsilon-Greedy, Softmax, and Upper Confidence Bound (UCB) algorithms—by working through this framework written in Java, which you can easily adapt for deployment on your own website.

Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

EnkiDoctor/Neural_bandit

This is a repo for research proposal of Du Junye

Language: Jupyter Notebook - Size: 129 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Rajarshi1001/CS780

Repository contains codes for the course CS780: Deep Reinforcement Learning

Language: Jupyter Notebook - Size: 167 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

doerlbh/MiniVox

Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".

Language: Cuda - Size: 998 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 25 - Forks: 5

jayrcausal/Essential3CRL

Research about Causality-based Reinforcement Learning. This repository includes all needed fundamentals, summary of past work and some most recent development

Language: Jupyter Notebook - Size: 63.1 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 0

mknbv/zamburak

Bandit algorithms in OCaml

Language: OCaml - Size: 479 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

aayushmanghosh/RL-Algorithms-for-iBMI-Applications

Official repository for Reinforcement Learning Decoders used for intra-cortical brain machine interfaces - IEEE TNNLS 2023

Language: MATLAB - Size: 197 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

luke-davidson/ReinforcementLearning

Programming assignments completed for my Reinforcement Learning course: Topics include Bandit Algorithms, Dynamic Programming, policy iteration, Monte-Carlo methods, SARSA, Q-Learning, Dyna-Q/Dyna-Q+, gradient control methods, state aggregation methods, and Deep Q-Learning Networks (DQNs).

Language: Jupyter Notebook - Size: 26.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

hvishal512/CS6700-Reinforcement-Learning

Artificial Intelligence series

Language: Jupyter Notebook - Size: 5.04 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 16 - Forks: 4

gokceuludogan/interactive-music-recommendation

Personalized and Interactive Music Recommendation with Bandit approach

Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 9 - Forks: 2

siavashadpey/MultiArmedBandits

Language: Python - Size: 290 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Naereen/Kullback-Leibler-divergences-and-kl-UCB-indexes

🐍 🔬 Fast Python implementation of various Kullback-Leibler divergences for 1D and 2D parametric distributions. Also provides optimized code for kl-UCB indexes

Language: HTML - Size: 136 KB - Last synced at: 3 months ago - Pushed at: about 7 years ago - Stars: 9 - Forks: 7

TheUnsolvedDev/ReinforcementLearning

Repository containing basic algorithm applied in python.

Language: Jupyter Notebook - Size: 121 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

jia-yi-chen/Bandit-and-Reinforcement-Learning

Python implementation for Reinforcement Learning algorithms -- Bandit algorithms, MDP, Dynamic Programming (value/policy iteration), Model-free Control (off-policy Monte Carlo, Q-learning)

Language: Python - Size: 31.3 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 1

junjiedong/warfarin-bandit

Contextual Bandit algorithms for Warfarin Treatment

Language: Jupyter Notebook - Size: 1.67 MB - Last synced at: 4 months ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 1

rafaol/no-regret-approximate-inference-via-bo

Code repository for the paper No-Regret Approximate Inference via Bayesian Optimisation, published at UAI 2021

Language: Jupyter Notebook - Size: 6.35 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rasros/combo

Language: Kotlin - Size: 13.5 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

niazangels/bandits

An introduction to multi arm bandits

Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

Sagarnandeshwar/Bandit_Algorithms

Reinforcement Learning (COMP 579) Project

Language: Jupyter Notebook - Size: 3.03 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

niffler92/Bandit

Bandit algorithms

Language: Python - Size: 300 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 29 - Forks: 6

anselmeamekoe/Graphs_in_ML_MVA

Language: Jupyter Notebook - Size: 2.99 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

amirbalef/PS_MOMAB

Multi-Objective Multi-Armed Bandit

Language: Python - Size: 608 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

ruqoyyasadiq/deep_RL-multi-arm-bandit-exploration

This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.

Language: Python - Size: 1.37 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

rojagtap/value-function-methods

Implementation of greedy, ε-greedy and softmax methods for n-armed bandit problem

Language: Jupyter Notebook - Size: 59.6 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

guptav96/bandit-algorithms

A short implementation of bandit algorithms - ETC, UCB, MOSS and KL-UCB

Language: Python - Size: 5.56 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

alif898/Yelp-Recommender-System

Proof of concept for a recommender system for Yelp, using bandit algorithms .

Language: Jupyter Notebook - Size: 40.3 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

rmitsuboshi/bandit

A small collection of Bandit algorithms, written in Rust 🦀.

Language: Rust - Size: 1.72 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

kulinshah98/Multi-Armed-Bandit-Algorithms

Python implementation of UCB, EXP3 and Epsilon greedy algorithms

Language: Python - Size: 760 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 24 - Forks: 9

nicoleorzan/Multi-armed-bandit-RL

C++ implementation of Multi-Armed Bandits (Gaussian and Bernoulli)

Language: C++ - Size: 73.2 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 2

sparsh-ai/reco-bandit

Building recommender Systems using contextual bandit methods to address cold-start issue and online real-time learning

Language: Jupyter Notebook - Size: 7.32 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 1

rssalessio/DPE

DPE code - Code used in "Optimal Algorithms for Multiplayer Multi-Armed Bandits" (AISTATS 2020)

Language: Python - Size: 95.7 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 2

gdmarmerola/interactive-intro-rl Fork of bigdatabr/interactive-intro-rl

Big Data's open seminars: An Interactive Introduction to Reinforcement Learning

Language: Jupyter Notebook - Size: 7.71 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 48 - Forks: 19

gdmarmerola/advanced-bandit-problems

More about the exploration-exploitation tradeoff with harder bandits

Language: Jupyter Notebook - Size: 2.82 MB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 21 - Forks: 9

fouratifares/RGL

Randomized Greedy Learning Under Full-bandit Feedback

Language: Python - Size: 150 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

amirhosein-mesbah/Reinforcement_learning

This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.

Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0