GitHub topics: bandits

Repositories

tensorflow/agents

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Language: Python - Size: 12.9 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 2,889 - Forks: 732

doerlbh/BanditZoo

Python library of bandits and RL agents in different real-world environments

Language: Python - Size: 205 KB - Last synced at: 8 days ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 4

DURUII/Replica-AUCB

🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"

Language: Python - Size: 3.83 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 0

yfletberliac/rlss-2019

Materials for the Practical Sessions of the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch).

Language: Jupyter Notebook - Size: 7.34 MB - Last synced at: 13 days ago - Pushed at: over 5 years ago - Stars: 89 - Forks: 44

iheartradio/thomas

Another A/B test library

Language: Scala - Size: 5.12 MB - Last synced at: 21 days ago - Pushed at: about 1 month ago - Stars: 24 - Forks: 8

banditml/banditml

A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

Language: Python - Size: 197 KB - Last synced at: 2 days ago - Pushed at: almost 4 years ago - Stars: 66 - Forks: 10

manome/python-mab

This project provides a simulation of multi-armed bandit problems. This implementation is based on the below paper. https://arxiv.org/abs/2308.14350.

Language: Python - Size: 1.2 MB - Last synced at: 16 days ago - Pushed at: 5 months ago - Stars: 4 - Forks: 0

lasgroup/MaxMinLCB

Code for our paper "Bandits with Preference Feedback: A Stackelberg Game Perspective"

Language: Python - Size: 45.9 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

riccardodv/COOP-learning

Study the interplay between communication and feedback in a cooperative online learning setting.

Language: Python - Size: 70.3 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

thoughtworks/simplebandit

lightweight contextual bandit library for ts/js

Language: TypeScript - Size: 558 KB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0

pappar-delle/AI-Labs-2022-23

TJHSST Artificial Intelligence Labs from the 2022-23 School Year with Dr. Gabor

Language: Python - Size: 116 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

TanguyUrvoy/pmlib

A python library for (finite) Partial Monitoring algorithms

Language: Jupyter Notebook - Size: 1.75 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 1

ElianBelot/bernoulli-bandits

An exploration of multi-armed Bernoulli bandits in reinforcement learning, complete with experiments and observations.

Language: Jupyter Notebook - Size: 2.83 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

annieyan/Bandits-using-UCB-algorithm

Thompson Sampling for Bandits using UCB policy

Language: Python - Size: 3.91 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 10 - Forks: 3

Ralyhu/CMAB-CC

Code and data for the paper "A Combinatorial Multi-Armed Bandit Approach to Correlation Clustering", DAMI 2023

Language: Python - Size: 320 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

babaniyi/Deep-contextual-bandits

A benchmark to test decision-making algorithms for contextual-bandits. The library implements a variety of algorithms (many of them based on approximate Bayesian Neural Networks and Thompson sampling), and a number of real and syntethic data problems exhibiting a diverse set of properties.

Language: Python - Size: 58.6 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 7 - Forks: 1

YRussac/WeightedLinearBandits

Code associated with the NeurIPS19 paper "Weighted Linear Bandits in Non-Stationary Environments"

Language: Jupyter Notebook - Size: 673 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 1

sarthakmittal92/multi-armed-bandits

Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.

Language: Python - Size: 330 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

JoelJa835/MAB_Algorithms

Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.

Language: Python - Size: 17.6 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

gurbaaz27/amazon-hackathon

Language: Jupyter Notebook - Size: 48.8 MB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 2

anishacharya/Bandits-Online-Learning

Simple Implementations of Bandit Algorithms in python

Language: Jupyter Notebook - Size: 120 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

doerlbh/ABaCoDE

Code for our ICDMW 2018 paper: "Contextual Bandit with Adaptive Feature Extraction".

Language: MATLAB - Size: 1.09 GB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 1

doerlbh/BerlinUCB

Code for our AJCAI 2020 paper: "Online Semi-Supervised Learning in Contextual Bandits with Episodic Reward".

Language: MATLAB - Size: 33.2 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

doerlbh/dilemmaRL

Code for our PRICAI 2022 paper: "Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior".

Language: Python - Size: 23 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

kfoofw/applied_learning_articles

Collaborative project for documenting ML/DS learnings.

Language: Jupyter Notebook - Size: 2.01 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 3

philinemey/BSE-T3-RL

Coursework, Stochastic Models and Optimization, BSE, Term 3, Class of 2022

Language: Jupyter Notebook - Size: 3.98 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Nicolivain/RLD

Deep Reinforcement Learning Agents in Pytorch in a modular framework

Language: Jupyter Notebook - Size: 39.6 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 0

alxthm/rld-project

Play Rock, Paper, Scissors (Kaggle competition) with Reinforcement Learning: bandits, tabular Q-learning and PPO with LSTM.

Language: Python - Size: 416 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

MehranTaghian/prophet-inequlity-implementation

Implementation of the prophet inequalities

Language: Python - Size: 1.47 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

krishnaw14/CS747-assignments

Foundations of Intelligent and Learning Agenet

Language: Python - Size: 992 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

Nicolivain/trustful-bandits

A two armed bandit simulation and comparison with theoritical convergence

Language: Jupyter Notebook - Size: 28.9 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

XiaoMutt/ucbc

Stanford CS234 Course Side Project

Language: Python - Size: 7.71 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Zaidtech/OverTheWire

This repo contains all the stuff I encountered while playing OverTheWire games.

Size: 1.95 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

BrianHung/random

random python notebooks (hopefully useful in future)

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 1

Ralami1859/MemoryBandits

Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

rohilrg/Online-Learning-Bandits-Reinforcement-Learning

An assignment for the implementation of Online Learning, Bandits and Reinforcement Learning

Language: Jupyter Notebook - Size: 3.28 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

SC5/bandits

Language: Python - Size: 95.7 KB - Last synced at: 2 days ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 5

rameshjes/RobotLearning

Language: Jupyter Notebook - Size: 6.54 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

zedoul/bandits

(develop) Bayesian Bandits

Language: R - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

Related Keywords

bandits 39 reinforcement-learning 16 contextual-bandits 6 multi-armed-bandits 6 bandit-algorithms 5 bandit 4 machine-learning 4 ucb 3 reinforcement-learning-algorithms 3 online-learning 2 thompson-sampling 2 contextual-bandit 2 recommendation-system 2 online-learning-algorithms 2 dynamic-programming 2 pytorch 2 personalization 2 bayesian 2 markov-decision-processes 2 mab 2 multiagent-systems 1 human-behavior 1 game-theory 1 behavioral-cloning 1 semi-supervised-learning 1 self-supervised-learning 1 paper 1 nonstationary-environments 1 jupyter 1 random 1 representation-learning 1 nonstationary 1 non-stationary 1 online-passive-aggresive-algorithm 1 icdm2018 1 icdm 1 feature-extraction 1 online-learning-python 1 policy-gradient 1 bandit-learning 1 hackathon 1 sarsa-lambda 1 amazon 1 online-optimization 1 asset-allocation 1 stochastic-algorithm 1 prophet-inequality 1 k-prophet 1 rps-game 1 stochastic-algorithms 1 rl 1 q-learning 1 ppo 1 gym-environment 1 deep-reinforcement-learning 1 policy-iteration 1 stochastic-optimization 1 trading-agent 1 gaussian-processes 1 bayesian-optimization 1 uplift-modelling 1 causal-inference 1 prisoner-dilemma 1 cybersecurity 1 overthewire 1 multiplayer-game 1 stochastic-bandit-algorithms 1 neural-networks 1 scala 1 public 1 functional-reactive-programming 1 functional-programming 1 bayesian-analysis 1 bandit-algorithm 1 ab-testing 1 tutorial 1 school 1 notebooks 1 materials 1 ipynb 1 google-colab 1 education 1 multi-armed-bandit 1 cmab 1 aution 1 aucb 1 simulation 1 tf-agents 1 tensorflow 1 rl-algorithms 1 dqn 1 e-greedy 1 python 1 kl-ucb 1 non-stationary-environment 1 neurips-2019 1 multiarmed-bandits 1 correlation-clustering 1 clustering 1 rex3 1