An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: preference-learning

tournesol-app/tournesol

Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3

Language: Python - Size: 29.5 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 348 - Forks: 50

sail-sg/dice

Official implementation of Bootstrapping Language Models via DPO Implicit Rewards

Language: Python - Size: 18.4 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 43 - Forks: 3

osorensen/BayesMallowsSMC2

Sequential Monte Carlo algorithms for the Bayesian Mallows model.

Language: C++ - Size: 501 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

PeymanMorteza/Metric-Preference-Learning-RKHS

Metric and Preference Learning in Reproducing Kernel Hilbert Spaces

Language: Python - Size: 4.45 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

SMARTlab-Purdue/SAN-NaviSTAR

This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refer to our project website at https://sites.google.com/view/san-navistar.

Language: Python - Size: 125 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 54 - Forks: 5

Dev1nW/Simplified-Rating-and-Preference-RL

Simplified, modern implementation of Rating and Preference-based Reinforcement Learning.

Language: Python - Size: 21.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

zwhong714/weak-to-strong-preference-optimization

[ICLR 2025 Spotlight] Weak-to-strong preference optimization: stealing reward from weak aligned model

Language: Python - Size: 1.41 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

qxcv/magical

The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)

Language: Python - Size: 52.3 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 77 - Forks: 11

SMARTlab-Purdue/SAN-FAPL

This repository contains the source code for our paper: "Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation", accepted to IROS-2022. For more details, please refer to our project website at https://sites.google.com/view/san-fapl.

Language: Python - Size: 26.6 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 4

martimfasantos/CustomPOs-for-SLMs

Novel Preference Optimization Algorithms for state-of-the-art small LMs, enhancing performance in GenAI and NLP tasks

Language: Python - Size: 272 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Seninfarheen/Senior-Sage

A conversational assistant designed to support elderly individuals with reminders, health questions, and personalized preferences using advanced LLM capabilities.

Language: Python - Size: 223 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

typoverflow/WiseRL

PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms

Language: Python - Size: 6.06 MB - Last synced at: 10 days ago - Pushed at: 27 days ago - Stars: 19 - Forks: 2

gao-g/prelude

Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".

Language: Python - Size: 43 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 27 - Forks: 0

DanieleF198/ILASP-as-post-hoc-method-in-a-preference-system

Project about experiments of the use of ILASP as a post-hoc method over black-box models, in which we also study and approach technical issues like exponential time execution.

Language: Lasso - Size: 370 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

lasgroup/MaxMinLCB

Code for our paper "Bandits with Preference Feedback: A Stackelberg Game Perspective"

Language: Python - Size: 45.9 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

IAAR-Shanghai/ICSFSurvey

A comprehensive survey on Internal Consistency and Self-Feedback in Large Language Models.

Language: Jupyter Notebook - Size: 4.94 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 38 - Forks: 3

JanoschMenke/metis

Python-based GUI to collect Feedback of Chemist in Molecules

Language: Python - Size: 100 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 34 - Forks: 10

LemurPwned/bradley-terry-ui

UI for straightforward Bradley-Terry feedback loop

Language: Python - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

vicgalle/configurable-safety-tuning

Data and models for the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data"

Language: Python - Size: 2.53 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 8 - Forks: 1

ma921/CoExBO

(AISTATS 2024) "Looping in the Human: Collaborative and Explainable Bayesian Optimization"

Language: Jupyter Notebook - Size: 4.97 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

allenai/reward-bench

RewardBench: the first evaluation tool for reward models.

Language: Python - Size: 3.29 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 157 - Forks: 15

aleksa-sukovic/iclr2024-reward-design-for-justifiable-rl

Code for the paper "Reward Design for Justifiable Sequential Decision-Making"; ICLR 2024

Language: Jupyter Notebook - Size: 2.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

BARUDA-AI/Awesome-Preference-Optimization

Survey of preference alignment algorithms

Size: 0 Bytes - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

julilien/PLDepth

Code for "Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model" as published at CVPR 2021.

Language: Python - Size: 503 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 12 - Forks: 2

huixin-zhan-ai/GAN-Assisted-Preference-Based-Learning

A paper under AAAI-20 review

Language: Python - Size: 206 KB - Last synced at: 18 days ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 1

Intelligent-Systems-Group/jpl-framework

Java framework for Preference Learning

Language: Java - Size: 9.6 MB - Last synced at: 12 months ago - Pushed at: about 7 years ago - Stars: 6 - Forks: 2

Rahgooy/MDFT

In this project, we design a recurrent neural network to simulate a cognitive model of decision-making called Multi Alternative Decision Field Theory (MDFT). We train this RNN to learn the parameters of MDFT.

Language: Python - Size: 3.74 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

FareedKhan-dev/APReL-Mountain-Car-Reinforcement-Learning

APReL: Active preference-based reward learning for human-robot interaction. Utilizing "Mountain Car" environment, learn from human preferences to reach the goal state. Applications in robotics and adaptability to other learning methods.

Size: 2.93 KB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

aaronpmishkin/gaussian_processes 📦

Preference Learning with Gaussian Processes and Bayesian Optimization

Language: Python - Size: 272 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 7 - Forks: 0

makgyver/PRL

[P]reference and [R]ule [L]earning algorithm implementation for Python 3 (https://arxiv.org/abs/1812.07895)

Language: Python - Size: 117 KB - Last synced at: 2 days ago - Pushed at: about 6 years ago - Stars: 5 - Forks: 1

TristanFauvel/Bayesian_test_for_preference

An analysis of preference comparisons based on the Bayes factor

Language: Jupyter Notebook - Size: 85 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rowlandseymour/BSBT

Bayesian Spatial Bradley--Terry

Language: R - Size: 44.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

Bekyilma/Master_thesis

Constructive Preference Elicitation for Social Choice With Setwise max-margin Learning.

Language: Python - Size: 9.24 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

jimparr19/pypbl

Python library for preference based learning

Language: Python - Size: 1.34 MB - Last synced at: 30 days ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 3

afiliot/Preference-Learning-And-Movie-Reviews

Project on preference learning - ENSAE ParisTech

Language: Python - Size: 5.58 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

Related Keywords
preference-learning 35 machine-learning 10 reinforcement-learning 7 alignment 7 rlhf 4 llm 3 large-language-models 3 python 2 recommendation-engine 2 llms 2 bayesian-optimization 2 socially-aware-navigation 2 robot-navigation 2 bradley-terry 2 bradley-terry-model 2 human-preferences 2 bayesian-inference 2 relative-depth 1 human-in-the-loop-machine-learning 1 preference-based-reinforcement-learning 1 plackett-luce 1 reward-design 1 direct 1 preference-alignment 1 monocular-depth-estimation 1 monocular-depth 1 learning-to-rank 1 cvpr 1 cvpr2021 1 deep-learning 1 human-computer-interaction 1 expert-advisors 1 safety 1 dpo 1 ui 1 human-in-the-loop 1 generative-ai 1 drug-discovery 1 de-novo-drug-design 1 self-refine 1 self-improvement 1 self-feedback 1 self-correction 1 self-correct 1 regret-minimization 1 preference-graph 1 movie-recommender 1 label-preference 1 label-learning 1 instance-preference 1 ensae 1 social-choice-theory 1 recommendation-system 1 gurobi-optimization 1 group-recommendations 1 comparative-judgement 1 bayesian-statistics 1 game-theory 1 algorithm 1 value-charts 1 gaussian-processes 1 openai-gym 1 mountain-car 1 recurrent-neural-networks 1 decision-making 1 cognitive-models 1 ranking 1 object-ranking 1 label-ranking 1 collaborative-filtering 1 reinforcement-learning-algorithms 1 gan 1 weakly-supervised-learning 1 personalization 1 health-wellbeing 1 function-calling 1 elderly-care 1 ai-conversational-assistant 1 preference-optimization 1 nlp 1 gen-ai 1 evaluation 1 learning-from-demonstration 1 reinforcement-learning-environments 1 imitation-learning 1 rating-learning 1 ai-alignment 1 transformer 1 metric-learning 1 kernel-methods 1 ranks 1 particle-filter 1 online-learning 1 youtube 1 social-choice 1 reactjs 1 preference-aggregation 1 golden-ratio-optimization 1 django-rest-framework 1 django 1