GitHub topics: llmalignment

Repositories

holarissun/RewardModelingBeyondBradleyTerry

official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives

Language: Python - Size: 365 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 3

Related Keywords

inverse-reinforcement-learning 1 large-language-models 1 largelanguagemodels 1 llm-aligment 1 llmalignment 1 reward 1 reward-modeling 1 reward-models 1 rlhf 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub topics: llmalignment

holarissun/RewardModelingBeyondBradleyTerry