Topic: "llmalignment"
holarissun/RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
Language: Python - Size: 365 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 41 - Forks: 3
