GitHub topics: llm-aligment
sail-sg/oat
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
Language: Python - Size: 2.29 MB - Last synced at: about 22 hours ago - Pushed at: about 23 hours ago - Stars: 338 - Forks: 23

ZFancy/awesome-activation-engineering
A curated list of resources for activation engineering
Size: 154 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 56 - Forks: 1

Dicklesworthstone/some_thoughts_on_ai_alignment
Some Thoughts on AI Alignment: Using AI to Control AI
Size: 1.08 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 7 - Forks: 0

holarissun/RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
Language: Python - Size: 365 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 3

Zanette-Labs/SpeculativeRejection
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
Language: Python - Size: 2.24 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0
