GitHub topics: reward-modeling

Repositories

Jialuo-Li/Science-T2I

[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis

Language: Python - Size: 189 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 50 - Forks: 1

VectorInstitute/vector-inference

Efficient LLM inference on Slurm clusters using vLLM.

Language: Python - Size: 2.59 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 58 - Forks: 10

sileod/tasksource

Datasets collection and preprocessings framework for NLP extreme multitask learning

Language: Python - Size: 368 KB - Last synced at: 23 days ago - Pushed at: 4 months ago - Stars: 178 - Forks: 10

holarissun/RewardModelingBeyondBradleyTerry

official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives

Language: Python - Size: 365 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 3

YangLing0818/IterComp

[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Language: Python - Size: 32.8 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 161 - Forks: 10

allenai/hybrid-preferences

Learning to route instances for Human vs AI Feedback

Language: Python - Size: 273 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 18 - Forks: 2

Related Keywords

reward-modeling 6 rlhf 3 benchmark 2 dpo 2 instruction-tuning 1 meta-learning 1 multi-task-learning 1 multi-task-learning-scaling 1 natural-language-inference 1 nlp 1 preprocessings 1 scaling 1 text-classification 1 inverse-reinforcement-learning 1 large-language-models 1 largelanguagemodels 1 llm-aligment 1 llmalignment 1 reward 1 reward-models 1 text-to-image 1 language-model 1 computer-vision 1 dataset 1 generative-model 1 post-training 1 science 1 inference 1 llm 1 llm-inference 1 text-embedding 1 vllm 1 vlm 1 bigbench 1 crossfit 1 curated-datasets 1 dataset-collection 1 discriminative 1 extreme-mtl 1 extreme-multi-task-learning 1 glue 1 huggingface 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos