GitHub topics: imperfect-reward-function
Brezy024/Mind-the-Gap
# Mind-the-GapMind the Gap aims to enhance Chain of Thought (CoT) tuning for better AI performance. Join us in exploring innovative solutions and contributing to the project! 🐙🌟
Language: Python - Size: 11 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

Facebear-ljx/RGM
The official implementation of "Mind the Gap: Offline Policy Optimization for Imperfect Rewards" (ICLR2023)
Language: Python - Size: 3.75 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 1
