GitHub topics: judge-model
IAAR-Shanghai/xFinder
[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
Language: Python - Size: 1.36 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 169 - Forks: 7

Abhisang3/xVerify
xVerify: Efficient Answer Verifier for Large Language Model Evaluations
Language: Python - Size: 806 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

IAAR-Shanghai/xVerify
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Language: Python - Size: 826 KB - Last synced at: 27 days ago - Pushed at: about 1 month ago - Stars: 71 - Forks: 5
