Topic: "gpt-evaluation"
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Language: HTML - Size: 18 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 8,130 - Forks: 769

allenai/CommonGen-Eval
Evaluating LLMs with CommonGen-Lite
Language: Python - Size: 1.28 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 89 - Forks: 3

armingh2000/FactScoreLite
FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package builds upon the framework provided by the original FactScore repository, which is no longer maintained and contains outdated functions.
Language: Python - Size: 674 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 1
