Topic: "llm-evals"
The-Swarm-Corporation/StatisticalModelEvaluator
An implementation of the Anthropic's paper and essay on "A statistical approach to model evaluations"
Language: Python - Size: 2.32 MB - Last synced at: 2 days ago - Pushed at: 27 days ago - Stars: 16 - Forks: 1

pyladiesams/eval-llm-based-apps-jan2025
Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.
Language: Jupyter Notebook - Size: 11.6 MB - Last synced at: 24 days ago - Pushed at: 4 months ago - Stars: 7 - Forks: 5

kevinschaul/llm-evals
Because we should all have our own set of LLM evals.
Language: Python - Size: 11.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 1
