Topic: "llm-test"
uptrain-ai/uptrain
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
Language: Python - Size: 36.9 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 2,263 - Forks: 199

georgian-io/LLM-Finetuning-Toolkit
Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.
Language: Python - Size: 32.7 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 838 - Forks: 99

JohnSnowLabs/langtest
Deliver safe & effective language models
Language: Python - Size: 157 MB - Last synced at: 2 days ago - Pushed at: 8 days ago - Stars: 522 - Forks: 46

athina-ai/athina-sdk
LLM Testing SDK that helps you write and run tests to monitor your LLM app in production
Language: Python - Size: 119 KB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 131 - Forks: 1

levitation-opensource/Manipulative-Expression-Recognition
MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. MER benchmarks language models for manipulative expressions, fostering development of transparency and safety in AI. It also supports manipulation victims by detecting manipulative patterns in human communication.
Language: HTML - Size: 8.54 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 13 - Forks: 3

pyladiesams/eval-llm-based-apps-jan2025
Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.
Language: Jupyter Notebook - Size: 11.6 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 7 - Forks: 5

prompt-foundry/typescript-sdk
The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.
Language: TypeScript - Size: 20.9 MB - Last synced at: 11 days ago - Pushed at: 8 months ago - Stars: 6 - Forks: 1

prompt-foundry/go-sdk
The prompt engineering, prompt management, and prompt evaluation tool for Go.
Size: 1000 Bytes - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Coldwave96/LLM-Sec-Evaluation
Scripts for evaluating LLM security abilities.
Language: Python - Size: 393 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

awesome-software/nlptest Fork of JohnSnowLabs/langtest
Deliver safe & effective language models
Size: 106 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

awesome-software/promptfoo Fork of promptfoo/promptfoo
Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.
Size: 961 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
