GitHub topics: ai-evaluation-tools

Repositories

raga-ai-hub/RagaAI-Catalyst

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced analytics with timeline and execution graph view

Language: Python - Size: 55.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 16,161 - Forks: 3,781

petmal/MindTrial

MindTrial: Evaluate and compare AI language models (LLMs) on text-based tasks with optional file/image attachments. Supports multiple providers (OpenAI, Google, Anthropic, DeepSeek), custom tasks in YAML, and HTML/CSV reports.

Language: Go - Size: 143 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

Related Keywords

ai-evaluation-tools 2 yaml-configuration 1 opensource 1 openai 1 nlp 1 mozilla-public-license 1 llm-evaluation-framework 1 llm-comparison 1 llm-benchmarking 1 language-models-ai 1 html-reports 1 google-gemini-ai 1 golang-cli 1 deepseek 1 customizable 1 csv-reports 1 anthropic 1 ai-tool 1 ai-model-comparison 1 ai-benchmark 1 llmops 1 llm-tracing 1 llm-testing 1 ai-tool-interaction-monitoring 1 ai-performance-optimization 1 ai-application-debugging 1 ai-agent-monitoring 1 agents 1 agentneo 1 agentic-ai-development 1 agentic-ai 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub topics: ai-evaluation-tools

raga-ai-hub/RagaAI-Catalyst

petmal/MindTrial