Topic: "rag-evaluation"
Giskard-AI/giskard
🐢 Open-Source Evaluation & Testing for AI & LLM systems
Language: Python - Size: 176 MB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 4,582 - Forks: 324

Marker-Inc-Korea/AutoRAG
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
Language: Python - Size: 70.7 MB - Last synced at: 22 days ago - Pushed at: about 1 month ago - Stars: 3,937 - Forks: 309

Agenta-AI/agenta
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
Language: Python - Size: 171 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,795 - Forks: 330

frutik/Awesome-RAG
Size: 153 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 330 - Forks: 25

vectara/open-rag-eval
Open source RAG evaluation package
Language: Python - Size: 2.18 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 206 - Forks: 11

LLAMATOR-Core/llamator
Framework for testing vulnerabilities of large language models (LLM).
Language: Python - Size: 4.31 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 114 - Forks: 9

mts-ai/rurage
Language: Python - Size: 3.85 MB - Last synced at: 26 days ago - Pushed at: about 2 months ago - Stars: 27 - Forks: 0

oztrkoguz/RAG-Framework-Evaluation
This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.
Language: Python - Size: 289 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 14 - Forks: 0

ioannis-papadimitriou/rag-playground
A framework for systematic evaluation of retrieval strategies and prompt engineering in RAG systems, featuring an interactive chat interface for document analysis.
Language: Python - Size: 771 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 9 - Forks: 2

rostyslavshovak/RAG-Retrieval-Augmented-Generation
RAG Chatbot for Financial Analysis
Language: Python - Size: 1.55 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

simranjeet97/Learn_RAG_from_Scratch_LLM
Learn Retrieval-Augmented Generation (RAG) from Scratch using LLMs from Hugging Face and Langchain or Python
Language: Jupyter Notebook - Size: 425 KB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 5 - Forks: 3

shaadclt/EvalRAG
A comprehensive evaluation toolkit for assessing Retrieval-Augmented Generation (RAG) outputs using linguistic, semantic, and fairness metrics
Language: Python - Size: 32.2 KB - Last synced at: 21 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

fkapsahili/EntRAG
EntRAG - Enterprise RAG Benchmark
Language: Python - Size: 168 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

Kaos599/BetterRAG
BetterRAG: Powerful RAG evaluation toolkit for LLMs. Measure, analyze, and optimize how your AI processes text chunks with precision metrics. Perfect for RAG systems, document processing, and embedding quality assessment.
Language: Python - Size: 104 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

AnasAber/MLflow_with_RAG
Using MLflow to deploy your RAG pipeline, using LLamaIndex, Langchain and Ollama/HuggingfaceLLMs/Groq
Language: Python - Size: 60.7 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 1

keitabroadwater/llm-eval-lab
A web sandbox for hands-on learning of LLM and RAG Evaluation
Language: TypeScript - Size: 67.4 KB - Last synced at: 27 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Gian207/RAG-lego-like-component
Proposal for industry RAG evaluation: Generative Universal Evaluation of LLMs and Information retrieval
Language: Python - Size: 2.37 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ash-hun/RAGNAROK
RAGNAROK : E2E RAG Sub-Module Framework
Language: Python - Size: 0 Bytes - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

TajaKuzman/pandachat-rag-benchmark
PandaChat-RAG benchmark for evaluation of RAG systems on a non-synthetic Slovenian test dataset.
Language: Python - Size: 842 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

jhaayush2004/RAG-Evaluation
Different approaches to evaluate RAG !!!
Language: Jupyter Notebook - Size: 224 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0
