GitHub topics: rag-evaluation
Giskard-AI/giskard
🐢 Open-Source Evaluation & Testing for AI & LLM systems
Language: Python - Size: 175 MB - Last synced at: about 10 hours ago - Pushed at: 1 day ago - Stars: 4,486 - Forks: 318

vectara/open-rag-eval
Open source RAG evaluation package
Language: Python - Size: 804 KB - Last synced at: about 6 hours ago - Pushed at: 1 day ago - Stars: 117 - Forks: 8

Agenta-AI/agenta
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
Language: TypeScript - Size: 163 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,606 - Forks: 305

LLAMATOR-Core/llamator
Framework for testing vulnerabilities of large language models (LLM).
Language: Python - Size: 2.82 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 98 - Forks: 9

frutik/Awesome-RAG
Size: 153 KB - Last synced at: about 8 hours ago - Pushed at: 8 months ago - Stars: 326 - Forks: 25

Marker-Inc-Korea/AutoRAG
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
Language: Python - Size: 70 MB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 3,833 - Forks: 305

simranjeet97/Learn_RAG_from_Scratch_LLM
Learn Retrieval-Augmented Generation (RAG) from Scratch using LLMs from Hugging Face and Langchain or Python
Language: Jupyter Notebook - Size: 425 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 5 - Forks: 3

mts-ai/rurage
Language: Python - Size: 3.85 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 27 - Forks: 0

Kaos599/BetterRAG
BetterRAG: Powerful RAG evaluation toolkit for LLMs. Measure, analyze, and optimize how your AI processes text chunks with precision metrics. Perfect for RAG systems, document processing, and embedding quality assessment.
Language: Python - Size: 104 KB - Last synced at: 27 days ago - Pushed at: 28 days ago - Stars: 1 - Forks: 0

oztrkoguz/RAG-Framework-Evaluation
This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.
Language: Python - Size: 289 KB - Last synced at: 17 days ago - Pushed at: 9 months ago - Stars: 14 - Forks: 0

rostyslavshovak/RAG-Retrieval-Augmented-Generation
RAG Chatbot for Financial Analysis
Language: Python - Size: 1.55 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 0

keitabroadwater/llm-eval-lab
A web sandbox for hands-on learning of LLM and RAG Evaluation
Language: Python - Size: 5.86 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Gian207/RAG-lego-like-component
Proposal for industry RAG evaluation: Generative Universal Evaluation of LLMs and Information retrieval
Language: Python - Size: 2.37 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ioannis-papadimitriou/rag-playground
A framework for systematic evaluation of retrieval strategies and prompt engineering in RAG systems, featuring an interactive chat interface for document analysis.
Language: Python - Size: 771 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 9 - Forks: 2

AnasAber/MLflow_with_RAG
Using MLflow to deploy your RAG pipeline, using LLamaIndex, Langchain and Ollama/HuggingfaceLLMs/Groq
Language: Python - Size: 60.7 MB - Last synced at: 17 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 1

ash-hun/RAGNAROK
RAGNAROK : E2E RAG Sub-Module Framework
Language: Python - Size: 0 Bytes - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

TajaKuzman/pandachat-rag-benchmark
PandaChat-RAG benchmark for evaluation of RAG systems on a non-synthetic Slovenian test dataset.
Language: Python - Size: 842 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

jhaayush2004/RAG-Evaluation
Different approaches to evaluate RAG !!!
Language: Jupyter Notebook - Size: 224 KB - Last synced at: 11 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
