An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-search

infinilabs/coco-app

🥥 Coco AI App - Search, Connect, Collaborate, Personal AI Search and Assistant, all in one space.

Language: TypeScript - Size: 22.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 596 - Forks: 63

frankwiersma/pdf-chat-gemini

Intelligent PDF document analysis using Google Gemini AI with File Search capabilities

Language: Python - Size: 525 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Shirokent/ai_doc_search_summarizer

📄 Empower document management with this FastAPI service that uploads, searches, and summarizes text documents using advanced NLP techniques.

Language: Python - Size: 1.29 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

ZachHunt359/JumpChainSearch

ASP.NET Core Blazor application for searching and managing JumpChain documents with Google Drive integration, community tag voting, and admin authentication

Language: C# - Size: 122 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

maharishiayurveda/DocQuify

Extract insights from research papers with DocQuify. Upload PDFs and ask questions for quick, accurate answers. 🌐📄 Explore AI-powered document processing today!

Language: TypeScript - Size: 283 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

flamehaven01/Flamehaven-Filesearch

Open-source semantic document search (RAG) engine with FastAPI and instant self-hosted deployment

Language: Python - Size: 2.61 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 73 - Forks: 8

Nishit00/document-qa-rag-system

📄 Transform documents into interactive AI conversations with ease, creating a searchable knowledge base for efficient information retrieval.

Language: Jupyter Notebook - Size: 24.6 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

ananyacanakapalli/PageWhisper

PageWhisper is an AI-powered chatbot that lets users upload PDFs or DOCX files and ask questions. It uses LangChain for text chunking, OpenAI embeddings with FAISS for vector search, and GPT-4o-mini for accurate, context-aware answers from the uploaded documents.

Language: Python - Size: 20.5 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

keeleydicotyledonous209/support-copilot-public

🔍 Search Jira and Confluence effortlessly from Slack with Support Copilot. Access tickets and documents without leaving your workspace.

Size: 1.3 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

infinilabs/coco-server

🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered Enterprise Search, all in one space.

Language: TypeScript - Size: 28.4 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 121 - Forks: 28

deepsense-ai/ragbits

Building blocks for rapid development of GenAI applications

Language: Python - Size: 27.5 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1,590 - Forks: 128

Se00n00/PaperRAG

This repository implements a modular Retrieval-Augmented Generation (RAG) pipeline with intelligent query routing, step-back prompting, and context-aware query decomposition for high-quality information retrieval. It includes redundancy filtering, cross-encoder–based document reranking, and controlled generation that leverages either retrieved cont

Language: TypeScript - Size: 79.9 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

Poolchaos/local-knowledge-search

Privacy-first AI-powered document search. Upload docs, search semantically with Transformers.js - all client-side, no data leaves your browser.

Language: TypeScript - Size: 166 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

valinsogna/free-rag-document-search

💰 Zero-cost RAG system for intelligent document search using Ollama local LLMs | Privacy-first | No API keys required

Language: Python - Size: 17.6 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

wesleyscholl/VoltAI

⚡🤖 VoltAI is a lightning-fast, Rust-powered local AI agent that answers questions, summarizes documents, and reasons over large datasets — all in milliseconds. Perfect for developers, engineers, and researchers who want an offline private AI - your data never leaves your machine 🛡️📂⚡

Language: Swift - Size: 117 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 3 - Forks: 1

G4brym/agentic-ai-search

Intelligent document search powered by agentic AI

Language: HTML - Size: 233 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

neehanthreddym/doc_query_rag

A basic RAG pipeline which uses llama-3.1-8b model to answer the user query with the external knowledge stored in a vector database.

Language: Jupyter Notebook - Size: 230 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

saskinosie/weaviate-claude-skills

Claude Skills for connecting Claude.ai to local Weaviate vector databases - manage collections, ingest data, and query with RAG

Size: 43.9 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

wise-saint/smartdoc.ai

RAG powered multimodal Q&A application.

Language: Java - Size: 73.2 KB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

oxdev03/node-tantivy-binding

Node.js bindings for Tantivy. Provides indexing, querying, and advanced search features with TypeScript support.

Language: Rust - Size: 3.77 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 3 - Forks: 0

duluk/gdoc-tools

CLI tools for Google Docs: AI chatbot with two-tier architecture for efficient multi-document queries and analysis

Language: Python - Size: 32.2 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

neuml/paperai

📄 🤖 AI for medical and scientific papers

Language: Python - Size: 2.32 MB - Last synced at: 26 days ago - Pushed at: 5 months ago - Stars: 1,481 - Forks: 113

Michael-A-Kuykendall/contextlite

Database Freedom Platform - Mathematical search optimization for whatever database you already have. 27,000x faster than vector databases with SMT-powered search across 8+ database types. One-time 9-2999 vs 00-500/month recurring.

Language: Go - Size: 522 MB - Last synced at: 26 days ago - Pushed at: 2 months ago - Stars: 11 - Forks: 3

jbmiller10/semantik

Semantik is a self-hosted semantic search engine for your documents.

Language: Python - Size: 11.4 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2 - Forks: 0

praarishtech/support-copilot-public

Slack-integrated assistant for Jira and Confluence. Word-based search across spaces — without leaving Slack. Available for exclusive buyout.

Size: 3.91 KB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

aimaster-dev/SmartRAG

SmartRAG is a terminal-based RAG system using LangGraph. It processes queries by retrieving relevant content from markdown or PDFs, then responds using OpenAI GPT. Supports webpage-to-PDF conversion, vector DB search, and modular flow control.

Language: Python - Size: 51.8 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 5 - Forks: 0

lekt9/alBERT-launcher

AI-powered file launcher and semantic search assistant. Like Spotlight/Alfred but with advanced AI capabilities for understanding context and meaning. Features local processing, privacy-first design, and seamless integration with your workflow.

Language: TypeScript - Size: 8.22 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 14 - Forks: 0

AditSaxena/doc-qna-ai

AI-powered Document QnA app built with MERN + OpenAI + MongoDB Atlas + AWS S3. Upload documents, ask questions, and get answers with citations.

Language: JavaScript - Size: 115 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

kevinandersontech/ai_doc_search_summarizer

AI-powered document search and summarisation with FastAPI and Docker

Language: Python - Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

manojcode242/ai-vision-rag

AI powered Visual RAG system using Cohere Embed-4 and Google Gemini for intelligent insights from PDFs and images.

Language: Python - Size: 2.06 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Parado-xy/seroost

A content-based document search engine in rust.

Language: Rust - Size: 3.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

ZohaibCodez/document-qa-rag-system

A simple Retrieval-Augmented Generation (RAG) project built with LangChain and Streamlit. Upload documents (PDF/TXT) and interact with them using natural language questions powered by embeddings and vector search.

Language: Jupyter Notebook - Size: 23.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

hconyeka/gideons-ai-assistant

Local, Offline, Document-Aware AI Assistant prototype designed for my Internship at The Gideons International

Language: Python - Size: 1.78 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

capjamesg/jamesql

An in-memory NoSQL database implemented in Python.

Language: Python - Size: 849 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 83 - Forks: 1

mishraanuraagx/ChatQ

Local Retrieval-Augmented Generation (RAG) system built with FastAPI, integrating vector search, Elasticsearch, and optional web search to power LLM-based intelligent question answering using models like Mistral or GPT-4.

Language: HTML - Size: 2.19 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 1

bent10/boox

Search anything, instantly

Language: TypeScript - Size: 43.5 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 5 - Forks: 1

poloclub/mememo

A JavaScript library that brings vector search and RAG to your browser!

Language: TypeScript - Size: 66.7 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 137 - Forks: 10

redis-developer/redis-arXiv-search

Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

Language: Python - Size: 1000 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 147 - Forks: 26

Seufrid/finance-policy-chatbot

AI-powered finance policy chatbot with English/Bahasa Malaysia support for hospital employees

Language: Python - Size: 166 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

daac-tools/find-simdoc

Finding all pairs of similar documents time- and memory-efficiently

Language: Rust - Size: 225 KB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 61 - Forks: 3

rahulmittal901/streamlit-qdrant-app

Chat with your PDF documents using Streamlit, LlamaIndex, and Qdrant. Upload, embed, and search documents with a modern UI—containerized for easy deployment.

Language: Python - Size: 25.4 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

salameaz/pdf-process-rag

A Python-based application that extracts and processes PDF content using a Retrieval-Augmented Generation (RAG) approach. Leverage vector embeddings to enable efficient querying of both text-based and scanned PDFs, and interact with your documents using a large language model.

Language: Python - Size: 264 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

teilomillet/raggo

A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.

Language: Go - Size: 430 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 155 - Forks: 9

laxmanclo/pany

PostgreSQL-native semantic search engine with multi-modal capabilities. Add AI-powered search to your existing database without separate vector databases, vendor fees, or complex setup. Features text + image search using CLIP embeddings, native SQL joins, and 10-minute Docker deployment.

Language: Python - Size: 130 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 11 - Forks: 0

Moez-lab/parallel-keyword-scanner

High-performance keyword scanner for text and PDF files with multiprocessing and a modern React UI.

Language: TypeScript - Size: 80.1 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

AliCedillo/ragbits

Ragbits provides essential tools for building GenAI applications quickly and efficiently. With features like seamless LLM switching and type-safe interactions, developers can enhance their projects with ease. 🐱💻✨

Language: Python - Size: 3.76 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

aimaster-dev/chatbot-using-rag-and-langchain

Chat with your PDFs using AI! This Streamlit app uses RAG, LangChain, FAISS, and OpenAI to let you ask questions and get answers with page and file references.

Language: Python - Size: 15.5 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 11 - Forks: 0

GoodGuyAdy/QueryBaseAI

AI-powered hybrid search engine combining keyword, vector, and LLM-based contextual search using RAG with support for AI21, OpenAI or any other LLM.

Language: Python - Size: 39.1 KB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

jankovicsandras/plpgsql_bm25

BM25 search implemented in PL/pgSQL

Language: Jupyter Notebook - Size: 1.26 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 42 - Forks: 0

jankovicsandras/bm25opt

faster BM25 search algorithms in Python

Language: Jupyter Notebook - Size: 69.3 KB - Last synced at: 8 months ago - Pushed at: about 1 year ago - Stars: 20 - Forks: 1

tomlin7/AI-research-assistant

Semantic document search system with pgvector and PGAI

Language: Python - Size: 50.8 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 1

kcubeterm/achoz

Search through all your personal data efficiently like web search.

Language: Python - Size: 1.74 MB - Last synced at: 25 days ago - Pushed at: almost 3 years ago - Stars: 80 - Forks: 6

JumjumiAsbullah-08/KMS-V01

📚 Knowledge Management System (KMS) - Document Management Based Sebuah aplikasi berbasis web untuk mengelola, menyimpan, dan mencari dokumen secara efisien menggunakan PHP murni dan MySQL.

Language: PHP - Size: 53 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

gsidhu/buzee-releases

Public releases for Buzee

Size: 20.5 KB - Last synced at: 9 months ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

zayedrais/DocumentSearchEngine

Document Search Engine project with TF-IDF abd Google universal sentence encoder model

Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 53 - Forks: 24

robindekoster/chatgpt-custom-knowledge-chatbot

This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.

Language: Python - Size: 63.5 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 120 - Forks: 34

SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System

An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.

Language: Jupyter Notebook - Size: 5.03 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

kyr0/clientside-search

A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.

Language: TypeScript - Size: 1.58 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 1

easonlai/chatbot_with_pdf_streamlit

This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.

Language: Jupyter Notebook - Size: 6.57 MB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 15 - Forks: 5

Qyokizzzz/simhash

The extended version of simhash supports fingerprint extraction of documents and images.

Language: Python - Size: 551 KB - Last synced at: 8 months ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

opengento/magento2-document-product-search

This module aims to make documents searchable with product keywords in Magento 2.

Language: PHP - Size: 13.7 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

opengento/magento2-document-search

This module aims to make documents searchable for customers in Magento 2.

Language: PHP - Size: 30.3 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

krisluczka/OSSE

Open Source Search Engine with built-in web/document crawler and an indexing method.

Language: C++ - Size: 58.6 KB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

lethalbit/bookwurm

dead simple document index and search, nothing fancy

Language: Python - Size: 20.5 KB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

EricSchoebel/DocSpector

Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)

Language: Python - Size: 103 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

EmirhanSyl/TheBSTSearchEngine

Mini desktop search engine with Binary Search Tree

Language: Java - Size: 56.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

HarshKothari21/Natural-Language-Processing-Specialization

NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.

Language: Jupyter Notebook - Size: 397 KB - Last synced at: 11 months ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 0

neuml/cord19q 📦

COVID-19 Open Research Dataset (CORD-19) Analysis

Language: Python - Size: 1.47 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 55 - Forks: 19

harishartanto/information-retrieval

Information retrieval of text document using TF-IDF weighting & Cosine Similarity Algorithm

Language: Python - Size: 44.9 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

mdietrichstein/ir-search-engine-rust

Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)

Language: Rust - Size: 132 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

domwal/acervo-digital-pessoal

Website in PHP to index all pdf content and easy way to find any text

Language: PHP - Size: 15.1 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

liviobisogni/solr-ocr-indexing

Apache Solr Document Search and Indexing Analysis with OCR

Language: Java - Size: 2.37 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AI-STACK-dev/Covid19-Comorbidities-NLP-WEB

COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)

Language: JavaScript - Size: 11 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 2