GitHub topics: document-search
infinilabs/coco-app
🥥 Coco AI App - Search, Connect, Collaborate, Your Personal AI Search and Assistant, all in one space.
langage: TypeScript - taille: 20,7 Mo - dernière synchronisation: il y a 1 jour - enregistré: il y a 1 jour - étoiles: 428 - forks: 47

deepsense-ai/ragbits
Building blocks for rapid development of GenAI applications
langage: Python - taille: 9,59 Mo - dernière synchronisation: il y a 4 jours - enregistré: il y a 4 jours - étoiles: 70 - forks: 8

poloclub/mememo
A JavaScript library that brings vector search and RAG to your browser!
langage: TypeScript - taille: 66,7 Mo - dernière synchronisation: il y a 4 jours - enregistré: il y a 10 mois - étoiles: 121 - forks: 10

aimaster-dev/SmartRAG
SmartRAG is a terminal-based RAG system using LangGraph. It processes queries by retrieving relevant content from markdown or PDFs, then responds using OpenAI GPT. Supports webpage-to-PDF conversion, vector DB search, and modular flow control.
langage: Python - taille: 51,8 Mo - dernière synchronisation: il y a 8 jours - enregistré: il y a 8 jours - étoiles: 0 - forks: 0

infinilabs/coco-server
🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.
langage: TypeScript - taille: 13,1 Mo - dernière synchronisation: il y a 8 jours - enregistré: il y a 8 jours - étoiles: 77 - forks: 19

aimaster-dev/chatbot-using-rag-and-langchain
Chat with your PDFs using AI! This Streamlit app uses RAG, LangChain, FAISS, and OpenAI to let you ask questions and get answers with page and file references.
langage: Python - taille: 15,5 Mo - dernière synchronisation: il y a 6 jours - enregistré: il y a 9 jours - étoiles: 11 - forks: 0

neuml/paperai
📄 🤖 Semantic search and workflows for medical/scientific papers
langage: Python - taille: 1,73 Mo - dernière synchronisation: il y a 18 jours - enregistré: il y a environ 2 mois - étoiles: 1 398 - forks: 110

GoodGuyAdy/QueryBaseAI
AI-powered hybrid search engine combining keyword, vector, and LLM-based contextual search using RAG with support for AI21, OpenAI or any other LLM.
langage: Python - taille: 39,1 ko - dernière synchronisation: il y a environ 9 heures - enregistré: il y a environ un mois - étoiles: 2 - forks: 0

lekt9/alBERT-launcher
AI-powered file launcher and semantic search assistant. Like Spotlight/Alfred but with advanced AI capabilities for understanding context and meaning. Features local processing, privacy-first design, and seamless integration with your workflow.
langage: TypeScript - taille: 8 Mo - dernière synchronisation: il y a 25 jours - enregistré: il y a 5 mois - étoiles: 11 - forks: 0

Parado-xy/seroost
A content-based document search engine in rust.
langage: Rust - taille: 3,58 Mo - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ 2 mois - étoiles: 1 - forks: 0

capjamesg/jamesql
An in-memory NoSQL database implemented in Python.
langage: Python - taille: 849 ko - dernière synchronisation: il y a 6 jours - enregistré: il y a 4 mois - étoiles: 84 - forks: 1

redis-developer/redis-arXiv-search
Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.
langage: Python - taille: 1000 ko - dernière synchronisation: il y a 14 jours - enregistré: il y a environ 2 mois - étoiles: 144 - forks: 24

jankovicsandras/plpgsql_bm25
BM25 search implemented in PL/pgSQL
langage: Jupyter Notebook - taille: 1,26 Mo - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ 2 mois - étoiles: 42 - forks: 0

jankovicsandras/bm25opt
faster BM25 search algorithms in Python
langage: Jupyter Notebook - taille: 69,3 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a 7 mois - étoiles: 20 - forks: 1

teilomillet/raggo
A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.
langage: Go - taille: 567 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a 7 mois - étoiles: 55 - forks: 3

mishraanuraagx/ChatQ
Local Retrieval-Augmented Generation (RAG) system built with FastAPI, integrating vector search, Elasticsearch, and optional web search to power LLM-based intelligent question answering using models like Mistral or GPT-4.
langage: HTML - taille: 2,19 Mo - dernière synchronisation: il y a 9 jours - enregistré: il y a 2 mois - étoiles: 0 - forks: 1

daac-tools/find-simdoc
Finding all pairs of similar documents time- and memory-efficiently
langage: Rust - taille: 225 ko - dernière synchronisation: il y a 5 jours - enregistré: il y a 3 mois - étoiles: 60 - forks: 3

tomlin7/AI-research-assistant
Semantic document search system with pgvector and PGAI
langage: Python - taille: 50,8 ko - dernière synchronisation: il y a environ un mois - enregistré: il y a 7 mois - étoiles: 2 - forks: 2

kcubeterm/achoz
Search through all your personal data efficiently like web search.
langage: Python - taille: 1,74 Mo - dernière synchronisation: il y a environ un mois - enregistré: il y a plus de 2 ans - étoiles: 80 - forks: 5

RozhakXD/DocHunter
langage: Python - taille: 53,7 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 5 mois - étoiles: 2 - forks: 0

JumjumiAsbullah-08/KMS-V01
📚 Knowledge Management System (KMS) - Document Management Based Sebuah aplikasi berbasis web untuk mengelola, menyimpan, dan mencari dokumen secara efisien menggunakan PHP murni dan MySQL.
langage: PHP - taille: 53 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a 4 mois - étoiles: 1 - forks: 0

bent10/boox
Search anything, instantly
langage: TypeScript - taille: 43,6 Mo - dernière synchronisation: il y a 4 jours - enregistré: il y a 5 jours - étoiles: 5 - forks: 1

gsidhu/buzee-releases
Public releases for Buzee
taille: 20,5 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 6 mois - étoiles: 6 - forks: 0

zayedrais/DocumentSearchEngine
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
langage: Jupyter Notebook - taille: 28,6 Mo - dernière synchronisation: il y a environ un mois - enregistré: il y a environ 2 ans - étoiles: 53 - forks: 24

robindekoster/chatgpt-custom-knowledge-chatbot
This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.
langage: Python - taille: 63,5 ko - dernière synchronisation: il y a 6 mois - enregistré: il y a presque 2 ans - étoiles: 120 - forks: 34

SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System
An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.
langage: Jupyter Notebook - taille: 5,03 Mo - dernière synchronisation: il y a 2 jours - enregistré: il y a 7 mois - étoiles: 0 - forks: 0

kyr0/clientside-search
A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.
langage: TypeScript - taille: 1,58 Mo - dernière synchronisation: il y a 12 jours - enregistré: il y a presque 2 ans - étoiles: 10 - forks: 0

easonlai/chatbot_with_pdf_streamlit
This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.
langage: Jupyter Notebook - taille: 6,57 Mo - dernière synchronisation: il y a environ un mois - enregistré: il y a presque 2 ans - étoiles: 15 - forks: 5

Qyokizzzz/simhash
The extended version of simhash supports fingerprint extraction of documents and images.
langage: Python - taille: 551 ko - dernière synchronisation: il y a 2 mois - enregistré: il y a presque 3 ans - étoiles: 2 - forks: 0

opengento/magento2-document-product-search
This module aims to make documents searchable with product keywords in Magento 2.
langage: PHP - taille: 13,7 ko - dernière synchronisation: il y a 9 jours - enregistré: il y a 10 mois - étoiles: 2 - forks: 1

opengento/magento2-document-search
This module aims to make documents searchable for customers in Magento 2.
langage: PHP - taille: 30,3 ko - dernière synchronisation: il y a 15 jours - enregistré: il y a 10 mois - étoiles: 3 - forks: 1

krisluczka/OSSE
Open Source Search Engine with built-in web/document crawler and an indexing method.
langage: C++ - taille: 58,6 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ un an - étoiles: 0 - forks: 0

lethalbit/bookwurm
dead simple document index and search, nothing fancy
langage: Python - taille: 20,5 ko - dernière synchronisation: il y a 2 mois - enregistré: il y a environ un an - étoiles: 6 - forks: 0

EricSchoebel/DocSpector
Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)
langage: Python - taille: 103 ko - dernière synchronisation: il y a 11 mois - enregistré: il y a 11 mois - étoiles: 0 - forks: 0

EmirhanSyl/TheBSTSearchEngine
Mini desktop search engine with Binary Search Tree
langage: Java - taille: 56,6 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 1 - forks: 0

HarshKothari21/Natural-Language-Processing-Specialization
NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.
langage: Jupyter Notebook - taille: 397 ko - dernière synchronisation: il y a 5 mois - enregistré: il y a presque 5 ans - étoiles: 2 - forks: 0

neuml/cord19q 📦
COVID-19 Open Research Dataset (CORD-19) Analysis
langage: Python - taille: 1,47 Mo - dernière synchronisation: il y a plus d'un an - enregistré: il y a plus de 2 ans - étoiles: 55 - forks: 19

harishartanto/information-retrieval
Information retrieval of text document using TF-IDF weighting & Cosine Similarity Algorithm
langage: Python - taille: 44,9 ko - dernière synchronisation: il y a environ 2 ans - enregistré: il y a environ 2 ans - étoiles: 0 - forks: 1

mdietrichstein/ir-search-engine-rust
Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)
langage: Rust - taille: 132 ko - dernière synchronisation: il y a 5 jours - enregistré: il y a environ 4 ans - étoiles: 5 - forks: 0

domwal/acervo-digital-pessoal
Website in PHP to index all pdf content and easy way to find any text
langage: PHP - taille: 15,1 Mo - dernière synchronisation: il y a 5 jours - enregistré: il y a environ un an - étoiles: 0 - forks: 1

liviobisogni/solr-ocr-indexing
Apache Solr Document Search and Indexing Analysis with OCR
langage: Java - taille: 2,37 Mo - dernière synchronisation: il y a environ 2 ans - enregistré: il y a environ 2 ans - étoiles: 0 - forks: 0

AI-STACK-dev/Covid19-Comorbidities-NLP-WEB
COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)
langage: JavaScript - taille: 11 Mo - dernière synchronisation: il y a plus de 2 ans - enregistré: il y a plus de 3 ans - étoiles: 1 - forks: 2
