GitHub topics: document-search
neuml/paperai
📄 🤖 Semantic search and workflows for medical/scientific papers
Language: Python - Size: 1.73 MB - Last synced at: about 1 hour ago - Pushed at: about 2 hours ago - Stars: 1,394 - Forks: 108

Parado-xy/seroost
A content-based document search engine in rust.
Language: Rust - Size: 3.58 MB - Last synced at: about 19 hours ago - Pushed at: about 19 hours ago - Stars: 1 - Forks: 0

infinilabs/coco-server
🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.
Language: TypeScript - Size: 7.56 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 54 - Forks: 14

infinilabs/coco-app
🥥 Coco AI App - Search, Connect, Collaborate, Your Personal AI Search and Assistant, all in one space.
Language: TypeScript - Size: 18.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 338 - Forks: 37

redis-developer/redis-arXiv-search
Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.
Language: Python - Size: 1000 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 144 - Forks: 23

jankovicsandras/plpgsql_bm25
BM25 search implemented in PL/pgSQL
Language: Jupyter Notebook - Size: 1.26 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 42 - Forks: 0

GoodGuyAdy/QueryBaseAI
AI-powered hybrid search engine combining keyword, vector, and LLM-based contextual search using RAG with support for AI21, OpenAI or any other LLM.
Language: Python - Size: 31.3 KB - Last synced at: 6 days ago - Pushed at: 8 days ago - Stars: 2 - Forks: 0

deepsense-ai/ragbits
Building blocks for rapid development of GenAI applications
Language: Python - Size: 7.23 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 65 - Forks: 8

jankovicsandras/bm25opt
faster BM25 search algorithms in Python
Language: Jupyter Notebook - Size: 69.3 KB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 20 - Forks: 1

teilomillet/raggo
A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.
Language: Go - Size: 567 KB - Last synced at: 10 days ago - Pushed at: 5 months ago - Stars: 55 - Forks: 3

capjamesg/jamesql
An in-memory NoSQL database implemented in Python.
Language: Python - Size: 849 KB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 83 - Forks: 1

poloclub/mememo
A JavaScript library that brings vector search and RAG to your browser!
Language: TypeScript - Size: 66.7 MB - Last synced at: 29 days ago - Pushed at: 8 months ago - Stars: 105 - Forks: 10

daac-tools/find-simdoc
Finding all pairs of similar documents time- and memory-efficiently
Language: Rust - Size: 225 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 3

tomlin7/AI-research-assistant
Semantic document search system with pgvector and PGAI
Language: Python - Size: 50.8 KB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 1

kcubeterm/achoz
Search through all your personal data efficiently like web search.
Language: Python - Size: 1.74 MB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 80 - Forks: 5

RozhakXD/DocHunter
Language: Python - Size: 53.7 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

JumjumiAsbullah-08/KMS-V01
📚 Knowledge Management System (KMS) - Document Management Based Sebuah aplikasi berbasis web untuk mengelola, menyimpan, dan mencari dokumen secara efisien menggunakan PHP murni dan MySQL.
Language: PHP - Size: 53 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

bent10/boox
Search anything, instantly
Language: TypeScript - Size: 43.6 MB - Last synced at: 11 days ago - Pushed at: 13 days ago - Stars: 5 - Forks: 1

lekt9/alBERT-launcher
AI-powered file launcher and semantic search assistant. Like Spotlight/Alfred but with advanced AI capabilities for understanding context and meaning. Features local processing, privacy-first design, and seamless integration with your workflow.
Language: TypeScript - Size: 8 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

gsidhu/buzee-releases
Public releases for Buzee
Size: 20.5 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

zayedrais/DocumentSearchEngine
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 15 days ago - Pushed at: almost 2 years ago - Stars: 53 - Forks: 24

robindekoster/chatgpt-custom-knowledge-chatbot
This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.
Language: Python - Size: 63.5 KB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 120 - Forks: 34

SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System
An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.
Language: Jupyter Notebook - Size: 5.03 MB - Last synced at: 14 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

kyr0/clientside-search
A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.
Language: TypeScript - Size: 1.58 MB - Last synced at: 14 days ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 0

easonlai/chatbot_with_pdf_streamlit
This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.
Language: Jupyter Notebook - Size: 6.57 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 5

Qyokizzzz/simhash
The extended version of simhash supports fingerprint extraction of documents and images.
Language: Python - Size: 551 KB - Last synced at: 20 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

opengento/magento2-document-product-search
This module aims to make documents searchable with product keywords in Magento 2.
Language: PHP - Size: 13.7 KB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 1

opengento/magento2-document-search
This module aims to make documents searchable for customers in Magento 2.
Language: PHP - Size: 30.3 KB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 3 - Forks: 1

krisluczka/OSSE
Open Source Search Engine with built-in web/document crawler and an indexing method.
Language: C++ - Size: 58.6 KB - Last synced at: 6 days ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

lethalbit/bookwurm
dead simple document index and search, nothing fancy
Language: Python - Size: 20.5 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

EricSchoebel/DocSpector
Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)
Language: Python - Size: 103 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

EmirhanSyl/TheBSTSearchEngine
Mini desktop search engine with Binary Search Tree
Language: Java - Size: 56.6 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

HarshKothari21/Natural-Language-Processing-Specialization
NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.
Language: Jupyter Notebook - Size: 397 KB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

neuml/cord19q 📦
COVID-19 Open Research Dataset (CORD-19) Analysis
Language: Python - Size: 1.47 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 55 - Forks: 19

harishartanto/information-retrieval
Information retrieval of text document using TF-IDF weighting & Cosine Similarity Algorithm
Language: Python - Size: 44.9 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

mdietrichstein/ir-search-engine-rust
Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)
Language: Rust - Size: 132 KB - Last synced at: 4 days ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

domwal/acervo-digital-pessoal
Website in PHP to index all pdf content and easy way to find any text
Language: PHP - Size: 15.1 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

liviobisogni/solr-ocr-indexing
Apache Solr Document Search and Indexing Analysis with OCR
Language: Java - Size: 2.37 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

AI-STACK-dev/Covid19-Comorbidities-NLP-WEB
COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)
Language: JavaScript - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 2
