An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-search

neuml/paperai

📄 🤖 Semantic search and workflows for medical/scientific papers

Language: Python - Size: 1.73 MB - Last synced at: about 1 hour ago - Pushed at: about 2 hours ago - Stars: 1,394 - Forks: 108

Parado-xy/seroost

A content-based document search engine in rust.

Language: Rust - Size: 3.58 MB - Last synced at: about 19 hours ago - Pushed at: about 19 hours ago - Stars: 1 - Forks: 0

infinilabs/coco-server

🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.

Language: TypeScript - Size: 7.56 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 54 - Forks: 14

infinilabs/coco-app

🥥 Coco AI App - Search, Connect, Collaborate, Your Personal AI Search and Assistant, all in one space.

Language: TypeScript - Size: 18.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 338 - Forks: 37

redis-developer/redis-arXiv-search

Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

Language: Python - Size: 1000 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 144 - Forks: 23

jankovicsandras/plpgsql_bm25

BM25 search implemented in PL/pgSQL

Language: Jupyter Notebook - Size: 1.26 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 42 - Forks: 0

GoodGuyAdy/QueryBaseAI

AI-powered hybrid search engine combining keyword, vector, and LLM-based contextual search using RAG with support for AI21, OpenAI or any other LLM.

Language: Python - Size: 31.3 KB - Last synced at: 6 days ago - Pushed at: 8 days ago - Stars: 2 - Forks: 0

deepsense-ai/ragbits

Building blocks for rapid development of GenAI applications

Language: Python - Size: 7.23 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 65 - Forks: 8

jankovicsandras/bm25opt

faster BM25 search algorithms in Python

Language: Jupyter Notebook - Size: 69.3 KB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 20 - Forks: 1

teilomillet/raggo

A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.

Language: Go - Size: 567 KB - Last synced at: 10 days ago - Pushed at: 5 months ago - Stars: 55 - Forks: 3

capjamesg/jamesql

An in-memory NoSQL database implemented in Python.

Language: Python - Size: 849 KB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 83 - Forks: 1

poloclub/mememo

A JavaScript library that brings vector search and RAG to your browser!

Language: TypeScript - Size: 66.7 MB - Last synced at: 29 days ago - Pushed at: 8 months ago - Stars: 105 - Forks: 10

daac-tools/find-simdoc

Finding all pairs of similar documents time- and memory-efficiently

Language: Rust - Size: 225 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 3

tomlin7/AI-research-assistant

Semantic document search system with pgvector and PGAI

Language: Python - Size: 50.8 KB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 1

kcubeterm/achoz

Search through all your personal data efficiently like web search.

Language: Python - Size: 1.74 MB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 80 - Forks: 5

RozhakXD/DocHunter

Language: Python - Size: 53.7 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

JumjumiAsbullah-08/KMS-V01

📚 Knowledge Management System (KMS) - Document Management Based Sebuah aplikasi berbasis web untuk mengelola, menyimpan, dan mencari dokumen secara efisien menggunakan PHP murni dan MySQL.

Language: PHP - Size: 53 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

bent10/boox

Search anything, instantly

Language: TypeScript - Size: 43.6 MB - Last synced at: 11 days ago - Pushed at: 13 days ago - Stars: 5 - Forks: 1

lekt9/alBERT-launcher

AI-powered file launcher and semantic search assistant. Like Spotlight/Alfred but with advanced AI capabilities for understanding context and meaning. Features local processing, privacy-first design, and seamless integration with your workflow.

Language: TypeScript - Size: 8 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

gsidhu/buzee-releases

Public releases for Buzee

Size: 20.5 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

zayedrais/DocumentSearchEngine

Document Search Engine project with TF-IDF abd Google universal sentence encoder model

Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 15 days ago - Pushed at: almost 2 years ago - Stars: 53 - Forks: 24

robindekoster/chatgpt-custom-knowledge-chatbot

This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.

Language: Python - Size: 63.5 KB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 120 - Forks: 34

SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System

An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.

Language: Jupyter Notebook - Size: 5.03 MB - Last synced at: 14 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

kyr0/clientside-search

A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.

Language: TypeScript - Size: 1.58 MB - Last synced at: 14 days ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 0

easonlai/chatbot_with_pdf_streamlit

This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.

Language: Jupyter Notebook - Size: 6.57 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 5

Qyokizzzz/simhash

The extended version of simhash supports fingerprint extraction of documents and images.

Language: Python - Size: 551 KB - Last synced at: 20 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

opengento/magento2-document-product-search

This module aims to make documents searchable with product keywords in Magento 2.

Language: PHP - Size: 13.7 KB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 1

opengento/magento2-document-search

This module aims to make documents searchable for customers in Magento 2.

Language: PHP - Size: 30.3 KB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 3 - Forks: 1

krisluczka/OSSE

Open Source Search Engine with built-in web/document crawler and an indexing method.

Language: C++ - Size: 58.6 KB - Last synced at: 6 days ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

lethalbit/bookwurm

dead simple document index and search, nothing fancy

Language: Python - Size: 20.5 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

EricSchoebel/DocSpector

Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)

Language: Python - Size: 103 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

EmirhanSyl/TheBSTSearchEngine

Mini desktop search engine with Binary Search Tree

Language: Java - Size: 56.6 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

HarshKothari21/Natural-Language-Processing-Specialization

NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.

Language: Jupyter Notebook - Size: 397 KB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

neuml/cord19q 📦

COVID-19 Open Research Dataset (CORD-19) Analysis

Language: Python - Size: 1.47 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 55 - Forks: 19

harishartanto/information-retrieval

Information retrieval of text document using TF-IDF weighting & Cosine Similarity Algorithm

Language: Python - Size: 44.9 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

mdietrichstein/ir-search-engine-rust

Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)

Language: Rust - Size: 132 KB - Last synced at: 4 days ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

domwal/acervo-digital-pessoal

Website in PHP to index all pdf content and easy way to find any text

Language: PHP - Size: 15.1 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

liviobisogni/solr-ocr-indexing

Apache Solr Document Search and Indexing Analysis with OCR

Language: Java - Size: 2.37 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

AI-STACK-dev/Covid19-Comorbidities-NLP-WEB

COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)

Language: JavaScript - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 2