An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-search

infinilabs/coco-app

🥥 Coco AI App - Search, Connect, Collaborate, Your Personal AI Search and Assistant, all in one space.

langage: TypeScript - taille: 20,7 Mo - dernière synchronisation: il y a 1 jour - enregistré: il y a 1 jour - étoiles: 428 - forks: 47

deepsense-ai/ragbits

Building blocks for rapid development of GenAI applications

langage: Python - taille: 9,59 Mo - dernière synchronisation: il y a 4 jours - enregistré: il y a 4 jours - étoiles: 70 - forks: 8

poloclub/mememo

A JavaScript library that brings vector search and RAG to your browser!

langage: TypeScript - taille: 66,7 Mo - dernière synchronisation: il y a 4 jours - enregistré: il y a 10 mois - étoiles: 121 - forks: 10

aimaster-dev/SmartRAG

SmartRAG is a terminal-based RAG system using LangGraph. It processes queries by retrieving relevant content from markdown or PDFs, then responds using OpenAI GPT. Supports webpage-to-PDF conversion, vector DB search, and modular flow control.

langage: Python - taille: 51,8 Mo - dernière synchronisation: il y a 8 jours - enregistré: il y a 8 jours - étoiles: 0 - forks: 0

infinilabs/coco-server

🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.

langage: TypeScript - taille: 13,1 Mo - dernière synchronisation: il y a 8 jours - enregistré: il y a 8 jours - étoiles: 77 - forks: 19

aimaster-dev/chatbot-using-rag-and-langchain

Chat with your PDFs using AI! This Streamlit app uses RAG, LangChain, FAISS, and OpenAI to let you ask questions and get answers with page and file references.

langage: Python - taille: 15,5 Mo - dernière synchronisation: il y a 6 jours - enregistré: il y a 9 jours - étoiles: 11 - forks: 0

neuml/paperai

📄 🤖 Semantic search and workflows for medical/scientific papers

langage: Python - taille: 1,73 Mo - dernière synchronisation: il y a 18 jours - enregistré: il y a environ 2 mois - étoiles: 1 398 - forks: 110

GoodGuyAdy/QueryBaseAI

AI-powered hybrid search engine combining keyword, vector, and LLM-based contextual search using RAG with support for AI21, OpenAI or any other LLM.

langage: Python - taille: 39,1 ko - dernière synchronisation: il y a environ 9 heures - enregistré: il y a environ un mois - étoiles: 2 - forks: 0

lekt9/alBERT-launcher

AI-powered file launcher and semantic search assistant. Like Spotlight/Alfred but with advanced AI capabilities for understanding context and meaning. Features local processing, privacy-first design, and seamless integration with your workflow.

langage: TypeScript - taille: 8 Mo - dernière synchronisation: il y a 25 jours - enregistré: il y a 5 mois - étoiles: 11 - forks: 0

Parado-xy/seroost

A content-based document search engine in rust.

langage: Rust - taille: 3,58 Mo - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ 2 mois - étoiles: 1 - forks: 0

capjamesg/jamesql

An in-memory NoSQL database implemented in Python.

langage: Python - taille: 849 ko - dernière synchronisation: il y a 6 jours - enregistré: il y a 4 mois - étoiles: 84 - forks: 1

redis-developer/redis-arXiv-search

Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

langage: Python - taille: 1000 ko - dernière synchronisation: il y a 14 jours - enregistré: il y a environ 2 mois - étoiles: 144 - forks: 24

jankovicsandras/plpgsql_bm25

BM25 search implemented in PL/pgSQL

langage: Jupyter Notebook - taille: 1,26 Mo - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ 2 mois - étoiles: 42 - forks: 0

jankovicsandras/bm25opt

faster BM25 search algorithms in Python

langage: Jupyter Notebook - taille: 69,3 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a 7 mois - étoiles: 20 - forks: 1

teilomillet/raggo

A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.

langage: Go - taille: 567 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a 7 mois - étoiles: 55 - forks: 3

mishraanuraagx/ChatQ

Local Retrieval-Augmented Generation (RAG) system built with FastAPI, integrating vector search, Elasticsearch, and optional web search to power LLM-based intelligent question answering using models like Mistral or GPT-4.

langage: HTML - taille: 2,19 Mo - dernière synchronisation: il y a 9 jours - enregistré: il y a 2 mois - étoiles: 0 - forks: 1

daac-tools/find-simdoc

Finding all pairs of similar documents time- and memory-efficiently

langage: Rust - taille: 225 ko - dernière synchronisation: il y a 5 jours - enregistré: il y a 3 mois - étoiles: 60 - forks: 3

tomlin7/AI-research-assistant

Semantic document search system with pgvector and PGAI

langage: Python - taille: 50,8 ko - dernière synchronisation: il y a environ un mois - enregistré: il y a 7 mois - étoiles: 2 - forks: 2

kcubeterm/achoz

Search through all your personal data efficiently like web search.

langage: Python - taille: 1,74 Mo - dernière synchronisation: il y a environ un mois - enregistré: il y a plus de 2 ans - étoiles: 80 - forks: 5

RozhakXD/DocHunter

langage: Python - taille: 53,7 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 5 mois - étoiles: 2 - forks: 0

JumjumiAsbullah-08/KMS-V01

📚 Knowledge Management System (KMS) - Document Management Based Sebuah aplikasi berbasis web untuk mengelola, menyimpan, dan mencari dokumen secara efisien menggunakan PHP murni dan MySQL.

langage: PHP - taille: 53 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a 4 mois - étoiles: 1 - forks: 0

bent10/boox

Search anything, instantly

langage: TypeScript - taille: 43,6 Mo - dernière synchronisation: il y a 4 jours - enregistré: il y a 5 jours - étoiles: 5 - forks: 1

gsidhu/buzee-releases

Public releases for Buzee

taille: 20,5 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 6 mois - étoiles: 6 - forks: 0

zayedrais/DocumentSearchEngine

Document Search Engine project with TF-IDF abd Google universal sentence encoder model

langage: Jupyter Notebook - taille: 28,6 Mo - dernière synchronisation: il y a environ un mois - enregistré: il y a environ 2 ans - étoiles: 53 - forks: 24

robindekoster/chatgpt-custom-knowledge-chatbot

This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.

langage: Python - taille: 63,5 ko - dernière synchronisation: il y a 6 mois - enregistré: il y a presque 2 ans - étoiles: 120 - forks: 34

SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System

An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.

langage: Jupyter Notebook - taille: 5,03 Mo - dernière synchronisation: il y a 2 jours - enregistré: il y a 7 mois - étoiles: 0 - forks: 0

kyr0/clientside-search

A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.

langage: TypeScript - taille: 1,58 Mo - dernière synchronisation: il y a 12 jours - enregistré: il y a presque 2 ans - étoiles: 10 - forks: 0

easonlai/chatbot_with_pdf_streamlit

This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.

langage: Jupyter Notebook - taille: 6,57 Mo - dernière synchronisation: il y a environ un mois - enregistré: il y a presque 2 ans - étoiles: 15 - forks: 5

Qyokizzzz/simhash

The extended version of simhash supports fingerprint extraction of documents and images.

langage: Python - taille: 551 ko - dernière synchronisation: il y a 2 mois - enregistré: il y a presque 3 ans - étoiles: 2 - forks: 0

opengento/magento2-document-product-search

This module aims to make documents searchable with product keywords in Magento 2.

langage: PHP - taille: 13,7 ko - dernière synchronisation: il y a 9 jours - enregistré: il y a 10 mois - étoiles: 2 - forks: 1

opengento/magento2-document-search

This module aims to make documents searchable for customers in Magento 2.

langage: PHP - taille: 30,3 ko - dernière synchronisation: il y a 15 jours - enregistré: il y a 10 mois - étoiles: 3 - forks: 1

krisluczka/OSSE

Open Source Search Engine with built-in web/document crawler and an indexing method.

langage: C++ - taille: 58,6 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ un an - étoiles: 0 - forks: 0

lethalbit/bookwurm

dead simple document index and search, nothing fancy

langage: Python - taille: 20,5 ko - dernière synchronisation: il y a 2 mois - enregistré: il y a environ un an - étoiles: 6 - forks: 0

EricSchoebel/DocSpector

Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)

langage: Python - taille: 103 ko - dernière synchronisation: il y a 11 mois - enregistré: il y a 11 mois - étoiles: 0 - forks: 0

EmirhanSyl/TheBSTSearchEngine

Mini desktop search engine with Binary Search Tree

langage: Java - taille: 56,6 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 1 - forks: 0

HarshKothari21/Natural-Language-Processing-Specialization

NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.

langage: Jupyter Notebook - taille: 397 ko - dernière synchronisation: il y a 5 mois - enregistré: il y a presque 5 ans - étoiles: 2 - forks: 0

neuml/cord19q 📦

COVID-19 Open Research Dataset (CORD-19) Analysis

langage: Python - taille: 1,47 Mo - dernière synchronisation: il y a plus d'un an - enregistré: il y a plus de 2 ans - étoiles: 55 - forks: 19

harishartanto/information-retrieval

Information retrieval of text document using TF-IDF weighting & Cosine Similarity Algorithm

langage: Python - taille: 44,9 ko - dernière synchronisation: il y a environ 2 ans - enregistré: il y a environ 2 ans - étoiles: 0 - forks: 1

mdietrichstein/ir-search-engine-rust

Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)

langage: Rust - taille: 132 ko - dernière synchronisation: il y a 5 jours - enregistré: il y a environ 4 ans - étoiles: 5 - forks: 0

domwal/acervo-digital-pessoal

Website in PHP to index all pdf content and easy way to find any text

langage: PHP - taille: 15,1 Mo - dernière synchronisation: il y a 5 jours - enregistré: il y a environ un an - étoiles: 0 - forks: 1

liviobisogni/solr-ocr-indexing

Apache Solr Document Search and Indexing Analysis with OCR

langage: Java - taille: 2,37 Mo - dernière synchronisation: il y a environ 2 ans - enregistré: il y a environ 2 ans - étoiles: 0 - forks: 0

AI-STACK-dev/Covid19-Comorbidities-NLP-WEB

COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)

langage: JavaScript - taille: 11 Mo - dernière synchronisation: il y a plus de 2 ans - enregistré: il y a plus de 3 ans - étoiles: 1 - forks: 2