GitHub topics: document-indexing
224saisrikanth/keyword-based-search
A high-performance PDF document search application that extracts text from PDF files, indexes content using Whoosh, and provides a premium user interface with modern design elements. Features include context-aware search results, content highlighting, multi-format export options, and an interactive document viewer with match navigation.
Language: Python - Size: 130 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

SubhangiSati/RAG-using-DeepSeek-R1
This repository highlights my learning journey in building Retrieval-Augmented Generation (RAG) pipelines using DeepSeek on Lightning AI, covering document ingestion, retrieval, and integration with generative AI. It showcases fine-tuning, evaluation, and optimization for accurate open-domain QA and knowledge management.
Language: Jupyter Notebook - Size: 1.01 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

kyr0/clientside-search
A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.
Language: TypeScript - Size: 1.58 MB - Last synced at: 12 days ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 0

MaximLevchenko/Boolean-Model-Implementations-Comparison
The purpose of this project is also to compare the efficiency and performance of two different methods for handling search operations: the inverted index and the term-document matrix
Language: Python - Size: 41.5 MB - Last synced at: 24 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

krisluczka/OSSE
Open Source Search Engine with built-in web/document crawler and an indexing method.
Language: C++ - Size: 58.6 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

lethalbit/bookwurm
dead simple document index and search, nothing fancy
Language: Python - Size: 20.5 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

Grimmer107/Search-Engine
It is a search engine that uses Json files as corpus of data.
Language: Python - Size: 10.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

trkotovicz/document-indexing-algorithm-py
Programa que simula um algoritmo de indexação de documentos similar ao do Google. Ele é capaz de identificar ocorrências de termos em arquivos TXT.
Language: Python - Size: 32.2 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ak811/ase
Local Search Engine Implementation with Document Indexing
Language: Java - Size: 10 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

victor-cali/DocumentBase
Educational Document Base prototype to perform queries based on similarity and dissimilarity measures of documents to which stemming, lemmantization and latent semantic indexing was applied.
Language: Jupyter Notebook - Size: 212 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0
