An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-indexing

224saisrikanth/keyword-based-search

A high-performance PDF document search application that extracts text from PDF files, indexes content using Whoosh, and provides a premium user interface with modern design elements. Features include context-aware search results, content highlighting, multi-format export options, and an interactive document viewer with match navigation.

Language: Python - Size: 130 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

SubhangiSati/RAG-using-DeepSeek-R1

This repository highlights my learning journey in building Retrieval-Augmented Generation (RAG) pipelines using DeepSeek on Lightning AI, covering document ingestion, retrieval, and integration with generative AI. It showcases fine-tuning, evaluation, and optimization for accurate open-domain QA and knowledge management.

Language: Jupyter Notebook - Size: 1.01 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

kyr0/clientside-search

A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.

Language: TypeScript - Size: 1.58 MB - Last synced at: 12 days ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 0

MaximLevchenko/Boolean-Model-Implementations-Comparison

The purpose of this project is also to compare the efficiency and performance of two different methods for handling search operations: the inverted index and the term-document matrix

Language: Python - Size: 41.5 MB - Last synced at: 24 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

krisluczka/OSSE

Open Source Search Engine with built-in web/document crawler and an indexing method.

Language: C++ - Size: 58.6 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

lethalbit/bookwurm

dead simple document index and search, nothing fancy

Language: Python - Size: 20.5 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

Grimmer107/Search-Engine

It is a search engine that uses Json files as corpus of data.

Language: Python - Size: 10.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

trkotovicz/document-indexing-algorithm-py

Programa que simula um algoritmo de indexação de documentos similar ao do Google. Ele é capaz de identificar ocorrências de termos em arquivos TXT.

Language: Python - Size: 32.2 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ak811/ase

Local Search Engine Implementation with Document Indexing

Language: Java - Size: 10 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

victor-cali/DocumentBase

Educational Document Base prototype to perform queries based on similarity and dissimilarity measures of documents to which stemming, lemmantization and latent semantic indexing was applied.

Language: Jupyter Notebook - Size: 212 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0