An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: information-retrieval

femto-dev/femto

Sequence Indexing and Search

Language: C++ - Size: 4.79 MB - Last synced at: about 1 hour ago - Pushed at: about 2 hours ago - Stars: 106 - Forks: 25

castorini/pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Language: Python - Size: 8.19 MB - Last synced at: 36 minutes ago - Pushed at: about 4 hours ago - Stars: 1,810 - Forks: 398

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

Language: Python - Size: 34.3 MB - Last synced at: about 4 hours ago - Pushed at: about 6 hours ago - Stars: 2,433 - Forks: 380

piskvorky/gensim

Topic Modelling for Humans

Language: Python - Size: 101 MB - Last synced at: about 5 hours ago - Pushed at: 2 months ago - Stars: 15,981 - Forks: 4,391

youngfish42/Awesome-FL

Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)

Language: Python - Size: 4.48 MB - Last synced at: about 7 hours ago - Pushed at: about 8 hours ago - Stars: 1,685 - Forks: 196

marqo-ai/marqo

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Language: Python - Size: 79.5 MB - Last synced at: about 4 hours ago - Pushed at: about 4 hours ago - Stars: 4,832 - Forks: 203

langroid/langroid

Harness LLMs with Multi-Agent Programming

Language: Python - Size: 104 MB - Last synced at: about 6 hours ago - Pushed at: about 9 hours ago - Stars: 3,237 - Forks: 315

NYXMatik/Web-Search-and-Information-Retrieval-in-the-Internet-Seminar

Technical seminar exploring the architecture and algorithms behind modern web search engines, including BM25, DPR, and hybrid retrieval models.

Size: 0 Bytes - Last synced at: about 11 hours ago - Pushed at: about 12 hours ago - Stars: 0 - Forks: 0

ict-bigdatalab/awesome-pretrained-models-for-information-retrieval

A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).

Size: 437 KB - Last synced at: about 5 hours ago - Pushed at: over 1 year ago - Stars: 661 - Forks: 49

lorainesouza/legal.ai

Projeto IA

Language: JavaScript - Size: 182 KB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 0 - Forks: 0

xhluca/bm25s

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy

Language: Python - Size: 2.03 MB - Last synced at: about 13 hours ago - Pushed at: 6 days ago - Stars: 1,118 - Forks: 64

SylphAI-Inc/AdalFlow

AdalFlow: The library to build & auto-optimize LLM applications.

Language: Python - Size: 101 MB - Last synced at: about 4 hours ago - Pushed at: 28 days ago - Stars: 2,930 - Forks: 256

ronenh24/bible_search_engine

Bible search engine incorporating natural language processing, deep learning, and machine learning.

Language: Jupyter Notebook - Size: 91.9 MB - Last synced at: about 23 hours ago - Pushed at: about 23 hours ago - Stars: 1 - Forks: 0

YanivHaliwa/Url-To-Text

Language: Python - Size: 5.86 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

onyx-dot-app/onyx

Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.

Language: Python - Size: 42.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 12,698 - Forks: 1,646

ashvardanian/StringZilla

Up to 10x faster strings for C, C++, Python, Rust, Swift & Go, leveraging NEON, AVX2, AVX-512, SVE, & SWAR to accelerate search, hashing, sort, edit distances, and memory ops 🦖

Language: C - Size: 8.69 MB - Last synced at: about 20 hours ago - Pushed at: 1 day ago - Stars: 2,530 - Forks: 88

deepset-ai/haystack

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Language: Python - Size: 48.3 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 20,392 - Forks: 2,136

JaidedAI/EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Language: Python - Size: 154 MB - Last synced at: 1 day ago - Pushed at: 7 months ago - Stars: 26,414 - Forks: 3,309

superlinked/superlinked

Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.

Language: Jupyter Notebook - Size: 96.1 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,052 - Forks: 75

FlagOpen/FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language: Python - Size: 38 MB - Last synced at: 1 day ago - Pushed at: 9 days ago - Stars: 9,408 - Forks: 678

neuml/txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Language: Python - Size: 52 MB - Last synced at: 1 day ago - Pushed at: 8 days ago - Stars: 10,768 - Forks: 683

Mojne/semantic-index

Lightweight, single-file vector database for experiments and small projects.

Language: C# - Size: 42 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

illuin-tech/colpali

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Language: Python - Size: 729 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,751 - Forks: 149

vladislavpyatnitskiy/financial.data.scraping

Access to Financial Data

Language: R - Size: 367 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3 - Forks: 0

KittyKatt/screenFetch

Fetches system/theme information in terminal for Linux desktop screenshots.

Language: Shell - Size: 4.24 MB - Last synced at: about 6 hours ago - Pushed at: 5 months ago - Stars: 3,963 - Forks: 452

weaviate/weaviate

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

Language: Go - Size: 964 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 13,131 - Forks: 924

seinecle/Gaze

Detects structure in your network

Language: Java - Size: 19.5 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 14 - Forks: 2

nicolay-r/nicolay-r

This is my personal news list updates in Information Retrieval domain

Size: 229 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

apache/lucene

Apache Lucene open-source search software

Language: Java - Size: 493 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,928 - Forks: 1,107

12345far/metrics-calculation-precision-recall

Laboratory 7 - Retrieval Information

Size: 1.95 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

catalyst-team/catalyst

Accelerated deep learning R&D

Language: Python - Size: 52.6 MB - Last synced at: about 8 hours ago - Pushed at: about 1 year ago - Stars: 3,347 - Forks: 393

zebbern/dezcrwl

🕷️ | dezcrwl is a website history crawler gather hidden information and check vulnerabilities for extracted .js endpoints & much more!

Language: Python - Size: 105 KB - Last synced at: 1 day ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

arc53/DocsGPT

DocsGPT is an open-source genAI tool that helps users get reliable answers from knowledge source, while avoiding hallucinations. It enables private and reliable information retrieval, with tooling and agentic system capability built in.

Language: TypeScript - Size: 81.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 15,566 - Forks: 1,657

xynehq/xyne

AI-first Search & Answer Engine for work. Open-source alternative to Glean.

Language: TypeScript - Size: 8.05 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 509 - Forks: 38

rapidsai/cuvs

cuVS - a library for vector search and clustering on the GPU

Language: Cuda - Size: 7.9 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 382 - Forks: 96

Vishal-Padia/ResumeScreener

A task given for the intern role at Captial Placement. UPDATE: Didn't get that internship but the project is great addition to my resume.

Language: Python - Size: 54.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 7 - Forks: 2

alirezatheh/perke

A keyphrase extractor for Persian

Language: Python - Size: 143 KB - Last synced at: about 19 hours ago - Pushed at: 23 days ago - Stars: 69 - Forks: 8

DataScienceUIBK/Rankify

🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techniques, 24+ state-of-the-art Reranking models, and multiple RAG methods.

Language: Python - Size: 5.11 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 420 - Forks: 34

castorini/anserini

Anserini is a Lucene toolkit for reproducible information retrieval research

Language: Java - Size: 91.2 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,051 - Forks: 479

Xihtro/sinapsis-ocr

Sinapsis templates supporting different OCR techniques

Language: Python - Size: 147 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

ReeseHatfield/wiki-info

A information retrieval API for wikipedia in rust

Language: Rust - Size: 110 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

AmenRa/ranx

⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍

Language: Python - Size: 34.6 MB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 542 - Forks: 28

ashvardanian/SimSIMD

Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐

Language: C - Size: 1.82 MB - Last synced at: 4 days ago - Pushed at: 23 days ago - Stars: 1,333 - Forks: 77

YunaBraska/semver-info-action

Cleans, parses, and compares semantic versions, providing essential insights into versioning, stability, and compatibility, making software release management a breeze!

Language: TypeScript - Size: 27 MB - Last synced at: 3 days ago - Pushed at: 11 days ago - Stars: 4 - Forks: 0

treygrainger/ai-powered-search

The codebase for the book "AI-Powered Search" (Manning Publications, 2024)

Language: Jupyter Notebook - Size: 65.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 254 - Forks: 63

thepushkarp/nalcos

Search Git commits in natural language

Language: Python - Size: 421 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 54 - Forks: 8

Unstructured-IO/unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language: HTML - Size: 192 MB - Last synced at: 5 days ago - Pushed at: 16 days ago - Stars: 10,915 - Forks: 907

SkBlaz/rakun2

RaKUn 2.0 - A fast keyword detection algorithm

Language: Python - Size: 2.62 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 66 - Forks: 9

felladrin/MiniSearch

Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space

Language: TypeScript - Size: 27.4 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 391 - Forks: 44

saschaszott/suma-tech

Musterlösungen und Demonstratoren für das Modul Suchmaschinentechnologie an der TH Wildau

Language: Java - Size: 404 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 4 - Forks: 1

UtrechtUniversity/dataQuest

A configurable pipeline for extracting and filtering articles from large corpora, tailored for the Delpher Kranten corpus, with support for features like keyword filtering and tf-idf-based relevance scoring.

Language: Python - Size: 280 KB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

texttron/tevatron

Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.

Language: Python - Size: 20.4 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 582 - Forks: 105

apache/solr

Apache Solr open-source search software

Language: Java - Size: 515 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,361 - Forks: 709

xlang-ai/instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Language: Python - Size: 170 MB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 1,933 - Forks: 146

vivekmenonm/documents-chat

A Streamlit-based app that lets users upload and chat with documents (PDF, DOCX, CSV, Excel) using LangChain, OpenAI, and FAISS. Includes user authentication, chat history, and PostgreSQL integration.

Language: Python - Size: 15.6 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

dorianbrown/rank_bm25

A Collection of BM25 Algorithms in Python

Language: Python - Size: 43.9 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 1,143 - Forks: 95

felladrin/awesome-ai-web-search

A list of software that allows searching the web with the assistance of AI: https://hf.co/spaces/felladrin/awesome-ai-web-search

Language: HTML - Size: 65.4 KB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 809 - Forks: 50

bitcrowd/rag_time

💁 Example code for a blog post series about using a RAG system on a local codebase.

Language: Python - Size: 31.3 KB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 26 - Forks: 5

rapidsai/raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.

Language: Cuda - Size: 15.1 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 870 - Forks: 205

saschaszott/ir-hdm-2025

Modul Information Retrieval im Sommersemester 2025 an der HdM Stuttgart

Language: Python - Size: 130 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 3

o19s/quepid

Improve your OpenSearch, Elasticsearch, Solr, Vectara, Algolia and Custom Search search quality.

Language: Ruby - Size: 68.4 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 303 - Forks: 106

tira-io/tirex-tracker

Automatic resource and metadata tracking for IR experiments.

Language: C++ - Size: 1.23 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 4 - Forks: 0

webis-de/archive-query-log

📜 The Archive Query Log.

Language: Jupyter Notebook - Size: 52.6 MB - Last synced at: about 9 hours ago - Pushed at: about 9 hours ago - Stars: 28 - Forks: 0

gopala-kr/summary

summaries of all the papers I read

Size: 215 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 139 - Forks: 41

iamarunbrahma/pdf-to-markdown

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.

Language: Python - Size: 69.3 KB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 73 - Forks: 7

edoardottt/csprecon

Discover new target domains using Content Security Policy

Language: Go - Size: 6.49 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 417 - Forks: 49

aryn-ai/sycamore

🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.

Language: Python - Size: 99.7 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 506 - Forks: 59

Picovoice/octopus 📦

On-device Speech-to-Index engine powered by deep learning

Language: Python - Size: 214 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 36 - Forks: 2

PreferredAI/tutorials

A tutorial series by Preferred.AI

Language: Jupyter Notebook - Size: 34.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 173 - Forks: 71

impresso/impresso-frontend

🚀 The frontend application of the Impresso WebApp http://impresso-project.ch/app

Language: Vue - Size: 27.9 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 5 - Forks: 0

apache/solr-operator

Official Kubernetes operator for Apache Solr

Language: Go - Size: 5.12 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 263 - Forks: 118

malbiruk/PubMedSummarizer

LLM-assistant that searches PubMed, retrieves abstracts or full-texts, and generates answers using OpenAI ChatGPT. Features a custom RAG pipeline, semantic search, and knowledge graph generation.

Language: Python - Size: 474 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 1

BJW333/findpeopleinfo

The findpeopleinfo OSINT Toolkit is a Python based command-line tool designed to assist in open-source intelligence (OSINT) gathering. It offers various functionalities that enable the user to find information regarding a particular target/interest.

Language: Python - Size: 61.5 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2 - Forks: 0

lightonai/ducksearch

Efficient BM25 with DuckDB 🦆

Language: Python - Size: 1.42 MB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 45 - Forks: 2

Twenkid/Vsy-Jack-Of-All-Trades-AGI-Bulgarian-Internet-Archive-And-Search-Engine

Вседържец/Vsy - The AGI Infrastructure of "The Sacred Computer" AGI Institute : Custom Intelligent Selective Internet Archiving and Exploration/Crawling; Information Retrieval, Media Monitoring, Search Engine, Smart DB, Data Preservation, Knowledge Extraction,Datasets creation,AI Generative models building and testing,Experiments etc.

Language: Jupyter Notebook - Size: 6.82 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 4 - Forks: 0

apache/lucene-solr

Apache Lucene and Solr open-source search software

Size: 562 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 4,376 - Forks: 2,641

henrypp/errorlookup

Simple tool for retrieving information about Windows errors codes.

Language: C - Size: 2.8 MB - Last synced at: 4 days ago - Pushed at: 8 days ago - Stars: 281 - Forks: 48

arian-askari/ChatGPT-RetrievalQA-CIKM2023

A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.

Language: Jupyter Notebook - Size: 24.9 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 141 - Forks: 7

lgrz/polystem

Stemming algorithms in Rust

Language: Rust - Size: 123 KB - Last synced at: 6 days ago - Pushed at: almost 6 years ago - Stars: 3 - Forks: 1

TileDB-Inc/TileDB-Vector-Search

Cloud-native vector similarity search and storage with efficient, serverless scale-out

Language: Jupyter Notebook - Size: 86.5 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 57 - Forks: 9

NEOS-AI/Neosearch

AI-based search engine done right

Language: HTML - Size: 97.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 16 - Forks: 0

soldni/pyterrier_sentence_transformers

Create PyTerrier compatible dense indices using any sentence_transformers model

Language: Python - Size: 53.7 KB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 3

shaoxiongji/knowledge-graphs

A collection of research on knowledge graphs

Language: JavaScript - Size: 199 KB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 1,731 - Forks: 295

lightonai/pylate

Late Interaction Models Training & Retrieval

Language: Python - Size: 2.31 MB - Last synced at: 5 days ago - Pushed at: 10 days ago - Stars: 276 - Forks: 17

tomfran/search-rs

Search engine written in Rust

Language: Rust - Size: 1.03 MB - Last synced at: about 17 hours ago - Pushed at: 9 months ago - Stars: 20 - Forks: 2

krishpranav/maigret

A simple username osint tool built in go

Language: Go - Size: 2.46 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 123 - Forks: 15

mmartyna123/WikipediaRecommendationSystem

Find your next Wikipedia read — smarter.

Language: Python - Size: 8.86 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 3 - Forks: 0

NoHaxito/deploys-top

Search & compare free and paid providers. Find the best option for your needs quickly and easily!

Language: TypeScript - Size: 1.47 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 1

StarlightSearch/EmbedAnything

Production-ready Inference, Ingestion and Indexing built in Rust 🦀

Language: Rust - Size: 36.7 MB - Last synced at: 9 days ago - Pushed at: 13 days ago - Stars: 506 - Forks: 45

truelockmc/PC-Optimus

A multi use Python Tool for Windows that offers Features to clean, debloat, update, repair or get Infos about your PC

Language: Python - Size: 637 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 10 - Forks: 0

tkhang1999/semantic-food-search

A semantic food search web application built with Django, Solr, SBERT, and Docker

Language: JavaScript - Size: 359 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 10 - Forks: 2

cwida/PDX

⚡ Faster vector search with PDX: A vertical data layout for vectors

Language: C++ - Size: 198 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 31 - Forks: 1

IntelLabs/fastRAG

Efficient Retrieval Augmentation and Generation Framework

Language: Python - Size: 20.4 MB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 1,513 - Forks: 139

sunnweiwei/RankGPT

Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]

Language: Python - Size: 25 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 589 - Forks: 58

kampersanda/elinor

Facilitating your comprehensive and in-depth evaluation in Information Retrieval

Language: Rust - Size: 285 KB - Last synced at: 3 minutes ago - Pushed at: 6 months ago - Stars: 9 - Forks: 0

gabriben/awesome-generative-information-retrieval

Size: 322 KB - Last synced at: 9 days ago - Pushed at: 6 months ago - Stars: 667 - Forks: 49

YunaBraska/java-info-action

Fast Maven/Gradle parser. This dynamic GitHub action automatically detects and extracts crucial information such as Java version, project version, and encoding. It also provides essential build commands and properties to make your development process more independent, efficient and streamlined.

Language: TypeScript - Size: 38 MB - Last synced at: 3 days ago - Pushed at: 11 days ago - Stars: 3 - Forks: 1

YunaBraska/git-info-action

Instant insights into the latest changes and commits. Provides valuable outputs such as ticket number detection, breaking changes, latest branch & commit & tag information, variety of programming languages and conventions.

Language: TypeScript - Size: 28.9 MB - Last synced at: 3 days ago - Pushed at: 11 days ago - Stars: 5 - Forks: 2

IntelLabs/RAG-FiT

Framework for enhancing LLMs for RAG tasks using fine-tuning.

Language: Python - Size: 925 KB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 737 - Forks: 56

billpwchan/DeepTrust

DeepTrust: A Reliable Financial Knowledge Retrieval Framework For Explaining Extreme Pricing Anomalies

Language: Python - Size: 116 MB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 20 - Forks: 3

Related Keywords
information-retrieval 2,363 python 375 search-engine 302 nlp 296 natural-language-processing 227 machine-learning 220 tf-idf 148 deep-learning 112 java 97 python3 96 inverted-index 90 information-extraction 85 search 83 bm25 79 question-answering 78 lucene 74 rag 72 llm 64 indexing 63 vector-space-model 62 elasticsearch 59 recommender-system 57 text-mining 57 semantic-search 55 nltk 55 data-mining 52 data-science 52 cosine-similarity 52 crawler 50 ir 49 nlp-machine-learning 48 retrieval-augmented-generation 45 bert 44 language-model 43 pytorch 41 information-gathering 40 text-classification 40 large-language-models 39 ai 39 pagerank 37 solr 35 artificial-intelligence 34 transformers 33 flask 33 ranking 32 boolean-retrieval 32 clustering 31 text-processing 31 information 30 embeddings 29 knowledge-graph 29 dataset 29 chatbot 29 neural-network 29 tfidf 27 word2vec 26 retrieval 25 django 25 learning-to-rank 24 sentiment-analysis 24 vector-search 24 search-algorithm 24 tokenization 24 osint 24 query-expansion 23 stemming 23 evaluation 22 computer-vision 22 wikipedia 21 classification 21 pandas 21 docker 21 dense-retrieval 20 trec 19 summarization 19 keyword-extraction 19 chatgpt 19 vector-database 19 recommendation-system 18 webscraping 18 javascript 18 react 18 faiss 17 jupyter-notebook 17 topic-modeling 17 web-scraping 17 data-analysis 17 generative-ai 16 nodejs 16 query 16 golang 16 streamlit 16 stemmer 16 linux 16 langchain 16 openai 16 research 15 fastapi 15 hacktoberfest 15 collaborative-filtering 15