GitHub topics: retrieval

Repositories

lilpumpkin/rapto

Rapto is a fast, memory-efficient database designed for high-performance querying. This repository contains the server code, while client libraries are organized by language in separate repositories. 🐙✨

Language: Zig - Size: 57.6 KB - Last synced at: about 5 hours ago - Pushed at: about 6 hours ago - Stars: 2 - Forks: 0

ajksah/pdf-highlighter

This repository offers a straightforward PDF annotation tool built with React and PDF.js. Users can easily highlight text, add comments, and choose from multiple highlight colors. 🌟📄

Language: JavaScript - Size: 517 KB - Last synced at: about 11 hours ago - Pushed at: about 12 hours ago - Stars: 0 - Forks: 0

xhluca/bm25s

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy

Language: Python - Size: 2.03 MB - Last synced at: about 9 hours ago - Pushed at: 19 days ago - Stars: 1,213 - Forks: 72

cinarcy/semantic-recommender

# 🔍 Semantic Article RecommenderThis project offers a simple way to find articles that are similar in meaning. It uses advanced techniques like Hugging Face embeddings and FAISS for efficient searching. 🛠️

Language: Python - Size: 513 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

rgonzalezp/llarmy

LLArmy is a collection of small, specialized LlamaIndex-based agents and tools designed to create powerful agentic systems.

Language: Python - Size: 36.1 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

Language: Python - Size: 40.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2,626 - Forks: 422

maastrichtlawtech/MATCHED

Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data

Language: Jupyter Notebook - Size: 4.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

AmitPeleg/CLIC

Implementation of the paper "Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning", arXiv, 2025

Language: Python - Size: 2.87 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 7 - Forks: 0

Ethel75/NoteMR

NoteMR enhances multimodal large language models for visual question answering by integrating structured notes. This implementation aims to reduce reasoning errors and improve visual feature perception. 🐙📚

Language: Python - Size: 8.14 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

mikkoim/dinotool

Command-line tool for extracting DINO features for images and videos

Language: Python - Size: 18.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 9 - Forks: 1

memodb-io/memobase

Profile-Based Long-Term Memory for AI Applications

Language: Python - Size: 12.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,433 - Forks: 101

superlinked/superlinked

Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.

Language: Jupyter Notebook - Size: 141 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,164 - Forks: 83

run-pine/pineflow

Pineflow is a data framework to make AI easier to work with.

Language: Python - Size: 271 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

ARM-DOE/ACT

Atmospheric data Community Toolkit - A python based toolkit for exploring and analyzing time series atmospheric datasets

Language: Python - Size: 279 MB - Last synced at: 2 days ago - Pushed at: 4 days ago - Stars: 164 - Forks: 39

Anikethh/Methodology-Inspiration-Retrieval

How do you train retrievers to find inspirations? [ACL 2025]

Size: 17.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

lycoai/ducky-cookbook

Examples and guides for using the Ducky.ai API

Size: 968 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

aimaster-dev/SmartRAG

SmartRAG is a terminal-based RAG system using LangGraph. It processes queries by retrieving relevant content from markdown or PDFs, then responds using OpenAI GPT. Supports webpage-to-PDF conversion, vector DB search, and modular flow control.

Language: Python - Size: 51.8 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Anush008/fastembed-js

Library to generate vector embeddings in NodeJS

Language: TypeScript - Size: 1.07 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 128 - Forks: 11

tensorlakeai/indexify

A realtime serving engine for Data-Intensive Generative AI Applications

Language: Rust - Size: 123 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,020 - Forks: 129

DataScienceUIBK/Rankify

🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techniques, 24+ state-of-the-art Reranking models, and multiple RAG methods.

Language: Python - Size: 5.5 MB - Last synced at: 5 days ago - Pushed at: 10 days ago - Stars: 471 - Forks: 35

qdrant/fastembed

Fast, Accurate, Lightweight Python library to make State of the Art Embedding

Language: Python - Size: 3.15 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 2,141 - Forks: 139

AnswerDotAI/byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

Language: Python - Size: 1.94 MB - Last synced at: about 4 hours ago - Pushed at: 5 months ago - Stars: 797 - Forks: 85

vitrivr/vitrivr-engine

vitrivr's next-generation retrieval engine. It is capable of extracting and retrieving a wider range of multimedia objects such as audio, video, images or 3d models.

Language: Kotlin - Size: 5.05 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 7 - Forks: 4

devspidr/rag-from-scratch Fork of langchain-ai/rag-from-scratch

A reference project where I'm learning how Retrieval-Augmented Generation (RAG) works from scratch. This repo is for my personal understanding and experimentation with LLMs and retrieval pipelines. Not a production-ready tool — purely for learning and exploring the core ideas behind RAG.

Size: 3.17 MB - Last synced at: 4 days ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

apache/lucenenet

Apache Lucene.NET

Language: C# - Size: 171 MB - Last synced at: 5 days ago - Pushed at: 24 days ago - Stars: 2,316 - Forks: 647

palladian/palladian

Palladian is a Java-based toolkit with functionality for text processing, classification, information extraction, and data retrieval from the Web.

Language: Java - Size: 275 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 38 - Forks: 10

Anush008/fastembed-go

Go implementation of @qdrant/fastembed.

Language: Go - Size: 3.22 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 77 - Forks: 5

beir-cellar/beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Language: Python - Size: 38.9 MB - Last synced at: 9 days ago - Pushed at: 18 days ago - Stars: 1,836 - Forks: 210

wi2trier/cbrkit

Customizable Case-Based Reasoning (CBR) toolkit for Python with a built-in API and CLI.

Language: Python - Size: 4.03 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 10 - Forks: 5

kidist-amde/ddro

We introduce the direct document relevance optimization (DDRO) for training a pairwise ranker model. DDRO encourages the model to focus on document-level relevance during generation

Language: Python - Size: 2.01 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 17 - Forks: 2

zou-group/avatar

(NeurIPS 2024) AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning

Language: Python - Size: 13.4 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 204 - Forks: 19

NoHeartPen/fast-mikann-api

[WIP] This project provides a completely free pronunciation annotation and dictionary search API for Japanese language learners.

Language: Python - Size: 90.8 KB - Last synced at: 6 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

Anush008/fastembed-rs

Rust library for generating vector embeddings, reranking.

Language: Rust - Size: 608 KB - Last synced at: 8 days ago - Pushed at: 12 days ago - Stars: 531 - Forks: 72

BIGBALLON/GME-Search

A multimodal image search engine built on the GME model, capable of handling diverse input types. Whether you're querying with text, images, or both, provides powerful and flexible image retrieval under arbitrary inputs. Perfect for research and demos.

Language: Python - Size: 12.7 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 39 - Forks: 4

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

Language: Python - Size: 14.7 MB - Last synced at: 14 days ago - Pushed at: 4 months ago - Stars: 725 - Forks: 57

fkapsahili/EntRAG

EntRAG - Enterprise RAG Benchmark

Language: Python - Size: 168 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 2 - Forks: 0

TIGER-AI-Lab/UniIR

Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)

Language: Python - Size: 53.6 MB - Last synced at: 10 days ago - Pushed at: 9 months ago - Stars: 152 - Forks: 13

BIGBALLON/yandex-ris

A professional reverse image search and crawling tool that uses Yandex's image search engine to find and download similar images.

Language: Python - Size: 19.5 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

CoIR-team/coir

(ACL 2025 Main) A Comprehensive Benchmark for Code Information Retrieval.

Language: Python - Size: 2.48 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 95 - Forks: 9

UKPLab/PeerQA

Code and Data for PeerQA: A Scientific Question Answering Dataset from Peer Reviews, NAACL 2025

Language: Python - Size: 449 KB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 0

neulab/retomaton

PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)

Language: Python - Size: 6.62 MB - Last synced at: 10 days ago - Pushed at: almost 3 years ago - Stars: 73 - Forks: 4

jianhuiwemi/Material-Retrieval-Integration-across-Domains

[CVPR 2025] Official implementation of "MaRI: Material Retrieval Integration across Domains"

Language: Jupyter Notebook - Size: 43.9 MB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 3 - Forks: 0

Areebanaeemsatti/Quotes

Quotes Explorer is a semantic quote search application that uses Sentence-BERT and FAISS to find quotes based on meaning rather than keywords. Built with Gradio, it offers a fast, intuitive interface for discovering inspirational and insightful quotes.

Language: Jupyter Notebook - Size: 586 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

al-Jurjani/BibleRAG

This project explores Retrieval-Augmented Generation (RAG) for Bible question answering. It evaluates various configurations of document chunking, retrieval methods, embedding models and LLMs using the King James Version of the Bible. Performance is measured by faithfulness, relevance, and similarity to ground truth answers.

Language: Jupyter Notebook - Size: 3.69 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

naver/bergen

Benchmarking library for RAG

Language: Jupyter Notebook - Size: 139 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 206 - Forks: 20

SimonLupart/ikat-baseline

TREC iKAT (interactive Knowledge Assistant Track): Baselines Retrieval for neural Conversational Search

Language: Python - Size: 147 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1 - Forks: 0

epsilla-cloud/vectordb

Epsilla is a high performance Vector Database Management System

Language: C++ - Size: 998 KB - Last synced at: 14 days ago - Pushed at: 17 days ago - Stars: 854 - Forks: 41

mayurbhangale/multimodal-retrieval

Code for paper "Multimodal semantic retrieval for product search"

Language: Python - Size: 9.77 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

KarelDO/xmc.dspy

In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.

Language: Python - Size: 45.4 MB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 423 - Forks: 25

mbzuai-nlp/fire

A lightweight, agent-style framework for fact-checking atomic claims using iterative retrieval and verification. Reduces LLM and search cost while maintaining strong factuality performance.

Language: Python - Size: 1.29 MB - Last synced at: 21 days ago - Pushed at: 22 days ago - Stars: 8 - Forks: 0

tonywu71/colpali-cookbooks

Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻‍🍳

Size: 12 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 292 - Forks: 20

intel/intel-extension-for-transformers 📦

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Language: Python - Size: 585 MB - Last synced at: 20 days ago - Pushed at: 9 months ago - Stars: 2,171 - Forks: 213

ContextualAI/gritlm

Generative Representational Instruction Tuning

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 24 days ago - Pushed at: 3 months ago - Stars: 640 - Forks: 45

ai4protein/VenusREM

🧬 Augmenting zero-shot mutant prediction by retrieval-based logits fusion. (ISMB/ECCB 2025)

Language: Python - Size: 359 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 69 - Forks: 8

BUAADreamer/EasyRAG

Easy-to-Use RAG Framework; CCF AIOps International Challenge 2024 Top3 Solution; CCF AIOps 国际挑战赛 2024 季军方案

Language: Python - Size: 30.3 MB - Last synced at: 26 days ago - Pushed at: 7 months ago - Stars: 481 - Forks: 59

lancopku/IAIS

[ACL 2021] Learning Relation Alignment for Calibrated Cross-modal Retrieval

Language: Python - Size: 3.26 MB - Last synced at: 25 days ago - Pushed at: about 2 years ago - Stars: 31 - Forks: 4

mohsenfayyaz/ColDeR

Collapse of Dense Retrievers [ ACL 2025 ]

Language: Jupyter Notebook - Size: 24.3 MB - Last synced at: 27 days ago - Pushed at: 28 days ago - Stars: 2 - Forks: 0

j991222/mirb

MIRB: Mathematical Information Retrieval Benchmark

Language: Jupyter Notebook - Size: 1.8 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 1 - Forks: 0

xlang-ai/BRIGHT

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Language: Python - Size: 12.4 MB - Last synced at: 28 days ago - Pushed at: about 1 month ago - Stars: 121 - Forks: 12

BatsResearch/trove

A Flexible Toolkit for Dense Retrieval

Language: Python - Size: 188 KB - Last synced at: 25 days ago - Pushed at: 2 months ago - Stars: 33 - Forks: 2

FasterDecoding/REST

REST: Retrieval-Based Speculative Decoding, NAACL 2024

Language: C - Size: 1.06 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 202 - Forks: 14

SapienzaNLP/relik

Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)

Language: Python - Size: 908 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 421 - Forks: 32

AstraBert/diRAGnosis

Diagnose the performance of your RAG🩺

Language: Python - Size: 214 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 36 - Forks: 3

mendableai/rag-arena

Open-source RAG evaluation through users' feedback

Language: TypeScript - Size: 18.7 MB - Last synced at: 30 days ago - Pushed at: about 1 year ago - Stars: 184 - Forks: 21

jxmorris12/cde

code for training & evaluating Contextual Document Embedding models

Language: Python - Size: 1.67 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 191 - Forks: 11

illuin-tech/vidore-benchmark

Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.

Language: Python - Size: 2.97 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 206 - Forks: 27

lucidrains/RETRO-pytorch

Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Language: Python - Size: 186 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 865 - Forks: 107

vitrivr/cottontaildb

Cottontail DB is a column store vector database aimed at multimedia retrieval. It allows for classical boolean as well as vector-space retrieval (nearest neighbour search) used in similarity search using a unified data and query model.

Language: Kotlin - Size: 14.3 MB - Last synced at: 11 days ago - Pushed at: 6 months ago - Stars: 41 - Forks: 20

NeumTry/NeumAI

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

Language: Python - Size: 3.83 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 856 - Forks: 47

Ryota-Kawamura/LangChain-Chat-with-Your-Data

Start building practical applications that allow you to interact with data using LangChain and LLMs.

Language: Jupyter Notebook - Size: 71.8 MB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 44 - Forks: 43

shervinea/mit-15-003-data-science-tools

Study guides for MIT's 15.003 Data Science Tools

Size: 8.94 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 1,842 - Forks: 365

Muennighoff/sgpt

SGPT: GPT Sentence Embeddings for Semantic Search

Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 867 - Forks: 54

LongxingTan/open-retrievals

All-in-One: Text Embedding, Retrieval, Reranking and RAG in Transformers

Language: Python - Size: 1.38 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 58 - Forks: 12

arcee-ai/DALM

Domain Adapted Language Modeling Toolkit - E2E RAG

Language: Python - Size: 18.9 MB - Last synced at: 29 days ago - Pushed at: 8 months ago - Stars: 319 - Forks: 40

lucidrains/memorizing-transformers-pytorch

Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

Language: Python - Size: 34.2 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 632 - Forks: 46

ntat/Lightweight_CLIP_model

A lightweight Pytorch implementation of OpenAI's CLIP model.

Language: Python - Size: 44.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

YuweiYin/ARR

ARR: QA with LLMs via Analyzing, Retrieving, and Reasoning

Language: Python - Size: 821 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0

0x7o/RETRO-transformer

Easy-to-use Retrieval-Enhanced Transformer implementation

Language: Python - Size: 586 KB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 4

mixedbread-ai/mxbai-rerank

Crispy reranking models by Mixedbread

Language: Python - Size: 115 KB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 31 - Forks: 2

lucidrains/marge-pytorch

Implementation of Marge, Pre-training via Paraphrasing, in Pytorch

Language: Python - Size: 166 KB - Last synced at: 28 minutes ago - Pushed at: over 4 years ago - Stars: 76 - Forks: 11

sofyan48/ochabot

Rag Tools for Retrieval QA

Language: Python - Size: 271 KB - Last synced at: 22 days ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 3

lucidrains/tranception-pytorch

Implementation of Tranception, an attention network, paired with retrieval, that is SOTA for protein fitness prediction

Language: Python - Size: 206 KB - Last synced at: 22 days ago - Pushed at: about 3 years ago - Stars: 32 - Forks: 1

umbertogriffo/Trie

A Mixed Trie and Levenshtein distance implementation in Java for extremely fast prefix string searching and string similarity.

Language: Java - Size: 34.6 MB - Last synced at: 10 days ago - Pushed at: about 3 years ago - Stars: 44 - Forks: 12

redis-developer/ArXivChatGuru

Use ArXiv ChatGuru to talk to research papers. This app uses LangChain, OpenAI, Streamlit, and Redis as a vector database/semantic cache.

Language: Python - Size: 2.95 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 546 - Forks: 71

pratapyash/local-rag-qa-engine

A secure document question-answering system leveraging local LLMs for private, efficient, and scalable knowledge retrieval

Language: Python - Size: 229 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

teddylee777/openai-api-kr

OpenAI 공식 Document, Cookbook, 그 밖의 실용 예제를 바탕으로 작성한 한국어 튜토리얼입니다. 본 튜토리얼을 통해 Python OpenAI API 를 더 쉽고 효과적으로 사용하는 방법을 배울 수 있습니다.

Language: Jupyter Notebook - Size: 39.2 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 53 - Forks: 25

JYyangming02/Similarity

A deep learning project for similarity measurement and retrieval tasks using feature embeddings.

Language: Python - Size: 7.69 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

DRSY/MoTIS

[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)

Language: Swift - Size: 16.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 123 - Forks: 10

itz-me-nvs/langchain-QA-RAG

Using langchain for Question Answering on Own Data

Language: Jupyter Notebook - Size: 27.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

das-projects/RAGChatbot

Creating ChatGPT like experience on enterprise data using the Retrieval Augmented Generation pattern to

Language: Python - Size: 118 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sogaiu/git-some-janets

Tool to retrieve various Janet repositories

Language: Janet - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 0

pineflow-ai/pineflow

Pineflow is a data framework to load any data in one line of code and connect with AI applications.

Language: Python - Size: 854 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

keivanipchihagh/multi-stage-two-tower-recommender

A Movie Recommender System using YouTube's Two-Tower Architecture built with TFRS+TFR and served with FastAPI

Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 4 - Forks: 0

luyug/COIL

NAACL2021 - COIL Contextualized Lexical Retriever

Language: Python - Size: 91.8 KB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 153 - Forks: 28

CheckerNetwork/spark-checker

💥 Storage Provider Retrieval Checker as a Checker Subnet

Language: JavaScript - Size: 612 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 4

gabmoreira/subspaces

Code for the paper Learning Visual-Semantic Subspace Representations

Language: Python - Size: 1.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 1

chao1224/MoleculeSTM

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)

Language: Python - Size: 39.9 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 224 - Forks: 21

chao1224/ChatDrug

LLM for Drug Editing, ICLR 2024

Language: Python - Size: 4.48 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 149 - Forks: 7

RonMcKay/OODRetrieval

Detection and Retrieval of Out-of-Distribution Objects in Semantic Segmentation

Language: Python - Size: 4.93 MB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 32 - Forks: 5

RhizoNymph/InformationAgency

An information indexing and retrieval information for LLMs and agents. Uses FastAPI, MinIO, OpenSearch, and Qdrant (with ColBERT embeddings via FastEmbed). Uses an LLM with structured output for document classification and type specific metadata extraction. Exposes index_document and search routes.

Language: Python - Size: 959 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Related Keywords

retrieval 386 rag 63 nlp 48 llm 48 retrieval-augmented-generation 44 deep-learning 37 python 31 information-retrieval 30 embeddings 28 pytorch 23 search 22 machine-learning 19 ai 18 information 16 question-answering 15 vector-database 14 chatbot 14 computer-vision 14 semantic-search 13 large-language-models 13 data 13 openai 13 language-model 12 natural-language-processing 12 transformers 12 image-retrieval 12 langchain 12 tensorflow 11 image 11 benchmark 11 ranking 11 vector-search 11 python3 10 multimedia 10 clip 10 artificial-intelligence 10 classification 10 bm25 10 embedding 9 search-engine 9 faiss 9 generation 9 chatgpt 9 recommender-system 8 indexing 8 multimodal 8 java 8 cbir 7 reranking 7 cnn 7 dataset 7 llms 7 sentence-transformers 7 dense-retrieval 6 dpr 6 hashing 6 metric-learning 6 agents 6 video 6 generative-ai 6 attention-mechanism 6 retrieval-systems 6 augmented 6 embedding-models 6 generative 6 retrieval-model 5 recommendation-system 5 evaluation 5 php 5 matching 5 database 5 framework 5 similarity-search 5 ml 5 3d 5 paper 5 nearest-neighbor-search 5 query 5 huggingface 5 image-search 5 neural-search 5 audio 5 clustering 5 cross-modal-retrieval 5 sentence-embeddings 4 open-domain-qa 4 keras 4 cpp 4 text 4 pinecone 4 bert 4 elasticsearch 4 ir 4 lucene 4 feature-extraction 4 multi-modal 4 index 4 multimedia-retrieval 4 sbert 4 content-based-image-retrieval 4