An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: embeddings-similarity

harehimself/pinecone-lab

Experimenting with Pinecone as vector data continues to take center stage in AI-native systems. The purpose of this project is to explore the core capabilities, benchmark performance across different embedding models, and better understand what is possible with vector search in production environments.

Language: Python - Size: 104 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

qdrant/qdrant

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Language: Rust - Size: 32.8 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 24,208 - Forks: 1,657

dangkhoasdc/awesome-vector-database

A curated list of awesome works related to high dimensional structure/vector search & database

Size: 627 KB - Last synced at: 10 days ago - Pushed at: 22 days ago - Stars: 287 - Forks: 16

epsilla-cloud/vectordb

Epsilla is a high performance Vector Database Management System

Language: C++ - Size: 998 KB - Last synced at: 13 days ago - Pushed at: 16 days ago - Stars: 854 - Forks: 41

m1guelpf/tinyvector

A tiny embedding database in pure Rust.

Language: Rust - Size: 121 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 408 - Forks: 21

featureform/featureform

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

Language: Go - Size: 218 MB - Last synced at: 22 days ago - Pushed at: about 1 month ago - Stars: 1,900 - Forks: 96

jaypinho/transcript-accuracy

A Streamlit app to evaluate the accuracy of automatic speech recognition (ASR) transcription services.

Language: Python - Size: 414 KB - Last synced at: 6 days ago - Pushed at: 29 days ago - Stars: 2 - Forks: 1

proxectonos/simil-eval

Multilingual toolkit for evaluating LLMs using embeddings

Language: Python - Size: 89.8 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

marcominerva/ChatGptNet

A ChatGPT integration library for .NET, supporting both OpenAI and Azure OpenAI Service

Language: C# - Size: 4.29 MB - Last synced at: 30 days ago - Pushed at: 8 months ago - Stars: 316 - Forks: 38

EulerSearch/embedding_studio

Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.

Language: Python - Size: 10.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 380 - Forks: 5

louisbrulenaudet/ragoon

High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡

Language: Jupyter Notebook - Size: 18.9 MB - Last synced at: 9 days ago - Pushed at: 8 months ago - Stars: 66 - Forks: 7

shubham0204/glove.c

Simple, cross-platform port of GloVe embeddings, written in C

Language: C - Size: 278 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

pentoai/vectory

Vectory provides a collection of tools to track and compare embedding versions.

Language: Python - Size: 1.92 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 71 - Forks: 0

fm1320/song-vibe

AI song recommendations based on the feel of a song

Language: Python - Size: 5.88 MB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 22 - Forks: 0

cgast/embird

An open-source project for crawling RSS feeds and websites, extracting news content, and storing it with vector embeddings for semantic search, clustering and visualization..

Language: Python - Size: 957 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

ragu-manjegowda/RAGify

Contextual Code Exploration for Developers

Language: Python - Size: 30.6 MB - Last synced at: 20 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

sheriff1max/recs-searcher

Python library for correcting registry and spelling errors in user input when comparing with a database of texts.

Language: Python - Size: 1.97 MB - Last synced at: 27 days ago - Pushed at: 11 months ago - Stars: 3 - Forks: 0

Med-Karim-Ben-Boubaker/localume

Localume is a powerful desktop application that enables semantic search across your documents using advanced vector embeddings and retrieval technology. The application monitors specified directories in real-time, automatically indexing new and modified files to maintain an up-to-date searchable database.

Language: Python - Size: 10.4 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

comhendrik/vectorMatch

Dockerized application that embeds text in a pgvecto.rs database and retrieves data with a similarity search to generate a response with an llm from ollama.

Language: Python - Size: 31.3 KB - Last synced at: 14 days ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

MinLee0210/evento

Building an Event Retrieval System from Visual Data participating in Ho Chi Minh's AI Challenge in 2024

Language: Python - Size: 21.7 MB - Last synced at: 7 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

sonisanskar/Neighborhood-Blending

Language: Python - Size: 27.3 KB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 1

Babelscape/CroCoAlign

A Cross-Lingual, Context-Aware and Fully-Neural Sentence Alignment System for Long Texts.

Language: Python - Size: 90.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 6 - Forks: 0

lablab-ai/Vector-Similarity-Search-with-Redis-Quickstart-Notebook

Vector similarity can be used to find similar products, articles and much more. In this tutorial, we will show you how to use Redis to index and search for similar vectors

Language: Jupyter Notebook - Size: 6.84 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 3

farouqzaib/mettis

Vector Database implemented in Golang with support for full-text and vector search as well as fault tolerance via Raft.

Language: Go - Size: 684 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 53 - Forks: 3

zenrsr/mistral-ai

A noob's guide to AI Agents and RAG implementation using mistral-ai

Language: JavaScript - Size: 37.1 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

France-Travail/embcompare

A simple python tool for embedding comparison

Language: Python - Size: 27.9 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 0

marcocolangelo/MSSGAT Fork of leaves520/MSSGAT

Molecular substructure graph attention network for molecular property identification in drug discovery. This is the starting point for my thesis project and is the fork of a repository from the paper https://doi.org/10.1016/j.patcog.2022.108659

Language: Python - Size: 101 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

abeed04/CHAT-with-PDF-using-Gemini-1.5-Flash

Chat with the content of PDFs using an informative LLM powered by RAG.

Language: Python - Size: 36.1 KB - Last synced at: 4 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

manthan89-py/Real-Time-Social-Media-Content-Retrievel-System

The Real Time Social Media Content Retrieval System fetches real-time LinkedIn posts based on user queries, offering multiple post retrieval and customization options. Although initially focused on LinkedIn, it can be expanded to incorporate other social media platforms, facilitating cross-channel post similarity searches.

Language: Python - Size: 82 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

hubmapconsortium/asctb-ct-label-mapper 📦

asctb-ct-label-mapper: A package to recommend controlled vocabulary for annotations of scRNA-seq datasets. and thereby enable cross-dataset or cross-experiment comparison of annotations.

Language: Python - Size: 1.4 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 1

aaqilruzzan/RemEz-AI

RemEz is a descriptive question based learning platform built for students in highly theoretical subjects. The Frontend and Backend of this platform is built with the MERN stack and tailwind. This repository contains nlp code for pdf processing and descriptive QA generation via a LLM along with a similarity assessment of two descriptive answers.

Language: Python - Size: 58.6 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

handrew/gpt-memory

Using embeddings to create memory.

Language: Python - Size: 14.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 31 - Forks: 0

MohammedAly22/Sentiment-Analysis-for-Homonyms-Problem

A comprehensive examination is conducted to assess the influence of homonyms in sentiment analysis, employing two distinct techniques: fixed embeddings (LSTM) and contextualized embeddings (DistilBERT).

Language: Jupyter Notebook - Size: 377 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

hacker-4-good/PDF-Chatbot

The streamlit application for everyone who want to chit chat with their documents.

Language: Python - Size: 101 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

petermchale/llm-powered-applications

Orchestrating the interaction between users and Large Language Models

Language: Jupyter Notebook - Size: 10.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

abhinandanshahdev/bananaGPT

BananaGPT is a computer vision application that checks for ripeness of bananas using google's vertex ai multi modal embeddings, built using android

Language: Java - Size: 230 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

spartypkp/legalAI

LegalAI is a passion project which explores and simplifies the complexities of obtaining legal information using LLMs.

Language: Jupyter Notebook - Size: 4.29 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

saadtariq-ds/langchain_chat_with_data

Dive into LangChain, a powerful platform that lets you interact with your data like never before. This guide offers insights on its unique capabilities, helping you tap into your data in conversational ways.

Language: Jupyter Notebook - Size: 2.26 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

farhan0167/QnAChatBot

A chatbot that parses your PDF files and answers your questions around that file using GPT

Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: 12 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

viniciusarruda/word2vec

Yet Another Word2Vec Implementation

Language: Python - Size: 3.89 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

TeamEpicProjects/Practical-LLM-and-GPT-Applications

The repository is aimed at providing practical examples and resources for developers and researchers interested in applying LM and GPT models to real-world NLP problems.

Language: Jupyter Notebook - Size: 76.8 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 8 - Forks: 2

RikSchoonbeek/llm_knowledgebank

Exploring building an application in which an LLM can be prompted with the addition of context from a customly managed knowledge bank of data.

Language: Python - Size: 365 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

mortensi/face-recognition-python

Demos to test modelling and classification algorithms for face recognition

Size: 5.4 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

MahmoudAbdelRahman/build2Vec

Building representation in the vector space

Language: Python - Size: 665 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

cr21/Reverse-Search-Engine-Data-Collection

Data Collection repository for Reverse Search Engine

Language: Python - Size: 46.9 KB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

marcaureledivernois/semantic_game

Word Mini-Game : Guess the secret word ! Play here :

Language: Python - Size: 4.42 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

hadi-gharibi/pupil

Python package for labeing data more efficiently

Language: Python - Size: 485 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

AbdullahMakhdoom/Image-Search-Engine

Performed feature extraction and similarity lookup on Caltech101.

Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

Related Keywords
embeddings-similarity 48 embeddings 16 vector-database 15 python 9 machine-learning 7 vector-search 6 retrieval-augmented-generation 6 rag 6 nlp 6 python3 5 llms 5 search-engine 5 data-science 4 vector-search-engine 4 similarity-search 4 llm 4 natural-language-processing 4 embedding-vectors 3 faiss 3 cosine-similarity 3 question-answering 3 nearest-neighbor-search 3 deep-learning 2 ai 2 chatgpt 2 agents 2 embedding-python 2 postgresql 2 gpt-3 2 pytorch 2 openai 2 hacktoberfest 2 langchain-python 2 streamlit 2 faiss-vector-database 2 search-engines 2 vector 2 search 2 generative-ai 2 neural-search 2 neural-network 2 mlops 2 vectorization 2 hnsw 2 langchain 2 python-package 1 text-classification 1 chromadb 1 google-generative-ai 1 chatbot 1 legal 1 vertex-ai 1 android-app 1 large-language-models 1 natural-language-generation 1 vector-store 1 tree-decompositions 1 gemini-api 1 bytewax 1 database-retrieval 1 molecular-structures 1 linkedin-scraper 1 gru 1 graph-attention-networks 1 real-time-systems 1 bert-model 1 data-engineering 1 human-reference-atlas 1 gnn 1 single-cell-rna-seq 1 web-scraping 1 huggingface-transformers 1 nlp-machine-learning 1 sentiment-analysis 1 ecfp 1 bioinformatics 1 streamlit-dashboard 1 sentiment-classification 1 text-analysis 1 legal-analytics-and-data-science 1 face-recognition 1 pattern-recognition 1 bim 1 digital-twins 1 graph 1 graph-neural-networks 1 ifc 1 network-embedding 1 representation-learning 1 aws-s3 1 cicd 1 ecr 1 fastapi 1 image-search-engine 1 mongodb 1 tensorflow 1 mini-game 1 semantics 1 active-learning 1 labeling-tool 1