An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: vector-embeddings

cwida/PDX

⚑ Faster vector search with PDX: A vertical data layout for vectors

Language: C++ - Size: 198 MB - Last synced at: about 2 hours ago - Pushed at: about 4 hours ago - Stars: 36 - Forks: 2

kantord/SeaGOAT

local-first semantic code search engine

Language: Python - Size: 19.4 MB - Last synced at: about 8 hours ago - Pushed at: about 8 hours ago - Stars: 1,138 - Forks: 71

EmmS21/fundAI

We give students across Africa access to laptops preloaded with offline AI tutors to help them prepare for exams and build tech careers

Language: Python - Size: 8.64 MB - Last synced at: about 8 hours ago - Pushed at: about 10 hours ago - Stars: 0 - Forks: 0

manomarras/Scalable-and-Real-Time-RAG-System-with-Open-AI

A real-time data transfer and RAG-based question-answering system using OpenAI. The project integrates PostgreSQL, Elasticsearch, and OpenAI's GPT-3.5 for real-time data updates and accurate, fast user query responses

Size: 1.95 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

ZachNagengast/similarity-search-kit

πŸ”Ž SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.

Language: Swift - Size: 175 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 454 - Forks: 43

Yashwanth12321/RepoHelp

program that analyses a github repository and responds with appropriate answers with respect to the repo. Uses concept of vector embedding and RAG to fetch most relevent Information.

Language: Python - Size: 570 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

Dr-Hutchinson/nicolay

Nicolay is a digital history experiment that uses artificial intelligence to explore the speeches of Abraham Lincoln.

Language: Python - Size: 14.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 6 - Forks: 0

dev-diaries41/smartscan

SmartScan is an innovative app powered by a CLIP model that automatically organizes your images by content similarity and enables text-based search.

Language: Kotlin - Size: 106 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 164 - Forks: 4

dead8309/ai-rag-crawler

AI pipeline built with the honc and workers-ai. vector embeddings, web scraping and processing with Cloudflare Workflows (beta)

Language: TypeScript - Size: 404 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 19 - Forks: 1

boredom1234/CodeCraft

A powerful CLI tool using vector embeddings and LLMs to help developers understand codebases through natural language. Ask questions in plain English, get context-aware responses, analyze GitHub repos, and generate documentation. Your AI coding companion for quick codebase exploration.

Language: Python - Size: 1.09 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

m-abdelwahab/build-smarter-chatbots-workshop

Build an AI-powered chatbot that is able to access external data to provide the most accurate answer

Language: TypeScript - Size: 133 KB - Last synced at: about 23 hours ago - Pushed at: 10 months ago - Stars: 9 - Forks: 1

slhodak/Canvo

Node-based procedural AI-powered text editing

Language: TypeScript - Size: 4.89 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

denizdagli/QuantumComputingChatbot

Quantum Computing Chatbot is a Streamlit app that answers questions about quantum computing using a PDF document as its knowledge base. It uses Google Gemini and LangChain for intelligent, document-aware responses.

Language: Jupyter Notebook - Size: 19.8 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

pierrebrunelle/infinite-memory-discord-bot

A context-aware Discord bot with semantic search and conversational memory. Uses Pixeltable + OpenAI for human-like responses

Language: Python - Size: 107 KB - Last synced at: 23 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

dev-diaries41/smartscan-cli

A Linux CLI tool powered by CLIP that enables comparison and automated organization of image and text files based on content similarity

Language: Python - Size: 19.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

e-d-i-n-i/e-doctor-public

AI-powered medical expert system using RAG for diagnosis, prescription, and recommendations, integrating role-based access, AI chat, and structured medical data management with Next.js, Flask, MySQL, and Supabase.

Language: Python - Size: 363 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

e-d-i-n-i/ai-data-extraction

AI-driven system for structured data extraction, storage, and vector search, leveraging Crawl4AI, PydanticAI, and Supabase to enable efficient retrieval and RAG-based AI applications.

Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

alexandralanorias/litintuit

an intelligent book recommendation system using llms and python. vibe code-free.

Language: Jupyter Notebook - Size: 2.87 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

apsinghAnalytics/FinRAGify_App

An LLM app leveraging RAG with LangChain and GPT-4 mini to analyze earnings call transcripts, assess company performance, using natural language queries (NLP), FAISS (vector database), and Hugging Face re-ranking models.

Language: Jupyter Notebook - Size: 4.85 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

KristianMSchmidt/semantic-art-search

Semantic Art Search – Discover art through meaning, not just keywords.

Language: Python - Size: 1.21 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 0

alash3al/vecdb

a vector embedding database with multiple storage engines and AI embedding integrations

Language: Go - Size: 53.7 KB - Last synced at: 16 days ago - Pushed at: 9 months ago - Stars: 33 - Forks: 2

laavanjan/Budget2025NIMRAG-Q-A-chat

"A document Q&A application powered by NVIDIA NIM and LangChain, focused on Sri Lanka's Budget Speech 2025

Language: Python - Size: 294 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

debarup24/knowledgebased-AI

Knowledge based AI chatbot

Language: TypeScript - Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

bartolomej/vector-embedding-explorer

Streamlit app to visualize vector embeddings

Language: Python - Size: 7.81 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

vdutts7/ai-mreflow

YouTubeGPT β€’ AI Chat with 100+ videos ft. YouTuber Matt Wolfe (@mreflow) πŸΊπŸŸ£πŸ€–πŸ’¬

Language: TypeScript - Size: 70.5 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 32 - Forks: 3

Snehil-Shah/Multimodal-Image-Search-Engine

Text to Image & Reverse Image Search Engine built upon Vector Similarity Search utilizing CLIP VL-Transformer for Semantic Embeddings & Qdrant as the Vector-Store

Language: Jupyter Notebook - Size: 10.9 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 3

piyush-eon/ai-portfolio-nextjs

Build an AI Driven Portfolio App with NextJS and Tailwind CSS. We will learn advance AI Technologies like vector embedding and vector databases along with how to work with Open AI's APIs. This is an amazing project to impress recruiters a lot and showcase your skillset.

Language: JavaScript - Size: 143 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 6 - Forks: 5

CodeWith-HAMZA/Management-System

It's (SaaS) platform focused on advanced task and project management. This SaaS solution offers a comprehensive set of features including task tracking, real-time communication through audio and video calls, and a design collaboration tool inspired by Figma. By integrating these functionalities into a single user

Language: TypeScript - Size: 728 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Shahbaz1234567/SmartRAG-Assistant

SmartRAG-Assistant/GenAI-Assistant leverages advanced LLM models and Nvidia APIs for efficient query handling and document summarization. It integrates LlamaParse for structured data extraction, HuggingFace embeddings for vectorization, and PineconeDB for efficient retrieval, ensuring precise answers to user queries.

Language: Python - Size: 12.1 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

kivanc57/mongodb_operations

This project demonstrates MongoDB CRUD operations, data modeling, and advanced Atlas Search & Atlas Vector Search features with Hugging Face, PyMongo, and PyArrow to efficiently process, query data and get efficient results.

Language: Jupyter Notebook - Size: 49.8 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Isa1asN/plagiarism-detector

Plagiarism detection for Amharic language text

Language: Jupyter Notebook - Size: 635 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

fpgmaas/pypi-scout

Find Python Packages on PyPI with the help of vector embeddings

Language: Python - Size: 15 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 44 - Forks: 1

monish-prabhu/Intra-Search

A tool for performing semantic search within pdf documents leveraging sentence transformers.

Language: Python - Size: 747 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

kunalvirwal/CLIP-Vectorizer

A containerized API for generating Vector embeddings for text and images using the OpenAI CLIP Model utilizing CUDA

Language: Python - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Manish-Syal123/DevLens

https://dev-lens-m9r2cgtrv-manishsyal123s-projects.vercel.app

Language: TypeScript - Size: 5.09 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

rsharvesh16/Medalytics

MEDALYTICS is a comprehensive medical coding and insurance analytics platform built using Streamlit and AWS Bedrock. The system helps insurance companies process patient insurance payments through automated medical coding, analytics, and patient insurance analysis.

Language: Python - Size: 26.1 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

rooneyrulz/memomind-ai-chatbot

Memomind is a sleek note-taking app built with React 18, Next.js 14, and TypeScript. It features a chat-based RAG workflow, AI-powered insights with Langchain and Llama3, and secure authentication via Clerk. It uses Tailwind CSS for styling and Shadcn-UI for components.

Language: TypeScript - Size: 609 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

gobeli/deno-drizzle-embeddings

Vector embeddings with Deno, Drizzle and sqlite

Language: TypeScript - Size: 4.16 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

kaloslazo/PyFuseDB

Database system that combines structured data retrieval through inverted indexes with unstructured data (images, audio) search using multidimensional vector embeddings, all within a unified platform.

Language: Python - Size: 631 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

terilios/file-upload-embeddings

Enterprise-grade document intelligence platform leveraging vector embeddings and LLMs for advanced document processing, semantic search, and information retrieval.

Language: Python - Size: 173 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

madeyexz/markdown-file-query

Semantic QA with a markdown database: Query any markdown file using vector embedding, Pinecone vector database and GPT (langchain). A weaker version of privateGPT

Language: Python - Size: 187 KB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 29 - Forks: 2

prernarohra/Tech-Research-Agent

This Tech Research Agent will help you research about any topic related to technology, whether it's old news or latest, my tech agent has you covered.

Language: Python - Size: 12.7 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

rooneyrulz/vector-search-sandbox

This repository contains my practice and experiments with vector search and semantic searching using various vector databases.

Language: JavaScript - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

vdutts7/chatBTC

AI Chat with The β‚Ώitcoin Whitepaper

Language: TypeScript - Size: 4.24 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 4

worldbeater/code-vecs

Code for the methods and algorithms described in the paper "Analysis of Program Representations Based on Abstract Syntax Trees and Higher-Order Markov Chains for Source Code Classification Task"

Language: Jupyter Notebook - Size: 1.23 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 5 - Forks: 0

Giorgi/SemanticKernel.Connectors.Oracle

Semantic Kernel memory built on top of Oracle 23ai

Language: C# - Size: 71.3 KB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 3 - Forks: 0

AishwaryaHastak/RAG-using-T5

An End-to-End RAG Pipeline for handling Question and Answer in the Data Science domain

Language: Jupyter Notebook - Size: 2.27 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 1

deadbits/vector-embedding-api

Flask API for generating text embeddings using OpenAI or sentence_transformers

Language: Python - Size: 36.1 KB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 14 - Forks: 1

bennyschmidt/next-token-prediction

Next-token prediction in JavaScript β€” build fast language and diffusion models.

Language: JavaScript - Size: 35.3 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 135 - Forks: 5

c-vandenberg/paper-qa-chemistry

Extends the Paper QA package for use with the author's Zotero database papers in organic chemistry, drug discovery & development, cheminformatics and the applications of machine learning to these areas

Language: Python - Size: 90.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

berecat/openai-pinecone-search

Semantic search with openai's embeddings stored to pineconedb (vector database)

Language: TypeScript - Size: 6.44 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 14 - Forks: 2

marcoswastaken/erdos_paware

Pawsitive Retrieval RAG Project - Erdos Institute Deep Learning Boot Camp - Spring 2024

Language: Jupyter Notebook - Size: 510 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

nem035/tim.nem.ai

AI chat with Tim Ferriss or any of his past guests

Language: TypeScript - Size: 52.2 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1 - Forks: 1

itsbariscan/ContextBridge-Semantic-Internal-Link-Tool

ContextBridge-Semantic-Internal-Link-Tool is an advanced Python script designed to enhance website structure and user experience by identifying and suggesting intelligent internal linking opportunities.

Language: Python - Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

itsbariscan/SEO-Semantix

The SEO Content Analyzer is a sophisticated Python script designed to perform in-depth semantic analysis of content for SEO purposes.

Language: Python - Size: 13.7 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

brylie/wagtail-vector-blog

Experimenting with Wagtail vector search (and possibly chat) by creating a blog

Language: Python - Size: 135 KB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

oozdal/pdf-bundle

PDF Bundle extracts text from PDFs in AWS S3, splits it, stores embeddings in Pinecone, and uses query vector embeddings based on cosine similarities for efficient search and retrieval.

Language: Jupyter Notebook - Size: 1.23 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Srijan-D/youtube-ai-assistant-langchain

Language: Python - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

mmr116/document-search-using-vector-embeddings-openai-rag

A simple web application to generate vector embeddings for PDF document, store them in a vector database, and enable semantic search and information retrieval using OpenAI's language models.

Language: Python - Size: 109 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

mmr116/Data-Query-with-RAG-OpenAI-Embeddings-and-Vector-Database

Vector embeddings generation for a csv file, storing embeddings in the vector database and query the csv file using openai language model

Language: Python - Size: 104 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

azeebneuron/Know-your-docs

Know Your Docs: Upload your documents and get instant answers to any questions related to them with this document knowledge platform

Language: Python - Size: 6.16 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Icomanman/mico-ai

AI Chatbot with Knowledge Base embeddings (prototype)

Language: Python - Size: 53.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

malleswarigelli/QA_Information_Retrival_Application

Build Generative AI, custom Question/Answer or Information Retrival Application using LlamaIndex, Google Gemini

Language: Jupyter Notebook - Size: 236 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

dawar-shafaque/people-tracking-and-counting-system

Language: Python - Size: 0 Bytes - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

serp-ai/V3CTRON-vector-database-embedding-neural-search-retrieval-chatgpt-plugin Fork of openai/chatgpt-retrieval-plugin

V3CTRON | Vector Embeddings Data Retrieval | ChatGPT Plugin

Language: Python - Size: 6.61 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 7

manthan-modi/people-tracking-system

Language: Python - Size: 66.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

souravkumardubey/milvus-vector-db

Hands-on with Milvus vector db

Language: JavaScript - Size: 9.54 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Prayag2003/ethereum-bots-revelio-labs-nirma

Advance Resume Parser: This project was built during the Mined Hackathon organized by Nirma University.

Language: Jupyter Notebook - Size: 10.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 2

angus-spence/loc2vec

Learning semantic embeddings from OSM data: A Pytorch implementation of the loc2vec general method outlined in: https://sentiance.com/loc2vec-learning-location-embeddings-w-triplet-loss-networks.

Language: Python - Size: 123 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

trimoyee-g/Doc-Converse

Seamlessly interact with PDF, CSV and Handwritten Notes

Language: Python - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

xpbowler/semanticOS

Fast semantic OS Search

Language: Rust - Size: 96.1 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sueszli/vector-database-benchmark πŸ“¦

paper: vecdb benchmark stats for dec 2023

Size: 1.81 GB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

chetryJyoti/QA-ChatBot

Developed using custom data for answering questions from a given domain knowledge

Language: Jupyter Notebook - Size: 18.6 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

gagan257/Brain-Flows

Intelligent note taking Web App with AI Integration

Language: TypeScript - Size: 352 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Weile-Zheng/ssat-analogy

SSAT Analogy Solver

Language: Python - Size: 578 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

astronomer/use-case-llm-customer-feedback

Use Cohere and OpenSearch to analyze customer feedback in an MLOps pipeline

Language: Python - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

abhishekHegde2000/ai-note-app

Developed an AI-powered note-taking application using Next.js 14, ChatGPT API, vector embeddings, Pinecone, TailwindCSS, Shadcn UI, and TypeScript.

Language: TypeScript - Size: 1.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Govind-S-B/pdf-to-text-chroma-search

Python scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma DB for similarity search based on user input.

Language: Python - Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sfoteini/image-vector-search-azure-postgresql

Image Vector Similarity Search with Azure AI Vision (Florence model) and Azure Cosmos DB for PostgreSQL

Language: Jupyter Notebook - Size: 38.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

horans/json-docs-embedding

An online tool to merge text document items with their vector embeddings in JSON.

Language: HTML - Size: 30.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

baronet2/Bike2Vec

Vector Embedding Representations of Road Cycling Riders and Races

Language: Jupyter Notebook - Size: 33.3 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

Cbhihe/NLP_clip-bleu-meteor

Python Implementation of lexical vector embedding similarity scoring, zero-shot classification of images and n-gram based scoring to compare textual summaries

Language: Jupyter Notebook - Size: 4.81 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AshishGusain17/face-recognition

Detection as well as identification of faces

Language: Python - Size: 52.9 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0