GitHub topics: vector-embeddings
cwida/PDX
β‘ Faster vector search with PDX: A vertical data layout for vectors
Language: C++ - Size: 198 MB - Last synced at: about 2 hours ago - Pushed at: about 4 hours ago - Stars: 36 - Forks: 2

kantord/SeaGOAT
local-first semantic code search engine
Language: Python - Size: 19.4 MB - Last synced at: about 8 hours ago - Pushed at: about 8 hours ago - Stars: 1,138 - Forks: 71

EmmS21/fundAI
We give students across Africa access to laptops preloaded with offline AI tutors to help them prepare for exams and build tech careers
Language: Python - Size: 8.64 MB - Last synced at: about 8 hours ago - Pushed at: about 10 hours ago - Stars: 0 - Forks: 0

manomarras/Scalable-and-Real-Time-RAG-System-with-Open-AI
A real-time data transfer and RAG-based question-answering system using OpenAI. The project integrates PostgreSQL, Elasticsearch, and OpenAI's GPT-3.5 for real-time data updates and accurate, fast user query responses
Size: 1.95 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

ZachNagengast/similarity-search-kit
π SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.
Language: Swift - Size: 175 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 454 - Forks: 43

Yashwanth12321/RepoHelp
program that analyses a github repository and responds with appropriate answers with respect to the repo. Uses concept of vector embedding and RAG to fetch most relevent Information.
Language: Python - Size: 570 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

Dr-Hutchinson/nicolay
Nicolay is a digital history experiment that uses artificial intelligence to explore the speeches of Abraham Lincoln.
Language: Python - Size: 14.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 6 - Forks: 0

dev-diaries41/smartscan
SmartScan is an innovative app powered by a CLIP model that automatically organizes your images by content similarity and enables text-based search.
Language: Kotlin - Size: 106 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 164 - Forks: 4

dead8309/ai-rag-crawler
AI pipeline built with the honc and workers-ai. vector embeddings, web scraping and processing with Cloudflare Workflows (beta)
Language: TypeScript - Size: 404 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 19 - Forks: 1

boredom1234/CodeCraft
A powerful CLI tool using vector embeddings and LLMs to help developers understand codebases through natural language. Ask questions in plain English, get context-aware responses, analyze GitHub repos, and generate documentation. Your AI coding companion for quick codebase exploration.
Language: Python - Size: 1.09 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

m-abdelwahab/build-smarter-chatbots-workshop
Build an AI-powered chatbot that is able to access external data to provide the most accurate answer
Language: TypeScript - Size: 133 KB - Last synced at: about 23 hours ago - Pushed at: 10 months ago - Stars: 9 - Forks: 1

slhodak/Canvo
Node-based procedural AI-powered text editing
Language: TypeScript - Size: 4.89 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

denizdagli/QuantumComputingChatbot
Quantum Computing Chatbot is a Streamlit app that answers questions about quantum computing using a PDF document as its knowledge base. It uses Google Gemini and LangChain for intelligent, document-aware responses.
Language: Jupyter Notebook - Size: 19.8 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

pierrebrunelle/infinite-memory-discord-bot
A context-aware Discord bot with semantic search and conversational memory. Uses Pixeltable + OpenAI for human-like responses
Language: Python - Size: 107 KB - Last synced at: 23 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

dev-diaries41/smartscan-cli
A Linux CLI tool powered by CLIP that enables comparison and automated organization of image and text files based on content similarity
Language: Python - Size: 19.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

e-d-i-n-i/e-doctor-public
AI-powered medical expert system using RAG for diagnosis, prescription, and recommendations, integrating role-based access, AI chat, and structured medical data management with Next.js, Flask, MySQL, and Supabase.
Language: Python - Size: 363 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

e-d-i-n-i/ai-data-extraction
AI-driven system for structured data extraction, storage, and vector search, leveraging Crawl4AI, PydanticAI, and Supabase to enable efficient retrieval and RAG-based AI applications.
Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

alexandralanorias/litintuit
an intelligent book recommendation system using llms and python. vibe code-free.
Language: Jupyter Notebook - Size: 2.87 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

apsinghAnalytics/FinRAGify_App
An LLM app leveraging RAG with LangChain and GPT-4 mini to analyze earnings call transcripts, assess company performance, using natural language queries (NLP), FAISS (vector database), and Hugging Face re-ranking models.
Language: Jupyter Notebook - Size: 4.85 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

KristianMSchmidt/semantic-art-search
Semantic Art Search β Discover art through meaning, not just keywords.
Language: Python - Size: 1.21 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 0

alash3al/vecdb
a vector embedding database with multiple storage engines and AI embedding integrations
Language: Go - Size: 53.7 KB - Last synced at: 16 days ago - Pushed at: 9 months ago - Stars: 33 - Forks: 2

laavanjan/Budget2025NIMRAG-Q-A-chat
"A document Q&A application powered by NVIDIA NIM and LangChain, focused on Sri Lanka's Budget Speech 2025
Language: Python - Size: 294 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

debarup24/knowledgebased-AI
Knowledge based AI chatbot
Language: TypeScript - Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

bartolomej/vector-embedding-explorer
Streamlit app to visualize vector embeddings
Language: Python - Size: 7.81 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

vdutts7/ai-mreflow
YouTubeGPT β’ AI Chat with 100+ videos ft. YouTuber Matt Wolfe (@mreflow) πΊπ£π€π¬
Language: TypeScript - Size: 70.5 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 32 - Forks: 3

Snehil-Shah/Multimodal-Image-Search-Engine
Text to Image & Reverse Image Search Engine built upon Vector Similarity Search utilizing CLIP VL-Transformer for Semantic Embeddings & Qdrant as the Vector-Store
Language: Jupyter Notebook - Size: 10.9 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 3

piyush-eon/ai-portfolio-nextjs
Build an AI Driven Portfolio App with NextJS and Tailwind CSS. We will learn advance AI Technologies like vector embedding and vector databases along with how to work with Open AI's APIs. This is an amazing project to impress recruiters a lot and showcase your skillset.
Language: JavaScript - Size: 143 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 6 - Forks: 5

CodeWith-HAMZA/Management-System
It's (SaaS) platform focused on advanced task and project management. This SaaS solution offers a comprehensive set of features including task tracking, real-time communication through audio and video calls, and a design collaboration tool inspired by Figma. By integrating these functionalities into a single user
Language: TypeScript - Size: 728 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Shahbaz1234567/SmartRAG-Assistant
SmartRAG-Assistant/GenAI-Assistant leverages advanced LLM models and Nvidia APIs for efficient query handling and document summarization. It integrates LlamaParse for structured data extraction, HuggingFace embeddings for vectorization, and PineconeDB for efficient retrieval, ensuring precise answers to user queries.
Language: Python - Size: 12.1 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

kivanc57/mongodb_operations
This project demonstrates MongoDB CRUD operations, data modeling, and advanced Atlas Search & Atlas Vector Search features with Hugging Face, PyMongo, and PyArrow to efficiently process, query data and get efficient results.
Language: Jupyter Notebook - Size: 49.8 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Isa1asN/plagiarism-detector
Plagiarism detection for Amharic language text
Language: Jupyter Notebook - Size: 635 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

fpgmaas/pypi-scout
Find Python Packages on PyPI with the help of vector embeddings
Language: Python - Size: 15 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 44 - Forks: 1

monish-prabhu/Intra-Search
A tool for performing semantic search within pdf documents leveraging sentence transformers.
Language: Python - Size: 747 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

kunalvirwal/CLIP-Vectorizer
A containerized API for generating Vector embeddings for text and images using the OpenAI CLIP Model utilizing CUDA
Language: Python - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Manish-Syal123/DevLens
https://dev-lens-m9r2cgtrv-manishsyal123s-projects.vercel.app
Language: TypeScript - Size: 5.09 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

rsharvesh16/Medalytics
MEDALYTICS is a comprehensive medical coding and insurance analytics platform built using Streamlit and AWS Bedrock. The system helps insurance companies process patient insurance payments through automated medical coding, analytics, and patient insurance analysis.
Language: Python - Size: 26.1 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

rooneyrulz/memomind-ai-chatbot
Memomind is a sleek note-taking app built with React 18, Next.js 14, and TypeScript. It features a chat-based RAG workflow, AI-powered insights with Langchain and Llama3, and secure authentication via Clerk. It uses Tailwind CSS for styling and Shadcn-UI for components.
Language: TypeScript - Size: 609 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

gobeli/deno-drizzle-embeddings
Vector embeddings with Deno, Drizzle and sqlite
Language: TypeScript - Size: 4.16 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

kaloslazo/PyFuseDB
Database system that combines structured data retrieval through inverted indexes with unstructured data (images, audio) search using multidimensional vector embeddings, all within a unified platform.
Language: Python - Size: 631 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

terilios/file-upload-embeddings
Enterprise-grade document intelligence platform leveraging vector embeddings and LLMs for advanced document processing, semantic search, and information retrieval.
Language: Python - Size: 173 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

madeyexz/markdown-file-query
Semantic QA with a markdown database: Query any markdown file using vector embedding, Pinecone vector database and GPT (langchain). A weaker version of privateGPT
Language: Python - Size: 187 KB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 29 - Forks: 2

prernarohra/Tech-Research-Agent
This Tech Research Agent will help you research about any topic related to technology, whether it's old news or latest, my tech agent has you covered.
Language: Python - Size: 12.7 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

rooneyrulz/vector-search-sandbox
This repository contains my practice and experiments with vector search and semantic searching using various vector databases.
Language: JavaScript - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

vdutts7/chatBTC
AI Chat with The βΏitcoin Whitepaper
Language: TypeScript - Size: 4.24 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 4

worldbeater/code-vecs
Code for the methods and algorithms described in the paper "Analysis of Program Representations Based on Abstract Syntax Trees and Higher-Order Markov Chains for Source Code Classification Task"
Language: Jupyter Notebook - Size: 1.23 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 5 - Forks: 0

Giorgi/SemanticKernel.Connectors.Oracle
Semantic Kernel memory built on top of Oracle 23ai
Language: C# - Size: 71.3 KB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 3 - Forks: 0

AishwaryaHastak/RAG-using-T5
An End-to-End RAG Pipeline for handling Question and Answer in the Data Science domain
Language: Jupyter Notebook - Size: 2.27 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 1

deadbits/vector-embedding-api
Flask API for generating text embeddings using OpenAI or sentence_transformers
Language: Python - Size: 36.1 KB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 14 - Forks: 1

bennyschmidt/next-token-prediction
Next-token prediction in JavaScript β build fast language and diffusion models.
Language: JavaScript - Size: 35.3 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 135 - Forks: 5

c-vandenberg/paper-qa-chemistry
Extends the Paper QA package for use with the author's Zotero database papers in organic chemistry, drug discovery & development, cheminformatics and the applications of machine learning to these areas
Language: Python - Size: 90.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

berecat/openai-pinecone-search
Semantic search with openai's embeddings stored to pineconedb (vector database)
Language: TypeScript - Size: 6.44 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 14 - Forks: 2

marcoswastaken/erdos_paware
Pawsitive Retrieval RAG Project - Erdos Institute Deep Learning Boot Camp - Spring 2024
Language: Jupyter Notebook - Size: 510 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

nem035/tim.nem.ai
AI chat with Tim Ferriss or any of his past guests
Language: TypeScript - Size: 52.2 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1 - Forks: 1

itsbariscan/ContextBridge-Semantic-Internal-Link-Tool
ContextBridge-Semantic-Internal-Link-Tool is an advanced Python script designed to enhance website structure and user experience by identifying and suggesting intelligent internal linking opportunities.
Language: Python - Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

itsbariscan/SEO-Semantix
The SEO Content Analyzer is a sophisticated Python script designed to perform in-depth semantic analysis of content for SEO purposes.
Language: Python - Size: 13.7 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

brylie/wagtail-vector-blog
Experimenting with Wagtail vector search (and possibly chat) by creating a blog
Language: Python - Size: 135 KB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

oozdal/pdf-bundle
PDF Bundle extracts text from PDFs in AWS S3, splits it, stores embeddings in Pinecone, and uses query vector embeddings based on cosine similarities for efficient search and retrieval.
Language: Jupyter Notebook - Size: 1.23 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Srijan-D/youtube-ai-assistant-langchain
Language: Python - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

mmr116/document-search-using-vector-embeddings-openai-rag
A simple web application to generate vector embeddings for PDF document, store them in a vector database, and enable semantic search and information retrieval using OpenAI's language models.
Language: Python - Size: 109 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

mmr116/Data-Query-with-RAG-OpenAI-Embeddings-and-Vector-Database
Vector embeddings generation for a csv file, storing embeddings in the vector database and query the csv file using openai language model
Language: Python - Size: 104 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

azeebneuron/Know-your-docs
Know Your Docs: Upload your documents and get instant answers to any questions related to them with this document knowledge platform
Language: Python - Size: 6.16 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Icomanman/mico-ai
AI Chatbot with Knowledge Base embeddings (prototype)
Language: Python - Size: 53.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

malleswarigelli/QA_Information_Retrival_Application
Build Generative AI, custom Question/Answer or Information Retrival Application using LlamaIndex, Google Gemini
Language: Jupyter Notebook - Size: 236 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

dawar-shafaque/people-tracking-and-counting-system
Language: Python - Size: 0 Bytes - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

serp-ai/V3CTRON-vector-database-embedding-neural-search-retrieval-chatgpt-plugin Fork of openai/chatgpt-retrieval-plugin
V3CTRON | Vector Embeddings Data Retrieval | ChatGPT Plugin
Language: Python - Size: 6.61 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 7

manthan-modi/people-tracking-system
Language: Python - Size: 66.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

souravkumardubey/milvus-vector-db
Hands-on with Milvus vector db
Language: JavaScript - Size: 9.54 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Prayag2003/ethereum-bots-revelio-labs-nirma
Advance Resume Parser: This project was built during the Mined Hackathon organized by Nirma University.
Language: Jupyter Notebook - Size: 10.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 2

angus-spence/loc2vec
Learning semantic embeddings from OSM data: A Pytorch implementation of the loc2vec general method outlined in: https://sentiance.com/loc2vec-learning-location-embeddings-w-triplet-loss-networks.
Language: Python - Size: 123 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

trimoyee-g/Doc-Converse
Seamlessly interact with PDF, CSV and Handwritten Notes
Language: Python - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

xpbowler/semanticOS
Fast semantic OS Search
Language: Rust - Size: 96.1 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sueszli/vector-database-benchmark π¦
paper: vecdb benchmark stats for dec 2023
Size: 1.81 GB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

chetryJyoti/QA-ChatBot
Developed using custom data for answering questions from a given domain knowledge
Language: Jupyter Notebook - Size: 18.6 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

gagan257/Brain-Flows
Intelligent note taking Web App with AI Integration
Language: TypeScript - Size: 352 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Weile-Zheng/ssat-analogy
SSAT Analogy Solver
Language: Python - Size: 578 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

astronomer/use-case-llm-customer-feedback
Use Cohere and OpenSearch to analyze customer feedback in an MLOps pipeline
Language: Python - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

abhishekHegde2000/ai-note-app
Developed an AI-powered note-taking application using Next.js 14, ChatGPT API, vector embeddings, Pinecone, TailwindCSS, Shadcn UI, and TypeScript.
Language: TypeScript - Size: 1.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Govind-S-B/pdf-to-text-chroma-search
Python scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma DB for similarity search based on user input.
Language: Python - Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sfoteini/image-vector-search-azure-postgresql
Image Vector Similarity Search with Azure AI Vision (Florence model) and Azure Cosmos DB for PostgreSQL
Language: Jupyter Notebook - Size: 38.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

horans/json-docs-embedding
An online tool to merge text document items with their vector embeddings in JSON.
Language: HTML - Size: 30.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

baronet2/Bike2Vec
Vector Embedding Representations of Road Cycling Riders and Races
Language: Jupyter Notebook - Size: 33.3 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

Cbhihe/NLP_clip-bleu-meteor
Python Implementation of lexical vector embedding similarity scoring, zero-shot classification of images and n-gram based scoring to compare textual summaries
Language: Jupyter Notebook - Size: 4.81 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AshishGusain17/face-recognition
Detection as well as identification of faces
Language: Python - Size: 52.9 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0
