An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: sentence-transformers

MinishLab/model2vec

Fast State-of-the-Art Static Embeddings

Language: Python - Size: 5.23 MB - Last synced at: about 11 hours ago - Pushed at: about 13 hours ago - Stars: 1,832 - Forks: 99

royxlead/docusense-python

AI-powered platform for intelligent document analysis, summarization, semantic search, and interactive Q&A using cutting-edge NLP technologies.

Language: Python - Size: 1.86 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 0 - Forks: 0

Joethomas0078/document-question-answering-system

This is a Document Question Answering (Doc-QA) system built with Python and Streamlit. Users can upload a PDF, and ask questions related to the document content. The system searches the document and provides the most relevant answers.

Language: Python - Size: 1.43 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 0 - Forks: 0

PaidXSmall/RAG-QA-demo

📄 Create a local, free Retrieval-Augmented Q&A system to easily extract answers from your personal documents in minutes.

Language: Python - Size: 15.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Gkk787578/veloclade

Veloclade is a research prototype of a neuro-symbolic knowledge graph system. It uses clade-inspired hierarchy + embedding clustering (sentence-transformers) to control ontology growth and mitigate subclassing explosion. Designed for experimentation in hybrid reasoning and AI knowledge representation.

Language: Python - Size: 6.84 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2 - Forks: 0

Hungry01382/Medical_Chatbot-Llama-2

Developed a chatbot for recommending homeopathic treatments based on disease inputs.Integrated medical knowledge to enhance the chatbot’s recommendation accuracy.Utilized NLP techniques to classify health issues and provide accurate responses.

Language: Python - Size: 21.3 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

Theanh130124/Medical_News

An academic project on a medical news system, developed using Spring Boot and ReactJS with TSX, implementing RAG to build a disease diagnosis chatbot based on data crawled using Selenium.

Language: Java - Size: 8.68 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

HAPPIE123/Milvus-Querying

This project uses Python, Hugging Face (sentence-transformers), Milvus + Docker (container running Vector DB) to create a vector database, populate it with details of many people (names, ages, salaries, addresses and their introductions) and enable searching and querying on the database contents using Cosine-Similarity distances on IVF Flat index.

Size: 1000 Bytes - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

wangyuxinwhy/uniem

unified embedding model

Language: Python - Size: 12.7 MB - Last synced at: about 18 hours ago - Pushed at: about 2 years ago - Stars: 868 - Forks: 71

eriksszva/resume-classifier-pipeline

Primary repository for the end-to-end resume classification system, integrating CI/CD, monitoring, and supporting submodules.

Language: HTML - Size: 85.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Arsalan692/Slidesense-PDF-Analyser

SlideSense PDF Analyser — A sleek, AI-powered web app for intelligent PDF analysis and querying, featuring modern dark UI, vector search, and Google Gemini integration.

Language: Python - Size: 27.9 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

julian-8897/arxiv-llm

An AI-powered semantic search tool for arXiv papers using sentence transformers and vector similarity search.

Language: Python - Size: 30.3 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

lovnishverma/Book-Recommendations

Enter a book title, author name, or description to get similar books 📚 recommendations from a curated Goodreads dataset using AI-powered embeddings and hybrid search.

Language: Python - Size: 1.01 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

huggingface/setfit

Efficient few-shot learning with Sentence Transformers

Language: Jupyter Notebook - Size: 1.79 GB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 2,561 - Forks: 250

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

Language: Python - Size: 43.7 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,810 - Forks: 462

IntelLabs/fastRAG

Efficient Retrieval Augmentation and Generation Framework

Language: Python - Size: 20.4 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 1,657 - Forks: 154

R0D10Nq/BydlanBot

Умный Telegram бот с искусственным интеллектом, долговременной памятью и уникальной личностью. Бот запоминает пользователей, анализирует их характер, поддерживает контекстные диалоги и имеет встроенный планировщик сообщений.

Language: Python - Size: 36.1 KB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

rasyosef/splade-tiny-msmarco

Python code to train SPLADE sparse retrieval models based on BERT-Tiny (4M) and BERT-Mini (11M) by distilling a Cross-Encoder on the MSMARCO dataset

Language: Jupyter Notebook - Size: 19.5 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

rag-wtf/open-text-embeddings

Open Source Text Embedding Models with OpenAI Compatible API

Language: Python - Size: 224 KB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 160 - Forks: 22

ShivaniPatil19/RAG-QA-demo

Turn your documents into instant answers with FAISS + Streamlit.

Language: Python - Size: 15.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

AryehRotberg/briefcase.ai

AI-powered system for extracting, categorizing, and analyzing privacy statements from online service documents. Features fine-tuned SentenceTransformer models, multi-database support, and an interactive web interface for automated privacy policy analysis.

Language: Python - Size: 1.08 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

haripatel07/ai-honeypot

Generative AI-driven Honeypot for cybersecurity. Simulates realistic server logs with AI and detects intrusions using unsupervised anomaly detection (Isolation Forest + NLP embeddings). Showcases synthetic data generation, feature engineering, and end-to-end ML workflow.

Language: Python - Size: 278 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

johndef64/nutrig-graphrag

Graph-based RAG system for biomedical nutrigenetic knowledge discovery. Enables natural language queries on gene-nutrient interactions, supports personalized nutrition counseling, and runs 100% locally with Ollama LLMs and SBERT embeddings.

Language: Jupyter Notebook - Size: 179 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 0

Abeshith/RAG-FundaMentals

🔰 A Comprehensive RAG repository covering basic vanilla RAG techniques, advanced retrieval methods, hybrid search fusion approaches, hands-on reranking techniques with code + explanation 📚✨

Language: Jupyter Notebook - Size: 3.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 1

theDAREK497/archive-assistant-bot

ИИ ассистент для ответов на вопросы клиентов

Language: Python - Size: 64.5 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

CK-Explorer/DuoSubs

Semantic subtitle aligner and merger for bilingual subtitle syncing.

Language: Python - Size: 276 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

ThisIs-Developer/Llama-2-GGML-Medical-Chatbot

Llama2-Medical-Chatbot is a medical chatbot that uses the Llama-2-7B-Chat-GGML model and the pdf The Gale Encyclopedia of Medicine, Volume 1, 2nd Edition. It is still under development, but it has the potential to be a valuable tool for patients, healthcare professionals, and researchers.

Language: Python - Size: 36.8 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 41 - Forks: 18

choudaryhussainali/FileIQ_Document-InteLLigence-BOT

This Streamlit-based AI assistant allows you to upload documents (PDF, DOCX, TXT) and interact with them using natural language. Powered by Llama models via Groq API and LangChain, the bot intelligently understands your documents and provides accurate answers with source references.

Language: Python - Size: 29.3 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

LakhindarPal/NextDrama

Content-based Asian drama recommendation system

Language: Jupyter Notebook - Size: 116 MB - Last synced at: about 18 hours ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

nabeelshan78/GroundTruth-RAG-Scratch

An end-to-end RAG system that grounds LLMs in factual reality, using semantic search on real-time news to provide verifiable, context-aware answers.

Language: Jupyter Notebook - Size: 192 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

mallickboy/Python_Search_Engine

A domain-specific Python search engine leveraging Flask, Pinecone, and Sentence Transformers for semantic search. Deployed on Azure with Gunicorn, Nginx, and SSL for secure and scalable performance.

Language: JavaScript - Size: 13.8 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 5 - Forks: 1

JoyeBright/domain-adapt-mt

A Python Tool for Selecting Domain-Specific (Contextually Similar Data) for Machine Translation

Language: Python - Size: 495 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

ddangelov/Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Language: Python - Size: 83.4 MB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 3,076 - Forks: 375

MohitGupta0123/Medical_Agentic_AI_Bot

An end-to-end Medical Assistant powered by RAG + Agentic AI. It enables medical Q&A with citations, patient registration, appointment confirmations, medicine stock tracking, and case summarization. Frontend built with Streamlit, backend with FastAPI, SQLite/Supabase, and vectorDB FAISS.

Language: Jupyter Notebook - Size: 363 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

BjornMelin/docmind-ai-llm

DocMind AI is a powerful, open-source Streamlit application leveraging LlamaIndex, LangGraph, and local Large Language Models (LLMs) via Ollama, LMStudio, or llama.cpp for advanced document analysis. Analyze, summarize, and extract insights from a wide array of file formats—securely and privately, all offline.

Language: Python - Size: 8.98 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 25 - Forks: 3

pierreolivierbonin/Canada-Labour-Research-Assistant

The Canada Labour Research Assistant (CLaRA) is a privacy-first LLM-powered RAG AI assistant proposing Easily Verifiable Direct Quotations (EVDQ) to mitigate hallucinations in answering questions about Canadian labour laws, standards, and regulations. It works entirely offline and locally, guaranteeing the confidentiality of your conversations.

Language: Python - Size: 11.3 MB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 7 - Forks: 3

AstraBert/SenTrEv

Simple customizable evaluation for text retrieval performance of Sentence Transformers embedders on PDFs

Language: Python - Size: 2.52 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 28 - Forks: 1

gautamgc17/Model-FineTuning

This repository provides resources for fine-tuning various types of models using different techniques and frameworks.

Language: Jupyter Notebook - Size: 91.8 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

kennethleungty/Llama-2-Open-Source-LLM-CPU-Inference

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

Language: Python - Size: 4.52 MB - Last synced at: about 2 hours ago - Pushed at: almost 2 years ago - Stars: 965 - Forks: 208

beir-cellar/beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Language: Python - Size: 38.9 MB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 1,930 - Forks: 217

BaranziniLab/KG_RAG

Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks

Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: 6 days ago - Pushed at: 10 months ago - Stars: 877 - Forks: 106

rasyosef/splade-index

Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba

Language: Python - Size: 2.11 MB - Last synced at: 7 days ago - Pushed at: 11 days ago - Stars: 4 - Forks: 0

padala-jayaram/Multi-Document-RAG

Multi Document RAG Application

Language: Python - Size: 7.81 KB - Last synced at: about 12 hours ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

suhastr/SpecFusion

A machine learning project for requirement integration and semantic similarity.

Size: 7.81 KB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

IsmaelMousa/wiki-rag

Build and deploy a simple retrieval augmented generation (RAG) system that takes wikipedia articles based on your topic, processes them, and answering your questions by integrating a strong retrieval system with a simple generative model

Language: Python - Size: 15.6 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

SatyamSaxena1/Reddit-scrape-to-zettelkasten-obsidian-workflow

From Reddit to Knowledge Graph: a Zettelkasten System from Saved Posts

Language: Python - Size: 50.2 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

scriptstar/vector-db-benchmark

A production-grade benchmarking suite that evaluates vector databases (Qdrant, Milvus, Weaviate, ChromaDB, Pinecone, SQLite, TopK) for music semantic search applications. Features automated performance testing, statistical analysis across 15-20 iterations, real-time web UI for database comparison, and comprehensive reporting with production.

Language: Python - Size: 511 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

preetam077/InteractiveDocumentOrganiser

A Flask web application that uses the Google Gemini API to intelligently scan, summarize, and organize local document folders using natural language.

Language: Python - Size: 51.8 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 1 - Forks: 0

zbilgeozkan/rot-rag-project

The RoT RAG Project implements a Python-based RAG pipeline. Documents are ingested and split into chunks, then embedded using Hugging Face models and indexed with FAISS for fast semantic search. A FastAPI backend exposes an API for querying the indexed documents using a local Hugging Face LLM for generating responses.

Language: Python - Size: 5.32 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

davidberenstein1957/classy-classification

This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.

Language: Python - Size: 613 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 219 - Forks: 15

Samsung2025lock/Qdrant-API-project

Deploy a Flask REST API for Qdrant vector search with SentenceTransformer embeddings, Docker Compose; store and retrieve strings via cosine similarity 🐙

Language: HTML - Size: 119 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Saivarun2611/RAG_Student

I built a RAG chatbot that helps students find the perfect Northeastern University Data Science graduate courses based on what they're interested in. The tech stack includes FastAPI for the backend, FAISS for vector search, SentenceTransformers for embeddings, and Gemini 2.0 Flash for generating responses. The frontend is a clean and responsive.

Language: Python - Size: 10 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

dell-research-harvard/linktransformer

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

Language: Python - Size: 1.81 MB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 127 - Forks: 11

Alex-ML-labs/text-embedding-service-MLA-

FastAPI service for sentence embeddings & cosine similarity (MiniLM-L6-v2). Small, CPU-friendly—great for RAG prototypes.

Language: Python - Size: 24.4 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

Rsmohanraj/rerank-practical

🔄 Rerank practical examples for integrating SentenceTransformers with LlamaIndex, featuring demos, fine-tuning scripts, and evaluation tools.

Language: Python - Size: 7.81 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

Krishanu2206/RAG-PIPELINE

A RAG PIPELINE

Language: Jupyter Notebook - Size: 47.9 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

neirzhei/LinkLoom

A semantic bookmark manager CLI that stores and retrieves links using natural language search powered by embeddings and ChromaDB.

Language: Python - Size: 13.7 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

asavschaeffer/globule

a new way to *not* think about organization

Language: Python - Size: 1.85 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

davidberenstein1957/fast-sentence-transformers

Simply, faster, sentence-transformers

Language: Python - Size: 456 KB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 143 - Forks: 10

YomnaWaleed/job-recommendation-system-ai

AI-Powered Job Recommendation System An intelligent job recommendation system that analyzes PDF resumes and suggests the best job opportunities using NLP, FAISS, and Sentence Transformers.

Language: Jupyter Notebook - Size: 88.7 MB - Last synced at: 17 days ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

fizzy73/uap-detect

🚀 Detect UAP anomalies with a multi-stream, YOLOv8-powered pipeline that offers robust features for efficient video analysis.

Language: Python - Size: 11.7 KB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

rahulthota21/Mock-n-Hire

Mock’n-Hire redefines hiring end-to-end by ranking resumes with semantic precision and delivering real-time, emotion-aware mock interview feedback - giving recruiters bias-resistant insights and candidates targeted, actionable practice.

Language: Python - Size: 26.2 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 1

hcd233/Aris-AI-Model-Server

An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API

Language: Python - Size: 1.11 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 16 - Forks: 1

khanalsumit/plantdeck_rag

🌿 Access your herbal PDFs offline with PlantDeck, which uses local files to provide plant information and safety details without cloud dependency.

Language: Python - Size: 28.3 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

AbhayR1104/NexusNews

An on-demand news intelligence engine. This interactive RAG pipeline queries real-time articles to provide sourced answers using 100% open-source models.

Language: Jupyter Notebook - Size: 10.7 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

Anipaleja/iLLuMinator-4.9B

A sophisticated transformer-based language model with integrated Retrieval-Augmented Generation (RAG) capabilities for intelligent question answering and conversation.

Language: Python - Size: 45.6 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 3 - Forks: 0

Graphlet-AI/eridu

Deep fuzzy matching people and company names for multilingual entity resolution using representation learning

Language: Python - Size: 863 KB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 23 - Forks: 1

itsSwapnil/Milvus-vector-database-project

This project asynchronously scrapes web content, generates semantic text chunks using sentence embeddings, and stores them in a Milvus vector database for efficient similarity search. Built with Python, Langchain, SentenceTransformers, and Milvus for scalable vector-based retrieval.

Language: Python - Size: 22.5 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

45Harry/End_to_End_Medical_Chatbot_using_Llama2

Medical ChatBot Trained on The famous Gale Encyclopedia (1-5 vol) . Using Llama2

Language: Jupyter Notebook - Size: 57.6 MB - Last synced at: 10 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

the-y9/FAQ-Bot

FAQ-Bot is a lightweight, semantic search-based chatbot designed to answer frequently asked questions using sentence embeddings. It uses Sentence-Transformers to find the most relevant answer from a set of predefined FAQs.

Language: Python - Size: 142 KB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 1

Ericles-Porty/txt-embed-search

Ferramenta para indexação e busca semântica de arquivos .txt usando OpenAI Embeddings e ChromaDB.

Language: Python - Size: 4.96 MB - Last synced at: 10 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

pheonix-19/OpsAI

OpsAI (Operational AI) is an intelligent IT support automation platform that uses AI to automatically categorize tickets, suggest solutions, and route requests to the right teams. Built with advanced NLP and machine learning technologies, it integrates with Jira, Slack, and Freshdesk to streamline operational workflows and improve response times.

Language: Python - Size: 2.43 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

sudhakar-r08/MeetMind

MeetMind RAG is a Streamlit-based application designed for Retrieval-Augmented Generation (RAG). It enables text processing, vector-based retrieval, similarity search, and audio transcription.

Language: Python - Size: 61.5 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

ranjeetsohanpal/AI-Resume_Analyzer_and_Job_Matcher

An intelligent web application that analyzes uploaded resumes, extracts skills and summaries, and compares it with the uploaded job description using NLP and semantic similarity models.

Language: Python - Size: 496 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

zthsk/semantic_search

An implementation of LSA, LDA and BERT for performing semantic search on MS MARCO Dataset

Language: Jupyter Notebook - Size: 236 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

The-Data-Dilemma/MediRag-Guard

A RAG Proof of Concept that delivers comprehensive, context-aware insights on healthcare data privacy through a novel knowledge tree.

Language: Python - Size: 6.05 MB - Last synced at: 18 days ago - Pushed at: 3 months ago - Stars: 13 - Forks: 0

coderwahaj/NLP

A collection of 4 NLP projects with Flask UI — Sentiment Analysis, News Classification, Fake News Detection, and Resume Screening using embeddings. All projects feature full data preprocessing, model training, evaluation, and result visualizations.

Language: Python - Size: 80.1 MB - Last synced at: 22 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

EzioDEVio/plantdeck_rag

PlantDeck is an offline herbal RAG that indexes your PDF books and monographs, extracts text/images with OCR, and answers questions with page-level citations using a local LLM via Ollama. Runs on your machine; no cloud. Field guide only; not medical advice.

Language: Python - Size: 27.3 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 1 - Forks: 0

taaylor/gg_kino

Open source киносервис, где можно поглядеть фильмы и весло провести время. Получая рекомендации под ваш вкус и настроение ;)

Language: Python - Size: 15.2 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 2 - Forks: 0

mxskeen/erblogx

ErBlogX: An AI-powered sematic search pipeline to search and chat with engineering blogs/articles/transcripts. Built with Next.js, FastAPI, Supabase/pgvector, and sentence-transformers (all-mpnet-base-v2 ) embeddings.

Language: JavaScript - Size: 1.56 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 1 - Forks: 0

LazarusNLP/indonesian-sentence-embeddings

Embedding Representation for Indonesian Sentences!

Language: Jupyter Notebook - Size: 1.56 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 19 - Forks: 3

fynnfluegge/codeqai

Local first semantic code search and chat | Leverage custom copilots with fine-tuning datasets from code in Alpaca, Conversational, Completion and Instruction format

Language: Python - Size: 562 KB - Last synced at: 6 days ago - Pushed at: 7 months ago - Stars: 488 - Forks: 53

K024/llm-sharp

Language models in C#

Language: C# - Size: 255 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 49 - Forks: 7

sidhantls/optim-sentence-transformers

Optimize SentenceTransformers models with Optimum for faster inference using model.encode

Language: Python - Size: 226 KB - Last synced at: 13 days ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 0

ankman007/job-prep-ai

An AI-powered web platform that uses RAG and LLMs to generate personalized interview prep sheets and candidate-job fit analysis. Built with FastAPI, LangChain, FAISS, and Next.js.

Language: TypeScript - Size: 314 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

cpcdoy/rust-sbert

Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)

Language: Rust - Size: 165 KB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 119 - Forks: 13

avdvh/DeepPressNet

This project classifies Reddit user posts as either "Depressed" or "Not Depressed" using Sentence-BERT embeddings and a convolutional neural network. It includes an end-to-end NLP pipeline with preprocessing, model training, evaluation, and real-time text input prediction.

Language: Jupyter Notebook - Size: 6.69 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

mayankmittal29/MediMind-End-to-end-Medical-Chatbot-Generative-AI

MediMind: RAG-powered medical chatbot leveraging LangChain, OpenAI GPT, and Pinecone vector DB for semantic retrieval. Flask-served UI with HuggingFace embeddings (all-MiniLM-L6-v2) enables context-aware medical query responses from PDF knowledge bases.

Language: Jupyter Notebook - Size: 10.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

nqureshi/ev-winners

Semantic search over every Emergent Ventures winner.

Language: TypeScript - Size: 13.5 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 23 - Forks: 6

Kowd-PauUh/encoders-context-extension

Official implementation of "Zero-Training Context Extension for Transformer Encoders via Nonlinear Absolute Positional Embeddings Interpolation"

Language: Python - Size: 270 KB - Last synced at: 19 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

HorizonChaser12/RAGineer

A beginner project using RAG for retrieving data for the testing team based on multiple type of data.

Language: Python - Size: 171 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

WasifSohail5/IdeaNestTech-Chatbot

Welcome to the official AI-powered chatbot built for IdeaNestTech — a full-stack digital innovation company transforming ideas into reality through modern tech solutions.

Language: Python - Size: 77.8 MB - Last synced at: 28 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

GivenBY/askthecat

An AI-powered lazy study buddy that reads your messy notes (PDFs, PPTs, DOCX, images) and answers questions using Groq LLMs, sentence-transformers, and FAISS. Built with Streamlit for max chill. OCR included.

Language: Python - Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

fireindark707/Python-Schema-Matching

A python tool using XGboost and sentence-transformers to perform schema matching task on tables.

Language: Python - Size: 63.4 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 36 - Forks: 13

eglenn-dev/bible-search

Built an API that uses a vector database to store and retrieve verses based on their semantic meaning.

Language: Jupyter Notebook - Size: 104 MB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

NamanKr24/QuotientRAG

A semantic quote retrieval system using fine-tuned MiniLM, FAISS indexing, and RAG-style LLM synthesis-built with Streamlit and Hugging Face Spaces.

Language: Jupyter Notebook - Size: 3.71 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

shubhamjangid/fine-tune-embedding-model

Language: Jupyter Notebook - Size: 255 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

balahariharasudhan/RAG-Based-Chatbot-for-Smart-Customer-Support-Documents

🤖 RAG-based chatbot for answering queries from 📄 customer support PDFs using 🧠 LLMs, 🔍 OCR, and 📚 FAISS vector search.

Language: Python - Size: 17.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

kwame-mintah/hugging-face-smolagents-playground

🛝 \ ˈplā-ˌgrau̇nd \ an area known or suited for activity of a specified sort.

Language: Python - Size: 377 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

istat-methodology/semantic-search

Toolkit for building semantic search applications in Python.

Language: Python - Size: 9.16 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

Related Keywords