An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: sentence-embeddings

jina-ai/vectordb

A Python vector database you just need - no more, no less.

Language: Python - Size: 1.22 MB - Last synced at: about 1 hour ago - Pushed at: about 1 year ago - Stars: 605 - Forks: 47

fuzzy-memory/caffeine-print

A current affairs and politics news mailer

Language: Python - Size: 1.43 MB - Last synced at: about 8 hours ago - Pushed at: about 9 hours ago - Stars: 0 - Forks: 0

ritesh-modi/embedding-hallucinations

This repo shows how foundational model hallucinates and how we can fix such hallucinations using fine-tuning them

Language: Python - Size: 474 KB - Last synced at: about 21 hours ago - Pushed at: about 24 hours ago - Stars: 1 - Forks: 0

FlagOpen/FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language: Python - Size: 38 MB - Last synced at: about 18 hours ago - Pushed at: 5 days ago - Stars: 9,381 - Forks: 675

MaartenGr/BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Language: Python - Size: 25.1 MB - Last synced at: about 18 hours ago - Pushed at: 5 days ago - Stars: 6,678 - Forks: 803

shubham0204/Sentence-Embeddings-Android

Embeddings from sentence-transformers in Android! Supports all-MiniLM-L6-V2, bge-small-en, snowflake-arctic, model2vec models and more

Language: Kotlin - Size: 42 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 43 - Forks: 5

neuml/txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Language: Python - Size: 52 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 10,705 - Forks: 679

shibing624/text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Language: Python - Size: 15.4 MB - Last synced at: 8 days ago - Pushed at: 11 days ago - Stars: 4,673 - Forks: 409

Separius/awesome-sentence-embedding 📦

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 282 KB - Last synced at: 8 days ago - Pushed at: almost 4 years ago - Stars: 2,258 - Forks: 262

DanRo3/tesis-multiagente

Sistema multiagente basado en IA para la extracción y visualización de información desde bases de datos vectoriales mediante lenguaje natural.

Language: Python - Size: 2.71 MB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 1 - Forks: 0

LazarusNLP/indonesian-sentence-embeddings

Embedding Representation for Indonesian Sentences!

Language: Jupyter Notebook - Size: 1.56 MB - Last synced at: 9 days ago - Pushed at: 8 months ago - Stars: 17 - Forks: 2

nikolamilosevic86/local-genAI-search

Local-GenAI-Search is a generative search engine based on Llama 3, langchain and qdrant that answers questions based on your local files

Language: Python - Size: 2.27 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 93 - Forks: 34

SeanLee97/AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language: Python - Size: 889 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 531 - Forks: 36

JohnSnowLabs/nlu

1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.

Language: Python - Size: 474 MB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 909 - Forks: 138

princeton-nlp/SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Language: Python - Size: 40.4 MB - Last synced at: 12 days ago - Pushed at: 6 months ago - Stars: 3,530 - Forks: 522

Muennighoff/sgpt

SGPT: GPT Sentence Embeddings for Semantic Search

Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 863 - Forks: 54

Agrover112/awesome-semantic-search

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

Size: 371 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 353 - Forks: 29

SAP-samples/acl2022-self-contrastive-decorrelation

Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".

Language: Python - Size: 278 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 25 - Forks: 7

MoleculeTransformers/smiles-featurizers

Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.

Language: Python - Size: 39.1 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 1

geeks-of-data/knowledge-gpt

Extract knowledge from all information sources using gpt and other language models. Index and make Q&A session with information sources.

Language: Python - Size: 3.36 MB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 284 - Forks: 54

thiswillbeyourgithub/AnnA_Anki_neuronal_Appendix

Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity

Language: Python - Size: 3.89 MB - Last synced at: 10 days ago - Pushed at: 7 months ago - Stars: 64 - Forks: 1

JohnGiorgi/DeCLUTR

The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!

Language: Python - Size: 702 KB - Last synced at: 21 days ago - Pushed at: almost 2 years ago - Stars: 379 - Forks: 33

worldbank/GISTEmbed

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

Language: Python - Size: 1.3 MB - Last synced at: 18 days ago - Pushed at: about 1 year ago - Stars: 39 - Forks: 2

SeanLee97/xmnlp

xmnlp:提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首,句子表征及文本相似度计算等功能

Language: Python - Size: 114 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 1,277 - Forks: 188

dborrelli/chat-intents

Clustering sentence embeddings to extract message intent

Language: Jupyter Notebook - Size: 6.38 MB - Last synced at: 10 days ago - Pushed at: over 3 years ago - Stars: 173 - Forks: 24

dayyass/muse-as-service

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

Language: Python - Size: 339 KB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 51 - Forks: 5

HITsz-TMG/KaLM-Embedding

Code for KaLM-Embedding models

Language: Python - Size: 319 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 74 - Forks: 6

YomnaWaleed/job-recommendation-system-ai

AI-Powered Job Recommendation System An intelligent job recommendation system that analyzes PDF resumes and suggests the best job opportunities using NLP, FAISS, and Sentence Transformers.

Language: Jupyter Notebook - Size: 88.7 MB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

YomnaWaleed/medical-chatbot-using-Llama2

A medical chatbot built with Meta's Llama2, LangChain, and FAISS to provide accurate, context-aware responses to medical queries. The system uses a Flask-based web interface for user interaction and leverages Hugging Face embeddings for efficient document retrieval. Ideal for exploring domain-specific AI applications in healthcare.

Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

goru001/inltk

Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need

Language: Python - Size: 812 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 829 - Forks: 161

kamalkraj/e5-mistral-7b-instruct

Finetune mistral-7b-instruct for sentence embeddings

Language: Python - Size: 34.2 KB - Last synced at: 9 days ago - Pushed at: 12 months ago - Stars: 81 - Forks: 18

tomlin7/AI-research-assistant

Semantic document search system with pgvector and PGAI

Language: Python - Size: 50.8 KB - Last synced at: 20 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 1

Doragd/Awesome-Sentence-Embedding

A curated list of research papers in Sentence Reprsentation Learning and a sts leaderboard of sentence embeddings.

Size: 174 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 315 - Forks: 20

wangyuxinwhy/uniem

unified embedding model

Language: Python - Size: 12.7 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 851 - Forks: 69

TianduoWang/DiffAug

[EMNLP 2022] Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536

Language: Python - Size: 551 KB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 39 - Forks: 2

4AI/BeLLM

Code for BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings (NAACL2024)

Language: Python - Size: 247 KB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 7 - Forks: 0

ahr9n/quranic-search-v2

Quranic Lexical/Semantic Search

Language: Jupyter Notebook - Size: 5.91 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 35 - Forks: 7

SkywardAI/kirin

Self-hosted and local-first application for inference and RAG on consumer grade hardware.

Language: Python - Size: 918 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 7 - Forks: 8

YJiangcm/PromCSE

[EMNLP 2022] Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning

Language: Python - Size: 737 KB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 134 - Forks: 16

DeepK/hoDMD-experiments

EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition

Language: Python - Size: 47.8 MB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 4

amazon-science/text_generation_diffusion_llm_topic

Topic Embedding, Text Generation and Modeling using diffusion

Language: Python - Size: 152 KB - Last synced at: 13 days ago - Pushed at: 8 months ago - Stars: 12 - Forks: 3

hppRC/simple-simcse-ja

Exploring Japanese SimCSE

Language: Python - Size: 1.29 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 69 - Forks: 4

BounharAbdelaziz/MorDern-Bert

Sentence Transformer model finetuned from ModernBERT-base for Moroccan Darija.

Language: Jupyter Notebook - Size: 46.9 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

jeongukjae/question-similarity

Find similar questions via contrastive learning

Language: Python - Size: 93.8 KB - Last synced at: 27 days ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

yuanzhoulvpi2017/Rust4SenVec

convert sentence to vector by nlp transformers model in Rust

Language: Jupyter Notebook - Size: 21.5 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

cpcdoy/rust-sbert

Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)

Language: Rust - Size: 165 KB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 113 - Forks: 12

pranshurastogi29/Amazon_ml_challenge-solution

26th place solution from 3290 teams held on HackerEarth

Language: Jupyter Notebook - Size: 238 KB - Last synced at: 1 day ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 0

goldpulpy/pysentence-similarity

PySentence-Similarity is a tool designed to identify and find similarities between sentences and a base sentence, expressed as a percentage 📊.

Language: Python - Size: 60.5 KB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

Abhigyan126/FEEDBACK

A Flask-based web application that analyzes user comments using sentiment analysis, similarity detection, and AI-powered insights.

Language: Python - Size: 9.77 KB - Last synced at: 21 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

kampersanda/sif-embedding

Rust implementation of SIF and uSIF: Simple and fast sentence embedding

Language: Rust - Size: 1.22 MB - Last synced at: 21 days ago - Pushed at: 3 months ago - Stars: 19 - Forks: 0

robrua/easy-bert

A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)

Language: Java - Size: 44.9 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 171 - Forks: 44

TharinduDR/Simple-Sentence-Similarity

Exploring the simple sentence similarity measurements using word embeddings

Language: Python - Size: 60.4 MB - Last synced at: 11 days ago - Pushed at: 8 months ago - Stars: 101 - Forks: 37

hellojwilde/energetic-ai

EnergeticAI is TensorFlow.js, optimized for serverless environments, with fast cold-start, small module size, and pre-trained models.

Language: TypeScript - Size: 35.8 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 36 - Forks: 0

goamegah/pytorch-stc

PyTorch implementation of Self-training approch for short text clustering

Language: Python - Size: 16.9 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 8 - Forks: 0

chaosgen/awesome-sentence-embedding

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 213 KB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 4 - Forks: 0

paraglondhe098/movie-recommendation-llm-embeddings

Movie recommender system using LLM and Vector database

Language: Jupyter Notebook - Size: 189 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ash-sha/Semantic-Textual-Similarity-NLP

Measuring similarity of a sentence

Language: Jupyter Notebook - Size: 4.14 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

DolbyUUU/Reinforcement-Calibration-SimCSE

Reinforcement Calibration SimCSE, combining contrastive learning, artificial potential fields, perceptual loss, and RLHF to achieve improved Semantic Textual Similarity (STS) embeddings. PyTorch-based implementations of PerceptualBERT and ForceBasedInfoNCE, along with fine-tuning capabilities via RLHF and evaluation using SentEval.

Language: Python - Size: 371 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ojrlopez27/nl-service-composition

NLSC Unrestricted Natural Language-based Service Composition Middleware that uses Sentence Embeddings. Named-Entity Recognition and other NLP models.

Language: Java - Size: 450 MB - Last synced at: 24 days ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 1

hppRC/simple-simcse

A simple implementation of SimCSE

Language: Python - Size: 157 KB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 76 - Forks: 10

Nikoletos-K/QA-with-SBERT-for-CORD19

⚕️🦠 Developed a document retrieval system to return titles of scientific papers containing the answer to a given user question based on the first version of the COVID-19 Open Research Dataset (CORD-19) ☣️🧬

Language: Jupyter Notebook - Size: 1.53 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

Lizhecheng02/UCSD-CSE256-PA3

CSE 256 LIGN 256 - Statistical Natural Lang Proc - Nakashole [FA24] PA3

Language: Jupyter Notebook - Size: 5.07 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

shasss447/QuestionAnwering-with-RAG

This project implements a Retrieval-Augmented Generation (RAG) pipeline for answering user queries by combining information retrieval with text generation.

Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

kaushalshetty/Structured-Self-Attention

A Structured Self-attentive Sentence Embedding

Language: Python - Size: 492 KB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 495 - Forks: 106

atinyshrimp/TripAdvisor-Recommendation-ML-NLP

Machine Learning and NLP models for improving text-based recommendations on TripAdvisor, using BM25, TF-IDF, embeddings, and a Hybrid approach.

Language: Jupyter Notebook - Size: 489 KB - Last synced at: 24 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

voidism/DiffCSE

Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"

Language: Python - Size: 6.3 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 291 - Forks: 27

ksm26/Embedding-Models-From-Architecture-to-Implementation

Understand and build embedding models, focusing on word and sentence embeddings, dual encoder architectures. Learn to train embedding models using contrastive loss, implement them in semantic search and RAG systems.

Language: Jupyter Notebook - Size: 2 MB - Last synced at: 23 days ago - Pushed at: 8 months ago - Stars: 3 - Forks: 0

cui-shaobo/causal-strength

evaluating the causal strength between cause and effect

Language: Python - Size: 107 KB - Last synced at: 17 days ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

ai-lluminator/backend

The backend for the Ailluminator project, which sends updates when relevant paper are being published, based on a prompt from the user.

Language: Python - Size: 9.21 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

francescobaio/Sentence_Reordering

This project was undertaken as part of the Deep Learning course final exam. The primary objective of this project is to develop and implement a deep learning model for sentence reordering. Sentence reordering is a challenging Natural Language Processing (NLP) task that involves rearranging the words in an ordered sentence.

Language: Jupyter Notebook - Size: 71.3 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

sdadas/polish-sentence-evaluation

Evaluation of Sentence Representations in Polish

Language: Python - Size: 4.96 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 3

flipz357/S3BERT

Semantically Structured Sentence Embeddings

Language: Python - Size: 72.3 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 65 - Forks: 5

Synapxe-DNA/healthhub-content-optimization

Content Optimization code for Health Hub Articles

Language: Jupyter Notebook - Size: 114 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

oborchers/Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!

Language: Jupyter Notebook - Size: 2.86 MB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 618 - Forks: 83

louisbrulenaudet/tax-retrieval-benchmark

An implementation of the TaxRetrievalBenchmark task for the 🤗 Massive Text Embedding Benchmark (MTEB) framework.

Language: Jupyter Notebook - Size: 85 KB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 1

jeongukjae/smaller-labse

Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

Language: Python - Size: 9.47 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 18 - Forks: 0

rbitr/ferrite

Simple, lightweight transformers in Fortran

Language: Fortran - Size: 28.3 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 1

Galal-pic/Talented-recruitment-and-skills-analysis-system

The project's goal is to help job seekers understand the basic qualifications for specific jobs and evaluate the suitability of their skills for those positions. Additionally, the program aims to assist recruiters in enhancing their resume selection processes by analyzing and understanding job advertisements ....

Language: HTML - Size: 12.3 MB - Last synced at: 11 days ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

Babelscape/CroCoAlign

A Cross-Lingual, Context-Aware and Fully-Neural Sentence Alignment System for Long Texts.

Language: Python - Size: 90.4 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 6 - Forks: 0

wuji3/nlpdk

Natural Language Processing(NLP) Toolbox

Language: Python - Size: 324 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 1

shangan23/similar-sentences

Similar sentence Prediction with more accurate results with your dataset on top of pertained model. #BERT

Language: Python - Size: 86.9 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 2

ksm26/Understanding-and-Applying-Text-Embeddings

Dive into the world of text embeddings. This course will guide you through leveraging text embeddings to enhance various natural language processing (NLP) tasks.

Language: Jupyter Notebook - Size: 4.58 MB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 6

izhx/uni-rep

Code for embedding and retrieval research.

Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 0

Salman-Khan-Mohammed/Q-A-System

The "Codebasics Q&A" project is an end-to-end Question and Answer (Q&A) system developed for Codebasics, an e-learning company specializing in data-related courses and bootcamps. The system is designed to assist students who typically ask questions via Discord or email by providing instant, automated responses.

Language: Jupyter Notebook - Size: 270 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

eren23/semantic-code-searcher

Basic example for searching code semantically in github profiles. In python

Language: Python - Size: 44 MB - Last synced at: 25 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

retkowsky/azure_visual_search_toolkit

Azure AI Visual Search toolkit

Language: Jupyter Notebook - Size: 169 MB - Last synced at: 22 days ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 3

arasgungore/job-posting-duplicate-detection

A project aiming to leverage text embeddings and Milvus, a high-performance vector search engine, to detect duplicate job postings.

Language: Python - Size: 289 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

KwokHing/AI-Planet-LLM-Bootcamp-Challenge

An LLM challenge to (i) fine-tune pre-trained HuggingFace transformer model to build a Code Generation language model, and (ii) build a retrieval-augmented generation (RAG) application using LangChain

Language: Jupyter Notebook - Size: 874 KB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

dongkyunk/Semantic-Sentence-Similarity

Semantic Sentence Similarity using Word2Vec, Fasttext embedding and Cosine Similarity, Word Mover Distance

Language: Python - Size: 10.7 KB - Last synced at: 8 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

dayyass/muse_tf2pt

Convert MUSE from TensorFlow to PyTorch and ONNX

Language: Jupyter Notebook - Size: 1.74 MB - Last synced at: 7 days ago - Pushed at: 11 months ago - Stars: 11 - Forks: 0

bilalhameed248/FAQ-Finder-Using-RAG

A RAG (Retrieval augmented generation)-based FAQ Chat-Bot, designed to operate within an organization's internal domain. - Jul 2023 - Oct 2023

Language: Jupyter Notebook - Size: 6.84 KB - Last synced at: 12 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

simonsanvil/EarlyDepression-MentalRiskEs

Code for "A Framework for detecting Depression on Social Media: MentalRiskES@IberLEF 2023"

Language: Jupyter Notebook - Size: 1.79 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

ncbi-nlp/BioSentVec

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 567 - Forks: 97

atAlexFM/HotOffThePressResearchNLP

Hot Off The Press - A curratted list of the most obscure, interesting, and innovative Natural Language Processing research papers. Topics including: personality, archetypes, big data sentiment mining, and narrative arcs.

Size: 8.79 KB - Last synced at: 8 days ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 1

bilalhameed248/Patient-Most-Recent-Treatments-Similarity-Measuring

BioBert Enhanced Patient Treatment Similarity Analysis with Sentence Transformers

Language: Jupyter Notebook - Size: 19.5 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bilalhameed248/Delay-Reason-Extraction-Model

Efficient Delay Reason Extraction in Patient Appointments/Treatments Using BERT and Tensorflow. - Feb 2022 - Jun 2023

Language: Python - Size: 2.93 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

somjit101/BERT-Question-Answering

A study on encoding english sentences to tensorflow vectors or tensors using pre-trained BERT model from the Hugging Face Library.

Language: Jupyter Notebook - Size: 828 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

UKPLab/useb

Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/2104.06979.

Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 32 - Forks: 2

nagababumo/Open-Source-Models-with-Hugging-Face

Language: Jupyter Notebook - Size: 17 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

dennisvdang/QA-Retrieval-System

Project repository for the development of a Question-Answering (QA) information retrieval system fine-tuned on customer queries.

Language: Jupyter Notebook - Size: 19.6 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Related Keywords
sentence-embeddings 274 nlp 104 sentence-transformers 61 natural-language-processing 43 bert 41 sentence-similarity 41 embeddings 34 python 30 machine-learning 30 pytorch 28 word-embeddings 24 transformers 23 semantic-search 20 deep-learning 20 sbert 19 tensorflow 18 huggingface 17 huggingface-transformers 15 text-classification 14 bert-embeddings 13 transformer 13 word2vec 12 llm 12 information-retrieval 12 language-model 12 nlp-machine-learning 12 ai 11 unsupervised-learning 10 langchain 10 clustering 10 embedding-models 9 question-answering 9 sentiment-analysis 9 contrastive-learning 8 rag 8 text-similarity 8 vector-database 7 flask 7 topic-modeling 7 text-embedding 7 sentence-bert 7 semantic-similarity 7 fine-tuning 7 simcse 7 bert-model 6 search-engine 6 data-science 6 streamlit 6 faiss 6 representation-learning 6 large-language-models 6 openai 6 retrieval-augmented-generation 6 python3 6 fasttext 6 vector-search 6 sentence-classification 5 text-generation 5 bert-fine-tuning 5 self-attention 5 infersent 5 natural-language-understanding 5 gpt 5 self-supervised-learning 5 embedding 5 autoencoder 4 text-summarization 4 docker 4 transfer-learning 4 glove-embeddings 4 mteb 4 universal-sentence-encoder 4 transformer-models 4 pandas 4 roberta 4 cosine-similarity 4 retrieval 4 artificial-intelligence 4 awesome-list 4 search 4 sent2vec 4 fastapi 4 pca 4 sentiment-classification 4 cross-lingual 4 sentence-representations 4 text-mining 4 similarity-search 4 sentence-encoding 3 natural-language-inference 3 document-retrieval 3 umap 3 lstm-neural-networks 3 api 3 nltk-python 3 doc2vec 3 rnn-tensorflow 3 attention 3 attention-mechanism 3 text 3