GitHub topics: sentence-embeddings

Repositories

neuml/txtai

💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows

Language: Python - Size: 53 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 11,121 - Forks: 702

rutujakokate430/Multi-AI-Agent-team-of-Researchers-Software-developers-and-QA

Crew of AI Agents working towards developing the reference software solution end-to-end autonomously

Size: 0 Bytes - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

david-xander/visual-analytics-tool-sentence-embeddings

A visual analytics tool and framework for exploring compositionality in sentence embeddings. Gain interactive insights into how embedding models, composition functions, and similarity metrics influence textual representations, focusing on error gap analysis for enhanced model interpretability.

Language: Python - Size: 13.7 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

rafay123321/embedding-hallucinations

This repo shows how foundational model hallucinates and how we can fix such hallucinations using fine-tuning them

Language: Python - Size: 476 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

SeanLee97/AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language: Python - Size: 889 KB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 547 - Forks: 38

jina-ai/vectordb

A Python vector database you just need - no more, no less.

Language: Python - Size: 1.22 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 619 - Forks: 47

MaartenGr/BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Language: Python - Size: 23.7 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 6,834 - Forks: 829

gyunggyung/AGI-Papers

Papers and Book to look at when starting AGI 📚

Language: Python - Size: 35.7 MB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 281 - Forks: 45

BBC-Esq/KeyBERT_GUI

GUI for the great keybert repository.

Language: Python - Size: 73.2 KB - Last synced at: 1 day ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

shubham0204/Sentence-Embeddings-Android

Embeddings from sentence-transformers in Android! Supports all-MiniLM-L6-V2, bge-small-en, snowflake-arctic, model2vec models and more

Language: Kotlin - Size: 42 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 47 - Forks: 6

Separius/awesome-sentence-embedding 📦

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 282 KB - Last synced at: 11 days ago - Pushed at: about 4 years ago - Stars: 2,258 - Forks: 262

robrua/easy-bert

A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)

Language: Java - Size: 44.9 KB - Last synced at: about 22 hours ago - Pushed at: over 2 years ago - Stars: 173 - Forks: 44

shibing624/text2vec

text2vec, text to vector. 文本向量表征工具，把文本转化为向量矩阵，实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型，开箱即用。

Language: Python - Size: 15.4 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 4,754 - Forks: 413

cui-shaobo/conditional-dichotomy-quantification

A lightweight toolkit for measuring how “opposite” two texts are when they share the same context.

Language: Python - Size: 6.73 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

Agrover112/awesome-semantic-search

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

Size: 371 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 354 - Forks: 29

fuzzy-memory/caffeine-print

A current affairs and politics news mailer

Language: Python - Size: 1.46 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

nikolamilosevic86/local-genAI-search

Local-GenAI-Search is a generative search engine based on Llama 3, langchain and qdrant that answers questions based on your local files

Language: Python - Size: 2.27 MB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 94 - Forks: 36

oborchers/Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!

Language: Jupyter Notebook - Size: 2.86 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 623 - Forks: 84

shangan23/similar-sentences

Similar sentence Prediction with more accurate results with your dataset on top of pertained model. #BERT

Language: Python - Size: 86.9 KB - Last synced at: 20 days ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 2

wangyuxinwhy/uniem

unified embedding model

Language: Python - Size: 12.7 MB - Last synced at: 14 days ago - Pushed at: almost 2 years ago - Stars: 863 - Forks: 70

FlagOpen/FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language: Python - Size: 49.5 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 9,763 - Forks: 715

Sakibalam03/resume-scanner

🔍 AI-powered resume scanner that ranks candidates by semantic similarity to job descriptions. Supports PDF/DOCX/images with OCR fallback and sentence transformer embeddings for intelligent matching beyond keywords.

Language: Python - Size: 388 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

SeanLee97/xmnlp

xmnlp：提供中文分词, 词性标注, 命名体识别，情感分析，文本纠错，文本转拼音，文本摘要，偏旁部首，句子表征及文本相似度计算等功能

Language: Python - Size: 114 MB - Last synced at: 26 days ago - Pushed at: over 2 years ago - Stars: 1,285 - Forks: 188

goru001/inltk

Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need

Language: Python - Size: 812 KB - Last synced at: 29 days ago - Pushed at: over 1 year ago - Stars: 830 - Forks: 160

geeks-of-data/knowledge-gpt

Extract knowledge from all information sources using gpt and other language models. Index and make Q&A session with information sources.

Language: Python - Size: 3.36 MB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 281 - Forks: 54

princeton-nlp/SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Language: Python - Size: 40.4 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 3,558 - Forks: 526

Doragd/Awesome-Sentence-Embedding

A curated list of research papers in Sentence Reprsentation Learning and a sts leaderboard of sentence embeddings.

Size: 174 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 316 - Forks: 20

Muennighoff/sgpt

SGPT: GPT Sentence Embeddings for Semantic Search

Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 867 - Forks: 54

JohnSnowLabs/nlu

1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.

Language: Python - Size: 474 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 915 - Forks: 138

kamalkraj/e5-mistral-7b-instruct

Finetune mistral-7b-instruct for sentence embeddings

Language: Python - Size: 34.2 KB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 80 - Forks: 18

dborrelli/chat-intents

Clustering sentence embeddings to extract message intent

Language: Jupyter Notebook - Size: 6.38 MB - Last synced at: 12 days ago - Pushed at: over 3 years ago - Stars: 174 - Forks: 24

MoleculeTransformers/smiles-featurizers

Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.

Language: Python - Size: 39.1 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 1

DanRo3/tesis-multiagente

Sistema multiagente basado en IA para la extracción y visualización de información desde bases de datos vectoriales mediante lenguaje natural.

Language: Python - Size: 2.72 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

toninf/dense_retrieval

Word2vec, sentenceBert, BM25 and IVFFlat Index quality and speed comparison

Language: Jupyter Notebook - Size: 129 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

TharinduDR/Simple-Sentence-Similarity

Exploring the simple sentence similarity measurements using word embeddings

Language: Python - Size: 60.4 MB - Last synced at: 12 days ago - Pushed at: 10 months ago - Stars: 100 - Forks: 37

cpcdoy/rust-sbert

Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)

Language: Rust - Size: 165 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 114 - Forks: 12

MatthewCYM/GenSE

Official implementaion of EMNLP 2022 paper "Generate, Discriminate, and Contrast: A Semi-Supervised Sentence Representation Learning Framework"

Language: Python - Size: 975 KB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 23 - Forks: 1

hppRC/simple-simcse-ja

Exploring Japanese SimCSE

Language: Python - Size: 1.29 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 68 - Forks: 4

worldbank/GISTEmbed

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

Language: Python - Size: 1.3 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 3

ritesh-modi/embedding-hallucinations

This repo shows how foundational model hallucinates and how we can fix such hallucinations using fine-tuning them

Language: Python - Size: 474 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

LazarusNLP/indonesian-sentence-embeddings

Embedding Representation for Indonesian Sentences!

Language: Jupyter Notebook - Size: 1.56 MB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 17 - Forks: 2

amazon-science/text_generation_diffusion_llm_topic

Topic Embedding, Text Generation and Modeling using diffusion

Language: Python - Size: 154 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 12 - Forks: 3

SAP-samples/acl2022-self-contrastive-decorrelation

Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".

Language: Python - Size: 278 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 25 - Forks: 7

thiswillbeyourgithub/AnnA_Anki_neuronal_Appendix

Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity

Language: Python - Size: 3.89 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 64 - Forks: 1

JohnGiorgi/DeCLUTR

The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!

Language: Python - Size: 702 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 379 - Forks: 33

dayyass/muse-as-service

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

Language: Python - Size: 339 KB - Last synced at: 3 days ago - Pushed at: almost 4 years ago - Stars: 51 - Forks: 5

HITsz-TMG/KaLM-Embedding

Code for KaLM-Embedding models

Language: Python - Size: 319 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 74 - Forks: 6

YomnaWaleed/job-recommendation-system-ai

AI-Powered Job Recommendation System An intelligent job recommendation system that analyzes PDF resumes and suggests the best job opportunities using NLP, FAISS, and Sentence Transformers.

Language: Jupyter Notebook - Size: 88.7 MB - Last synced at: 24 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

YomnaWaleed/medical-chatbot-using-Llama2

A medical chatbot built with Meta's Llama2, LangChain, and FAISS to provide accurate, context-aware responses to medical queries. The system uses a Flask-based web interface for user interaction and leverages Hugging Face embeddings for efficient document retrieval. Ideal for exploring domain-specific AI applications in healthcare.

Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

tomlin7/AI-research-assistant

Semantic document search system with pgvector and PGAI

Language: Python - Size: 50.8 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 2

TianduoWang/DiffAug

[EMNLP 2022] Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536

Language: Python - Size: 551 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 39 - Forks: 2

4AI/BeLLM

Code for BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings (NAACL2024)

Language: Python - Size: 247 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 0

ahr9n/quranic-search-v2

Quranic Lexical/Semantic Search

Language: Jupyter Notebook - Size: 5.91 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 35 - Forks: 7

SkywardAI/kirin

Self-hosted and local-first application for inference and RAG on consumer grade hardware.

Language: Python - Size: 918 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 8

YJiangcm/PromCSE

[EMNLP 2022] Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning

Language: Python - Size: 737 KB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 134 - Forks: 16

DeepK/hoDMD-experiments

EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition

Language: Python - Size: 47.8 MB - Last synced at: 4 months ago - Pushed at: almost 6 years ago - Stars: 13 - Forks: 4

BounharAbdelaziz/MorDern-Bert

Sentence Transformer model finetuned from ModernBERT-base for Moroccan Darija.

Language: Jupyter Notebook - Size: 46.9 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

jeongukjae/question-similarity

Find similar questions via contrastive learning

Language: Python - Size: 93.8 KB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

yuanzhoulvpi2017/Rust4SenVec

convert sentence to vector by nlp transformers model in Rust

Language: Jupyter Notebook - Size: 21.5 KB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

pranshurastogi29/Amazon_ml_challenge-solution

26th place solution from 3290 teams held on HackerEarth

Language: Jupyter Notebook - Size: 238 KB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 0

goldpulpy/pysentence-similarity

PySentence-Similarity is a tool designed to identify and find similarities between sentences and a base sentence, expressed as a percentage 📊.

Language: Python - Size: 60.5 KB - Last synced at: 15 days ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

Abhigyan126/FEEDBACK

A Flask-based web application that analyzes user comments using sentiment analysis, similarity detection, and AI-powered insights.

Language: Python - Size: 9.77 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

kampersanda/sif-embedding

Rust implementation of SIF and uSIF: Simple and fast sentence embedding

Language: Rust - Size: 1.22 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 19 - Forks: 0

hellojwilde/energetic-ai

EnergeticAI is TensorFlow.js, optimized for serverless environments, with fast cold-start, small module size, and pre-trained models.

Language: TypeScript - Size: 35.8 MB - Last synced at: about 5 hours ago - Pushed at: over 1 year ago - Stars: 36 - Forks: 0

goamegah/pytorch-stc

PyTorch implementation of Self-training approch for short text clustering

Language: Python - Size: 16.9 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 0

chaosgen/awesome-sentence-embedding

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 213 KB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 4 - Forks: 0

paraglondhe098/movie-recommendation-llm-embeddings

Movie recommender system using LLM and Vector database

Language: Jupyter Notebook - Size: 189 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

ash-sha/Semantic-Textual-Similarity-NLP

Measuring similarity of a sentence

Language: Jupyter Notebook - Size: 4.14 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

DolbyUUU/Reinforcement-Calibration-SimCSE

Reinforcement Calibration SimCSE, combining contrastive learning, artificial potential fields, perceptual loss, and RLHF to achieve improved Semantic Textual Similarity (STS) embeddings. PyTorch-based implementations of PerceptualBERT and ForceBasedInfoNCE, along with fine-tuning capabilities via RLHF and evaluation using SentEval.

Language: Python - Size: 371 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

voidism/DiffCSE

Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"

Language: Python - Size: 6.3 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 293 - Forks: 26

ojrlopez27/nl-service-composition

NLSC Unrestricted Natural Language-based Service Composition Middleware that uses Sentence Embeddings. Named-Entity Recognition and other NLP models.

Language: Java - Size: 450 MB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 1

hppRC/simple-simcse

A simple implementation of SimCSE

Language: Python - Size: 157 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 76 - Forks: 10

Nikoletos-K/QA-with-SBERT-for-CORD19

⚕️🦠 Developed a document retrieval system to return titles of scientific papers containing the answer to a given user question based on the first version of the COVID-19 Open Research Dataset (CORD-19) ☣️🧬

Language: Jupyter Notebook - Size: 1.53 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 4 - Forks: 1

Lizhecheng02/UCSD-CSE256-PA3

CSE 256 LIGN 256 - Statistical Natural Lang Proc - Nakashole [FA24] PA3

Language: Jupyter Notebook - Size: 5.07 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

shasss447/QuestionAnwering-with-RAG

This project implements a Retrieval-Augmented Generation (RAG) pipeline for answering user queries by combining information retrieval with text generation.

Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

kaushalshetty/Structured-Self-Attention

A Structured Self-attentive Sentence Embedding

Language: Python - Size: 492 KB - Last synced at: 7 months ago - Pushed at: almost 6 years ago - Stars: 495 - Forks: 106

atinyshrimp/TripAdvisor-Recommendation-ML-NLP

Machine Learning and NLP models for improving text-based recommendations on TripAdvisor, using BM25, TF-IDF, embeddings, and a Hybrid approach.

Language: Jupyter Notebook - Size: 489 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

ksm26/Embedding-Models-From-Architecture-to-Implementation

Understand and build embedding models, focusing on word and sentence embeddings, dual encoder architectures. Learn to train embedding models using contrastive loss, implement them in semantic search and RAG systems.

Language: Jupyter Notebook - Size: 2 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 0

spiritokko/sentence_similarity

Language: Python - Size: 646 KB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

cui-shaobo/causal-strength

evaluating the causal strength between cause and effect

Language: Python - Size: 107 KB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

ai-lluminator/backend

The backend for the Ailluminator project, which sends updates when relevant paper are being published, based on a prompt from the user.

Language: Python - Size: 9.21 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

francescobaio/Sentence_Reordering

This project was undertaken as part of the Deep Learning course final exam. The primary objective of this project is to develop and implement a deep learning model for sentence reordering. Sentence reordering is a challenging Natural Language Processing (NLP) task that involves rearranging the words in an ordered sentence.

Language: Jupyter Notebook - Size: 71.3 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

sdadas/polish-sentence-evaluation

Evaluation of Sentence Representations in Polish

Language: Python - Size: 4.96 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 3

flipz357/S3BERT

Semantically Structured Sentence Embeddings

Language: Python - Size: 72.3 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 65 - Forks: 5

Synapxe-DNA/healthhub-content-optimization

Content Optimization code for Health Hub Articles

Language: Jupyter Notebook - Size: 114 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

louisbrulenaudet/tax-retrieval-benchmark

An implementation of the TaxRetrievalBenchmark task for the 🤗 Massive Text Embedding Benchmark (MTEB) framework.

Language: Jupyter Notebook - Size: 85 KB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

jeongukjae/smaller-labse

Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

Language: Python - Size: 9.47 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 18 - Forks: 0

rbitr/ferrite

Simple, lightweight transformers in Fortran

Language: Fortran - Size: 28.3 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 1

Galal-pic/Talented-recruitment-and-skills-analysis-system

The project's goal is to help job seekers understand the basic qualifications for specific jobs and evaluate the suitability of their skills for those positions. Additionally, the program aims to assist recruiters in enhancing their resume selection processes by analyzing and understanding job advertisements ....

Language: HTML - Size: 12.3 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

Babelscape/CroCoAlign

A Cross-Lingual, Context-Aware and Fully-Neural Sentence Alignment System for Long Texts.

Language: Python - Size: 90.4 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 6 - Forks: 0

wuji3/nlpdk

Natural Language Processing(NLP) Toolbox

Language: Python - Size: 324 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 1

ksm26/Understanding-and-Applying-Text-Embeddings

Dive into the world of text embeddings. This course will guide you through leveraging text embeddings to enhance various natural language processing (NLP) tasks.

Language: Jupyter Notebook - Size: 4.58 MB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 6

izhx/uni-rep

Code for embedding and retrieval research.

Size: 5.86 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 0

Salman-Khan-Mohammed/Q-A-System

The "Codebasics Q&A" project is an end-to-end Question and Answer (Q&A) system developed for Codebasics, an e-learning company specializing in data-related courses and bootcamps. The system is designed to assist students who typically ask questions via Discord or email by providing instant, automated responses.

Language: Jupyter Notebook - Size: 270 MB - Last synced at: 4 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Related Keywords

sentence-embeddings 283 nlp 107 sentence-transformers 64 natural-language-processing 43 bert 41 sentence-similarity 41 embeddings 35 python 33 machine-learning 30 pytorch 28 word-embeddings 24 transformers 23 deep-learning 21 semantic-search 20 sbert 19 tensorflow 18 huggingface 17 huggingface-transformers 15 text-classification 15 nlp-machine-learning 14 bert-embeddings 13 transformer 13 word2vec 12 llm 12 language-model 12 information-retrieval 12 langchain 11 ai 11 unsupervised-learning 10 clustering 10 embedding-models 10 sentiment-analysis 10 rag 9 question-answering 9 topic-modeling 8 text-similarity 8 semantic-similarity 8 contrastive-learning 8 python3 8 text-embedding 7 vector-database 7 openai 7 retrieval-augmented-generation 7 simcse 7 sentence-bert 7 fine-tuning 7 flask 7 streamlit 6 search-engine 6 faiss 6 fasttext 6 representation-learning 6 vector-search 6 large-language-models 6 data-science 6 bert-model 6 infersent 5 bert-fine-tuning 5 artificial-intelligence 5 natural-language-understanding 5 sentence-classification 5 universal-sentence-encoder 5 embedding 5 text-generation 5 self-attention 5 self-supervised-learning 5 gpt 5 pandas 5 sentence 4 sif 4 retrieval 4 cosine-similarity 4 roberta 4 awesome-list 4 fastapi 4 cross-lingual 4 transformer-models 4 sentiment-classification 4 similarity-search 4 sentence-representations 4 glove-embeddings 4 docker 4 mteb 4 search 4 pca 4 sent2vec 4 msmarco 4 transfer-learning 4 text-mining 4 text-summarization 4 keras 4 semantic-textual-similarity 4 embedding-vectors 4 keyword-extraction 4 natural-language-inference 4 autoencoder 4 llama3 3 deep-neural-networks 3 sagemaker 3 javascript 3