An open API service providing repository metadata for many open source software ecosystems.

Topic: "sentence-embeddings"

neuml/txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Language: Python - Size: 52 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 10,768 - Forks: 683

FlagOpen/FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language: Python - Size: 38 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 9,408 - Forks: 678

MaartenGr/BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Language: Python - Size: 25.1 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 6,684 - Forks: 803

shibing624/text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Language: Python - Size: 15.4 MB - Last synced at: 3 days ago - Pushed at: 16 days ago - Stars: 4,687 - Forks: 409

princeton-nlp/SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Language: Python - Size: 40.4 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 3,545 - Forks: 524

Separius/awesome-sentence-embedding 📦

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 282 KB - Last synced at: 1 day ago - Pushed at: about 4 years ago - Stars: 2,255 - Forks: 262

SeanLee97/xmnlp

xmnlp:提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首,句子表征及文本相似度计算等功能

Language: Python - Size: 114 MB - Last synced at: 12 days ago - Pushed at: over 2 years ago - Stars: 1,277 - Forks: 188

JohnSnowLabs/nlu

1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.

Language: Python - Size: 474 MB - Last synced at: 16 days ago - Pushed at: 3 months ago - Stars: 909 - Forks: 138

Muennighoff/sgpt

SGPT: GPT Sentence Embeddings for Semantic Search

Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 863 - Forks: 54

wangyuxinwhy/uniem

unified embedding model

Language: Python - Size: 12.7 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 851 - Forks: 69

goru001/inltk

Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need

Language: Python - Size: 812 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 830 - Forks: 161

oborchers/Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!

Language: Jupyter Notebook - Size: 2.86 MB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 618 - Forks: 83

jina-ai/vectordb

A Python vector database you just need - no more, no less.

Language: Python - Size: 1.22 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 605 - Forks: 47

ncbi-nlp/BioSentVec

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 567 - Forks: 97

SeanLee97/AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language: Python - Size: 889 KB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 531 - Forks: 36

kaushalshetty/Structured-Self-Attention

A Structured Self-attentive Sentence Embedding

Language: Python - Size: 492 KB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 495 - Forks: 106

sunyilgdx/SIFRank_zh

Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)

Language: Python - Size: 2.38 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 404 - Forks: 78

JohnGiorgi/DeCLUTR

The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!

Language: Python - Size: 702 KB - Last synced at: 27 days ago - Pushed at: about 2 years ago - Stars: 379 - Forks: 33

Agrover112/awesome-semantic-search

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

Size: 371 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 353 - Forks: 29

Doragd/Awesome-Sentence-Embedding

A curated list of research papers in Sentence Reprsentation Learning and a sts leaderboard of sentence embeddings.

Size: 174 KB - Last synced at: about 3 hours ago - Pushed at: over 1 year ago - Stars: 315 - Forks: 20

voidism/DiffCSE

Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"

Language: Python - Size: 6.3 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 291 - Forks: 27

geeks-of-data/knowledge-gpt

Extract knowledge from all information sources using gpt and other language models. Index and make Q&A session with information sources.

Language: Python - Size: 3.36 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 284 - Forks: 54

gyunggyung/AGI-Papers

Papers and Book to look at when starting AGI 📚

Size: 35.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 243 - Forks: 32

yumeng5/Spherical-Text-Embedding

[NeurIPS 2019] Spherical Text Embedding

Language: C - Size: 10.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 176 - Forks: 29

dborrelli/chat-intents

Clustering sentence embeddings to extract message intent

Language: Jupyter Notebook - Size: 6.38 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 173 - Forks: 24

robrua/easy-bert

A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)

Language: Java - Size: 44.9 KB - Last synced at: 21 days ago - Pushed at: over 2 years ago - Stars: 171 - Forks: 44

YJiangcm/PromCSE

[EMNLP 2022] Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning

Language: Python - Size: 737 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 134 - Forks: 16

pdrm83/sent2vec

How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.

Language: Python - Size: 57.6 KB - Last synced at: 10 months ago - Pushed at: almost 3 years ago - Stars: 132 - Forks: 12

KwangKa/SIMCSE_unsup

中文无监督SimCSE Pytorch实现

Language: Python - Size: 3.19 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 123 - Forks: 30

cpcdoy/rust-sbert

Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)

Language: Rust - Size: 165 KB - Last synced at: 11 days ago - Pushed at: 7 months ago - Stars: 113 - Forks: 12

TharinduDR/Simple-Sentence-Similarity

Exploring the simple sentence similarity measurements using word embeddings

Language: Python - Size: 60.4 MB - Last synced at: 16 days ago - Pushed at: 8 months ago - Stars: 101 - Forks: 37

roomylee/self-attentive-emb-tf

Simple Tensorflow Implementation of "A Structured Self-attentive Sentence Embedding" (ICLR 2017)

Language: Python - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 94 - Forks: 34

nikolamilosevic86/local-genAI-search

Local-GenAI-Search is a generative search engine based on Llama 3, langchain and qdrant that answers questions based on your local files

Language: Python - Size: 2.27 MB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 93 - Forks: 34

kamalkraj/e5-mistral-7b-instruct

Finetune mistral-7b-instruct for sentence embeddings

Language: Python - Size: 34.2 KB - Last synced at: 14 days ago - Pushed at: 12 months ago - Stars: 81 - Forks: 18

hppRC/simple-simcse

A simple implementation of SimCSE

Language: Python - Size: 157 KB - Last synced at: 24 days ago - Pushed at: over 2 years ago - Stars: 76 - Forks: 10

HITsz-TMG/KaLM-Embedding

Code for KaLM-Embedding models

Language: Python - Size: 319 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 74 - Forks: 6

hppRC/simple-simcse-ja

Exploring Japanese SimCSE

Language: Python - Size: 1.29 MB - Last synced at: 24 days ago - Pushed at: over 1 year ago - Stars: 69 - Forks: 4

flipz357/S3BERT

Semantically Structured Sentence Embeddings

Language: Python - Size: 72.3 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 65 - Forks: 5

thiswillbeyourgithub/AnnA_Anki_neuronal_Appendix

Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity

Language: Python - Size: 3.89 MB - Last synced at: 16 days ago - Pushed at: 7 months ago - Stars: 64 - Forks: 1

Moradnejad/ColBERT-Using-BERT-Sentence-Embedding-for-Humor-Detection

ColBERT humor dataset for the task of humor detection, containing 200,000 jokes/news

Language: Jupyter Notebook - Size: 6.16 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 63 - Forks: 25

hhzrd/BERT-Embedding-Frequently-Asked-Question

FAQ-based Question Answering System using BERT

Language: Python - Size: 795 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 58 - Forks: 33

algoprog/Quin

An easy to use framework for large-scale fact-checking and question answering

Language: Python - Size: 51.8 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 57 - Forks: 7

sileod/Discovery

Mining Discourse Markers for Unsupervised Sentence Representation Learning

Language: Jupyter Notebook - Size: 541 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 57 - Forks: 2

yanzhangnlp/IS-BERT

An Unsupervised Sentence Embedding Method by Mutual Information Maximization (EMNLP2020)

Language: Python - Size: 27.9 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 53 - Forks: 10

dayyass/muse-as-service

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

Language: Python - Size: 339 KB - Last synced at: 13 days ago - Pushed at: over 3 years ago - Stars: 51 - Forks: 5

shubham0204/Sentence-Embeddings-Android

Embeddings from sentence-transformers in Android! Supports all-MiniLM-L6-V2, bge-small-en, snowflake-arctic, model2vec models and more

Language: Kotlin - Size: 42 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 43 - Forks: 5

worldbank/GISTEmbed

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

Language: Python - Size: 1.3 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 42 - Forks: 3

BM-K/KoSimCSE-SKT

Simple Contrastive Learning of Korean Sentence Embeddings

Language: Python - Size: 703 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 39 - Forks: 4

TianduoWang/DiffAug

[EMNLP 2022] Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536

Language: Python - Size: 551 KB - Last synced at: 13 days ago - Pushed at: over 2 years ago - Stars: 39 - Forks: 2

hellonlp/sentence-similarity

文本相似度,语义向量,文本向量,text-similarity,similarity, sentence-similarity,BERT,SimCSE,BERT-Whitening,Sentence-BERT, PromCSE, SBERT

Language: Python - Size: 221 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 37 - Forks: 11

hellojwilde/energetic-ai

EnergeticAI is TensorFlow.js, optimized for serverless environments, with fast cold-start, small module size, and pre-trained models.

Language: TypeScript - Size: 35.8 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 36 - Forks: 0

ahr9n/quranic-search-v2

Quranic Lexical/Semantic Search

Language: Jupyter Notebook - Size: 5.91 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 35 - Forks: 7

UKPLab/useb

Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/2104.06979.

Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 32 - Forks: 2

yanzhangnlp/BSL

Bootstrapped Unsupervised Sentence Representation Learning (ACL 2021)

Language: Python - Size: 62.2 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 26 - Forks: 0

SAP-samples/acl2022-self-contrastive-decorrelation

Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".

Language: Python - Size: 278 KB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 25 - Forks: 7

EQTPartners/pause

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

Language: Python - Size: 83 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 25 - Forks: 1

svjack/Sbert-ChineseExample

Sentence-Transformers Information Retrieval example on Chinese

Language: Python - Size: 53.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 25 - Forks: 6

perceptiveshawty/RankCSE

Implementation of "RankCSE: Unsupervised Sentence Representation Learning via Learning to Rank" (ACL 2023)

Language: Python - Size: 392 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 24 - Forks: 4

sdadas/polish-sentence-evaluation

Evaluation of Sentence Representations in Polish

Language: Python - Size: 4.96 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 3

MatthewCYM/GenSE

Official implementaion of EMNLP 2022 paper "Generate, Discriminate, and Contrast: A Semi-Supervised Sentence Representation Learning Framework"

Language: Python - Size: 975 KB - Last synced at: 21 days ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 1

tony-hong/event-embedding-multitask

*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach

Language: Python - Size: 673 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 22 - Forks: 13

yiren-jian/NonLing-CSE

[NeurIPS 2022] Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings

Language: Python - Size: 1.44 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 20 - Forks: 2

kampersanda/sif-embedding

Rust implementation of SIF and uSIF: Simple and fast sentence embedding

Language: Rust - Size: 1.22 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 19 - Forks: 0

ogencoglu/fair_cyberbullying_detection

Source code and models for the paper "Cyberbullying Detection with Fairness Constraints". IEEE Internet Computing, 2020

Language: Jupyter Notebook - Size: 598 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 18 - Forks: 3

arsena-k/discourse_atoms

How are topics encoded in semantic space? Repository to accompany PNAS article: https://www.pnas.org/doi/10.1073/pnas.2108801119

Language: Jupyter Notebook - Size: 26.4 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 5

jeongukjae/smaller-labse

Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

Language: Python - Size: 9.47 MB - Last synced at: 13 days ago - Pushed at: over 3 years ago - Stars: 18 - Forks: 0

shijx12/AR-Tree

Pytorch implementation of the paper "Learning to Embed Sentences Using Attentive Recursive Trees".

Language: Python - Size: 5 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 18 - Forks: 3

LazarusNLP/indonesian-sentence-embeddings

Embedding Representation for Indonesian Sentences!

Language: Jupyter Notebook - Size: 1.56 MB - Last synced at: 14 days ago - Pushed at: 9 months ago - Stars: 17 - Forks: 2

vTuanpham/Vietnamese_QA_System

Vietnamese long form question answering system with documents retrieval.

Language: Python - Size: 444 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 17 - Forks: 6

MoleculeTransformers/smiles-featurizers

Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.

Language: Python - Size: 39.1 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 1

jifei/simcse-tf2

A TensorFlow 2 Keras implementation of SimCSE with unsupervised and supervised.

Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 17 - Forks: 2

rbitr/ferrite

Simple, lightweight transformers in Fortran

Language: Fortran - Size: 28.3 KB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 1

izhx/uni-rep

Code for embedding and retrieval research.

Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 0

retkowsky/azure_visual_search_toolkit

Azure AI Visual Search toolkit

Language: Jupyter Notebook - Size: 169 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 3

alibaba/SimCSE-with-CARDS

Source code for SIGIR 2022 paper.

Language: Python - Size: 115 KB - Last synced at: 11 months ago - Pushed at: about 3 years ago - Stars: 15 - Forks: 1

TheShadow29/VC-with-GAN

Voice Conversion with GANs

Language: Python - Size: 8.38 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 15 - Forks: 3

yaushian/mSimCSE

mSimCSE: Multilingual SimCSE

Language: Python - Size: 2.62 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 1

mladvladimir/rust-sentence-transformers

Rust port of https://github.com/UKPLab/sentence-transformers

Language: Rust - Size: 17.6 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 14 - Forks: 1

DeepK/hoDMD-experiments

EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition

Language: Python - Size: 47.8 MB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 4

HsiaoYetGun/InferSent

TensorFlow implementation of FAIR's InferSent (Supervised Learning of Universal Sentence Representations from Natural Language Inference Data)

Language: Python - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 13 - Forks: 8

amazon-science/text_generation_diffusion_llm_topic

Topic Embedding, Text Generation and Modeling using diffusion

Language: Python - Size: 152 KB - Last synced at: 19 days ago - Pushed at: 8 months ago - Stars: 12 - Forks: 3

shuxiaobo/text-representation

Text representation works, such as : paper, code, review, datasets, blogs, thesis and so on.

Size: 27.3 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 12 - Forks: 0

HsiaoYetGun/Toy-Model-for-NLI

My toy model for natural language inference task.

Language: Python - Size: 93.8 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 2

ChristophAlt/embedding_vectorizer

Scikit-learn vectorizer implementing "A simple but tough-to-beat baseline for sentence embeddings." by Arora, Sanjeev, Yingyu Liang, and Tengyu Ma. (2016)

Language: Python - Size: 1.34 MB - Last synced at: 11 months ago - Pushed at: about 7 years ago - Stars: 12 - Forks: 4

dayyass/muse_tf2pt

Convert MUSE from TensorFlow to PyTorch and ONNX

Language: Jupyter Notebook - Size: 1.74 MB - Last synced at: 13 days ago - Pushed at: 11 months ago - Stars: 11 - Forks: 0

Susheel-1999/Sentence_Similarity

Package to calculate the similarity score between two sentences

Language: Python - Size: 10.7 KB - Last synced at: 10 days ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 1

danielwatson6/skip-thoughts

Simple TensorFlow implementation of skip-thought vectors

Language: Python - Size: 63.5 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 4

VictorProkhorov/KL_Text_VAE

[WNGT(2019)] On the Importance of the Kullback-Leibler Divergence Term in Variational Autoencoders for Text Generation

Language: Python - Size: 48.8 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 11 - Forks: 2

sunyilgdx/CwVW-SIF

基于方差权重因子选词的SIF句向量模型-实验源码

Language: Python - Size: 1 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 11 - Forks: 1

qiyuw/PeerCL

EMNLP 2022 "PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings"

Language: Python - Size: 111 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 0

tic-top/LoraCSE

😜Constrative Learning of Sentence Embedding using LoRA (EECS487 final project)

Language: Jupyter Notebook - Size: 17.5 MB - Last synced at: 12 months ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 0

chengzhipanpan/PaSeR

Code for EMNLP paper `Sentence Representation Learning with Generative Objective rather than Contrastive Objective`

Language: Python - Size: 7.26 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 0

jonaylor89/WineInAMillion

Wine Recommender created with sentence-BERT and NearestNeighbor on AWS SageMaker

Language: Jupyter Notebook - Size: 1.53 MB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 9 - Forks: 1

iarroyof/sentence_embedding

A sentence embedding method based on weighted series

Language: Python - Size: 109 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

yuanzhoulvpi2017/Rust4SenVec

convert sentence to vector by nlp transformers model in Rust

Language: Jupyter Notebook - Size: 21.5 KB - Last synced at: 23 days ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

vineetm/tf-similar-sentences

Find similar sentences using Tensorflow Hub for English Wikipedia

Language: Python - Size: 82 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 9 - Forks: 7

goamegah/pytorch-stc

PyTorch implementation of Self-training approch for short text clustering

Language: Python - Size: 16.9 MB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 8 - Forks: 0

luozhouyang/DeepSE

Sentence Embeddings using Deep Nerual Networks in PRODUCTION!

Language: Python - Size: 54.7 KB - Last synced at: 13 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 1

LorenzoMinto/ex-GPT-Summarizer

An Extractive-Abstractive Summarization Framework with a Sentence Embeddings Twist. Based on GPT-2 transformer fine-tuned on CNN/DailyMail dataset

Language: Python - Size: 410 MB - Last synced at: 25 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 2

bhattbhavesh91/sentence-transformers-example

HuggingFace's Transformer models for sentence / text embedding generation.

Language: Jupyter Notebook - Size: 29.3 KB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 4

Related Topics
nlp 104 sentence-transformers 61 natural-language-processing 43 sentence-similarity 41 bert 41 embeddings 34 machine-learning 30 python 30 pytorch 28 word-embeddings 24 transformers 23 semantic-search 20 deep-learning 20 sbert 19 tensorflow 18 huggingface 17 huggingface-transformers 15 text-classification 14 transformer 13 bert-embeddings 13 word2vec 12 llm 12 nlp-machine-learning 12 language-model 12 information-retrieval 12 langchain 11 ai 11 unsupervised-learning 10 clustering 10 embedding-models 10 sentiment-analysis 9 rag 9 question-answering 9 text-similarity 8 contrastive-learning 8 text-embedding 7 openai 7 flask 7 retrieval-augmented-generation 7 simcse 7 topic-modeling 7 semantic-similarity 7 sentence-bert 7 vector-database 7 fine-tuning 7 data-science 6 vector-search 6 bert-model 6 fasttext 6 search-engine 6 streamlit 6 large-language-models 6 faiss 6 python3 6 representation-learning 6 embedding 5 infersent 5 sentence-classification 5 bert-fine-tuning 5 natural-language-understanding 5 text-generation 5 self-supervised-learning 5 gpt 5 self-attention 5 text-summarization 4 msmarco 4 similarity-search 4 artificial-intelligence 4 transfer-learning 4 awesome-list 4 search 4 fastapi 4 universal-sentence-encoder 4 roberta 4 sentiment-classification 4 pandas 4 cross-lingual 4 glove-embeddings 4 docker 4 text-mining 4 mteb 4 autoencoder 4 pca 4 retrieval 4 cosine-similarity 4 transformer-models 4 sentence-representations 4 embedding-vectors 4 sent2vec 4 bm25 3 summarization 3 biobert 3 sentence-representation 3 awesome 3 nlu 3 llama3 3 lstm-neural-networks 3 nltk-python 3 text 3 sagemaker 3