An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: sentence-embedding

SeanLee97/AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language: Python - Size: 889 KB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 545 - Forks: 38

Huffon/sentence-similarity 📦

This repository contains various ways to calculate sentence vector similarity using NLP models

Language: Python - Size: 215 KB - Last synced at: 4 days ago - Pushed at: about 5 years ago - Stars: 198 - Forks: 34

mrpeerat/Thai-Sentence-Vector-Benchmark

Benchmark for Thai sentence representation

Language: Jupyter Notebook - Size: 19.5 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 117 - Forks: 7

LongxingTan/open-retrievals

All-in-One: Text Embedding, Retrieval, Reranking and RAG in Transformers

Language: Python - Size: 1.38 MB - Last synced at: 23 days ago - Pushed at: 24 days ago - Stars: 58 - Forks: 12

BM-K/Sentence-Embedding-Is-All-You-Need

Korean Sentence Embedding Repository

Language: Python - Size: 879 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 201 - Forks: 17

arasgungore/job-posting-duplicate-detection

A project aiming to leverage text embeddings and Milvus, a high-performance vector search engine, to detect duplicate job postings.

Language: Python - Size: 289 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

shaygeko/EASE-ReD

:rocket: EASE-ReD: Ethnicity Analysis and Sentence Embedding from Restaurant Distribution. Predicting ethnicity distribution in an area based on its restaurants data. Cleaning the data using sentence embeddings!

Language: Python - Size: 375 MB - Last synced at: 1 day ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

luozhouyang/embedrank

EmbedRank implemented in Python.

Language: Python - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 15 - Forks: 2

dev-chauhan/PQG-pytorch

Paraphrase Generation model using pair-wise discriminator loss

Language: Python - Size: 76.7 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 44 - Forks: 11

Lollipop/CRLT

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

Language: Python - Size: 737 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

SCAlabUnical/HASHET

HASHET (HAshtag recommendation using Sentence-to-Hashtag Embedding Translation) is a model aimed at suggesting a relevant set of hashtags for a given post.

Language: Python - Size: 41.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 6

botisan-ai/sentence-transformers.js

Run sentence-transformers (SBERT) compatible models in Node.js or browser.

Language: TypeScript - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

gulabpatel/Transformers

Language: Jupyter Notebook - Size: 931 KB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 2

elifftosunn/textDataClean

Kirli veri çekildiğinde ön işleme adımlarına gerek kalmadan model eğitimi için hazır hale getirmek amacıyla yapılan uygulamadır.

Size: 2.69 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

BM-K/KoDiffCSE

Difference-based Contrastive Learning for Korean Sentence Embeddings

Language: Python - Size: 939 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 20 - Forks: 1

joh-ga/Through-time-with-BERT

Scripts, data, and results from the "Through time with BERT" project, which evaluated and examined the extent to which English tenses are represented in BERT's raw sentence embeddings.

Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

VictorProkhorov/Text2Path

[NAACL(2019)] Generating Knowledge Graph Paths from Textual Definitions using Sequence-to-Sequence Models

Language: Shell - Size: 217 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 11 - Forks: 1

hwaves/cd_algorithm

Official implementation for paper "Learning Discrete Sentence Representations via Construction & Decomposition".

Language: Python - Size: 21.5 KB - Last synced at: 7 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1

kdrl/SCNE

C++ implementation of the paper "Segmentation-free compositional n-gram embedding". NAACL-HLT2019.

Language: C++ - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

Related Keywords
sentence-embedding 19 sentence-similarity 5 nlp 4 natural-language-processing 4 embeddings 3 sentence-embeddings 3 sentence-transformers 2 contrastive-learning 2 gpt-3 2 deep-learning 2 pytorch 2 text-embedding 2 machine-learning 2 korean-diffcse 2 word-embedding 2 rag 2 self-supervised-learning 2 llm 2 information-retrieval 2 mbart 1 mbart50 1 m2m100 1 qa 1 huggingface-transformers 1 semantic-search 1 sentence-transformer 1 speech-transcripts 1 flant5 1 english-hindi-translation 1 classification 1 bloom 1 typescript 1 transformers 1 sbert 1 javascript 1 social-media 1 hashtag-recommendation 1 deep-neural-networks 1 unsupervised-learning 1 simcse 1 representation-learning 1 iconip2020 1 binarization 1 sequence-to-sequence 1 knowledge-graph 1 english-tenses 1 bertology 1 bert 1 word-tokenizer 1 turkish-sentence-tokenizer 1 turkish 1 string 1 stopwords 1 stemmer 1 sentence-tokenizer 1 pandas 1 numpy 1 nltk 1 ngram 1 morphological-analysis 1 deasciifier 1 corpus 1 weighandbiases 1 wandb 1 text-clustering 1 tableqa 1 summarization 1 data-science 1 korean-simcse 1 korean-sentence-bert 1 triplet-loss 1 retrieval 1 rag-retrieval 1 rag-rerank 1 llm-rerankers 1 llm-embeddings 1 finetuning 1 advanced-rag 1 vector-similarity 1 text2vec 1 text-vector 1 text-similarity 1 stsbenchmark 1 sts 1 sentence-vector 1 semantic-textual-similarity 1 semantic-similarity 1 retrieval-augmented-generation 1 mteb 1 llama2 1 llama 1 dense-retrieval 1 data-augmentation 1 quora-question-pairs 1 quora 1 pytorch-implementation 1 paraphrase-generation 1 natural-language-generation 1 anaconda 1 term-weighting 1