An open API service providing repository metadata for many open source software ecosystems.

Topic: "sentence-embedding"

SeanLee97/AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language: Python - Size: 889 KB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 535 - Forks: 36

BM-K/Sentence-Embedding-Is-All-You-Need

Korean Sentence Embedding Repository

Language: Python - Size: 879 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 201 - Forks: 17

Huffon/sentence-similarity 📦

This repository contains various ways to calculate sentence vector similarity using NLP models

Language: Python - Size: 215 KB - Last synced at: 6 days ago - Pushed at: about 5 years ago - Stars: 199 - Forks: 34

mrpeerat/Thai-Sentence-Vector-Benchmark

Benchmark for Thai sentence representation

Language: Jupyter Notebook - Size: 19.5 MB - Last synced at: 18 days ago - Pushed at: 9 months ago - Stars: 114 - Forks: 7

LongxingTan/open-retrievals

All-in-One: Text Embedding, Retrieval, Reranking and RAG in Transformers

Language: Python - Size: 1.36 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 56 - Forks: 11

dev-chauhan/PQG-pytorch

Paraphrase Generation model using pair-wise discriminator loss

Language: Python - Size: 76.7 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 44 - Forks: 11

BM-K/KoDiffCSE

Difference-based Contrastive Learning for Korean Sentence Embeddings

Language: Python - Size: 939 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 20 - Forks: 1

luozhouyang/embedrank

EmbedRank implemented in Python.

Language: Python - Size: 13.7 KB - Last synced at: 7 days ago - Pushed at: 11 months ago - Stars: 15 - Forks: 2

VictorProkhorov/Text2Path

[NAACL(2019)] Generating Knowledge Graph Paths from Textual Definitions using Sequence-to-Sequence Models

Language: Shell - Size: 217 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 11 - Forks: 1

Lollipop/CRLT

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

Language: Python - Size: 737 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 8 - Forks: 0

SCAlabUnical/HASHET

HASHET (HAshtag recommendation using Sentence-to-Hashtag Embedding Translation) is a model aimed at suggesting a relevant set of hashtags for a given post.

Language: Python - Size: 41.8 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 6

gulabpatel/Transformers

Language: Jupyter Notebook - Size: 931 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 2

arasgungore/job-posting-duplicate-detection

A project aiming to leverage text embeddings and Milvus, a high-performance vector search engine, to detect duplicate job postings.

Language: Python - Size: 289 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

joh-ga/Through-time-with-BERT

Scripts, data, and results from the "Through time with BERT" project, which evaluated and examined the extent to which English tenses are represented in BERT's raw sentence embeddings.

Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

hwaves/cd_algorithm

Official implementation for paper "Learning Discrete Sentence Representations via Construction & Decomposition".

Language: Python - Size: 21.5 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1

ShayGeko/EASE-ReD

:rocket: EASE-ReD: Ethnicity Analysis and Sentence Embedding from Restaurant Distribution. Predicting ethnicity distribution in an area based on its restaurants data. Cleaning the data using sentence embeddings!

Language: Python - Size: 375 MB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

kdrl/SCNE

C++ implementation of the paper "Segmentation-free compositional n-gram embedding". NAACL-HLT2019.

Language: C++ - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

botisan-ai/sentence-transformers.js

Run sentence-transformers (SBERT) compatible models in Node.js or browser.

Language: TypeScript - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

elifftosunn/textDataClean

Kirli veri çekildiğinde ön işleme adımlarına gerek kalmadan model eğitimi için hazır hale getirmek amacıyla yapılan uygulamadır.

Size: 2.69 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Related Topics
sentence-similarity 5 natural-language-processing 4 nlp 4 embeddings 3 sentence-embeddings 3 llm 2 rag 2 information-retrieval 2 machine-learning 2 text-embedding 2 pytorch 2 contrastive-learning 2 sentence-transformers 2 deep-learning 2 word-embedding 2 gpt-3 2 korean-diffcse 2 self-supervised-learning 2 turkish-sentence-tokenizer 1 duplicate-detection 1 finetuning 1 llm-embeddings 1 dockerfile 1 docker-compose 1 llm-rerankers 1 data-science 1 rag-rerank 1 vector-similarity 1 text2vec 1 text-vector 1 text-similarity 1 stsbenchmark 1 sts 1 sentence-vector 1 semantic-textual-similarity 1 semantic-similarity 1 retrieval-augmented-generation 1 rag-retrieval 1 retrieval 1 mteb 1 triplet-loss 1 turkish 1 string 1 stopwords 1 stemmer 1 sentence-tokenizer 1 pandas 1 numpy 1 nltk 1 ngram 1 morphological-analysis 1 deasciifier 1 corpus 1 korean-simcse 1 korean-sentence-bert 1 vector-search-engine 1 word-tokenizer 1 sentence-encoding 1 sentence-encoder 1 advanced-rag 1 milvus 1 job-postings 1 job-posting 1 exploratory-data-analysis 1 embedding 1 duplicates 1 knowledge-graph 1 term-weighting 1 phrase-extraction 1 keyword-extraction 1 embedrank 1 weighandbiases 1 wandb 1 text-clustering 1 tableqa 1 summarization 1 speech-transcripts 1 sentence-transformer 1 semantic-search 1 qa 1 mbart50 1 mbart 1 m2m100 1 huggingface-transformers 1 flant5 1 english-hindi-translation 1 classification 1 bloom 1 social-media 1 hashtag-recommendation 1 deep-neural-networks 1 llama2 1 llama 1 dense-retrieval 1 data-cleaning 1 ai 1 iconip2020 1 binarization 1 representation-learning 1 english-tenses 1