word-embeddings | Topic | Ecosyste.ms: Repos

Topic: "word-embeddings"

piskvorky/gensim

Topic Modelling for Humans

Language: Python - Size: 101 MB - Last synced at: about 18 hours ago - Pushed at: 3 months ago - Stars: 16,023 - Forks: 4,396

flairNLP/flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Language: Python - Size: 351 MB - Last synced at: 1 day ago - Pushed at: 6 days ago - Stars: 14,169 - Forks: 2,115

Embedding/Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

Language: Python - Size: 1.42 MB - Last synced at: about 9 hours ago - Pushed at: over 1 year ago - Stars: 12,010 - Forks: 2,329

srbhr/Resume-Matcher

Resume Matcher is an open source, free tool to improve your resume. It works by using AI, Reader LLMs, to compare and rank resumes with job descriptions.

Language: Python - Size: 100 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8,763 - Forks: 3,317

bentrevett/pytorch-sentiment-analysis

Tutorials on getting started with PyTorch and TorchText for sentiment analysis.

Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 4,507 - Forks: 1,181

ddangelov/Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Language: Python - Size: 83.4 MB - Last synced at: about 17 hours ago - Pushed at: 6 months ago - Stars: 3,045 - Forks: 373

jbesomi/texthero

Text preprocessing, representation and visualization from zero to hero.

Language: Python - Size: 22.1 MB - Last synced at: 15 minutes ago - Pushed at: over 1 year ago - Stars: 2,904 - Forks: 240

JasonKessler/scattertext

Beautiful visualizations of how language differs among document types.

Language: Python - Size: 39.4 MB - Last synced at: about 3 hours ago - Pushed at: 22 days ago - Stars: 2,302 - Forks: 292

Separius/awesome-sentence-embedding 📦

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 282 KB - Last synced at: 14 days ago - Pushed at: about 4 years ago - Stars: 2,257 - Forks: 262

MinishLab/model2vec

Fast State-of-the-Art Static Embeddings

Language: Python - Size: 3.62 MB - Last synced at: about 10 hours ago - Pushed at: 1 day ago - Stars: 1,670 - Forks: 83

plasticityai/magnitude

A fast, efficient universal vector embedding utility package.

Language: Python - Size: 70.7 MB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 1,647 - Forks: 119

omarsar/nlp_overview

Overview of Modern Deep Learning Techniques Applied to Natural Language Processing

Language: CSS - Size: 6.82 MB - Last synced at: 5 days ago - Pushed at: about 5 years ago - Stars: 1,332 - Forks: 198

nlptown/nlp-notebooks

A collection of notebooks for Natural Language Processing from NLP Town

Language: Jupyter Notebook - Size: 94.8 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 884 - Forks: 358

dselivanov/text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

Language: R - Size: 46.2 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 863 - Forks: 133

goru001/inltk

Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need

Language: Python - Size: 812 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 830 - Forks: 161

meta-toolkit/meta

A Modern C++ Data Sciences Toolkit

Language: C++ - Size: 30.4 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 689 - Forks: 233

ncbi-nlp/BioSentVec

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: 10 months ago - Pushed at: almost 2 years ago - Stars: 567 - Forks: 97

ynqa/wego

Word Embeddings in Go!

Language: Go - Size: 6.98 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 490 - Forks: 41

KristiyanVachev/Question-Generation

Generating multiple choice questions from text using Machine Learning.

Language: Jupyter Notebook - Size: 19.2 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 489 - Forks: 116

imgarylai/bert-embedding 📦

🔡 Token level embeddings from BERT model on mxnet and gluonnlp

Language: Python - Size: 120 KB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 452 - Forks: 67

Tixierae/deep_learning_NLP

Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP

Language: Jupyter Notebook - Size: 105 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 435 - Forks: 106

sunyilgdx/SIFRank_zh

Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法（论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码）

Language: Python - Size: 2.38 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 404 - Forks: 78

kamalkraj/Named-Entity-Recognition-with-Bidirectional-LSTM-CNNs

Named-Entity-Recognition-with-Bidirectional-LSTM-CNNs

Language: Python - Size: 1.09 MB - Last synced at: 2 days ago - Pushed at: about 5 years ago - Stars: 365 - Forks: 141

dccuchile/spanish-word-embeddings

Spanish word embeddings computed with different methods and from different corpora

Size: 41 KB - Last synced at: 6 months ago - Pushed at: over 5 years ago - Stars: 356 - Forks: 82

amanchadha/coursera-natural-language-processing-specialization

Programming assignments from all courses in the Coursera Natural Language Processing Specialization offered by deeplearning.ai.

Language: Jupyter Notebook - Size: 178 MB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 348 - Forks: 334

sudharsan13296/Hands-On-Deep-Learning-Algorithms-with-Python

Master Deep Learning Algorithms with Extensive Math by Implementing them using TensorFlow

Language: Jupyter Notebook - Size: 206 MB - Last synced at: 2 days ago - Pushed at: over 4 years ago - Stars: 344 - Forks: 186

chakki-works/chakin

Simple downloader for pre-trained word vectors

Language: Python - Size: 172 KB - Last synced at: 4 days ago - Pushed at: almost 3 years ago - Stars: 334 - Forks: 48

explosion/floret Fork of facebookresearch/fastText

🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy

Language: C++ - Size: 4.4 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 310 - Forks: 12

malllabiisc/WordGCN

ACL 2019: Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks

Language: Python - Size: 5.07 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 291 - Forks: 64

gabrielspmoreira/chameleon_recsys

Source code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems

Language: Python - Size: 715 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 276 - Forks: 81

bloomberg/koan

A word2vec negative sampling implementation with correct CBOW update.

Language: C++ - Size: 378 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 260 - Forks: 18

vngrs-ai/vnlp

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

Language: Python - Size: 392 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 259 - Forks: 17

tolga-b/debiaswe

Remove problematic gender bias from word embeddings.

Language: Jupyter Notebook - Size: 58.6 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 247 - Forks: 90

devmount/GermanWordEmbeddings

Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.

Language: Jupyter Notebook - Size: 911 KB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 238 - Forks: 51

vinhkhuc/JFastText

Java interface for fastText

Language: Java - Size: 57.6 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 236 - Forks: 98

lgalke/vec4ir

Word Embeddings for Information Retrieval

Language: Python - Size: 965 KB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 225 - Forks: 42

alexandrainst/danlp 📦

DaNLP is a repository for Natural Language Processing resources for the Danish Language.

Language: Python - Size: 49.4 MB - Last synced at: 25 days ago - Pushed at: 3 months ago - Stars: 205 - Forks: 34

An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.

Language: Python - Size: 537 KB - Last synced at: 14 days ago - Pushed at: almost 8 years ago - Stars: 198 - Forks: 29

cbaziotis/datastories-semeval2017-task4

Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".

Language: Python - Size: 9.14 MB - Last synced at: 6 months ago - Pushed at: almost 7 years ago - Stars: 197 - Forks: 63

loretoparisi/fasttext.js

FastText for Node.js

Language: JavaScript - Size: 3.31 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 195 - Forks: 29

YannDubs/Hash-Embeddings

PyTorch implementation of Hash Embeddings (NIPS 2017). Submission to the NIPS Implementation Challenge.

Language: Python - Size: 1.12 MB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 195 - Forks: 29

somosnlp/nlp-de-cero-a-cien

Curso práctico: NLP de cero a cien 🤗

Language: Jupyter Notebook - Size: 3.86 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 188 - Forks: 90

sebischair/Lbl2Vec

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

Language: Python - Size: 13.7 MB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 185 - Forks: 27

avidale/compress-fasttext

Tools for shrinking fastText models (in gensim format)

Language: Jupyter Notebook - Size: 30.9 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 178 - Forks: 13

dccuchile/wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

Language: Python - Size: 41.6 MB - Last synced at: 20 days ago - Pushed at: 11 months ago - Stars: 177 - Forks: 14

datquocnguyen/LFTM

Improving topic models LDA and DMM (one-topic-per-document model for short texts) with word embeddings (TACL 2015)

Language: Java - Size: 9.02 MB - Last synced at: 18 days ago - Pushed at: about 8 years ago - Stars: 177 - Forks: 59

yumeng5/Spherical-Text-Embedding

[NeurIPS 2019] Spherical Text Embedding

Language: C - Size: 10.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 176 - Forks: 29

robrua/easy-bert

A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)

Language: Java - Size: 44.9 KB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 171 - Forks: 44

zhongpeixiang/AI-NLP-Paper-Readings

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

Size: 797 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 165 - Forks: 25

pnpnpn/dna2vec

dna2vec: Consistent vector representations of variable-length k-mers

Language: Python - Size: 32 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 164 - Forks: 59

PrashantRanjan09/Elmo-Tutorial

A short tutorial on Elmo training (Pre trained, Training on new data, Incremental training)

Language: Jupyter Notebook - Size: 396 KB - Last synced at: 4 months ago - Pushed at: almost 5 years ago - Stars: 155 - Forks: 38

yuvalpinter/Mimick

Code for Mimicking Word Embeddings using Subword RNNs (EMNLP 2017)

Language: Python - Size: 19.8 MB - Last synced at: 11 days ago - Pushed at: over 5 years ago - Stars: 153 - Forks: 34

augustwester/searchthearxiv

The code powering searchthearxiv.com, a simple semantic search engine for more than 300,000 ML papers on arXiv.

Language: Python - Size: 126 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 147 - Forks: 14

guenthermi/postgres-word2vec

utils to use word embedding models like word2vec vectors in a PostgreSQL database

Language: C - Size: 917 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 143 - Forks: 19

chatopera/wikidata-corpus

Train Wikidata with word2vec for word embedding tasks

Language: Python - Size: 74.6 MB - Last synced at: about 2 months ago - Pushed at: almost 7 years ago - Stars: 122 - Forks: 29

sunyilgdx/SIFRank

The code of our paper "SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model"

Language: Python - Size: 5.81 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 120 - Forks: 20

tca19/dict2vec

Dict2vec is a framework to learn word embeddings using lexical dictionaries.

Language: Python - Size: 208 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 115 - Forks: 30

gaetangate/text-summarizer

Python Framework for Extractive Text Summarization

Language: Python - Size: 50.8 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 111 - Forks: 32

DmitryRyumin/EMNLP-2023-Papers

EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learning, deep learning, and natural language processing with code included. :star: support NLP!

Language: Python - Size: 6.43 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 107 - Forks: 7

loristns/Kadot 📦

Natural language processing using unsupervised vectors representation.

Language: Jupyter Notebook - Size: 942 KB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 106 - Forks: 9

xiamx/fastText Fork of facebookresearch/fastText

Windows Build of fastText, library for text representation and classification.

Language: HTML - Size: 4.2 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 106 - Forks: 25

pommedeterresautee/fastrtext

R wrapper for fastText

Language: C++ - Size: 5.89 MB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 101 - Forks: 15

TharinduDR/Simple-Sentence-Similarity

Exploring the simple sentence similarity measurements using word embeddings

Language: Python - Size: 60.4 MB - Last synced at: 10 days ago - Pushed at: 9 months ago - Stars: 100 - Forks: 37

BobXWu/FASTopic

A Fast, Adaptive, Stable, and Transferable Topic Model (NeurIPS 2024)

Language: Python - Size: 1.68 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 97 - Forks: 6

Hellisotherpeople/Language-games

Dead simple games made with word vectors.

Language: Python - Size: 1.88 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 97 - Forks: 6

joisino/wordtour

Code for "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" (NAACL 2022)

Language: Python - Size: 629 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 94 - Forks: 4

gaohuang/S-WMD

Code for Supervised Word Mover's Distance (SWMD)

Language: Matlab - Size: 92.8 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 93 - Forks: 21

guillaume-chevalier/GloVe-as-a-TensorFlow-Embedding-Layer

Taking a pretrained GloVe model, and using it as a TensorFlow embedding weight layer **inside the GPU**. Therefore, you only need to send the index of the words through the GPU data transfer bus, reducing data transfer overhead.

Language: Jupyter Notebook - Size: 52.7 KB - Last synced at: 21 days ago - Pushed at: over 6 years ago - Stars: 90 - Forks: 19