GitHub topics: word-embedding

Repositories

atoffano/entity-norm

Comparative analysis of NLP neural methods for the entity normalization task in the biological field.

Language: Python - Size: 19.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Pzoom522/xANLG

Data and code for "Understanding Linearity of Cross-Lingual Word Embedding Mappings" (TMLR 2022)

Language: Python - Size: 31.3 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 11 - Forks: 0

marcopoli/Identification-of-Twitter-bots-using-CNN

Python project to create a classifier to guess if a Twitter account is a man, a woman or a bot.

Language: Python - Size: 158 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 15 - Forks: 8

sid321axn/bank_fin_embedding

This repository consists of customized word embedding focused on banking and finance terms which will be helpful in analyzing and classifying financial sentiments or stock price sentiment analysis.

Language: Jupyter Notebook - Size: 51.5 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 2

yxtay/glove-tensorflow

Implementation of GloVe using TensorFlow estimator API

Language: Python - Size: 368 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 9 - Forks: 5

Entreprecariat/Entreprecariat

This project aims at analysing with authomatic tools the new reality of 'entreprecariat', through a corpus of books related to the current labour market.

Language: Python - Size: 8.94 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

sagorbrur/GloVe-Bengali Fork of stanfordnlp/GloVe

Bengali GloVe Pretrained Word Vector

Language: C - Size: 206 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 1

raja-kumar/CSE-244-ML-for-NLP

NLP Assignments

Language: Python - Size: 14.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

eftekhar-hossain/Bengali-Document-Categorization

Bangla News Article Categorization Using Conv-LSTM Net. It is a multi-class classification problem.

Language: Jupyter Notebook - Size: 23.6 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 3

Question Answering System for Android Devices. 4 approaches implemented in backend for QA System i.e., Naive Approach, Word Embedding Technique (Word2Vec, Glove), Simple transformer and Bert. For frontend, an Android app is used.

Language: Jupyter Notebook - Size: 1.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 1

Variable-Embedding/nlp-421 📦

Exploring GloVe Embeddings

Language: Python - Size: 178 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

tainvecs/MachineLearning-2017 📦

Machine Learning Course 2017 Fall @ National Taiwan University

Language: Jupyter Notebook - Size: 156 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

miras-tech/MirasText

MirasText

Language: Python - Size: 9.15 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 52 - Forks: 7

keivanipchihagh/simple-word-embedding

A simple and custom word embedding algorithm

Language: Jupyter Notebook - Size: 117 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

mustafahakkoz/Text_Classification_ML-DL Fork of Aysenuryilmazz/Text_Classification_ML-DL

This is an end-to-end NLP project based on text classification. We have created a real-time web application that takes input text from the user and predicts its diagnosis out of 10 predefined labels.

Language: Jupyter Notebook - Size: 7.94 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hoseinlook/bioinformatic-course

simple virus DNA classification

Language: Jupyter Notebook - Size: 10 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 6 - Forks: 0

pesoto/Text-Analysis

Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.

Language: Jupyter Notebook - Size: 461 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 54 - Forks: 29

mpinta/tracevec

🔌 Learning word embedding models based on the electrical consumption of various home appliances.

Language: Jupyter Notebook - Size: 840 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

callforpapers-source/inter-word-embedding

a non-neural network approach for word embedding

Language: Jupyter Notebook - Size: 34.8 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

AAnirudh07/One-Shot-Learning-for-Indian-News-Classification

Siamese Networks for training BERT embeddings on low-resource languages for NLP tasks.

Language: Jupyter Notebook - Size: 1.77 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rahatkader/FAQ-Albert-Einstein

Language: Jupyter Notebook - Size: 15.6 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ttavni/SemanticWordClouds

Making word clouds more interesting

Language: Python - Size: 8.21 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 4

anishacharya/Online-Embedding-Compression-AAAI-2019

Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training, to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.

Language: Python - Size: 36 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 2

KTong06/SentimentAnalysis

A two layered LSTM model to solve binary classification problem (positive and negative movie review)

Language: Python - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

KTong06/-RNN-Article_Catagorizer

An LSTM model to label articles into 5 categories.

Language: Python - Size: 38.8 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

dongjun-Lee/kor2vec

Library for Korean morpheme and word vector representation

Language: Python - Size: 16.6 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 69 - Forks: 15

atoffano/basenorm

Implementation of a BERT model to normalize biological entities from the Bacteria Biotope 4 corpus.

Language: Python - Size: 461 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Aysenuryilmazz/Text_Classification_ML-DL

This is an end-to-end ML project based on text classification. We have created a real-time web application that takes input text from the user and predicts its diagnosis out of 10 predefined labels. To see project, please visit the link: https://conditionpredictor.herokuapp.com/

Language: Jupyter Notebook - Size: 7.66 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

Animesh-Chourey/Pre-trained_Transformers-Information_Extraction-and-Dialogue_System

Part of the assignment from the Neural Network and NLP module

Language: Jupyter Notebook - Size: 1.42 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ebrahimpichka/semantic-textual-similarity

Categorizing products of an online retailer based on products’ titles using word2vec word-embedding and DBSCAN (density-based spatial clustering of applications with noise) clustering.

Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

oneonlee/Building-Transformer-Based-NLP-Applications

NVIDIA DLI "트랜스포머 기반 자연어 처리 애플리케이션 구축" 워크숍 레포지토리

Language: Jupyter Notebook - Size: 24.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

QiutingWang/Twitter-Sentiment-Analysis_Kaggle

#NLP#TextPreprocessing#EDA&Viz#FeatureEngineering#Modeling

Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

tate8/translator

Transformer translator website with multithreaded web server in Rust

Language: Rust - Size: 19.5 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

giturra/my-glove-pytorch

Language: Python - Size: 7.81 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

mfakca/turkish-word2vec

Turkish word2vec trained with Wikipedia dataset

Language: Python - Size: 57.6 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

caps6/word-embedding

An implementation of word2vec skip-gram algorithm

Language: Python - Size: 20.5 KB - Last synced at: 10 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 3

paul-pias/Text-Preprocessing-in-Bangla-and-English

Language: Python - Size: 29.3 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 5 - Forks: 4

lovit/text_embedding

Inferring vector of unseen words

Language: Python - Size: 1.64 MB - Last synced at: 6 months ago - Pushed at: about 6 years ago - Stars: 7 - Forks: 1

sirius-mhlee/word-embedding-using-keras-skip-gram-word2vec

Keras implementation of Skip-gram Word2Vec

Language: Python - Size: 89.8 KB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

sirius-mhlee/word-embedding-using-keras-cbow-word2vec

Keras implementation of Continuous Bag-of-Words Word2Vec

Language: Python - Size: 89.8 KB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

iesl/Distributional-Inclusion-Vector-Embedding

Language: Jupyter Notebook - Size: 113 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 15 - Forks: 0

mohamad-dehghani/Semi-automatic-Detection-of-Persian-Stopwords-using-FastText-Library-

Size: 378 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

aguschin/the-hat-game

Learn word embedding and service deployment by playing the Hat Game

Language: Jupyter Notebook - Size: 134 KB - Last synced at: about 21 hours ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 2

guyelov/IR-Wikipedia-Search-Engine Fork of OmerIdgar/IR-Wikipedia-Search-Engine

Search Engine for the Wikipedia corpus

Size: 1000 Bytes - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

manishemirani/Word2Vec_Persian

Skip-gram algorithm on a Persian dataset

Language: Python - Size: 6.2 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

AvivYaish/X2VEC

We have implemented, expanded and reviewed the paper “Sense2Vec - A Fast and Accurate Method For Word Sense Disambiguation In Neural Word Embeddings" by Andrew Trask, Phil Michalak and John Liu.

Language: Python - Size: 793 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

ownzonefeng/Graph-based-text-representations

The official implementation for Graph-based text representations (EPFL Master Thesis)

Language: Python - Size: 16.4 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

papapana/tv-script-generation-with-recurrent-neural-networks

Generating TV scripts based on `Seinfeld` using recurrent neural networks

Language: HTML - Size: 2.33 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

dv66/word2vec-from-scratch

Word2Vec model implementation from scratch.

Language: Python - Size: 71 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

EasyL0ver/headlines-categorization

Neural networks school project on headlines categorization using deep learning and word embedding.

Language: MATLAB - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hwaves/rtst

Official implementation for paper "Embedding Compression with Right Triangle Similarity Transformations".

Language: Python - Size: 9.77 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

chrisPiemonte/TripWalk

Random walk generation on RDF graph - DeepWalk inspired

Language: Scala - Size: 391 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 1

rayheberer/nytimes-word-embedding

iXperience Applied AI NLP exercise

Language: Jupyter Notebook - Size: 287 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

oToToT/Doc2VecC

GPU accelerated implementation for Doc2VecC

Language: Cuda - Size: 329 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

lemay-ai/sidecar

Sidecar: Augmenting Word Embedding Models With Expert Knowledge

Language: Jupyter Notebook - Size: 3.24 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

fratambot/Word_Clouds

playing with word clouds

Language: Jupyter Notebook - Size: 990 KB - Last synced at: 7 days ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

tiddler/SIF_reproduce

This repo contains the code and results for reproducing the results in the paper: A SIMPLE BUT TOUGH-TO-BEAT BASELINE FOR SENTENCE EMBEDDINGS

Language: Python - Size: 1.42 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 12 - Forks: 3

thunlp/CWE Fork of Leonard-Xu/CWE

Character-enhanced Word Embedding (CWE) Model

Language: C - Size: 118 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 1

eftekhar-hossain/Word-Embedding-on-Bangla-Text

A Word Embedding Model for Bangla Text Corpus.

Language: Jupyter Notebook - Size: 4.23 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

ricky9123/synonym-recognition-by-finetune-embedding

:bird: Fine-tune Pre-trained Word Embedding for Synonym Recognition

Language: Python - Size: 5.19 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

NLPIR-team/Word2vec-CPP

Language: C++ - Size: 6.3 MB - Last synced at: 2 months ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

askintution/wordvector_be Fork of huichen/wordvector_be

Web服务：使用腾讯 800 万词向量模型和 spotify annoy 引擎得到相似关键词

Size: 27.3 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

mklimasz/wedt-elka

Information extraction using word embedding

Language: Java - Size: 89.8 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

eriknovak/python-text-embedding-microservice

Service for producing text representations via word embeddings

Language: Python - Size: 248 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 2

albert-espin/imdb-clustering

Topic Clustering from Word Embeddings for IMDB Movie Reviews

Language: Python - Size: 1.78 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

rounayak/Text-categorization-using-neural-word-embeddings

This is a practical implementation implementing neural networks on top of fasttext as well as word2vec word embeddings.

Language: Python - Size: 1.15 MB - Last synced at: 4 months ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 0

pemagrg1/word-embeddings

word embeddings. Created Date: 12 Feb 2019

Language: Python - Size: 1.81 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

kdrl/SCNE

C++ implementation of the paper "Segmentation-free compositional n-gram embedding". NAACL-HLT2019.

Language: C++ - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

loretoparisi/fastTextServe

FastText Server for Node.js based on fasttext.js

Size: 1000 Bytes - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

stephen-zhao/CS-456_ANN-MP2

CS-456 Artificial Neural Networks - Miniproject 2 - Chatbot

Language: Jupyter Notebook - Size: 21.7 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

kafku/mm_word2vec

Unofficial implementation of multimodal skip-gram model [Lazaridou+ 2015]

Language: Jupyter Notebook - Size: 565 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

raokrutarth/meetover_public

Mobile app and backend that uses NLP and word embeddings to connect similar users and setup informal meetings

Language: JavaScript - Size: 908 KB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

KEINOS/Fork_Word2vec-Golang Fork of ynqa/wego

✅ Forked repo of a GoLang implementation of Word2Vec and GloVe!

Language: Go - Size: 6.78 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

novara754/roses-and-violets

Language: JavaScript - Size: 12.7 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

damaohongtu/CNN-for-Sentence-Classification-in-Tensorflow

参考@yoonkim及其他仓库，完善CNN for Sentence Classification

Language: Python - Size: 603 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1

ysenarath/wang2vec Fork of wlin12/wang2vec

Extension of the original word2vec using different architectures

Language: C - Size: 59.6 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

cschen1205/java-deep-learning-nlp

Deep Learning for Natural Language Processing in Java

Language: Java - Size: 1.62 MB - Last synced at: about 2 months ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

hailiang-wang/fastText Fork of facebookresearch/fastText

Library for fast text representation and classification.

Language: C++ - Size: 306 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

zhuangh/kcws Fork of koth/kcws

Deep Learning Chinese Word Segment

Language: C++ - Size: 13.4 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Related Keywords

word-embedding 179 nlp 55 word2vec 46 machine-learning 34 deep-learning 27 natural-language-processing 25 word-embeddings 23 python 20 text-classification 19 tensorflow 17 lstm 16 glove 14 bert 11 neural-network 11 keras 10 nlp-machine-learning 10 fasttext 10 word 9 gensim 9 sentiment-analysis 8 cnn 8 ai 8 python3 7 topic-modeling 7 pytorch 7 dictionary 7 turkish-syllibification 7 rnn 6 tokenization 6 lda 5 embedding 5 embeddings 5 skip-gram 5 transformer 5 tf-idf 5 skipgram 5 classification 5 deep-neural-networks 4 cbow 4 clustering 4 doc2vec 4 glove-embeddings 4 flask 4 representation-learning 4 text-mining 4 bow 3 regex 3 text-generation 3 torch 3 text-similarity 3 docker 3 heroku 3 data-visualization 3 data-science 3 rest-api 3 w2v 3 document-embedding 3 gensim-word2vec 3 word-segmentation 3 pca 3 text-analysis 3 neural-networks 3 twitter 3 bidirectional-lstm 3 wordcloud 3 logistic-regression 3 bert-model 3 word-vectors 3 recurrent-neural-networks 3 text-embedding 3 language-model 3 artificial-intelligence 3 word2vec-embeddinngs 2 nltk-python 2 word-analogy 2 gutenberg 2 confusion-matrix 2 word-clouds 2 unsupervised-learning 2 matrix-factorization 2 semantic-web 2 semantic-segmentation 2 semantic 2 node2vec 2 graph-embedding 2 deepwalk 2 chatbot 2 interpretability 2 language 2 information-retrieval 2 inverted-index 2 bioinformatics 2 ner 2 gradient-descent 2 keras-tensorflow 2 windows 2 sequence-models 2 social-media 2 sentence-embedding 2 language-modeling 2