GitHub topics: word-embedding
atoffano/entity-norm
Comparative analysis of NLP neural methods for the entity normalization task in the biological field.
Language: Python - Size: 19.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Pzoom522/xANLG
Data and code for "Understanding Linearity of Cross-Lingual Word Embedding Mappings" (TMLR 2022)
Language: Python - Size: 31.3 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 11 - Forks: 0

marcopoli/Identification-of-Twitter-bots-using-CNN
Python project to create a classifier to guess if a Twitter account is a man, a woman or a bot.
Language: Python - Size: 158 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 15 - Forks: 8

sid321axn/bank_fin_embedding
This repository consists of customized word embedding focused on banking and finance terms which will be helpful in analyzing and classifying financial sentiments or stock price sentiment analysis.
Language: Jupyter Notebook - Size: 51.5 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 2

yxtay/glove-tensorflow
Implementation of GloVe using TensorFlow estimator API
Language: Python - Size: 368 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 9 - Forks: 5

Entreprecariat/Entreprecariat
This project aims at analysing with authomatic tools the new reality of 'entreprecariat', through a corpus of books related to the current labour market.
Language: Python - Size: 8.94 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

sagorbrur/GloVe-Bengali Fork of stanfordnlp/GloVe
Bengali GloVe Pretrained Word Vector
Language: C - Size: 206 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 1

raja-kumar/CSE-244-ML-for-NLP
NLP Assignments
Language: Python - Size: 14.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

eftekhar-hossain/Bengali-Document-Categorization
Bangla News Article Categorization Using Conv-LSTM Net. It is a multi-class classification problem.
Language: Jupyter Notebook - Size: 23.6 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 3

SatyamSoni23/DocyQA
Question Answering System for Android Devices. 4 approaches implemented in backend for QA System i.e., Naive Approach, Word Embedding Technique (Word2Vec, Glove), Simple transformer and Bert. For frontend, an Android app is used.
Language: Jupyter Notebook - Size: 1.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 1

Variable-Embedding/nlp-421 📦
Exploring GloVe Embeddings
Language: Python - Size: 178 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

tainvecs/MachineLearning-2017 📦
Machine Learning Course 2017 Fall @ National Taiwan University
Language: Jupyter Notebook - Size: 156 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

miras-tech/MirasText
MirasText
Language: Python - Size: 9.15 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 52 - Forks: 7

keivanipchihagh/simple-word-embedding
A simple and custom word embedding algorithm
Language: Jupyter Notebook - Size: 117 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

mustafahakkoz/Text_Classification_ML-DL Fork of Aysenuryilmazz/Text_Classification_ML-DL
This is an end-to-end NLP project based on text classification. We have created a real-time web application that takes input text from the user and predicts its diagnosis out of 10 predefined labels.
Language: Jupyter Notebook - Size: 7.94 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hoseinlook/bioinformatic-course
simple virus DNA classification
Language: Jupyter Notebook - Size: 10 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 6 - Forks: 0

pesoto/Text-Analysis
Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Language: Jupyter Notebook - Size: 461 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 54 - Forks: 29

mpinta/tracevec
🔌 Learning word embedding models based on the electrical consumption of various home appliances.
Language: Jupyter Notebook - Size: 840 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

callforpapers-source/inter-word-embedding
a non-neural network approach for word embedding
Language: Jupyter Notebook - Size: 34.8 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

AAnirudh07/One-Shot-Learning-for-Indian-News-Classification
Siamese Networks for training BERT embeddings on low-resource languages for NLP tasks.
Language: Jupyter Notebook - Size: 1.77 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rahatkader/FAQ-Albert-Einstein
Language: Jupyter Notebook - Size: 15.6 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ttavni/SemanticWordClouds
Making word clouds more interesting
Language: Python - Size: 8.21 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 4

anishacharya/Online-Embedding-Compression-AAAI-2019
Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training, to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.
Language: Python - Size: 36 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 2

KTong06/SentimentAnalysis
A two layered LSTM model to solve binary classification problem (positive and negative movie review)
Language: Python - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

KTong06/-RNN-Article_Catagorizer
An LSTM model to label articles into 5 categories.
Language: Python - Size: 38.8 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

dongjun-Lee/kor2vec
Library for Korean morpheme and word vector representation
Language: Python - Size: 16.6 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 69 - Forks: 15

atoffano/basenorm
Implementation of a BERT model to normalize biological entities from the Bacteria Biotope 4 corpus.
Language: Python - Size: 461 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Aysenuryilmazz/Text_Classification_ML-DL
This is an end-to-end ML project based on text classification. We have created a real-time web application that takes input text from the user and predicts its diagnosis out of 10 predefined labels. To see project, please visit the link: https://conditionpredictor.herokuapp.com/
Language: Jupyter Notebook - Size: 7.66 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

Animesh-Chourey/Pre-trained_Transformers-Information_Extraction-and-Dialogue_System
Part of the assignment from the Neural Network and NLP module
Language: Jupyter Notebook - Size: 1.42 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ebrahimpichka/semantic-textual-similarity
Categorizing products of an online retailer based on products’ titles using word2vec word-embedding and DBSCAN (density-based spatial clustering of applications with noise) clustering.
Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

oneonlee/Building-Transformer-Based-NLP-Applications
NVIDIA DLI "트랜스포머 기반 자연어 처리 애플리케이션 구축" 워크숍 레포지토리
Language: Jupyter Notebook - Size: 24.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

QiutingWang/Twitter-Sentiment-Analysis_Kaggle
#NLP#TextPreprocessing#EDA&Viz#FeatureEngineering#Modeling
Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

tate8/translator
Transformer translator website with multithreaded web server in Rust
Language: Rust - Size: 19.5 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

giturra/my-glove-pytorch
Language: Python - Size: 7.81 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

mfakca/turkish-word2vec
Turkish word2vec trained with Wikipedia dataset
Language: Python - Size: 57.6 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

caps6/word-embedding
An implementation of word2vec skip-gram algorithm
Language: Python - Size: 20.5 KB - Last synced at: 10 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 3

paul-pias/Text-Preprocessing-in-Bangla-and-English
Language: Python - Size: 29.3 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 5 - Forks: 4

lovit/text_embedding
Inferring vector of unseen words
Language: Python - Size: 1.64 MB - Last synced at: 6 months ago - Pushed at: about 6 years ago - Stars: 7 - Forks: 1

sirius-mhlee/word-embedding-using-keras-skip-gram-word2vec
Keras implementation of Skip-gram Word2Vec
Language: Python - Size: 89.8 KB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

sirius-mhlee/word-embedding-using-keras-cbow-word2vec
Keras implementation of Continuous Bag-of-Words Word2Vec
Language: Python - Size: 89.8 KB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

iesl/Distributional-Inclusion-Vector-Embedding
Language: Jupyter Notebook - Size: 113 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 15 - Forks: 0

mohamad-dehghani/Semi-automatic-Detection-of-Persian-Stopwords-using-FastText-Library-
Size: 378 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

aguschin/the-hat-game
Learn word embedding and service deployment by playing the Hat Game
Language: Jupyter Notebook - Size: 134 KB - Last synced at: about 21 hours ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 2

guyelov/IR-Wikipedia-Search-Engine Fork of OmerIdgar/IR-Wikipedia-Search-Engine
Search Engine for the Wikipedia corpus
Size: 1000 Bytes - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

manishemirani/Word2Vec_Persian
Skip-gram algorithm on a Persian dataset
Language: Python - Size: 6.2 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

AvivYaish/X2VEC
We have implemented, expanded and reviewed the paper “Sense2Vec - A Fast and Accurate Method For Word Sense Disambiguation In Neural Word Embeddings" by Andrew Trask, Phil Michalak and John Liu.
Language: Python - Size: 793 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

ownzonefeng/Graph-based-text-representations
The official implementation for Graph-based text representations (EPFL Master Thesis)
Language: Python - Size: 16.4 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

papapana/tv-script-generation-with-recurrent-neural-networks
Generating TV scripts based on `Seinfeld` using recurrent neural networks
Language: HTML - Size: 2.33 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

dv66/word2vec-from-scratch
Word2Vec model implementation from scratch.
Language: Python - Size: 71 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

EasyL0ver/headlines-categorization
Neural networks school project on headlines categorization using deep learning and word embedding.
Language: MATLAB - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hwaves/rtst
Official implementation for paper "Embedding Compression with Right Triangle Similarity Transformations".
Language: Python - Size: 9.77 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

chrisPiemonte/TripWalk
Random walk generation on RDF graph - DeepWalk inspired
Language: Scala - Size: 391 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 1

rayheberer/nytimes-word-embedding
iXperience Applied AI NLP exercise
Language: Jupyter Notebook - Size: 287 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

oToToT/Doc2VecC
GPU accelerated implementation for Doc2VecC
Language: Cuda - Size: 329 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

lemay-ai/sidecar
Sidecar: Augmenting Word Embedding Models With Expert Knowledge
Language: Jupyter Notebook - Size: 3.24 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

fratambot/Word_Clouds
playing with word clouds
Language: Jupyter Notebook - Size: 990 KB - Last synced at: 7 days ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

tiddler/SIF_reproduce
This repo contains the code and results for reproducing the results in the paper: A SIMPLE BUT TOUGH-TO-BEAT BASELINE FOR SENTENCE EMBEDDINGS
Language: Python - Size: 1.42 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 12 - Forks: 3

thunlp/CWE Fork of Leonard-Xu/CWE
Character-enhanced Word Embedding (CWE) Model
Language: C - Size: 118 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 1

eftekhar-hossain/Word-Embedding-on-Bangla-Text
A Word Embedding Model for Bangla Text Corpus.
Language: Jupyter Notebook - Size: 4.23 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

ricky9123/synonym-recognition-by-finetune-embedding
:bird: Fine-tune Pre-trained Word Embedding for Synonym Recognition
Language: Python - Size: 5.19 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

NLPIR-team/Word2vec-CPP
Language: C++ - Size: 6.3 MB - Last synced at: 2 months ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

askintution/wordvector_be Fork of huichen/wordvector_be
Web服务:使用腾讯 800 万词向量模型和 spotify annoy 引擎得到相似关键词
Size: 27.3 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

mklimasz/wedt-elka
Information extraction using word embedding
Language: Java - Size: 89.8 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

eriknovak/python-text-embedding-microservice
Service for producing text representations via word embeddings
Language: Python - Size: 248 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 2

albert-espin/imdb-clustering
Topic Clustering from Word Embeddings for IMDB Movie Reviews
Language: Python - Size: 1.78 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

rounayak/Text-categorization-using-neural-word-embeddings
This is a practical implementation implementing neural networks on top of fasttext as well as word2vec word embeddings.
Language: Python - Size: 1.15 MB - Last synced at: 4 months ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 0

pemagrg1/word-embeddings
word embeddings. Created Date: 12 Feb 2019
Language: Python - Size: 1.81 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

kdrl/SCNE
C++ implementation of the paper "Segmentation-free compositional n-gram embedding". NAACL-HLT2019.
Language: C++ - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

loretoparisi/fastTextServe
FastText Server for Node.js based on fasttext.js
Size: 1000 Bytes - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

stephen-zhao/CS-456_ANN-MP2
CS-456 Artificial Neural Networks - Miniproject 2 - Chatbot
Language: Jupyter Notebook - Size: 21.7 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

kafku/mm_word2vec
Unofficial implementation of multimodal skip-gram model [Lazaridou+ 2015]
Language: Jupyter Notebook - Size: 565 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

raokrutarth/meetover_public
Mobile app and backend that uses NLP and word embeddings to connect similar users and setup informal meetings
Language: JavaScript - Size: 908 KB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

KEINOS/Fork_Word2vec-Golang Fork of ynqa/wego
✅ Forked repo of a GoLang implementation of Word2Vec and GloVe!
Language: Go - Size: 6.78 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

novara754/roses-and-violets
Language: JavaScript - Size: 12.7 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

damaohongtu/CNN-for-Sentence-Classification-in-Tensorflow
参考@yoonkim及其他仓库,完善CNN for Sentence Classification
Language: Python - Size: 603 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1

ysenarath/wang2vec Fork of wlin12/wang2vec
Extension of the original word2vec using different architectures
Language: C - Size: 59.6 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

cschen1205/java-deep-learning-nlp
Deep Learning for Natural Language Processing in Java
Language: Java - Size: 1.62 MB - Last synced at: about 2 months ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

hailiang-wang/fastText Fork of facebookresearch/fastText
Library for fast text representation and classification.
Language: C++ - Size: 306 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

zhuangh/kcws Fork of koth/kcws
Deep Learning Chinese Word Segment
Language: C++ - Size: 13.4 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0
