Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / nikhiljsk / preprocess_nlp
A fast framework for pre-processing (Cleaning text, Reduction of vocabulary, Feature extraction and Vectorization). Implemented with parallel processing using custom number of processes.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikhiljsk%2Fpreprocess_nlp
Stars: 8
Forks: 2
Open Issues: 2
License: None
Language: Python
Repo Size: 58.6 KB
Dependencies:
10
Created: over 4 years ago
Updated: 3 months ago
Last pushed: almost 2 years ago
Last synced: about 12 hours ago
Topics: cleaning-data, feature-extraction, glove, natural-language-processing, nlp, parallel-processing, preprocess, python3, reduction, spacy, stages, tfidf, vectorization, word2vec
Files
Dependencies
- beautifulsoup4 ==4.8.2
- contractions ==0.0.24
- gensim ==3.8.1
- ipython ==7.13.0
- matplotlib ==3.1.3
- nltk ==3.4.5
- numpy ==1.18.1
- scikit_learn ==0.22.2.post1
- spacy ==2.2.3
- yake ==0.3.7