Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / shaharpit809 / Latent-Dirichlet-allocation-LDA-on-YELP-dataset-using-Apache-Spark
This repository consists of comparison between two LDA algorithms (EM and Online) in Apache Spark 'mllib' library and also finding the best hyper parameters on YELP dataset.
Stars: 3
Forks: 1
Open Issues: 3
License: None
Language: Java
Repo Size: 6.43 MB
Dependencies:
21
Created: almost 5 years ago
Updated: over 1 year ago
Last pushed: about 1 year ago
Last synced: 7 months ago
Topics: data-partitions, lda-algorithms, mllib, perplexity, pos-tagging, spark, yelp-dataset
Files
Loading...
Readme
Loading...
Dependencies
code/source-code/pom.xml
maven
- org.apache.spark:spark-sql_2.11 2.4.0 provided
- com.opencsv:opencsv 4.5
- databricks:spark-corenlp 0.4.0-spark2.4-scala2.11
- edu.stanford.nlp:stanford-corenlp 3.9.1
- log4j:log4j 1.2.17
- org.apache.spark:spark-core_2.11 2.4.0
- org.apache.spark:spark-mllib-local_2.11 2.4.0
- org.apache.spark:spark-mllib_2.11 2.4.0
- org.json:json 20180813
- org.scala-lang:scala-library 2.11.1
- org.apache.spark:spark-sql_2.11 2.4.0 provided
- com.opencsv:opencsv 4.5
- databricks:spark-corenlp 0.4.0-spark2.4-scala2.11
- edu.stanford.nlp:stanford-corenlp 3.9.1
- log4j:log4j 1.2.17
- org.apache.spark:spark-core_2.11 2.4.0
- org.apache.spark:spark-mllib-local_2.11 2.4.0
- org.apache.spark:spark-mllib_2.11 2.4.0
- org.apache.spark:spark-network-shuffle_2.11 2.4.0
- org.json:json 20180813
- org.scala-lang:scala-library 2.11.1