GitHub / TsungTseTu122 / CloudComputing--MovieLens-Big-Data-Analytics-on-the-Cloud
This project analyzes the MovieLens dataset using PySpark, Hadoop HDFS, and Docker to perform clustering, classification, and association rule mining on user-movie interactions. The system runs in a containerized cloud environment with Spark clusters, enabling scalable big data processing.
Stars: 0
Forks: 0
Open issues: 0
License: apache-2.0
Language:
Size: 10.7 KB
Dependencies parsed at: Pending
Created at: 3 months ago
Updated at: 3 months ago
Pushed at: 3 months ago
Last synced at: 3 months ago
Topics: docker-compose, hdfs, jupyter-notebook, pyspark, python, spark-mllib