GitHub / abroniewski / IdleCompute-Data-Management-Architecture
Implementation of a big data management and analysis backbone architecture using PySpark for distributed and scalable data ingestion and MLlib for machine learning analysis. Part of Big Data Management and Analytics (BDMA) program.
Stars: 1
Forks: 1
Open issues: 0
License: mit
Language: Jupyter Notebook
Size: 34.8 MB
Dependencies parsed at: Pending
Created at: about 3 years ago
Updated at: over 1 year ago
Pushed at: over 1 year ago
Last synced at: 3 months ago
Topics: bdma, big-data, big-data-analytics, bigdata, dataops, hadoop-hdfs, machine-learning, parquet, pipeline, pyspark-mllib