GitHub topics: apache-spark-cluster
PiercingDan/spark-Jupyter-AWS
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Language: Jupyter Notebook - Size: 220 KB - Last synced at: 28 days ago - Pushed at: over 7 years ago - Stars: 261 - Forks: 18

nchammas/flintrock
A command-line tool for launching Apache Spark clusters.
Language: Python - Size: 785 KB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 642 - Forks: 117

aamargajbhiye/big-data-projects
This project has customization likes custom data sources, plugins written for the distributed systems like Apache Spark, Apache Ignite etc
Language: Java - Size: 107 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 33 - Forks: 23

erjan/data_engineering_japan_visas_pyspark
data enginerring project - visualize visa numbers by country, time issued from japan
Language: HTML - Size: 3.7 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

josemarialuna/ExternalValidity
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
Language: Scala - Size: 146 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

savvydatainsights/spark
Apache Spark cluster lab.
Language: Java - Size: 7.1 MB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

arturobp3/Steam_Analysis_For_Gamers Fork of UCloudM/Steam_Analysis_For_Gamers
Analysis performed on data from the Steam platform using Apache Spark and Cloud services such as Amazon Web Services.
Size: 10.7 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

akaltsikis/Markov_Cluster_Algorithm
Implementations of Markov Clustrer Algorithm (MCL) and Regularized Markov Cluster Algorithm (R-MCL) in Apache Spark
Language: Scala - Size: 264 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 1
