GitHub topics: apache-spark-cluster

A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support

Language: Jupyter Notebook - Size: 220 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 261 - Forks: 18

A command-line tool for launching Apache Spark clusters.

Language: Python - Size: 785 KB - Last synced at: 27 days ago - Pushed at: 6 months ago - Stars: 642 - Forks: 117

This project has customization likes custom data sources, plugins written for the distributed systems like Apache Spark, Apache Ignite etc

Language: Java - Size: 107 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 33 - Forks: 23

data enginerring project - visualize visa numbers by country, time issued from japan

Language: HTML - Size: 3.7 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.

Language: Scala - Size: 146 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

Apache Spark cluster lab.

Language: Java - Size: 7.1 MB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Analysis performed on data from the Steam platform using Apache Spark and Cloud services such as Amazon Web Services.

Size: 10.7 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

Implementations of Markov Clustrer Algorithm (MCL) and Regularized Markov Cluster Algorithm (R-MCL) in Apache Spark

Language: Scala - Size: 264 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 1

Related Keywords

ecosyste.ms