GitHub topics: spark-clusters
airscholar/Japan-visa-data-engineering
This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark clusters are set up within a Docker container on Azure.
Language: HTML - Size: 1.46 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 12

PiercingDan/spark-Jupyter-AWS
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Language: Jupyter Notebook - Size: 220 KB - Last synced at: 2 days ago - Pushed at: over 7 years ago - Stars: 261 - Forks: 18

kthakore/spark-notebook-dsp-template
Template for Spark Data Science Projects
Language: Makefile - Size: 16.6 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1

radanalyticsio/oshinko-cli 📦
Command line interface for spark cluster management app
Language: Go - Size: 85.8 MB - Last synced at: 11 months ago - Pushed at: almost 6 years ago - Stars: 11 - Forks: 16

bruler/kub-setup
local kubernetes-based ml setup
Language: Shell - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 4 - Forks: 0

conema/spark-terraform
This project create an Hadoop and Spark cluster on Amazon AWS with Terraform
Language: Shell - Size: 30.3 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 4

kumarvna/terraform-azurerm-hdinsight
Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.
Language: HCL - Size: 365 KB - Last synced at: 30 days ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 5

nikhilsu/Product-review-analysis-Spark-MongoDB
Performing various product review analysis on Amazon dataset using Apache Spark and MongoDB
Language: Java - Size: 56.6 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 1

monyedavid/spark-cluster
spark-clusters management with docker
Language: Dockerfile - Size: 80.1 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

s8sg/spark-py-submit
A python library to submit spark job in yarn cluster at different distributions (Currently CDH, HDP)
Language: Python - Size: 31.3 KB - Last synced at: 2 days ago - Pushed at: over 8 years ago - Stars: 3 - Forks: 0

cameres/emr-spark-jupyter
:notebook: Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
Language: Python - Size: 17.6 KB - Last synced at: almost 2 years ago - Pushed at: over 8 years ago - Stars: 4 - Forks: 1

hypnosapos/sparknetes
Spark on Kubernetes PoCs
Language: Makefile - Size: 1.12 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

gusdoe/sparkhistory
Language: Dockerfile - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

gioenn/sparkutils
A collection of scripts to easily start HDFS and Spark clusters
Language: Shell - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 1

116davinder/spark-cluster-ansible
Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

reddy-s/spark-container
Docker image to deploy a spark cluster in containers
Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0
