An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: spark-clusters

airscholar/Japan-visa-data-engineering

This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark clusters are set up within a Docker container on Azure.

Language: HTML - Size: 1.46 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 12

PiercingDan/spark-Jupyter-AWS

A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support

Language: Jupyter Notebook - Size: 220 KB - Last synced at: 2 days ago - Pushed at: over 7 years ago - Stars: 261 - Forks: 18

kthakore/spark-notebook-dsp-template

Template for Spark Data Science Projects

Language: Makefile - Size: 16.6 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1

radanalyticsio/oshinko-cli 📦

Command line interface for spark cluster management app

Language: Go - Size: 85.8 MB - Last synced at: 11 months ago - Pushed at: almost 6 years ago - Stars: 11 - Forks: 16

bruler/kub-setup

local kubernetes-based ml setup

Language: Shell - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 4 - Forks: 0

conema/spark-terraform

This project create an Hadoop and Spark cluster on Amazon AWS with Terraform

Language: Shell - Size: 30.3 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 4

kumarvna/terraform-azurerm-hdinsight

Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.

Language: HCL - Size: 365 KB - Last synced at: 30 days ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 5

nikhilsu/Product-review-analysis-Spark-MongoDB

Performing various product review analysis on Amazon dataset using Apache Spark and MongoDB

Language: Java - Size: 56.6 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 1

monyedavid/spark-cluster

spark-clusters management with docker

Language: Dockerfile - Size: 80.1 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

s8sg/spark-py-submit

A python library to submit spark job in yarn cluster at different distributions (Currently CDH, HDP)

Language: Python - Size: 31.3 KB - Last synced at: 2 days ago - Pushed at: over 8 years ago - Stars: 3 - Forks: 0

cameres/emr-spark-jupyter

:notebook: Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR

Language: Python - Size: 17.6 KB - Last synced at: almost 2 years ago - Pushed at: over 8 years ago - Stars: 4 - Forks: 1

hypnosapos/sparknetes

Spark on Kubernetes PoCs

Language: Makefile - Size: 1.12 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

gusdoe/sparkhistory

Language: Dockerfile - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

gioenn/sparkutils

A collection of scripts to easily start HDFS and Spark clusters

Language: Shell - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 1

116davinder/spark-cluster-ansible

Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

reddy-s/spark-container

Docker image to deploy a spark cluster in containers

Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0