Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: dataproc-cluster

bche3/Big-Data-Project-Voter-Turnout-Prediction

Data Science Project: Predicting voter turnout in swing states in the United States based on 2020 General Election data through big data analytics

Language: Jupyter Notebook - Size: 4.88 MB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 0 - Forks: 0

Wittline/pyDag

Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag

Language: Python - Size: 146 KB - Last synced: 23 days ago - Pushed: over 1 year ago - Stars: 24 - Forks: 3

mihir-robotics/pyspark-gcp-project

PySpark Job that runs in Dataproc cluster, loads data from Cloud Storage to BigQuery table.

Language: Python - Size: 8.25 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

brauseo/desafio-dataproc

Criando um ecossitema Hadoop totalmente gerenciado com Google Cloud Platform: O desafio consiste em efetuar um processamento de dados utilizando o produto Dataproc do GCP. Esse processamento irรก efetuar a contagem das palavras de um livro e informar quantas vezes cada palavra aparece no mesmo.

Size: 64.5 KB - Last synced: 5 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

tirthmehta/Google-Cloud-Platform-based-Hadoop-Map-Reduce

Determination of which words occur in a dataset of textbooks along with each word's occurrence count identification with the help of Google Cloud Platform based Dataproc cluster formation.

Language: Java - Size: 1010 KB - Last synced: 7 months ago - Pushed: almost 7 years ago - Stars: 0 - Forks: 0

cloudgear-io/gke-terraform

gke with terraform, dataproc with terraform

Language: HCL - Size: 94.7 KB - Last synced: 8 months ago - Pushed: about 4 years ago - Stars: 3 - Forks: 4

spotify/limbo ๐Ÿ“ฆ

Language: Scala - Size: 136 KB - Last synced: about 1 month ago - Pushed: over 7 years ago - Stars: 6 - Forks: 2

dwaiba/dataproc-terraform

Dataproc Customisable HA cluster debian-9 with zookeeper,kafka ,BigQuery and other tools/jobs with Terraform

Language: HCL - Size: 28.3 KB - Last synced: about 1 month ago - Pushed: about 4 years ago - Stars: 3 - Forks: 7

anjijava16/GCP_Data_Enginner_Utils

GCP_Data_Enginner

Language: Shell - Size: 1.21 MB - Last synced: 10 months ago - Pushed: over 2 years ago - Stars: 7 - Forks: 0

MarieeCzy/METAR-Data-Engineering-and-Machine-Learning-Project

An educational project to build an end-to-end pipline for near real-time and batch processing of data further used for visualisation and a machine learning model.

Language: Python - Size: 3.93 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 2 - Forks: 0

mr-ubik/google-nembo ๐Ÿ“ฆ

Collection of personal resources on Google Cloud

Size: 6.84 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 1 - Forks: 0

akaliutau/gcp-prod-spark-cluster

Deploying production ready environment for Spark cluster

Language: HCL - Size: 15.6 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

naranjja/gcp-jupyter-sql

Run Jupyter Notebooks (and store data) on Google Cloud Platform.

Language: Python - Size: 321 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 9 - Forks: 4

natmurad/cloudbigdata

Content about how to create big data ecosystems on the Cloud

Language: HTML - Size: 872 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

jonathanAmancioSales/Hadoop_Dataproc_Google_Cloud_Platform_DIO

Projeto do Curso "Criando um Ecossistema Hadoop Totalmente Gerenciado com Google Cloud Dataproc" do Bootcamp Data Engineer da Digital Innovation One

Language: Shell - Size: 317 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

vishnudxb/gcloud-dataproc-creation

Creating gcloud dataproc cluster with this github action

Language: Shell - Size: 8.79 KB - Last synced: 2 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

pietrocarbo/scala-ble

A Scala Spark based project to experiment with map-reduce algorithms on big data graph shaped

Language: Scala - Size: 66.4 KB - Last synced: about 1 year ago - Pushed: almost 6 years ago - Stars: 1 - Forks: 1