GitHub topics: dataproc-clusters
AnveshaM/Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform
The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.
Language: Jupyter Notebook - Size: 8.88 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

Tinmarian/Airflow2.0-De-0-a-Heroe
Repositorio para realizar el curso en Udemy llamado "Airflow2.0 De 0 a Héroe", de la academia "Datapath".
Language: Python - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

dwaiba/dataproc-terraform
Dataproc Customisable HA cluster debian-9 with zookeeper,kafka ,BigQuery and other tools/jobs with Terraform
Language: HCL - Size: 28.3 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 7

lucianocoelho-28/dio-desafio-dataproc-gcp
Digital Innovation One - Desafio GCP Dataproc. O desafio consiste em efetuar um processamento de dados utilizando o produto Dataproc do GCP. Esse processamento irá efetuar a contahem das palavras de um livro e informar quantas vezes cada palavra aparece no mesmo.
Language: Python - Size: 317 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

redvg/dataproc-pyspark-monte-carlo
Monte Carlo simulations with PySpark on GCP Cloud Dataproc clusters
Language: Python - Size: 144 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

redvg/dataproc-pyspark-mapreduce
GCP Dataproc mapreduce sample with PySpark
Language: Shell - Size: 179 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0
