An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: dataproc-clusters

AnveshaM/Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform

The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.

Language: Jupyter Notebook - Size: 8.88 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

Tinmarian/Airflow2.0-De-0-a-Heroe

Repositorio para realizar el curso en Udemy llamado "Airflow2.0 De 0 a Héroe", de la academia "Datapath".

Language: Python - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

dwaiba/dataproc-terraform

Dataproc Customisable HA cluster debian-9 with zookeeper,kafka ,BigQuery and other tools/jobs with Terraform

Language: HCL - Size: 28.3 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 7

lucianocoelho-28/dio-desafio-dataproc-gcp

Digital Innovation One - Desafio GCP Dataproc. O desafio consiste em efetuar um processamento de dados utilizando o produto Dataproc do GCP. Esse processamento irá efetuar a contahem das palavras de um livro e informar quantas vezes cada palavra aparece no mesmo.

Language: Python - Size: 317 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

redvg/dataproc-pyspark-monte-carlo

Monte Carlo simulations with PySpark on GCP Cloud Dataproc clusters

Language: Python - Size: 144 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

redvg/dataproc-pyspark-mapreduce

GCP Dataproc mapreduce sample with PySpark

Language: Shell - Size: 179 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0