An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: gcp-dataflow

kameshpoc/rag-data-pipeline

This repo is to demonstrate rag data processing pipeline using dataflow flex templates

Language: Python - Size: 22.5 KB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

cristian-rincon/action-dataflow-template

Github action to create dataflow templates

Language: Shell - Size: 9.77 KB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

EmediongFrancis/Scalable-Data-Processing-and-Query-Optimization-GCP-Kafka-Snowflake-Airflow

This project focuses on scalable data processing and query performance optimisation. It uses Snowflake for data warehousing, GCP Cloud Functions for serverless compute, and Apache Kafka for real-time data streaming. It leverages the serverless capabilities of the systems for scalability and performance.

Language: HCL - Size: 479 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

mkuthan/example-beam

Playground for Apache Beam and Scio experiments, driven by real-world use cases.

Language: Scala - Size: 190 KB - Last synced at: 8 days ago - Pushed at: over 4 years ago - Stars: 9 - Forks: 3

ray-bytes/Walmart-back-friday-sales-analysis

Black Friday, the biggest shopping day of the year, presents a unique opportunity for retailers like Walmart to boost sales, attract new customers, and clear inventory. Managing the surge in transaction volumes, understanding customer preferences, and optimizing inventory in real time are critical challenges that require sophisticated data solution

Language: Python - Size: 120 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

UkrSoftTech/gcp-space-shepherd

GCP Space Shepherd - service for monitoring Google DataFlow executions

Language: Java - Size: 31.3 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

AbhishekSingh1180/snowflake-bigquery-data-migration

Leveraged GitHub Actions to automate the deployment of a GCP pipeline for Snowflake to BigQuery data migration. Utilized 'sensex-data-analysis' as the data source and Snowflake storage integration feature to load data to GCS. Implemented workflow management and transformation using Composer (Airflow) and Dataflow

Language: Shell - Size: 587 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

prakashdontaraju/google-cloud-ecommerce

ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipeline ― Cloud Storage, Dataproc, PySpark, Cloud Spanner and Tableau

Language: Python - Size: 4.38 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 3

cakirmuha/log-collector

Language: Go - Size: 85 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

svan0/anime_recommendation_system

An end to end anime recommendation system based on data scrapped from myanimelist.net

Language: Python - Size: 107 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 11 - Forks: 0

crosslibs/export-dialogflow-logs-to-bigquery

Export Dialogflow conversation logs to BigQuery with masking PII using DLP API

Language: JavaScript - Size: 271 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 5 - Forks: 2

redvg/streaming-pipeline-IoT-simulator

ETL pipeline on GCP

Language: Jupyter Notebook - Size: 1.14 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 3

ml-processing-backbone/batch-processing-boilerplate

Boilerplate for batch-processing scenarios' orchestration. Apache Airflow w/ realistic product analytics use case

Language: Python - Size: 1.91 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 1

RealKinetic/gcp-dataflow-gcf-trigger

Trigger a Dataflow job when a file is uploaded to Cloud Storage using a Cloud Function

Language: Python - Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 1

sacontreras/fids-capstone-asl-translation

Big Data ETL Pipeline for ASL-to-Text (Computer Vision), using Apache Beam on GCP Dataflow

Language: Jupyter Notebook - Size: 2.01 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

zaheershk/gcp-labs

Sample projects to explore various Google Cloud service-offerings and architecture approaches

Size: 0 Bytes - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

tansudasli/beam-sandbox

Apache beam sandbox w/ Dataflow for 10+ use cases

Language: Python - Size: 4.28 MB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

tckz/scio-example

Language: Scala - Size: 3.91 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

redvg/dataflow-cpb101-pipeline-mapreduce-py

GCP Dataflow pipeline with mapreduce in python

Language: Python - Size: 10.7 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

redvg/dataflow-bigquery-source-and-side-input-py

GCP Dataflow pipeline with BigQuery as source and side input

Language: Python - Size: 271 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

redvg/dataflow_pipeline_py

GCP Dataflow pipeline in python

Language: Java - Size: 11.7 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1