An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: cloud-dataflow

KayvanShah1/gws-token-activity-analyzer

A robust data pipeline to fetch, process, and analyze token activity events from the Google Workspace Admin Reports API. This project ensures no data loss across multiple runs and provides insight into API usage patterns.

Language: Python - Size: 134 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

mercari/pipeline

Tool to define Apache Beam pipeline in YAML or JSON

Language: Java - Size: 3.48 MB - Last synced at: 22 days ago - Pushed at: 2 months ago - Stars: 77 - Forks: 22

sanketrs/implementation-of-modern-data-engineering-architecture-with-fabric_analytics

Building a next-generation hybrid data pipeline architecture that combines the power of Microsoft Fabric, Azure Cloud, and Power BI. This pipeline is engineered to tackle the challenges of real-time data ingestion, multi-layered processing, and analytics, delivering business-critical insights.

Language: Python - Size: 32.2 KB - Last synced at: 15 days ago - Pushed at: 11 months ago - Stars: 5 - Forks: 0

Niangmohamed/GCP-Professional-Data-Engineer-Learning-Path

I got Google Cloud Certified. I have what it takes to leverage Google Cloud technology. Here my certification: https://www.credential.net/ee1bd2d6-fdb0-4037-8a8d-9afae3d79c86.

Language: Jupyter Notebook - Size: 16.8 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 6

GoogleCloudPlatform/dataflow-opinion-analysis

Opinion Analysis of News, Threaded Conversations, and User Generated Content

Language: Java - Size: 100 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 102 - Forks: 25

GoogleCloudPlatform/dataflow-pubsub-dedup

Language: Java - Size: 67.4 KB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 10

viveknaskar/google-dataflow-redis-example Fork of arun-james/dataflow-example

Cloud dataflow pipeline code that processes data from a cloud storage bucket, transforms it and stores in Google's highly scalable, reduced latency in-memory database, memorystore which is an implementation of Redis.

Language: Java - Size: 80.1 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

datastacktv/apache-beam-batch-processing

Public source code for the Batch Processing with Apache Beam (Python) online course

Language: Python - Size: 81.1 KB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 18 - Forks: 9

tosun-si/asgarde

Asgarde allows simplifying error handling with Apache Beam Java, with less code, more concise and expressive code.

Language: Java - Size: 159 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 74 - Forks: 5

tosun-si/pasgarde

Asgarde allows simplifying error handling with Apache Beam Python, with less code, more concise and expressive code.

Language: Python - Size: 61.5 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 31 - Forks: 1

tosun-si/midgard

Midgard is a wrapper on Beam Kotlin, allowing more concise and expressive code. It removes Beam boilerplate code and proposes more Functional Programming style

Language: Kotlin - Size: 97.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 1

pompierninja/hashtagsbattle 📦

IIM-DEVOPS - (demo) Real-time Twitter's hashtags analytics... fully managed by Google Cloud

Language: Python - Size: 827 KB - Last synced at: 30 days ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

viveknaskar/cloud-dataflow-with-memorystore

Cloud Dataflow pipeline that reads the file from Cloud Storage and processes and outputs in the memory store.

Language: Java - Size: 26.4 KB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 1

ostelco/ostelco-core

Cloud-native Telco BSS hosted in GCP K8s with standalone Diameter to gRPC gateway. Rule Engine using Neo4j graphs. Analytics Events sent to GCP BigData (Dataflow+BigQuery) via PubSub. It's awesome!

Language: Kotlin - Size: 25.8 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 36 - Forks: 13

tosun-si/dataflow-java-ci-cd

Project showing a CI CD pipeline for Dataflow Java with Flex Template and Cloud Build

Language: Java - Size: 3.31 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

esakik/data-engineering-essentials

Samples related to data engineering, e.g. spark, embulk, airflow, etc.

Language: Python - Size: 413 KB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

hayatoy/dataflow-tutorial

Cloud Dataflow Tutorial for Beginners

Language: Python - Size: 14.6 KB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 25 - Forks: 9

seahrh/fraud-detection-dataflow

Working example of a real-time inference pipeline on GCP Cloud Dataflow

Language: Python - Size: 111 KB - Last synced at: 8 months ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 1

maximoleinyk/pubsub-filter

GKE Replacement for PubSub-to-PubSub Cloud Dataflows in GCP

Language: TypeScript - Size: 409 KB - Last synced at: 9 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

pompierninja/meetuplytics 📦

Realtime Meetup "Répondez, s'il vous plaît" (RSVPs) analytics built upon Apache-Beam - Streaming Processing

Language: Python - Size: 16.8 MB - Last synced at: 30 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

zdenulo/upload-data-datastore-dataflow

Language: Python - Size: 3.91 KB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 4 - Forks: 8

dongma/springcloud-dataflow

build streaming apps on spring cloud dataflow platform

Language: Java - Size: 3.03 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 2

viveknaskar/cloud-dataflow-template-poc

Creating Cloud Dataflow template using Java for counting a number of words from a document.

Language: Java - Size: 74.2 KB - Last synced at: 9 months ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 1

saiteja09/CloudDataFlow-ETL-OnPremise-JDBC

ETL OnPremises Data to Google BigQuery using Google Cloud Data Flow via JDBC driver

Language: Java - Size: 21.5 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 2

imrenagi/beam-bank-trx-analytics

Experiment in using windowing and trigger for more accurate streaming analytics with Apache Beam

Language: Java - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0