GitHub topics: apache-beam
mmistroni/GCP_Experiments
Various utilities to be reused on GCP
Language: Python - Size: 2.57 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 5 - Forks: 1

GoogleCloudPlatform/DataflowTemplates
Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
Language: Java - Size: 25.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,235 - Forks: 1,040

CamilaJaviera91/mini-gcp
This project simulates a modern data pipeline architecture, entirely locally. It follows a modular design to extract, transform, load, validate, and analyze synthetic sales data using Python, Apache Beam, DuckDB, and PostgreSQL.
Language: Python - Size: 422 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 0

nielsbasjes/yauaa
Yet Another UserAgent Analyzer
Language: Java - Size: 86 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 850 - Forks: 140

HarishHary/blink
Detection engine at scale using Apache Beam, Apache Flink, Kubernetes
Language: Go - Size: 8.13 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

tensorflow/tfx
TFX is an end-to-end platform for deploying production ML pipelines
Language: Python - Size: 234 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 2,161 - Forks: 723

ivanildobarauna-dev/data-pipeline-async-ingest
Pipeline for processing and consuming streaming data from Pub/Sub, integrating with Dataflow for real-time data processing
Language: Python - Size: 738 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

spotify/flink-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
Language: Go - Size: 3.07 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 216 - Forks: 74

ncwhh/dataflow-jdbc-connection-pool
Example: Limit JDBC connections in Dataflow DoFns with a singleton pool
Language: Java - Size: 9.77 KB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

Ganeshsivakumar/langchain-beam
Integrates LLMs as PTransform in Apache Beam pipelines using LangChain
Language: Java - Size: 1.07 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 21 - Forks: 2

blockchain-etl/bitcoin-etl
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Language: Python - Size: 354 KB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 433 - Forks: 131

google/weather-tools
Tools to make weather data accessible and useful.
Language: Python - Size: 196 MB - Last synced at: 4 days ago - Pushed at: 24 days ago - Stars: 233 - Forks: 44

google/fhir-data-pipes
A collection of tools for extracting FHIR resources and analytics services on top of that data.
Language: Jupyter Notebook - Size: 389 MB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 187 - Forks: 111

mbari-org/aipipeline
Library for running detection, clustering or classification ai pipelines using ApacheBeam
Language: Jupyter Notebook - Size: 78.2 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 0

mercari/pipeline
Tool to define Apache Beam pipeline in YAML or JSON
Language: Java - Size: 3.52 MB - Last synced at: 8 days ago - Pushed at: 16 days ago - Stars: 77 - Forks: 22

ngrunwald/datasplash
Clojure API for a more dynamic Google Dataflow
Language: Clojure - Size: 846 KB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 131 - Forks: 32

O2-Czech-Republic/proxima-platform
The Proxima platform.
Language: Java - Size: 9.42 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 21 - Forks: 7

google/consent-based-conversion-adjustments
Code to statistically up-weight conversion values of consenting customers to feed up to 100% of the factual conversion values back into Google Ads.
Language: Python - Size: 63.5 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 24 - Forks: 8

diabahmed/london-bicycle-analysis
Apache Beam pipeline for analyzing London bicycle sharing data using Google Cloud Dataflow and BigQuery.
Language: Python - Size: 31.3 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

google/tensorflow-recorder 📦
TFRecorder makes it easy to create TensorFlow records (TFRecords) from Pandas DataFrames and CSVs files containing images or structured data.
Language: Python - Size: 6.54 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 180 - Forks: 32

SolaceProducts/solace-apache-beam
Solace connector for Apache Beam / Google Cloud Dataflow
Language: Java - Size: 3.3 MB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 4 - Forks: 14

akvelon/beam Fork of apache/beam
Apache Beam is a unified programming model for Batch and Streaming
Language: Java - Size: 509 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 12

mkuthan/stream-processing
Learn how to develop and test stateful streaming and batch data pipelines
Language: Scala - Size: 24.9 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 8 - Forks: 2

blockchain-etl/blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Language: Python - Size: 64.5 KB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 81 - Forks: 22

google-parfait/dataset_grouper
Libraries for efficient and scalable group-structured dataset pipelines.
Language: Python - Size: 60.5 KB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 26 - Forks: 4

sayakpaul/count-tokens-hf-datasets
This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Dataflow.
Language: Python - Size: 19.5 KB - Last synced at: 1 day ago - Pushed at: almost 3 years ago - Stars: 27 - Forks: 1

miozilla/dataflowbeam
dataflowbeam
Size: 1.16 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

HimanshuMohanty-Git24/StreamLineIRCTC
A real-time data pipeline simulating IRCTC bookings using GCP. It streams mock data via Pub/Sub, transforms it with Dataflow (Python UDF), stores results in BigQuery, and powers live dashboards. Includes error handling and schema validation.
Language: Python - Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

blockchain-etl/hedera-etl
ETL scripts for Hedera Hashgraph
Language: Java - Size: 511 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 12 - Forks: 3

medzin/beam-postgres
Light IO transforms for Postgres read/write in Apache Beam pipelines.
Language: Python - Size: 46.9 KB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 13 - Forks: 4

blockchain-etl/eos-etl
ETL scripts for EOS.
Language: Python - Size: 226 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 7

eduardogr/playing-apache-beam-tour
Playing with Apache Beam Tour: https://tour.beam.apache.org
Language: Go - Size: 34.2 KB - Last synced at: 25 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

eliias/gleam
Fun DSL for Apache Beam and Kotlin.
Language: Kotlin - Size: 2.72 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

thiagoneye/course-apache_beam
Apache Beam studies.
Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

rrmerugu-archive/apache-beam-io-extras
The missing I/O PTransforms of Apache Beam in python; which already exist in Java SDK based but not yet supported in the official apache-beam module.
Language: Python - Size: 12.7 KB - Last synced at: 1 day ago - Pushed at: about 7 years ago - Stars: 7 - Forks: 6

xlisp/apache-beam-csv-process
apache beam csv process
Language: Java - Size: 28.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

iht/bigquery-dataflow-cdc-example
A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates
Language: Java - Size: 184 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

CamilaJaviera91/apache-beam-pipeline-first-approach
This code demonstrates how to integrate Apache Beam with scikit-learn datasets and perform simple data transformations. It loads the Linnerud dataset from scikit-learn, converts it into a Pandas DataFrame for easier manipulation.
Language: Python - Size: 608 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

GoogleCloudPlatform/dataflow-pubsub-dedup
Language: Java - Size: 67.4 KB - Last synced at: 5 months ago - Pushed at: 12 months ago - Stars: 16 - Forks: 10

avcaliani/hello-airflow
🌬 PoC using Apache Airflow
Language: Python - Size: 499 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

yeopster/churn-prediction-GCP
Churn Prediction Machine Learning Using Google Cloud Platform
Language: Jupyter Notebook - Size: 126 KB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

armahdavi/MLOps
Productionizing ML Models using a variety of tools including FastAPI, Flask, Doocker, AWS, GCP, TensorFlow Extended (TFX), and TF.js.
Language: Jupyter Notebook - Size: 5.97 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

vladimirrotariu/parallel-monte-carlo-simulations
A package to orchestrate parallel (Monte Carlo) simulations via Apache Beam for an arbitrary number of models, with low-level parameter granularity, and flexible random number generator choice.
Language: Python - Size: 883 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

GoogleCloudPlatform/flink-on-k8s-operator 📦
[DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
Language: Go - Size: 1.67 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 658 - Forks: 266

esakik/beam-mysql-connector
An Apache Beam I/O connector for seamless integration with MySQL database 🔗 https://beam.apache.org/documentation/io/connectors/#other-io-connectors-for-apache-beam
Language: Python - Size: 188 KB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 21 - Forks: 19

datastacktv/apache-beam-explained
Source code for the YouTube video, Apache Beam Explained in 12 Minutes
Language: Python - Size: 2.93 KB - Last synced at: 4 months ago - Pushed at: almost 5 years ago - Stars: 21 - Forks: 14

marceloneppel/apache-beam-golang-udf
Run UDFs (User Defined Functions) on Apache Beam Golang SDK.
Language: Go - Size: 14.6 KB - Last synced at: 5 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

ArthurCoutinho15/rain_data_pipeline
Language: Python - Size: 1.19 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

iht/beam-cloud-build-terraform
The scripts in this repo will build the Apache Beam Java SDK packages, using Cloud Build and Artifact Registry, for a personal Beam fork.
Language: HCL - Size: 71.3 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 1

davidkhala/ETL
Collection of data Extract, Transform, Load
Language: Batchfile - Size: 105 KB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

xmlking/micro-apps
Microservices in Post-Kubernetes Era. A polyglot monorepo
Language: Kotlin - Size: 5.77 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 53 - Forks: 10

archie-cm/Mobile_Game_Analysis_Real-Time_Pipeline_with_PubSub_and_Dataflow
This project demonstrates how to build a real-time analytics pipeline for mobile game data using Google Cloud Pub/Sub and Apache Beam (Dataflow).
Language: Python - Size: 13.7 KB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

archie-cm/identify_bank_defaulter_customer_with_beam
This project builds a data pipeline to identify bank defaulter customers based on credit card and loan payment data using Google Dataflow
Language: Python - Size: 99.6 KB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

mohaseeb/beam-nuggets
Collection of transforms for the Apache beam python SDK.
Language: Python - Size: 6.19 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 39

bipinct/beam-kafka-kotlin-boilerplate
This project is a boilerplate setup for Kafka and Apache Beam integration, built using Gradle and Kotlin. It leverages Apache Beam for data processing and Kafka for messaging.
Language: Kotlin - Size: 60.5 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

rm3l/apache-beam-java-firestore-batch-dataflow 📦
Companion Repo for blog post : https://rm3l.org/batch-writes-to-google-cloud-firestore-using-the-apache-beam-java-sdk-on-google-cloud-dataflow/
Language: Java - Size: 101 KB - Last synced at: 5 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

data-mission/dota2-cast-assist
Real-time Dota2 broadcaster’s assistant integrates the live Steam API with Dota GSI to provide game metrics like GPM, XPM, kills, deaths, damage, buybacks, and more, enhancing commentary with insights on player performance and the in-game economy
Language: Python - Size: 8.93 MB - Last synced at: 6 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

uche-madu/twitter-pipeline
Twitter data streaming and analysis
Language: Python - Size: 24.4 KB - Last synced at: 8 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

janaom/gcp-de-project-streaming-pubsub-beam-dataflow
This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.
Language: Python - Size: 172 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

datastacktv/apache-beam-batch-processing
Public source code for the Batch Processing with Apache Beam (Python) online course
Language: Python - Size: 81.1 KB - Last synced at: 19 days ago - Pushed at: almost 5 years ago - Stars: 18 - Forks: 9

GoogleCloudPlatform/dataflow-metrics-exporter
CLI tool to collect dataflow resource & execution metrics and export to either BigQuery or Google Cloud Storage. Tool will be useful to compare & visualize the metrics while benchmarking the dataflow pipelines using various data formats, resource configurations etc
Language: Java - Size: 67.4 KB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 3

janaom/gcp-data-engineering-etl-with-composer-dataflow
This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering solution for processing, storing, and reporting daily transaction data in the online food delivery industry.
Language: Python - Size: 290 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 22 - Forks: 13

blockchain-etl/blockchain-etl-architecture
Blockchain ETL Architecture
Size: 101 KB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 47 - Forks: 14

tosun-si/asgarde
Asgarde allows simplifying error handling with Apache Beam Java, with less code, more concise and expressive code.
Language: Java - Size: 159 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 74 - Forks: 5

twosom/beam-mail-io
Mail Connector for Apache Beam / Google Cloud Dataflow
Language: Java - Size: 61.5 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Juwono136/disaster-tweet-detection-mlops
End-to-end machine learning project to detect tweet is a disaster or not disaster + monitoring model serving using Prometheus and Grafana
Language: Jupyter Notebook - Size: 1.07 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

mkuthan/gcp-dataflow-tampermonkey
Tampermonkey script for GCP Dataflow console with enhanced view for finding job bottlenecks
Language: JavaScript - Size: 1.17 MB - Last synced at: 5 months ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 0

pompierninja/IP-Cameras-Monitoring
distributed computer vision
Language: Jupyter Notebook - Size: 203 MB - Last synced at: 4 days ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

johannaojeling/go-beam-pipeline
Data pipeline built with the Apache Beam Go SDK
Language: Go - Size: 887 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

beam-pyio/firehose_pyio
Apache Beam Python I/O connector for Amazon Data Firehose
Language: Python - Size: 2.83 MB - Last synced at: 6 days ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

beam-pyio/sqs_pyio
Apache Beam Python I/O connector for Amazon SQS
Language: Python - Size: 2.83 MB - Last synced at: 2 days ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

beam-pyio/dynamodb_pyio
Apache Beam Python I/O connector for Amazon DynamoDB
Language: Python - Size: 2.8 MB - Last synced at: 22 days ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

beam-pyio/pyio-cookiecutter
Cookiecutter template for creating a package for the Apache Beam Python I/O Connectors project
Language: Python - Size: 229 KB - Last synced at: 6 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

BeamStackProj/beamstack-cli
Language: Go - Size: 95.3 MB - Last synced at: 8 months ago - Pushed at: 12 months ago - Stars: 3 - Forks: 0

tosun-si/pasgarde
Asgarde allows simplifying error handling with Apache Beam Python, with less code, more concise and expressive code.
Language: Python - Size: 61.5 KB - Last synced at: 21 days ago - Pushed at: about 3 years ago - Stars: 31 - Forks: 1

thecodemancer/study-with-me
Lots of code, resources, examples, some graphs and so much fun ahead!
Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

PEDROPERDO/Out 📦
Out : IMDB Review Classification on Apache Beam
Language: Jupyter Notebook - Size: 73.2 KB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

chermenin/kio
Kotlin extensions for Apache Beam
Language: Kotlin - Size: 1.28 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 12 - Forks: 1

emsalcengiz/Apache-Beam-examples
liked Apache Beam for streaming data transformations
Language: Python - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

felipenufisnor/apachebeam_pipeline_python
Apache Beam: Data Pipeline com Python
Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

thecodemancer/e-commerce
e-commerce
Language: Python - Size: 3.15 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

tosun-si/world-cup-qatar-team-stats-kotlin-midgard
This application shows a full Apache Beam pipeline with Kotlin and Midgard library. The use case works on the last Qatar FIFA world cup data and calculate players statistics per team. This application will be presented at Beam Summit 2023 in New York
Language: Kotlin - Size: 851 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

tosun-si/midgard
Midgard is a wrapper on Beam Kotlin, allowing more concise and expressive code. It removes Beam boilerplate code and proposes more Functional Programming style
Language: Kotlin - Size: 97.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 1

PATRICIAJUNQUEIRA/ETL_Apache_Beam
Projeto utilizando Apache Beam para integrar, processar e analisar dados de chuvas e casos de dengue, criando um ETL que permite analisar os casos de dengue e identificar as cidades com maior incidência.
Language: Jupyter Notebook - Size: 1.21 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

luillyfe/data-pipelines
Language: Go - Size: 52.7 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

NucleusEngineering/hack-your-pipe
Efficient streaming data ingestion, transformation & activation
Language: Python - Size: 3.01 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 28 - Forks: 3

ashleycheng/house-price-etl-pipeline
Building house price data pipelines with Apache Beam and Spark on GCP
Language: Python - Size: 14.6 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 1

bsrikanth24/gcp-data-engineering-etl-with-composer-dataflow
This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering solution for processing, storing, and reporting daily transaction data in the online food delivery industry.
Language: Python - Size: 39.1 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

neo4j-field/dataflow-flex-pyarrow-to-gds
Google Dataflow Flex Templates (in Python) for large scale Graph Loading with GDS and Apache Arrow
Language: Python - Size: 216 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 2

SamuelMarks/workflow-schemata
An exploration of various popular workflow tools from a schema level (in TOML & serde)
Language: Rust - Size: 24.4 KB - Last synced at: 18 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

alxmrs/beam-cli-example
How to structure Apache Beam pipelines as pip-installable CLIs.
Language: Python - Size: 12.7 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

ka-zo/booking-data-analysis
Booking data analysis
Language: Python - Size: 2.21 MB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mkuthan/example-beam
Playground for Apache Beam and Scio experiments, driven by real-world use cases.
Language: Scala - Size: 190 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 9 - Forks: 3

pompierninja/hashtagsbattle 📦
IIM-DEVOPS - (demo) Real-time Twitter's hashtags analytics... fully managed by Google Cloud
Language: Python - Size: 827 KB - Last synced at: 4 days ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

pompierninja/beam-amazon-batch-example
A practical example of batch processing on Google Cloud Dataflow using the Go SDK for Apache Beam :fire:
Language: Go - Size: 455 KB - Last synced at: 4 days ago - Pushed at: almost 6 years ago - Stars: 3 - Forks: 0

mercari/DataflowTemplates
Convenient Dataflow pipelines for transforming data between cloud data sources
Language: Java - Size: 97.7 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 12

Expan75/slacknight
Proof of concept of real-time sentiment analysis of Slack conversations to catch propagation of harassment.
Language: JavaScript - Size: 40 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

regadas/scio-cats
leverage cats type classes and data types in scio pipelines
Language: Scala - Size: 1.09 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 5 - Forks: 2

phamphihungbk/beam-starter
🚢 Example of Apache Beam data pipeline
Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

akj009/bigquery-to-hdfs
export data from big query to hdfs
Language: Java - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1
