An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: apache-beam

mmistroni/GCP_Experiments

Various utilities to be reused on GCP

Language: Python - Size: 2.57 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 5 - Forks: 1

GoogleCloudPlatform/DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks

Language: Java - Size: 25.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,235 - Forks: 1,040

CamilaJaviera91/mini-gcp

This project simulates a modern data pipeline architecture, entirely locally. It follows a modular design to extract, transform, load, validate, and analyze synthetic sales data using Python, Apache Beam, DuckDB, and PostgreSQL.

Language: Python - Size: 422 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 0

nielsbasjes/yauaa

Yet Another UserAgent Analyzer

Language: Java - Size: 86 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 850 - Forks: 140

HarishHary/blink

Detection engine at scale using Apache Beam, Apache Flink, Kubernetes

Language: Go - Size: 8.13 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

tensorflow/tfx

TFX is an end-to-end platform for deploying production ML pipelines

Language: Python - Size: 234 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 2,161 - Forks: 723

ivanildobarauna-dev/data-pipeline-async-ingest

Pipeline for processing and consuming streaming data from Pub/Sub, integrating with Dataflow for real-time data processing

Language: Python - Size: 738 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

spotify/flink-on-k8s-operator

Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.

Language: Go - Size: 3.07 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 216 - Forks: 74

ncwhh/dataflow-jdbc-connection-pool

Example: Limit JDBC connections in Dataflow DoFns with a singleton pool

Language: Java - Size: 9.77 KB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

Ganeshsivakumar/langchain-beam

Integrates LLMs as PTransform in Apache Beam pipelines using LangChain

Language: Java - Size: 1.07 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 21 - Forks: 2

blockchain-etl/bitcoin-etl

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ

Language: Python - Size: 354 KB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 433 - Forks: 131

google/weather-tools

Tools to make weather data accessible and useful.

Language: Python - Size: 196 MB - Last synced at: 4 days ago - Pushed at: 24 days ago - Stars: 233 - Forks: 44

google/fhir-data-pipes

A collection of tools for extracting FHIR resources and analytics services on top of that data.

Language: Jupyter Notebook - Size: 389 MB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 187 - Forks: 111

mbari-org/aipipeline

Library for running detection, clustering or classification ai pipelines using ApacheBeam

Language: Jupyter Notebook - Size: 78.2 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 0

mercari/pipeline

Tool to define Apache Beam pipeline in YAML or JSON

Language: Java - Size: 3.52 MB - Last synced at: 8 days ago - Pushed at: 16 days ago - Stars: 77 - Forks: 22

ngrunwald/datasplash

Clojure API for a more dynamic Google Dataflow

Language: Clojure - Size: 846 KB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 131 - Forks: 32

O2-Czech-Republic/proxima-platform

The Proxima platform.

Language: Java - Size: 9.42 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 21 - Forks: 7

google/consent-based-conversion-adjustments

Code to statistically up-weight conversion values of consenting customers to feed up to 100% of the factual conversion values back into Google Ads.

Language: Python - Size: 63.5 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 24 - Forks: 8

diabahmed/london-bicycle-analysis

Apache Beam pipeline for analyzing London bicycle sharing data using Google Cloud Dataflow and BigQuery.

Language: Python - Size: 31.3 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

google/tensorflow-recorder 📦

TFRecorder makes it easy to create TensorFlow records (TFRecords) from Pandas DataFrames and CSVs files containing images or structured data.

Language: Python - Size: 6.54 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 180 - Forks: 32

SolaceProducts/solace-apache-beam

Solace connector for Apache Beam / Google Cloud Dataflow

Language: Java - Size: 3.3 MB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 4 - Forks: 14

akvelon/beam Fork of apache/beam

Apache Beam is a unified programming model for Batch and Streaming

Language: Java - Size: 509 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 12

mkuthan/stream-processing

Learn how to develop and test stateful streaming and batch data pipelines

Language: Scala - Size: 24.9 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 8 - Forks: 2

blockchain-etl/blockchain-etl-streaming

Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes

Language: Python - Size: 64.5 KB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 81 - Forks: 22

google-parfait/dataset_grouper

Libraries for efficient and scalable group-structured dataset pipelines.

Language: Python - Size: 60.5 KB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 26 - Forks: 4

sayakpaul/count-tokens-hf-datasets

This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Dataflow.

Language: Python - Size: 19.5 KB - Last synced at: 1 day ago - Pushed at: almost 3 years ago - Stars: 27 - Forks: 1

miozilla/dataflowbeam

dataflowbeam

Size: 1.16 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

HimanshuMohanty-Git24/StreamLineIRCTC

A real-time data pipeline simulating IRCTC bookings using GCP. It streams mock data via Pub/Sub, transforms it with Dataflow (Python UDF), stores results in BigQuery, and powers live dashboards. Includes error handling and schema validation.

Language: Python - Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

blockchain-etl/hedera-etl

ETL scripts for Hedera Hashgraph

Language: Java - Size: 511 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 12 - Forks: 3

medzin/beam-postgres

Light IO transforms for Postgres read/write in Apache Beam pipelines.

Language: Python - Size: 46.9 KB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 13 - Forks: 4

blockchain-etl/eos-etl

ETL scripts for EOS.

Language: Python - Size: 226 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 7

eduardogr/playing-apache-beam-tour

Playing with Apache Beam Tour: https://tour.beam.apache.org

Language: Go - Size: 34.2 KB - Last synced at: 25 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

eliias/gleam

Fun DSL for Apache Beam and Kotlin.

Language: Kotlin - Size: 2.72 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

thiagoneye/course-apache_beam

Apache Beam studies.

Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

rrmerugu-archive/apache-beam-io-extras

The missing I/O PTransforms of Apache Beam in python; which already exist in Java SDK based but not yet supported in the official apache-beam module.

Language: Python - Size: 12.7 KB - Last synced at: 1 day ago - Pushed at: about 7 years ago - Stars: 7 - Forks: 6

xlisp/apache-beam-csv-process

apache beam csv process

Language: Java - Size: 28.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

iht/bigquery-dataflow-cdc-example

A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates

Language: Java - Size: 184 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

CamilaJaviera91/apache-beam-pipeline-first-approach

This code demonstrates how to integrate Apache Beam with scikit-learn datasets and perform simple data transformations. It loads the Linnerud dataset from scikit-learn, converts it into a Pandas DataFrame for easier manipulation.

Language: Python - Size: 608 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

GoogleCloudPlatform/dataflow-pubsub-dedup

Language: Java - Size: 67.4 KB - Last synced at: 5 months ago - Pushed at: 12 months ago - Stars: 16 - Forks: 10

avcaliani/hello-airflow

🌬 PoC using Apache Airflow

Language: Python - Size: 499 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

yeopster/churn-prediction-GCP

Churn Prediction Machine Learning Using Google Cloud Platform

Language: Jupyter Notebook - Size: 126 KB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

armahdavi/MLOps

Productionizing ML Models using a variety of tools including FastAPI, Flask, Doocker, AWS, GCP, TensorFlow Extended (TFX), and TF.js.

Language: Jupyter Notebook - Size: 5.97 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

vladimirrotariu/parallel-monte-carlo-simulations

A package to orchestrate parallel (Monte Carlo) simulations via Apache Beam for an arbitrary number of models, with low-level parameter granularity, and flexible random number generator choice.

Language: Python - Size: 883 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

GoogleCloudPlatform/flink-on-k8s-operator 📦

[DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.

Language: Go - Size: 1.67 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 658 - Forks: 266

esakik/beam-mysql-connector

An Apache Beam I/O connector for seamless integration with MySQL database 🔗 https://beam.apache.org/documentation/io/connectors/#other-io-connectors-for-apache-beam

Language: Python - Size: 188 KB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 21 - Forks: 19

datastacktv/apache-beam-explained

Source code for the YouTube video, Apache Beam Explained in 12 Minutes

Language: Python - Size: 2.93 KB - Last synced at: 4 months ago - Pushed at: almost 5 years ago - Stars: 21 - Forks: 14

marceloneppel/apache-beam-golang-udf

Run UDFs (User Defined Functions) on Apache Beam Golang SDK.

Language: Go - Size: 14.6 KB - Last synced at: 5 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

ArthurCoutinho15/rain_data_pipeline

Language: Python - Size: 1.19 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

iht/beam-cloud-build-terraform

The scripts in this repo will build the Apache Beam Java SDK packages, using Cloud Build and Artifact Registry, for a personal Beam fork.

Language: HCL - Size: 71.3 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 1

davidkhala/ETL

Collection of data Extract, Transform, Load

Language: Batchfile - Size: 105 KB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

xmlking/micro-apps

Microservices in Post-Kubernetes Era. A polyglot monorepo

Language: Kotlin - Size: 5.77 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 53 - Forks: 10

archie-cm/Mobile_Game_Analysis_Real-Time_Pipeline_with_PubSub_and_Dataflow

This project demonstrates how to build a real-time analytics pipeline for mobile game data using Google Cloud Pub/Sub and Apache Beam (Dataflow).

Language: Python - Size: 13.7 KB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

archie-cm/identify_bank_defaulter_customer_with_beam

This project builds a data pipeline to identify bank defaulter customers based on credit card and loan payment data using Google Dataflow

Language: Python - Size: 99.6 KB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

mohaseeb/beam-nuggets

Collection of transforms for the Apache beam python SDK.

Language: Python - Size: 6.19 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 39

bipinct/beam-kafka-kotlin-boilerplate

This project is a boilerplate setup for Kafka and Apache Beam integration, built using Gradle and Kotlin. It leverages Apache Beam for data processing and Kafka for messaging.

Language: Kotlin - Size: 60.5 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

rm3l/apache-beam-java-firestore-batch-dataflow 📦

Companion Repo for blog post : https://rm3l.org/batch-writes-to-google-cloud-firestore-using-the-apache-beam-java-sdk-on-google-cloud-dataflow/

Language: Java - Size: 101 KB - Last synced at: 5 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

data-mission/dota2-cast-assist

Real-time Dota2 broadcaster’s assistant integrates the live Steam API with Dota GSI to provide game metrics like GPM, XPM, kills, deaths, damage, buybacks, and more, enhancing commentary with insights on player performance and the in-game economy

Language: Python - Size: 8.93 MB - Last synced at: 6 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

uche-madu/twitter-pipeline

Twitter data streaming and analysis

Language: Python - Size: 24.4 KB - Last synced at: 8 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

janaom/gcp-de-project-streaming-pubsub-beam-dataflow

This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.

Language: Python - Size: 172 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

datastacktv/apache-beam-batch-processing

Public source code for the Batch Processing with Apache Beam (Python) online course

Language: Python - Size: 81.1 KB - Last synced at: 19 days ago - Pushed at: almost 5 years ago - Stars: 18 - Forks: 9

GoogleCloudPlatform/dataflow-metrics-exporter

CLI tool to collect dataflow resource & execution metrics and export to either BigQuery or Google Cloud Storage. Tool will be useful to compare & visualize the metrics while benchmarking the dataflow pipelines using various data formats, resource configurations etc

Language: Java - Size: 67.4 KB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 3

janaom/gcp-data-engineering-etl-with-composer-dataflow

This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering solution for processing, storing, and reporting daily transaction data in the online food delivery industry.

Language: Python - Size: 290 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 22 - Forks: 13

blockchain-etl/blockchain-etl-architecture

Blockchain ETL Architecture

Size: 101 KB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 47 - Forks: 14

tosun-si/asgarde

Asgarde allows simplifying error handling with Apache Beam Java, with less code, more concise and expressive code.

Language: Java - Size: 159 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 74 - Forks: 5

twosom/beam-mail-io

Mail Connector for Apache Beam / Google Cloud Dataflow

Language: Java - Size: 61.5 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Juwono136/disaster-tweet-detection-mlops

End-to-end machine learning project to detect tweet is a disaster or not disaster + monitoring model serving using Prometheus and Grafana

Language: Jupyter Notebook - Size: 1.07 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

mkuthan/gcp-dataflow-tampermonkey

Tampermonkey script for GCP Dataflow console with enhanced view for finding job bottlenecks

Language: JavaScript - Size: 1.17 MB - Last synced at: 5 months ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 0

pompierninja/IP-Cameras-Monitoring

distributed computer vision

Language: Jupyter Notebook - Size: 203 MB - Last synced at: 4 days ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

johannaojeling/go-beam-pipeline

Data pipeline built with the Apache Beam Go SDK

Language: Go - Size: 887 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

beam-pyio/firehose_pyio

Apache Beam Python I/O connector for Amazon Data Firehose

Language: Python - Size: 2.83 MB - Last synced at: 6 days ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

beam-pyio/sqs_pyio

Apache Beam Python I/O connector for Amazon SQS

Language: Python - Size: 2.83 MB - Last synced at: 2 days ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

beam-pyio/dynamodb_pyio

Apache Beam Python I/O connector for Amazon DynamoDB

Language: Python - Size: 2.8 MB - Last synced at: 22 days ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

beam-pyio/pyio-cookiecutter

Cookiecutter template for creating a package for the Apache Beam Python I/O Connectors project

Language: Python - Size: 229 KB - Last synced at: 6 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

BeamStackProj/beamstack-cli

Language: Go - Size: 95.3 MB - Last synced at: 8 months ago - Pushed at: 12 months ago - Stars: 3 - Forks: 0

tosun-si/pasgarde

Asgarde allows simplifying error handling with Apache Beam Python, with less code, more concise and expressive code.

Language: Python - Size: 61.5 KB - Last synced at: 21 days ago - Pushed at: about 3 years ago - Stars: 31 - Forks: 1

thecodemancer/study-with-me

Lots of code, resources, examples, some graphs and so much fun ahead!

Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

PEDROPERDO/Out 📦

Out : IMDB Review Classification on Apache Beam

Language: Jupyter Notebook - Size: 73.2 KB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

chermenin/kio

Kotlin extensions for Apache Beam

Language: Kotlin - Size: 1.28 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 12 - Forks: 1

emsalcengiz/Apache-Beam-examples

liked Apache Beam for streaming data transformations

Language: Python - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

felipenufisnor/apachebeam_pipeline_python

Apache Beam: Data Pipeline com Python

Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

thecodemancer/e-commerce

e-commerce

Language: Python - Size: 3.15 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

tosun-si/world-cup-qatar-team-stats-kotlin-midgard

This application shows a full Apache Beam pipeline with Kotlin and Midgard library. The use case works on the last Qatar FIFA world cup data and calculate players statistics per team. This application will be presented at Beam Summit 2023 in New York

Language: Kotlin - Size: 851 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

tosun-si/midgard

Midgard is a wrapper on Beam Kotlin, allowing more concise and expressive code. It removes Beam boilerplate code and proposes more Functional Programming style

Language: Kotlin - Size: 97.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 1

PATRICIAJUNQUEIRA/ETL_Apache_Beam

Projeto utilizando Apache Beam para integrar, processar e analisar dados de chuvas e casos de dengue, criando um ETL que permite analisar os casos de dengue e identificar as cidades com maior incidência.

Language: Jupyter Notebook - Size: 1.21 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

luillyfe/data-pipelines

Language: Go - Size: 52.7 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

NucleusEngineering/hack-your-pipe

Efficient streaming data ingestion, transformation & activation

Language: Python - Size: 3.01 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 28 - Forks: 3

ashleycheng/house-price-etl-pipeline

Building house price data pipelines with Apache Beam and Spark on GCP

Language: Python - Size: 14.6 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 1

bsrikanth24/gcp-data-engineering-etl-with-composer-dataflow

This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering solution for processing, storing, and reporting daily transaction data in the online food delivery industry.

Language: Python - Size: 39.1 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

neo4j-field/dataflow-flex-pyarrow-to-gds

Google Dataflow Flex Templates (in Python) for large scale Graph Loading with GDS and Apache Arrow

Language: Python - Size: 216 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 2

SamuelMarks/workflow-schemata

An exploration of various popular workflow tools from a schema level (in TOML & serde)

Language: Rust - Size: 24.4 KB - Last synced at: 18 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

alxmrs/beam-cli-example

How to structure Apache Beam pipelines as pip-installable CLIs.

Language: Python - Size: 12.7 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

ka-zo/booking-data-analysis

Booking data analysis

Language: Python - Size: 2.21 MB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mkuthan/example-beam

Playground for Apache Beam and Scio experiments, driven by real-world use cases.

Language: Scala - Size: 190 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 9 - Forks: 3

pompierninja/hashtagsbattle 📦

IIM-DEVOPS - (demo) Real-time Twitter's hashtags analytics... fully managed by Google Cloud

Language: Python - Size: 827 KB - Last synced at: 4 days ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

pompierninja/beam-amazon-batch-example

A practical example of batch processing on Google Cloud Dataflow using the Go SDK for Apache Beam :fire:

Language: Go - Size: 455 KB - Last synced at: 4 days ago - Pushed at: almost 6 years ago - Stars: 3 - Forks: 0

mercari/DataflowTemplates

Convenient Dataflow pipelines for transforming data between cloud data sources

Language: Java - Size: 97.7 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 12

Expan75/slacknight

Proof of concept of real-time sentiment analysis of Slack conversations to catch propagation of harassment.

Language: JavaScript - Size: 40 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

regadas/scio-cats

leverage cats type classes and data types in scio pipelines

Language: Scala - Size: 1.09 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 5 - Forks: 2

phamphihungbk/beam-starter

🚢 Example of Apache Beam data pipeline

Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

akj009/bigquery-to-hdfs

export data from big query to hdfs

Language: Java - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1