Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / sdabhi23 / kafka-data-pipeline

A streaming data pipeline uses Kafka as the backbone and Flink for data processing and transformations. Kafka Connect is used for writing the streams to S3 compatible blob stores and Redis (low latency KV store for real-time ML inference). Spark is used for the batch job to backfill the ml feature data.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sdabhi23%2Fkafka-data-pipeline

Stars: 0
Forks: 0
Open Issues: 0

License: None
Language: Python
Repo Size: 159 KB
Dependencies: 65

Created: 3 months ago
Updated: 2 months ago
Last pushed: 2 months ago
Last synced: 2 months ago

Topics: flink-stream-processing, flink-tab, kafka, kafka-connect, redis, s3