Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / sdabhi23 / kafka-data-pipeline
A streaming data pipeline uses Kafka as the backbone and Flink for data processing and transformations. Kafka Connect is used for writing the streams to S3 compatible blob stores and Redis (low latency KV store for real-time ML inference). Spark is used for the batch job to backfill the ml feature data.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sdabhi23%2Fkafka-data-pipeline
Stars: 0
Forks: 0
Open Issues: 0
License: None
Language: Python
Repo Size: 159 KB
Dependencies:
65
Created: 3 months ago
Updated: 2 months ago
Last pushed: 2 months ago
Last synced: 2 months ago
Topics: flink-stream-processing, flink-tab, kafka, kafka-connect, redis, s3
Files
Dependencies
- flink 1.17.2 build
- docker.redpanda.com/redpandadata/connectors v1.0.21 build
- apache/spark-py v3.4.0 build
- pyflink 1.17.2
- redpanda-kafka-connectors v1.0.21
- pyspark v3.4.0
- docker.redpanda.com/redpandadata/console v2.4.5
- docker.redpanda.com/redpandadata/redpanda v23.3.9
- quay.io/minio/minio RELEASE.2024-03-26T22-10-45Z
- redis/redis-stack 7.2.0-v9
- apache-flink 1.17.2
- confluent-kafka *
- faker *
- fastavro *
- minio *
- pyspark ==3.4.0
- requests *
- apache-beam ==2.48.0
- apache-flink ==1.17.2
- apache-flink-libraries ==1.17.2
- argon2-cffi ==23.1.0
- argon2-cffi-bindings ==21.2.0
- avro-python3 ==1.10.2
- certifi ==2024.2.2
- cffi ==1.16.0
- charset-normalizer ==3.3.2
- cloudpickle ==2.2.1
- confluent-kafka ==2.3.0
- crcmod ==1.7
- dill ==0.3.1.1
- dnspython ==2.6.1
- docopt ==0.6.2
- faker ==24.4.0
- fastavro ==1.9.4
- fasteners ==0.19
- find-libpython ==0.4.0
- grpcio ==1.62.1
- hdfs ==2.7.3
- httplib2 ==0.22.0
- idna ==3.6
- minio ==7.2.5
- numpy ==1.24.4
- objsize ==0.6.1
- orjson ==3.10.0
- pandas ==2.2.1
- pemja ==0.3.0
- proto-plus ==1.23.0
- protobuf ==4.23.4
- py4j ==0.10.9.7
- pyarrow ==11.0.0
- pycparser ==2.22
- pycryptodome ==3.20.0
- pydot ==1.4.2
- pymongo ==4.6.3
- pyparsing ==3.1.2
- pyspark ==3.4.0
- python-dateutil ==2.9.0.post0
- pytz ==2024.1
- regex ==2023.12.25
- requests ==2.31.0
- six ==1.16.0
- typing-extensions ==4.10.0
- tzdata ==2024.1
- urllib3 ==2.2.1
- zstandard ==0.22.0