Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / airscholar / RealtimeStreamingEngineering

This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data acquisition, processing, sentiment analysis with ChatGPT, production to kafka topic and connection to elasticsearch.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/airscholar%2FRealtimeStreamingEngineering

Stars: 15
Forks: 9
Open Issues: 0

License: None
Language: Python
Repo Size: 726 KB
Dependencies: pending

Created: 7 months ago
Updated: about 1 month ago
Last pushed: 5 months ago
Last synced: about 1 month ago

Topics: apache-spark, chatgpt, dataengineering, elasticsearch, kafka, openai-api, tcp-socket

Files
    Loading...
    Readme
    Loading...