Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / airscholar / RealtimeStreamingEngineering
This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data acquisition, processing, sentiment analysis with ChatGPT, production to kafka topic and connection to elasticsearch.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/airscholar%2FRealtimeStreamingEngineering
Stars: 15
Forks: 9
Open Issues: 0
License: None
Language: Python
Repo Size: 726 KB
Dependencies: pending
Created: 7 months ago
Updated: about 1 month ago
Last pushed: 5 months ago
Last synced: about 1 month ago
Topics: apache-spark, chatgpt, dataengineering, elasticsearch, kafka, openai-api, tcp-socket