GitHub topics: apache-sparksql
anqorithm/RealTime-StockStream
RealTime StockStream is a streamlined, simulation system for processing live stock market data. It uses Apache Kafka for data input, Apache Spark for data handling, and Apache Cassandra for data storage, making it a powerful yet easy-to-use tool for financial data analysis
Language: Python - Size: 5.36 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 26 - Forks: 3

treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
Language: Go - Size: 149 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 4,660 - Forks: 373

umbertogriffo/apache-spark-best-practices-and-tuning
https://umbertogriffo.gitbook.io/apache-spark-best-practices-and-tuning/
Size: 1.78 MB - Last synced at: about 22 hours ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 2

JKA098/Pokemon-Feistiness-Apache-Spark-Job
The following readme file, assume that before running the Spark analytic job, you have already installed the correct versions of **Java**, **Hadoop**, **Spark** and that you are inside **Ubuntu**.
Language: Python - Size: 184 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

ajaymahadeven/Apache-Spark-Programs
This repository contains Apache Spark programs implemented in Python. These programs are part of my learning process for Apache Spark and are intended to serve as examples for anyone who is also learning or working with Apache Spark.
Language: Python - Size: 3.94 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

tspannhw/table-ddl
DDL for Kudu, Impala, Phoenix, HBase, Hive, MySQL, PostgreSQL, Calcite, ... Tables. SQL.
Language: TSQL - Size: 34.2 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1
