GitHub topics: apache-sparksql
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
Language: Go - Size: 150 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4,723 - Forks: 380

anqorithm/RealTime-StockStream
RealTime StockStream is a streamlined, simulation system for processing live stock market data. It uses Apache Kafka for data input, Apache Spark for data handling, and Apache Cassandra for data storage, making it a powerful yet easy-to-use tool for financial data analysis
Language: Python - Size: 5.36 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 26 - Forks: 3

umbertogriffo/apache-spark-best-practices-and-tuning
https://umbertogriffo.gitbook.io/apache-spark-best-practices-and-tuning/
Size: 1.78 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 2

JKA098/Pokemon-Feistiness-Apache-Spark-Job
The following readme file, assume that before running the Spark analytic job, you have already installed the correct versions of **Java**, **Hadoop**, **Spark** and that you are inside **Ubuntu**.
Language: Python - Size: 184 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ajaymahadeven/Apache-Spark-Programs
This repository contains Apache Spark programs implemented in Python. These programs are part of my learning process for Apache Spark and are intended to serve as examples for anyone who is also learning or working with Apache Spark.
Language: Python - Size: 3.94 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

tspannhw/table-ddl
DDL for Kudu, Impala, Phoenix, HBase, Hive, MySQL, PostgreSQL, Calcite, ... Tables. SQL.
Language: TSQL - Size: 34.2 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1
