Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: data-lakehouse
huwngnosleep/complete_lakehouse_techstack
This project implements an end-to-end techstack for a data platform, can be used on production.
Language: Python - Size: 39.3 MB - Last synced: about 5 hours ago - Pushed: about 6 hours ago - Stars: 0 - Forks: 0
Qbeast-io/qbeast-spark
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Language: Scala - Size: 36.6 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 199 - Forks: 17
aabouzaid/modern-data-platform-poc
My M.Sc. dissertation: Modern Data Platform using DataOps, Kubernetes, and Cloud-Native ecosystem to build a resilient Big Data platform based on Data Lakehouse architecture which is the base for Machine Learning (MLOps) and Artificial Intelligence (AIOps).
Language: Jupyter Notebook - Size: 5.52 MB - Last synced: 20 days ago - Pushed: 20 days ago - Stars: 4 - Forks: 1
Data-Kube/tst-datalakehouse-hudi
#Test - Create a Data Lakehouse in Kubernetes
Size: 85.9 KB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 0 - Forks: 0
mahmoudparsian/data-warehousing
This repository is a place for the Data Warehousing course at the Information Systems & Analytics department, Santa Clara University.
Language: HTML - Size: 167 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 5 - Forks: 1
pracdata/awesome-open-source-data-engineering
A curated list of open source tools used in analytical stacks and data engineering ecosystem
Size: 43.9 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 19 - Forks: 1
gupta-aayushkr/F1-Racing
The project aims to process Formula 1 racing data, create an automated data pipeline, and make the data available for presentation and analysis purposes.
Language: Python - Size: 5.04 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 1 - Forks: 0
dominikhei/Local-Data-LakeHouse
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
Language: Dockerfile - Size: 127 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 24 - Forks: 6
prneidhardt/AWS-Data-Lakehouse
STEDI project
Language: Python - Size: 950 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0
sudohainguyen/mini-lakehouse
Data lakehouse at home with k8s and helm
Language: Jupyter Notebook - Size: 530 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
eavilaes/qbeast-spark Fork of Qbeast-io/qbeast-spark
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Language: Scala - Size: 16.9 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0