An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-engineering-project"

k0rsakov/dag_factory

Фабрика DAG

Language: Python - Size: 17.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 2

sanketrs/implementation-of-modern-data-engineering-architecture-with-fabric_analytics

Building a next-generation hybrid data pipeline architecture that combines the power of Microsoft Fabric, Azure Cloud, and Power BI. This pipeline is engineered to tackle the challenges of real-time data ingestion, multi-layered processing, and analytics, delivering business-critical insights.

Language: Python - Size: 32.2 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml

Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.

Language: Python - Size: 163 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 2

k0rsakov/infrastructure_for_data_engineer_kafka

infrastructure_for_data_engineer_kafka

Language: Python - Size: 17.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

k0rsakov/scd_dag_factory

Фабрика DAG через SCD-таблицу с конфигурациями

Language: Python - Size: 25.4 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 1

k0rsakov/infrastructure_for_data_engineer_S3

Инфраструктура для data engineer S3

Language: Python - Size: 11.7 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

k0rsakov/all_about_DuckDB

Всё что нужно знать про DuckDB

Language: Jupyter Notebook - Size: 33.2 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

waqarg2001/Youtube-Data-Pipeline-AWS

Leveraging AWS Cloud Services, an ETL pipeline transforms YouTube video statistics data. Data is downloaded from Kaggle, uploaded to an S3 bucket, and cataloged using AWS Glue for querying with Athena. AWS Lambda and Glue converts to Parquet format and stores it in a cleansed S3 bucket. AWS QuickSight then visualizes the materialised data.

Language: Python - Size: 2.89 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0