An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-engineering-workflows"

socialpoint-labs/sqlbucket 📦

Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.

Language: Python - Size: 463 KB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 74 - Forks: 8

junipertcy/networkie

A first course for data engineering workflows

Language: Python - Size: 2.35 MB - Last synced at: about 12 hours ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 52

marianajo/beam-examples

Examples that I use to learn and show Apache Beam

Language: Python - Size: 14.6 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

cybergeekgyan/Data-Engineering-Portfolio

Data Engineering portfolio projects, resources used to study data tools...

Language: Jupyter Notebook - Size: 2.92 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

waqarg2001/Youtube-Data-Pipeline-AWS

Leveraging AWS Cloud Services, an ETL pipeline transforms YouTube video statistics data. Data is downloaded from Kaggle, uploaded to an S3 bucket, and cataloged using AWS Glue for querying with Athena. AWS Lambda and Glue converts to Parquet format and stores it in a cleansed S3 bucket. AWS QuickSight then visualizes the materialised data.

Language: Python - Size: 2.89 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

borbert/Data_Engineering_Nanodegree

This repository is the collection point for all of the projects completed during the Udacity Data Engineering Nano Degree program.

Language: Jupyter Notebook - Size: 44.8 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1