GitHub topics: aws-glue-crawler
harika-majji/aws-stock-market-analysis
Language: Jupyter Notebook - Size: 2.38 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

jibbs1703/Tickit-Data-Pipeline
This repository demonstrates the creation of a robust data pipeline using an Orchestrator, on-prem and cloud resources. It collects data from on-premises SQL and NoSQL database and loads it into a SQL database in the cloud.
Language: Python - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ShubhamMohanty680/Spotify_end_to_end_data_engineering
It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.
Language: Jupyter Notebook - Size: 1.44 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

shahidmalik4/aws-glue-stepfunctions-etl
This project automates an ETL pipeline using AWS Glue, S3, Athena, and Step Functions to transform raw Airbnb data. It cleanses, enriches, and organizes the data into separate raw and transformed databases, enabling efficient querying and analysis via Athena, with automated notifications through SNS.
Language: Python - Size: 3.47 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

VvEK-Hiremath/Airlines-Data-Pipeline-Project-AWS
Implementing data pipeline using AWS services for airlines data
Language: Python - Size: 195 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

SadafAsad/LinkedIn-Jobs-Analysis
Unveiling job market trends with Scrapy and AWS
Language: Python - Size: 562 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

desininja/Quality-Movie-Data-Pipeline
ETL pipeline using AWS services
Language: Python - Size: 727 KB - Last synced at: 19 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Saurabhkhandebharad/BigData-SK
Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!
Language: Python - Size: 8.79 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

Tyriek-cloud/NYC-Mobility-Survey-Analysis
An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.
Language: Python - Size: 2.75 MB - Last synced at: 15 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

TravelXML/KAFKA-PYTHON-AWS-CRAWLER-AMAZON-ATHENA
A comprehensive tutorials / steps / scripts for setting up Apache Kafka on an Amazon EC2 instance, streaming logs to S3, and querying data with AWS Glue and Amazon Athena. Includes Zookeeper configuration, producer and consumer setup, and automated data catalog creation
Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

sumanthmalipeddi/spotify_trending_telugu
Collecting the list of songs,album and artists list details from the Spotify Music Application in specific intervals using spotipy API and performing ETL Operations using Amazon Cloud Services
Language: Jupyter Notebook - Size: 630 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

h-fuzzy-logic/data-analytics-spring
Open data and cloud computing to answer the question: Are we losing our spring days?
Language: Jupyter Notebook - Size: 390 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mihirkudale/Stock-Market-Real-Time-Data-Engineering-Project
In this project, you will execute an End-To-End Data Engineering Project on Real-Time Stock Market Data using Kafka. We are going to use different technologies such as Python, Amazon Web Services (AWS), Apache Kafka, Glue, Athena, and SQL.
Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

fermat01/ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena
Etl data pipeline using aws services
Language: Python - Size: 4.07 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

sarah-zhan/data_pipeline_amazon_products
An end-to-end data pipeline built with AWS S3, Glue, Crawler, Athena, Tableau visulization
Language: Jupyter Notebook - Size: 1.74 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Kartik-Banga/Automated-ETL-Pipeline-for-Playstore-Data
Implemented ETL pipeline on AWS for Playstore data using Lambda, Glue Crawlers, and Glue ETL Jobs. Orchestrated workflow with Step Functions and achieved seamless integration, optimal data merging, and enhanced data quality/accessibility.
Size: 2.97 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

imverma/DataEngineering-YouTube-Analysis-Project
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
Language: Python - Size: 61.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

GabrielDan92/AWS_Terraform_PySpark-ETL_Job
Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.
Language: Python - Size: 22.5 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

dhvani-k/YouTrend_Insights_Analyzing_YouTube_Video_Landscape
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
Language: Python - Size: 59.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AirtonLira/aws-bigdata-glue-athena
Este projeto tem como objetivo realizar a coleta, catalogo, governança, processamento e visualização de dados.
Size: 3.76 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

thedatanerdz/DEP-7
AWS Covid data engineering project
Size: 8.79 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rahulrajan15/Stock_Market_Kafka
Real-Time Stock Market Data Science Project using Apache Kafka: Analyzing and predicting stock market trends in real-time for informed decision-making. Scalable and low-latency data processing.
Language: Jupyter Notebook - Size: 2.48 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aws-samples/amazon-rds-export-to-s3-automation
This repository contains source code for the AWS Database Blog Post Reduce data archiving costs for compliance by automating RDS snapshot exports to Amazon S3
Size: 235 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 2

aws-samples/aws-glue-crawler-utilities
This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.
Language: Python - Size: 107 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 10

masood2iq/AWS-Athena-Glue-S3-Bucket-Deployment-Through-AWSConsole
AWS Athena, Glue Database, Glue Crawler and S3 buckets deployment through AWS GUI console.
Size: 3.18 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

productiveAnalytics/aws-cdk-constructs-sandbox
Cloud Development Kit (AWS CDK) using TypeScript, Python and Java
Language: Java - Size: 5.49 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0
