GitHub topics: aws-athena
omurkoc/enhanced-sentiment-analysis
A unique sentiment analysis model on IMDB reviews with custom negation handling. Instead of generic preprocessing, it smartly tags words after negators like "not" (e.g., "not good" → "not_good"), preserving sentiment context. Comparison of models with and without this logic shows improved accuracy and real-world reliability.
Language: Python - Size: 9.77 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

gps31320779/insightflow-retail-economic-pipeline
A data engineering portfolio project using AWS cloud services to analyze correlations between Malaysian retail performance and fuel prices. Features Terraform IaC, ETL/ELT with AWS S3, Glue, SQL analytics via Athena coupled with data transformation via dbt, and workflow orchestration with Kestra.
Size: 8.79 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

saifuzzuhdi123/apache_kafka_stock_market_data_streaming
This repository provides a clear guide on using Apache Kafka for real-time stock market data streaming. 📈 Explore how to set up producers and consumers, and see practical applications in financial data processing. 🛠️
Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

pizofreude/insightflow-retail-economic-pipeline
A data engineering portfolio project using AWS cloud services to analyze correlations between Malaysian retail performance and fuel prices. Features Terraform IaC, ETL/ELT with AWS S3, Glue, SQL analytics via Athena coupled with data transformation via dbt, and workflow orchestration with Kestra.
Language: HCL - Size: 2.38 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

ccao-data/data-architecture
Codebase for CCAO data infrastructure construction and management
Language: R - Size: 31 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 6 - Forks: 4

Omio-saha/Spotify_Data_Pipe_Snowflake
It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS into snowflake datawarehouse. It utilizes AWS services such as Lambda, S3, and CloudWatch to orchestrate the process. The transformed data is then loaded into Snowflake using Snowpipe, and finally visualized in Power BI.
Size: 1000 Bytes - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

ghfjd/youtube-veri-analizi-sunum
Veri analizi hakkında hazırladığım sunum
Size: 1000 Bytes - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

The-AI-Alliance/analytics
Repository for the AI Alliance Analytics Stack
Language: Python - Size: 310 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 2 - Forks: 1

commoncrawl/cc-notebooks
Various Jupyter notebooks about Common Crawl data
Language: Jupyter Notebook - Size: 3.01 MB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 54 - Forks: 11

commoncrawl/cc-index-table
Index Common Crawl archives in tabular format
Language: Java - Size: 205 KB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 122 - Forks: 11

ShreyasShende3/reddit-data-engineering
Built a ETL pipeline using Airflow and then used various AWS tools for further processing, storage and visualization like S3, Glue, Athena and Redshift
Language: Python - Size: 119 KB - Last synced at: 26 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

aws-samples/streamlit-application-deployment-on-aws
Streamlit EDA Dashboard Powered by AWS Cloud
Language: Python - Size: 3.99 MB - Last synced at: 19 days ago - Pushed at: about 1 month ago - Stars: 82 - Forks: 33

dbcli/athenacli
AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.
Language: Python - Size: 995 KB - Last synced at: 30 days ago - Pushed at: about 3 years ago - Stars: 214 - Forks: 32

tokern/piicatcher
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
Language: Python - Size: 1.38 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 311 - Forks: 99

frankndungu/f1-streamlit-data-pipline
A serverless data project showing how to ingest, query, and visualize F1 data using AWS Glue, Athena, and Streamlit.
Language: Python - Size: 217 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

HariSekhon/SQL-scripts
100+ SQL Scripts - PostgreSQL, MySQL, Oracle, Google BigQuery, MariaDB, AWS Athena. DBA, Analytics, DevOps, performance engineering. Google BigQuery ML machine learning classification.
Language: Shell - Size: 620 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 444 - Forks: 124

aws-samples/transactional-datalake-using-amazon-datafirehose-iceberg
Stream CDC into an Amazon S3 data lake in Apache Iceberg table format with Amazon Data Firehose and DMS
Language: Python - Size: 546 KB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 11 - Forks: 1

glassechidna/config2jsonlines
Transform AWS Config snapshots to a more AWS Athena-friendly format.
Language: Go - Size: 276 KB - Last synced at: 25 days ago - Pushed at: almost 5 years ago - Stars: 11 - Forks: 3

vsingh55/NBA-Analytics-Data-Lake
A sports analytics data lake leveraging AWS S3 for storage, AWS Glue for data cataloging, and AWS Athena for querying. Python scripts are used for data ingestion and manages the infrastructure.
Language: Python - Size: 1.32 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

aws-samples/aws-glue-streaming-etl-with-delta-lake
Streaming ETL job cases in AWS Glue to integrate Delta Lake and creating an in-place updatable data lake on Amazon S3
Language: Python - Size: 314 KB - Last synced at: 19 days ago - Pushed at: 10 months ago - Stars: 9 - Forks: 0

aws-samples/transactional-datalake-using-apache-iceberg-on-aws-glue
Stream CDC into an Amazon S3 data lake in Apache Iceberg table format with AWS Glue Streaming and DMS
Language: Python - Size: 727 KB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 32 - Forks: 2

Danitilahun/Reddit-Data-Engineering
This project automates the extraction, transformation, and loading (ETL) of Reddit data into a Redshift data warehouse using Airflow. Key technologies include Celery, PostgreSQL, S3, Glue, Athena, and Redshift, providing a complete data pipeline solution.
Size: 119 KB - Last synced at: 11 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

classmethod/athena-query
Athena-Query provide simple interface to get athena query results.
Language: TypeScript - Size: 433 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 10 - Forks: 6

AlexisRodriguezCS/serverless-data-platform
Serverless data platform using AWS Lambda, S3, DynamoDB, Athena & CDK. Upload files, run SQL, and deploy with CI/CD, fully serverless and production-ready
Size: 2.93 KB - Last synced at: 11 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

aws-samples/saas-metering-system-on-aws
This project shows how to implement a simple SaaS metering system on AWS
Language: Python - Size: 971 KB - Last synced at: 19 days ago - Pushed at: 2 months ago - Stars: 11 - Forks: 2

aws-samples/aws-analytics-immersion-day
Describes the concepts of lambda architecture and the actual deployment process with an example of building a serverless business intelligence systems using Amazon Kinesis, S3, Athena, OpenSearch Service, and QuickSight.
Language: Python - Size: 12.9 MB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 14 - Forks: 8

ghdna/athena-express
Athena-Express can simplify executing SQL queries in Amazon Athena AND fetching cleaned-up JSON results in the same synchronous or asynchronous request - well suited for web applications.
Language: JavaScript - Size: 214 KB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 182 - Forks: 70

JaewonSon37/Mining_Big_Data2
Topic: Exploring the Relationship Between Weather and Taxi Demand in Chicago
Language: Jupyter Notebook - Size: 181 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

VandanaBhumireddygari/Data-Engineering-YouTube-Analysis-Project
This project focuses on securely managing, streamlining, and analyzing structured and semi-structured data from YouTube videos based on categories and trending metrics. The goal is to build a comprehensive ETL system to process and transform raw data into a usable format, store it in a centralized data lake, and scale the solutions.
Language: Python - Size: 59.6 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

tedilabs/terraform-aws-data
🌳 A sustainable Terraform Package which creates resources for Data Services on AWS
Language: HCL - Size: 169 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 14 - Forks: 4

aws-samples/aws-glue-streaming-etl-with-apache-iceberg
Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3
Language: Python - Size: 465 KB - Last synced at: 19 days ago - Pushed at: 10 months ago - Stars: 23 - Forks: 2

dacort/metabase-athena-driver
An Amazon Athena driver for Metabase 0.32 and later
Language: Clojure - Size: 143 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 223 - Forks: 32

segmentio/go-athena
Golang database/sql driver for AWS Athena
Language: Go - Size: 26.4 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 130 - Forks: 66

enchant3dmango/esdiel
Esdiel (SDL) stands for serverless data lake. In this project, I'm learning to deploy a simple serverless data lake on AWS using Terraform.
Language: HCL - Size: 544 KB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Tejesvani/IoT-Data-Streaming-and-Analytics
The Smart City Data Streaming Pipeline processes real-time data from IoT devices using Apache Kafka for ingestion and Apache Spark for processing. Data is stored in AWS S3 and analyzed with Glue, Athena, and Redshift. It enhances traffic management, predictive analytics, and urban planning, making cities smarter and more efficient.
Language: Python - Size: 18.6 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

dacort/demo-code
Bits of code I use during live demos
Language: Jupyter Notebook - Size: 774 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 31 - Forks: 24

avegao/aws-athena-node-client
NodeJS AWS Athena client
Language: TypeScript - Size: 589 KB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 6 - Forks: 1

aws-samples/transactional-datalake-using-amazon-msk-and-apache-iceberg-on-aws-glue
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK and MSK Connect (Debezium)
Language: Python - Size: 701 KB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 5 - Forks: 0

ShubhamMohanty680/Spotify_Snowflake
It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS into snowflake datawarehouse. It utilizes AWS services such as Lambda, S3, and CloudWatch to orchestrate the process. The transformed data is then loaded into Snowflake using Snowpipe, and finally visualized in Power BI.
Language: Python - Size: 1.79 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 1

BrianWangila/Sports-Data-Lake-AWS
Automating the building of an NBA Sports Data Lake by leveraging AWS S3, AWS Glue, and AWS Athena and set up an infrastructure to store and query NBA-related data.
Language: Python - Size: 470 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

ShubhamMohanty680/Spotify_end_to_end_data_engineering
It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.
Language: Jupyter Notebook - Size: 1.44 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

AntoineGagne/parthenon
A library to parse Athena structures into Erlang terms
Language: Erlang - Size: 52.7 KB - Last synced at: 13 days ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

zablon-oigo/nba-data-lake
This project automates the creation of a data lake for NBA analytics using AWS services
Language: Python - Size: 12.7 KB - Last synced at: 20 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

SWO-GS/athena-cloudtrail-partitioner 📦
Automate the daily partitioning of your CloudTrail bucket in Athena
Language: JavaScript - Size: 671 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 7

reyhanhosavci/youtube-veri-analizi-sunum
Veri analizi hakkında hazırladığım sunum
Size: 0 Bytes - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

WinterYukky/athena-view
Language: TypeScript - Size: 256 KB - Last synced at: 15 days ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

ndomah/AWS-YouTube-Data-Analysis
Analyzed YouTube trending video data using AWS services to build a scalable pipeline for data ingestion, ETL, and storage in a centralized data lake. Created QuickSight dashboards highlighting video views by country, category, and region. Workflow included ingestion, preprocessing, cataloging, and analysis.
Language: Python - Size: 968 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

gautamgc17/YouTube-Data-Analytics-AWS-Pipeline
The projects aims to build a data engineering pipeline on AWS, for analysis of YouTube data based on video categories and trending metrics.
Language: Python - Size: 54.7 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

zapr-oss/zapr-athena-client
ZAPR AWS athena client is a python library to run the presto query on the AWS Athena.
Language: Python - Size: 18.6 KB - Last synced at: 14 days ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 2

shahidmalik4/aws-glue-stepfunctions-etl
This project automates an ETL pipeline using AWS Glue, S3, Athena, and Step Functions to transform raw Airbnb data. It cleanses, enriches, and organizes the data into separate raw and transformed databases, enabling efficient querying and analysis via Athena, with automated notifications through SNS.
Language: Python - Size: 3.47 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

aryan4codes/StockIO
StockIO is a real-time data streaming solution designed to process and analyze stock market data using Apache Kafka and AWS services.
Language: Jupyter Notebook - Size: 2.62 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

OElesin/querypal
Web UI for Amazon Athena
Language: Vue - Size: 22.6 MB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 55 - Forks: 26

SadafAsad/LinkedIn-Jobs-Analysis
Unveiling job market trends with Scrapy and AWS
Language: Python - Size: 562 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

AWS-Big-Data-Projects/front-line-concussion-monitoring-system-using-AWS-IoT-and-serverless-data-lakes
A simple, practical, and affordable system for measuring head trauma within the sports environment, subject to the absence of trained medical personnel made using Amazon Kinesis Data Streams, Kinesis Data Analytics, Kinesis Data Firehose, and AWS Lambda
Language: Shell - Size: 30.3 KB - Last synced at: 5 days ago - Pushed at: almost 5 years ago - Stars: 12 - Forks: 0

Saurabhkhandebharad/BigData-SK
Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!
Language: Python - Size: 8.79 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Tyriek-cloud/NYC-Mobility-Survey-Analysis
An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.
Language: Python - Size: 2.75 MB - Last synced at: 17 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

TimKong21/AWS-Batch-Processing
Big data analysis with AWS services, filtering the Wikiticker dataset with Apache Spark on Amazon EMR, storing data in S3, cataloging with AWS Glue, and querying with Amazon Athena. This end-to-end pipeline exemplifies handling and analyzing big data in the cloud.
Language: Python - Size: 8.01 MB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

alash3al/xyr
Query any data source using SQL, works with the local filesystem, s3, and more. It should be a very tiny and lightweight alternative to AWS Athena, Presto ... etc.
Language: Go - Size: 85.9 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 65 - Forks: 3

jxareas/Athena-SpringKlient
POC app to show how to query Athena and integrate the AWS SDK in Spring Boot.
Language: Kotlin - Size: 78.1 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

DenysGonzaga/glue-athena-cdk-example
A small walkthrough how to create an AWS Glue Job Pipeline with AWS CDK
Language: Python - Size: 10.7 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

pgrarchives/AWS_DATA_PIPELINE
End to End Data Engineering Pipeline using AWS Cloud Services
Language: Jupyter Notebook - Size: 2.03 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

flemm0/capitol-trades
politician stock market activity web scraping project
Language: Python - Size: 2.26 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

taupirho/read-big-file-aws-athena-glue
Continuing with my case study on reading a big data file, this is the fifth part of my trilogy :-) on how I got on reading a big'ish file with C, Python, spark-python and spark-scala, AWS Elastic Map reduce and AWS Athena.
Language: Python - Size: 45.9 KB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1

sumanthmalipeddi/spotify_trending_telugu
Collecting the list of songs,album and artists list details from the Spotify Music Application in specific intervals using spotipy API and performing ETL Operations using Amazon Cloud Services
Language: Jupyter Notebook - Size: 630 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

h-fuzzy-logic/data-analytics-spring
Open data and cloud computing to answer the question: Are we losing our spring days?
Language: Jupyter Notebook - Size: 390 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

stamixthereal/forecast-athena-query-cost
This Python project offers a business-focused solution for analyzing SQL query logs and predicting memory usage, primarily for AWS Athena. It enhances database performance monitoring and optimization, crucial for data-driven enterprises.
Language: Python - Size: 313 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

UdbhavSrivastava/Youtube-Analysis-Piepline
This AWS-based data pipeline manages data from storage in S3 data lakes, through transformation with AWS Glue and Lambda, to refined storage in separate S3 repositories. Using Athena for SQL querying and QuickSight for interactive dashboards, this solution optimizes data processing and visualization, facilitating informed decision-making and insigh
Language: Python - Size: 494 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

tewfik-ghariani/cloud-storage-analyzer
Analyzing and detecting anomalies in S3 Data using Athena JDBC Driver
Language: Python - Size: 2.58 MB - Last synced at: 4 days ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

mihirkudale/Stock-Market-Real-Time-Data-Engineering-Project
In this project, you will execute an End-To-End Data Engineering Project on Real-Time Stock Market Data using Kafka. We are going to use different technologies such as Python, Amazon Web Services (AWS), Apache Kafka, Glue, Athena, and SQL.
Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mihirkudale/youtube-analysis-data-engineering-project
This project aims to securely manage, streamline, and perform analysis on the structured and semi-structured YouTube videos data based on the video categories and the trending metrics.
Language: Python - Size: 114 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

shubhamjais40/AWS-Data-Pipeline-Project-Implementing-Data-Validation-Using-Lambda-based-Gluecrawler-v1.0
This Project demonstrates the Technology shift in Automobile Firm to resolve the data engineering challenge of manual data ops. AWS Cloud Services implemented here as: S3 bucket for lake storage incoming batches, Lambda Python Script for automating the validation function call and Glue Crawler to generate relational table with successful testing.
Language: Python - Size: 347 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Gabyzera/covid_data_lake_analysis
☁️ Análise de dados do data lake de covid-19 da AWS
Language: Jupyter Notebook - Size: 7.33 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

fermat01/ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena
Etl data pipeline using aws services
Language: Python - Size: 4.07 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

DimaKuriptya/RedditETL
This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
Language: Python - Size: 14.6 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

VivekRajyaguru/aws-athena
Language: JavaScript - Size: 9.77 KB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

QuiNovas/lambda-pyathena Fork of laughingman7743/PyAthena
PyAthena is a Python DB API 2.0 (PEP 249) compliant client for Amazon Athena.
Language: Python - Size: 318 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

blieusong/aws-cookbook
A set of commands that can help when working with AWS
Size: 3.91 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

sarah-zhan/data_pipeline_amazon_products
An end-to-end data pipeline built with AWS S3, Glue, Crawler, Athena, Tableau visulization
Language: Jupyter Notebook - Size: 1.74 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

gakas14/Kafka_streaming_project
The project is to simulate Real-time streaming for movie details using Kafka. We used different technologies such as Python, Amazon EC2, Apache Kafka, Glue, Athena, and SQL.
Language: Jupyter Notebook - Size: 1.51 MB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

san99tiago/aws-cdk-athena-s3-workflow
AWS CDK-TypeScript project to showcase an Athena-based solution for S3 data analysis.
Language: TypeScript - Size: 3.85 MB - Last synced at: about 21 hours ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

shiv-rna/Youtube-Data-Engineering-Pipeline
This project repo 📺 offers a robust solution meticulously crafted to efficiently manage, process, and analyze YouTube video data leveraging the power of AWS services. Whether you're diving into structured statistics or exploring the nuances of trending key metrics, this pipeline is engineered to handle it all with finesse.
Language: Python - Size: 179 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

kingyiusuen/udacity-data-engineering-nanodegree
Projects for Udacity's Data Engineering Nanodegree
Language: Jupyter Notebook - Size: 1.17 MB - Last synced at: 16 days ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

prathyyyyy/Youtube-ETL-Pipeline-For-Data-Analysis
Youtube ETL pipeline Project Using Pyspark and AWS
Language: Python - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

eljandoubi/aws-human-balance-analytics
Using AWS Glue, AWS S3, Python, and Spark, create or generate Python scripts to build a lakehouse solution in AWS
Language: Python - Size: 1.54 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

tracebit-com/cloudtrail-latency-investigation
Jupyter notebook for investigating CloudTrail latency using Athena and matplotlib.
Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

nnthanh101/sentiment-analysis
Voice of the Customer (VoC) to enhance customer experience with serverless architecture and sentiment analysis, using Amazon Kinesis, Amazon Athena, Amazon QuickSight, Amazon Comprehend, and ChatGPT-LLMs for sentiment analysis.
Language: JavaScript - Size: 7.78 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 5

exasol/athena-virtual-schema
Virtual Schema for connecting Athena as a data source to Exasol
Language: Java - Size: 66.4 KB - Last synced at: 30 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

omkarfadtare/Practical_data_science
These are the handwritten notes on Coursera's Practical data science specialization course.
Size: 82 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sarutlaa/Spotify-End-to-End-Data-Pipeline
ETL Data Pipeline built using AWS Offerings
Language: Jupyter Notebook - Size: 104 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RuFerdZ/Medical-X
US Insurance cost predicting linear regression model. Mainly used to learn about Machine Learning tools in Amazon Web Services (AWS)
Language: Jupyter Notebook - Size: 25.1 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

tmheo/spark-athena
AWS Athena data source for Apache Spark
Language: Scala - Size: 8.88 MB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 24 - Forks: 7

daniel-cortez-stevenson/aws-athena-udfs-h3
This connector extends Amazon Athena's capability by adding UDFs (via Lambda) for selected [h3-java](https://github.com/uber/h3-java) Java functions to support geospatial indexing and queries with Uber's [H3](https://h3geo.org/)
Language: Java - Size: 1.11 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 16 - Forks: 1

epomatti/aws-elb-access-logs
Access logs for ELB
Language: HCL - Size: 154 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

imverma/DataEngineering-YouTube-Analysis-Project
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
Language: Python - Size: 61.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

markoshlima/positional-file-process
This project is based for legacy applications that works with positional files to process data. The objetive is read these positional files when they arrives in AWS S3, and then send to a dataware-house like AWS Redshift, and finally read the results with a Business Intelligence tool as AWS QuickSight.
Size: 873 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

tylerdsilva/Brackets-Analytics-Dashboard
User, Event, and Predictive Metric Dashboard on 2GB/month of log files from Brackets IDE
Language: JavaScript - Size: 545 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bhavya1917/Layoffs_Decoded
Demystifying ~400K layoffs to analyze underlying causes and predict future trends of layoffs by different companies.
Language: Jupyter Notebook - Size: 38.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

tylerdsilva/Layoffs-Decoded
Demystifying ~400K layoffs to analyze underlying causes and predict future trends
Language: Jupyter Notebook - Size: 38.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bhavya1917/Brackets_Analytics_Dashboard
User, event, and predictive metric dashboard on log files from Brackets IDE.
Language: JavaScript - Size: 545 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

burtcorp/athena-runner
Runs Athena queries with AWS Lambda and Step Functions
Language: Makefile - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 18 - Forks: 9
