GitHub topics: etl-process
crate/cratedb-fivetran-destination
CrateDB Fivetran Destination connector, for loading data into CrateDB.
Language: Python - Size: 120 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

motetpaper/datawork-mkhrjson
hanzi-radical index data file generator
Language: JavaScript - Size: 1000 Bytes - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

olawale-effect/Activity-Log-Pipeline
A production-ready data table for basic analysis (an SHS Company)
Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

thompson0012/PyEmits
Sugar candy for data scientist. Easy manipulation in time-series data analytics works.
Language: Python - Size: 4.07 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 1

yash-h10/Healthcare-Data-Analysis
This project analyzes Inpatient and Outpatient Waiting Lists from 2018 to 2021, highlighting trends in patient wait times across various medical specialties. Using Power BI, the data was cleaned, modeled, and visualized to provide insights into waiting time distribution, specialty-wise backlogs, and yearly trends.
Size: 5.25 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

TheCocoTeam/source-watcher-core
This is a PHP project which combines ETL with different strategies to extract data from multiple databases, files, and services, transform it and load it into multiple destinations.
Language: PHP - Size: 1.29 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 0

Anurag-kumar-Molankala/Sales-Performance-Dashboard
A Power BI dashboard that analyzes sales trends, product performance, customer segmentation, and payment distribution. It uses DAX, time intelligence, and interactive visuals for data-driven insights. The model includes Sales, Product, and Customer tables for in-depth analysis.
Size: 301 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

imsanjoykb/Data-Science-Regular-Bootcamp
Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.
Language: Jupyter Notebook - Size: 200 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 112 - Forks: 43

taogeYT/pyetl
python ETL framework
Language: Python - Size: 131 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 104 - Forks: 36

GauravTheBeginner/Budget-tracker
Budget-Tracker
Language: Python - Size: 33.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Manduco/TruVistaXPraxedo_Integration
Automated data integration between TruVista’s ERP/CRM and Praxedo WorkOrder system using .NET ETL processes.
Language: C# - Size: 376 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

pzaino/microETL
A simple, reusable, templates based ETL (Extract, Transform and Load) library and framework written in Python
Language: Python - Size: 383 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 1

OtemaY/Twitter-Scraping-ETL-Process
Exploring the ETL Process through Twitter Scraping. A script that downloads tweets data on a specific search topic using the standard search API.
Language: Jupyter Notebook - Size: 29.3 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

LIoccoUMD/ETL-Analysis
This project automates ETL for gym exercise data, predicting safety scores using KNN and optimizing with GridSearchCV. It generates recommendations, statistical summaries, and visualizations to improve gym safety and client retention. Logging ensures transparency.
Language: Python - Size: 1.72 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

Tarun-Gemini/Netflix_Shows_Pbix
Netflix users insights through data visualization - Power BI
Size: 3.07 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

shogunbanik18/budgetify
Your one-stop destination for managing budgets and gaining financial insights
Language: Python - Size: 51.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 1

mbc07/TCC-BigData-Spotify
Processo de ETL e visualização de dados utilizando dados da Spotify Web API
Language: Python - Size: 130 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

dzaky-pr/ets-datalakehouse-b
Language: Jupyter Notebook - Size: 642 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

davideaimar/eth2dgraph
Extractor of Ethereum data to Dgraph format, utilities to analyse the indexed data.
Language: Rust - Size: 520 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 5 - Forks: 3

shellynagar27/Sales-Profit-Insights-Tableau-Project
Purpose is to unlock sales insights that are not visible before for sales team for decision support & automate them to reduced manual time spent in data gathering.Used MySQL for data connection, performed ETL and data analysis using Tableau.
Size: 2.87 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Steve0verton/google-maps-geocode-enrichment
This project repository provides a headless module to enrich location data in a database table using the Google Maps Geocode API.
Language: Python - Size: 2 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 0

Wazzabeee/pyspark-etl-twitter
Implementation of an ETL process for real-time sentiment analysis of tweets with Docker, Apache Kafka, Spark Streaming, MongoDB and Delta Lake
Language: Python - Size: 3.37 MB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 16 - Forks: 4

danilosoftwares/BikeServerProcessador
Data Processor
Language: Python - Size: 12.1 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

yekhanfir/Satisfaction-Analysis-Solution-For-Phone-Service-Providers
This is a sentimental analysis project that aims to provide a better insight on customers' satisfaction based on comments gathered (scrapped) from social media using google's Bert classification model.
Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 5 - Forks: 2

KatGilliland/Emoji-Usage-Power-BI-Report
Analyzing and visualizing an emoji usage data set using PowerPoint and Power BI
Size: 222 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

RuthLD/Movie_ETL
Using the ETL process to clean and merge data.
Language: Jupyter Notebook - Size: 14 MB - Last synced at: 9 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

emsalcengiz/data-normalize-with-etl-procesess
I made various data normalization operations with python scripts. Target data in CSV format
Language: Python - Size: 5.66 MB - Last synced at: 9 months ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 1

aymane-maghouti/HR-Data-Pipeline-Azure
This project is a comprehensive data engineering solution that extracts HR data from a GitHub repository, performs data transformations using Azure services, and creates an interactive HR dashboard using Power BI. The goal is to enable HR professionals and decision-makers to gain insights from the HR data for better workforce management.
Language: Jupyter Notebook - Size: 3 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

paschalugwu/Developing-ML-Workflows-On-AWS
Developed an image classification model for Scones Unlimited to identify delivery vehicles (bicycles vs. motorcycles) to enhance routing and loading bay assignments, thereby optimizing operational efficiency.
Language: Jupyter Notebook - Size: 784 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Ashleypat/ETL_PYTHON
CIP_ETL_PROCESS
Language: Jupyter Notebook - Size: 18 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

jacksonpf1/spotify-user-analysis
ETL process and EDA of user top artists & tracks data in Spotify using Spotipy, Pandas, Airflow and Seaborn
Language: Jupyter Notebook - Size: 466 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

SAZZAD-AMT/Informatica-Data-Integration-and-Transformation-Project
This process illustrates how to structure and manipulate relational databases effectively, demonstrating key SQL operations and transformations within an Informatica environment. The provided images and detailed SQL commands serve as a comprehensive guide for implementing and understanding these database management tasks.
Size: 3.71 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

bhammy27/Fantasy_Football_database_SQL
A desire to win my Fantasy Football leagues led to a realization that I have a passion for Data Analytics. I will create my own database using postgreSQL and pgAdmin.
Size: 81.1 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

pacicap/Data-Warehousing
Extraction of data from different Database sources, Transformation (unification and cleaning) of extracted data and laoding into the data warehouse
Size: 23.4 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

TeniOT/HR-Employee-Report
Dataset cleaned and queried to visualisation for HR Employee data report. Skills: PowerBI, MySQL, EDA, ETL
Language: Python - Size: 1.28 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

polarbeargo/udacity-nd027-Data-Modeling-with-Postgres
Udacity nd027 Data Modeling with Postgres
Language: Jupyter Notebook - Size: 664 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 2

benkeyben/candidalysis
Candidalysis is a project aimed at analyzing student performance for academic year 2022 to 2023 using Power BI. The primary goal of this project is to extract, visualize, and interpret various key performance indicators (KPIs) related to exams conducted during this period.
Size: 1.48 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

inregards2pluto/amazon-vine-analysis
Use PySpark to perform the ETL process on a dataset retrieved from an AWS RDS instance.
Language: Jupyter Notebook - Size: 12.7 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

inregards2pluto/movies-etl
Collect, combine, and clean data from Wikipedia and Kaggle for export into an SQL database.
Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

keity-p/Processo-de-ETL---Projeto-Pix
Processo de ELT da Análise Exploratória de Dados sobre Pix.
Language: Jupyter Notebook - Size: 288 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

polakowo/yelp-3nf
3NF-normalize Yelp data on S3 with Spark and load it into Redshift - automate the whole thing with Apache Airflow
Language: Jupyter Notebook - Size: 1.82 MB - Last synced at: 9 days ago - Pushed at: over 5 years ago - Stars: 12 - Forks: 3

MrSeemsGood/PostgreSQL-Python-ETL-json-app 📦
PyQt5 app for JSON parsing and ETL processing
Language: Python - Size: 157 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

rahulrajan15/SQL_Analytics
This project revolves around tapping into real-time data from Otodom, a prominent Polish online real estate platform. Leveraging Bright Data for scraping and Snowflake for ETL in the cloud, we ensure smooth and efficient data processing. Our aim is to provide a seamless analysis of real estate trends, enhancing insights and decision-making.
Language: Python - Size: 37.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

nickjlupu/Movies-ETL
An ETL process for a fictitious streaming service, Amazing Prime, was developed in Jupyter Notebook. The code was then refactored into a Python script to automate the ETL process.
Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 0

abdennourzo/AirQuality-ETL
Air Quality ETL is a Python repository facilitating the extraction, transformation, and loading of air quality data from RapidAPI to a Pandas DataFrame for easy analysis and customization.
Language: Python - Size: 27.3 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

V-MalM/ETL Fork of JosiahR89/Covid19_Analysis
A Case Study of Extract, Transform, Load. Documentaion includes sources of data, types of data wrangling performed (data cleaning, joining, filtering, and aggregating) and the schemata used in the final production database. Technologies used include Pandas, PostgreSQL, Jupyter Notebook.
Size: 85.3 MB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

seyedmahdiamin1998/ETL_catawiki
ETL : Extract --> transform --> load
Language: Python - Size: 260 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

AleksaMCode/university-notices-email-notifier
Dynamic website scraper and email notifier.
Language: Python - Size: 80.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

rickluizms/dio-desafio-limpeza-de-dados
Desafio de Projeto - CryptoETL
Language: Jupyter Notebook - Size: 164 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

buicongdanh/BI_DATH 📦
Đồ án thực hành môn HTTT phục vụ Trí tuệ Kinh doanh, HCMUS K19 | Project for Information Systems for Business Intelligence course
Language: Jupyter Notebook - Size: 122 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

odiegolopestech/architecture-to-extract-covid-data-from-api-for-analysis
This project has the mission of collecting public data of net movements and monthly transfers of expenses from the Federal Executive by programmatic functional classification related to COVID-19.
Language: Python - Size: 34.2 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

MattithyahuData/ETL-Python
🗂️ ETL Process completed in python3 using SQL Sever, MySQL, and PostgreSQl.
Language: Jupyter Notebook - Size: 112 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

simonediluna/Laboratory-of-Data-Science
This repository showcases my university "Laboratory of Data Science" project. It encompasses the implementation of a data warehouse, ETL process, Data Cube, MDX queries, and an interactive dashboard.
Language: Python - Size: 6.59 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

NEXTSLIM/The-Music-has-Changed-WEBSIDE
We going to examine two data sets relate with the music Industry. We want Extract, transform and load this in order to identify insides and trend about the music Industry.
Language: CSS - Size: 822 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

NEXTSLIM/The-Music-has-Changed-Extract-transform-load-
We examine two data sets relate with the music Industry. We Extract, transform and load the data sets in order to create a data base and identify insides and trends about the music Industry.
Language: Jupyter Notebook - Size: 47 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

Alvin1359/etl-esports-earnings
A ETL group project investigating eSports earnings
Language: Jupyter Notebook - Size: 571 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

sidgolangade/Python-Scripts-For-ETL-Jobs
This repository hosts a collection of Python scripts designed to work with ETL jobs.
Language: Python - Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

jxqbbb/Top-Skills-for-Data-Science---Extract-Transform-Load-Visualize
Finding the skills that are most in demand for a data scientist position.
Language: Jupyter Notebook - Size: 60.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

joshuapowell/etl-of-semi-structured-datasets-using-notebooks
ETL of Semi-Structured Datasets Using Notebooks
Language: Jupyter Notebook - Size: 22.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

AndrejaCH/Movies-ETL
For this project I am creating an ETL (Extract, Transform, and Load) pipeline using Python, RegEx, and SQL Database. The goal is to retrieve data from different sources, clean and transform it into a useful format and finally load the data into an SQL database where the data is ready for further analysis. The result is an established automated pipeline and a clean data set stored in an SQL database.
Language: Jupyter Notebook - Size: 1.88 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 12 - Forks: 4

hmignon/P2_mignon_helene
Scraping BooksToScrape (P2 OC D-A Python) : Utiliser les bases de Python pour l'analyse de marché
Language: Python - Size: 70.3 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

GhazaleZe/CourseShop_DataWarehouse
a data warehouse for an online course shop
Language: TSQL - Size: 4.03 MB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 1

caesarmario/data-warehouse-credit-card-applicant-using-pentaho
This repository contains OLTP, ETL process (using Pentaho Data Integration), and OLAP of credit card dataset. The dataset is taken from Kaggle (https://www.kaggle.com/rikdifos/credit-card-approval-prediction) and part of author Capstone Project.
Size: 1010 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

mamurciac/Advanced-Databases-Course
Language: Shell - Size: 310 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mamurciac/Udemy-s-Power-BI-Course
Size: 17.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

manjamcmills/Movies-ETL
Use of ETL to Collect, Import, and Process Movie Data
Language: Jupyter Notebook - Size: 417 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

DCF0708/Amazon_Vine_Analysis
ETL and analysis of trends in product review data from Amazon Vine.
Language: Jupyter Notebook - Size: 147 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

IzmaAziz/Analysing-Data-Of-Unicorn-Companies
Practicing Data Analyzation using Power BI
Size: 1.04 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

MariloyH/Amazon_Vine_Analysis
High Volume Data Analysis with Big Data, AWS, PySpark and pgAdmin
Language: Jupyter Notebook - Size: 1.82 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

cairebarletta/desafio_analise_macro_DS
Desafio de Ciência de Dados para a Análise Macro, utilizando R, Markdown e LaTeX.
Language: R - Size: 618 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ScuderiRosario/CryptoMundo
CryptoMundo is a simple and easy tool to analyze cryptocurrency data in real time which provides a simple and informative dashboard.
Language: Jupyter Notebook - Size: 648 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1

matthallman/movies_etl
Week 8 Challenge Module
Language: Jupyter Notebook - Size: 2.15 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ganeshkavhar/Group-By-SQL-Clause
Learn Group By Clause
Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

SherryKennedy/ETL-CO2_vs_Gas_Prices
Quick Data Anlaysis of CO2 Emmissions vs Gas Prices. Explanation of ETL process. Contributors listed below.
Language: Jupyter Notebook - Size: 3.84 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

pasmi369/Movies-ETL
Amazing Prime loves the dataset and wants to keep it updated on a daily basis. The purpose of the analysis is to clean and merge data using ETL process.
Language: Jupyter Notebook - Size: 3.57 MB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

carolinacraus/Amazon_Vine_Analysis
Analyze Amazon Vine reviews with PySpark. and AWS
Language: Jupyter Notebook - Size: 64.5 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

cdubiel08/ETL-Project-Group-9
Project for exploration of extract, transform, load process using Python, mongoDB and Flask. Data sets included cryptocurrency pricing and COVID case counts.
Language: Jupyter Notebook - Size: 81.1 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

LeeProut/ETL-project
Team project performing ETL on 2020 U.S. Election data, using jupyter notebook, PostgreSQL, and quickDBD.
Language: Jupyter Notebook - Size: 975 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

Arceoavs/videogame-recommender
A recommender system for video games! The Video Game Recommender (VGR) project was created as an university project at the Westfälische Wilhelms-Universität Münster, as part of the Data Integartion Module in the Information Systems master programme.
Language: Python - Size: 61.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

idelfonsog2/cassandra-etl-pipeline
Losing customer it’s not an option. Today in the world we have a ton of devices that are gathering and sending data. The benefit of using a document store database #NoSQL, is that developers don’t need to maintain and/or adjust entities, migrations and changes on existing products. Companies and product moves in an agile environment, where requirements are constantly changing; NoSQL allows us to spin these requirements in a quick manner.
Language: Jupyter Notebook - Size: 532 KB - Last synced at: 2 days ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0
