GitHub topics: extract-transform-load
chayansraj/Data-Pipeline-with-dbt-using-Airflow-on-GCP
This project demonstrates how to build and automate an ETL pipeline using DAGs in Airflow and load the transformed data to Bigquery. There are different tools that have been used in this project such as Astro, DBT, GCP, Airflow, Metabase.
Language: Python - Size: 15 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 23 - Forks: 5

Shanmukhi1920/ETL-MiniProject
Implemented an ETL Pipeline for weather data using OpenWeather API and orchestrated using Apache Airflow
Language: Python - Size: 1.16 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

Nachoxt17/Data-Warehouse-SQL-End-to-End-E.T.L.
Designed and implemented a complete Data Warehouse solution, defining data architecture, designing multi-layer (Bronze, Silver, Gold) E.T.L. pipelines, and building star schema models in SQL scripts to transform and load data from multiple CRM and ERP systems, with final data visualized using Tableau.
Size: 5.86 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

docwire/docwire
DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality
Language: C++ - Size: 35.8 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 83 - Forks: 18

python-bonobo/bonobo
Extract Transform Load for Python 3.5+
Language: Python - Size: 1.46 MB - Last synced at: about 9 hours ago - Pushed at: about 2 years ago - Stars: 1,589 - Forks: 145

chofste/ETL
Language: Python - Size: 2.71 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 0

networktocode/diffsync
A utility library for comparing and synchronizing different datasets.
Language: Python - Size: 1.09 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 167 - Forks: 32

vaxdata22/Water-Quality-DW-on-SQL-Server
This is an MSSQL Data Warehouse and manual ETL demo on a specially formatted Water Quality dataset from DEFRA, UK. It is a personal academic-grade exercise to explore the basic concepts of data warehousing and manual ETL process from an academic perspective.
Language: Jupyter Notebook - Size: 394 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

vaxdata22/Water-Quality-DW-on-Oracle-Database
This is an Oracle DB Data Warehouse and manual ETL demo on a specially formatted Water Quality dataset from DEFRA, UK. It is a personal academic-grade exercise to explore the basic concepts of data warehousing and manual ETL process from an academic perspective.
Language: Jupyter Notebook - Size: 380 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

jakuzzibubbles/getting-started-with-python
scripts to make life easier and organized
Language: Python - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Shahrzad-kaveh/Sales-orders
This project was created using Power BI. In this project, I used various tools such as Dashboards - Data visualization, Extract, Transform, and Load (ETL), Measures, Data modeling, Data cleaning.
Size: 657 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Mandar-1007/Python-and-Tableau
Gathered, cleansed, manipulated and analyzed data effectively using Python to build interactive Tableau dashboards and present data in meaningful ways
Language: Python - Size: 21.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

MaxineXiong/Cloud-Data-Warehousing-with-AWS-Redshift
This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.
Language: Jupyter Notebook - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

abrahamkoloboe27/Random-User-Streaming-Pipeline
Data Engeenering Project - Data Pipeline
Language: Jupyter Notebook - Size: 128 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

R-Mahesh45/HR---Resume-Text-Classification
Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.
Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

fab2s/YaEtl
Yet Another ETL in PHP
Language: PHP - Size: 348 KB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 64 - Forks: 16

drisskhattabi6/Data-Space-for-Electronic-Medical-Records Fork of HAFDAOUIH/Medical-DS
This Repo contains "Data Space for Electronic Medical Records" project, Using python, Angular and MySQL
Language: Python - Size: 2.86 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

jennynzhuang/CA_State_Water_Climate_Impact
California's Water Resources & Impact of Climate Variability
Language: Jupyter Notebook - Size: 53.5 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

JV456/Network-Security-System
This project is about creating a powerful network security system using machine learning and cloud technologies.
Language: Python - Size: 176 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

SayamAlt/TMDB-Movies-End-to-End-ETL-and-ML-Pipeline
This project encompasses end-to-end ETL and ML pipeline development. Data ingestion from TMDB API covered top-rated, current, upcoming, and popular movies with genres. Performed EDA to derive several valuable insights and observations. Developed a regression model with 97% r2 score to predict average movie ratings accurately.
Language: Python - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Association-Rule-Mining-Using-Apriori-Algorithm
This project applies the Apriori algorithm to generate association rules from transaction datasets. It explores the impact of varying support, confidence, and minimum length parameters on rule generation. Results are visualized using scatterplots, heatmaps, and bar charts for better insights.
Language: Jupyter Notebook - Size: 61.5 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Text-Mining-Assignment
This project performs sentiment analysis on Elon Musk's tweets and emotion mining on product reviews from an e-commerce website. It involves data preprocessing techniques such as stemming, lemmatization, and removing stop words. The goal is to extract meaningful insights and classify text based on sentiment and emotion.
Language: Jupyter Notebook - Size: 810 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Salary-Prediction-using-Naive-Bayes
This project uses the Naive Bayes classification algorithm to predict an individual's salary based on features like age, education, occupation, and more. It evaluates model accuracy on training and test datasets. The model achieved a 77% accuracy on both sets.
Language: Jupyter Notebook - Size: 2.91 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Zoo-and-Glass-Classification-Using-KNN
This project uses a K-Nearest Neighbors (KNN) classifier to categorize animals and classify glass types based on various features, with data preprocessing, model training, and accuracy evaluation through cross-validation.
Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Fraud-Detection-and-Sales-Analysis-using-Random-Forest
This project uses Random Forest to classify fraud risk based on taxable income and analyze key factors driving high sales for a cloth manufacturing company.
Language: Jupyter Notebook - Size: 301 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/SVM-Classification-Models-for-Salary-Data-and-Forest-Fire-Size
This project uses SVM to classify salary categories and forest fire sizes. GridSearchCV is applied for hyperparameter tuning, achieving high accuracy on both datasets.
Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

SaifurRR/Extract-Transform-Load-Data-Engineering
Language: Jupyter Notebook - Size: 91.8 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ayush9892/Supply-Chain-ETL
Data Engineering Project on Supply Chain ETL. Creating a dynamic ADF pipeline to ingest both Full Load and Incremental Load data from SQL Server and then transform these datasets based on medallion architecture using Databricks.
Language: Jupyter Notebook - Size: 1.57 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

abrahamkoloboe27/Airflow-Pipeline-Dashboard-Compagnie-Aerienne
Lien de l'application
Language: Python - Size: 555 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 5 - Forks: 0

GreenInfo-Network/nyc-crash-mapper-etl-script
Extract, Transform, and Load script for fetching new data from the NYC Open Data Portal's vehicle collision data and loading into the NYC Crash Mapper table on CARTO.
Language: Python - Size: 4.34 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

SayamAlt/Amazon-Products-API-ETL-and-ML-pipeline
In this project, I've created an end-to-end ETL pipeline and subsequently developed a machine learning model to predict the price of Amazon products based on several product-related features.
Language: Python - Size: 2.95 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

thaychansy/medicaid_drug_ultilization_analysis
This comprehensive dataset allows for Exploratory Data Analysis (EDA) that provide insights into drug utilization trends, cost distribution, and the comparison of Medicaid versus non-Medicaid reimbursement.
Language: Jupyter Notebook - Size: 5.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

TenjinLabs/TenjinLabs
Language: TypeScript - Size: 215 KB - Last synced at: 9 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

filip-kustura/data-warehouse-olympics
This project, part of the elective Advanced Database Systems course, involved building a data warehouse based on the already existing database in PostgreSQL. It focuses on analyzing Olympic Games data across time, covering athletes' performance by discipline, location, and other dimensions. Implemented in Spring 2022.
Size: 3.79 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

kpratikin/Business-Intelligence-and-Data-Warehousing
Business Intelligence and Data Warehousing Project
Language: TSQL - Size: 5.41 MB - Last synced at: 8 months ago - Pushed at: over 5 years ago - Stars: 12 - Forks: 7

iTrauco/data
Language: Python - Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Abhi0323/Full-Cycle-ETL-Analytics-with-Google-Analytics-and-Snowflake
Explore the transformative power of data analytics in my portfolio, where Google Analytics and Snowflake converge to provide comprehensive insights. This project leverages advanced ETL techniques and real-time data integration to enhance user engagement and optimize content delivery effectively.
Language: Jupyter Notebook - Size: 1.48 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 4

grgadekar/Excel-Sales-Analysis-And-Finance-Analysis
This project involves creating comprehensive Sales and Finance Reports using data provided by AtliQ Hardware. The objective is to empower businesses to monitor and evaluate their sales activities and financial performance, supporting informed decision-making and stakeholder communication.
Size: 2.79 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Konrad-Olszewski/Multidimensional-Data-Analysis-OLAP-with-ETL-PROJECT
OLAP multidimensional data analysis project - SSIS, PowerBI, ETL
Size: 14.4 MB - Last synced at: 4 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 1

LeftCoastNerdGirl/Extract_Transform_Load
This mini project introduces data cleaning through ETL
Language: Jupyter Notebook - Size: 620 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

gopiashokan/Airbnb-Analysis-with-Tableau
Built an interactive Tableau dashboard to analyze the Airbnb data extracted from MongoDB Atlas. Developed a Streamlit application for trend analysis, pattern recognition, and data insights using EDA. Explored variations in price, location, property type, and seasons through dynamic plots and charts.
Language: Jupyter Notebook - Size: 1.68 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 1

Kmohamedalie/Python-Project-for-Data-Engineering
Python Project for Data Engineering
Language: Python - Size: 953 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

StationA/xgeo 📦
Scriptable geospatial data processing engine
Language: Go - Size: 411 KB - Last synced at: 11 months ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 0

marda-alliance/metadata_extractors
A Working Group on connecting and advancing interoperability of efforts on automated extraction of metadata from materials and chemical file formats
Size: 630 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 14 - Forks: 3

ashershaw/Crowdfunding_ETL
Extract, Transform, and Load (ETL) Project
Language: Jupyter Notebook - Size: 615 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

j-b-ferguson/relational-database-design-and-test
Designing and testing a relational database for The Happy Phone Company.
Language: SQL - Size: 545 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

FistGang/ETL_vs_ELT
Comparison between ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform)
Size: 2.93 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

marda-alliance/metadata_extractors_api 📦
Archive of MaRDA Metadata Extractors Schema. See Datatractor Beam, below, for the current repository.
Language: Python - Size: 2.83 MB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 1

marda-alliance/metadata_extractors_registry 📦
Archive. See Datatractor Yard, below:
Language: Python - Size: 205 KB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 6 - Forks: 6

Pawsanie/Steam_statistics_ETL
This pipeline can be used to collect statistical information about all games, distributed through the Steam platform.
Language: Python - Size: 2.35 MB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

success4lyf/ETL-Pipline
Building a fully scalable ETL (Extract, Transform, Load) pipeline to handle large volumes of transaction data for a café business.
Language: Jupyter Notebook - Size: 70.3 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mathewsrc/machine-learning-monitoring-with-evidently
ML Monitoring with EvidentlyAI
Language: Jupyter Notebook - Size: 23.1 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

immanuvelprathap/ETL-Sales_Analysis_Report---MySQL-PowerBI
This repo explains how ETL can be done in MySQL and PowerBi to generate insights!
Size: 6.75 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Aishwarya-TheAnalyst/AtliQ-Grands-Hospitality-Insights-using-Power-BI
AtliQ Grands hotel Data Analysis using Power BI
Size: 4.25 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mathewsrc/ETL-Chicago-Cafe-Permits
This ETL (Extract, Transform, Load) project employs several Python libraries, including Airflow, Soda, Polars, YData Profiling, DuckDB, Requests, Loguru, and Google Cloud to streamline the extraction, transformation, and loading of CSV datasets from the U.S. government's data repository at https://catalog.data.gov.
Language: HTML - Size: 42.3 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

RYANFRANKLIN237/Data-cleansing
A group of python scripts that clean large data sets by removing duplicate data, putting data in correct formats, and removing redundant cells
Language: Python - Size: 7.81 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

udaisharma99/Human-Activity-Prediction
This project focuses on using sensor data to predict human activity and is based on the ExtraSensory dataset, created by Ph.D. students and staff at the Department of Electrical and Computer Engineering, University of California, San Diego.
Language: Jupyter Notebook - Size: 755 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ramkumarpj/project-three
SEC Finance Data Engineering - ETL process for SEC Finance data of S&P 500 companies. Jupyter Notebooks to run ETL work flows. The final dataset is hosted in MongoDB Atlas(cloud). The API is written using Python with PyMongo and Flask libraries. The dashboards with charts are hosted in MongoDB Atlas.
Language: Jupyter Notebook - Size: 3.01 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 3

damaniayesh/Inventory_Management_Dashboard
This project provides Inventory Management using Power BI, extremely useful for Warehouse/ In-plant Inventory Managers to effectively control the Inventory levels and also maintain the Service Levels.
Size: 4.85 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ramkumarpj/Crowdfunding_ETL
This project takes the crowd funding data provided in excel files through Extract Transform and Load (ETL) process and makes it available in a relational database for further usage.
Language: Jupyter Notebook - Size: 768 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

codecadre/melhordazona-web
Web app using babashka/apache + ETL pipeline
Language: Clojure - Size: 14.6 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

benispresence/hexbase
open-source ETL pipeline for HEX cryptocurrency data
Language: Python - Size: 525 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

lfhohmann/wordle-ETL
ETL for Wordle game
Language: Jupyter Notebook - Size: 157 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

rtimbro185/syr_mads_ist722_data_warehouse
Syracuse University, Masters of Applied Data Science - IST 722 Data Warehouse
Language: TSQL - Size: 50.6 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 4

praveendecode/Data_Pipeline
Implemented ETL projects with interactive Streamlit UI for user-friendly data extraction, transformation, and loading tasks
Size: 1000 Bytes - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

KDerec/bookscrap
Student project #1 - Web scraping, use Python basics to create a program that automate the process of extracting, transform and load data from the online library "Books to Scrape".
Language: Python - Size: 11.2 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

phelps-sg/zipline-tardis-bundle
A bundle for zipline-reloaded to allow data for crypto assets to be ingested from Tardis
Language: Python - Size: 93.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

klipperdev/klipper
A Web and API Development Platform build over Symfony
Language: PHP - Size: 3.1 MB - Last synced at: 12 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

PredictGroup/1C-ERP-OLAP
OLAP ITL-Утилиты для 1С:ERP Управление предприятием.
Language: C# - Size: 1.13 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 6 - Forks: 5

MadAboutImport/DIFS
Data Importer For SharePoint & Office 365
Size: 88.2 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 23 - Forks: 15

praveen-kumar-maurya/Superstore-Sales-Dashboard
The superstore sales dashboard developed in Power BI aims to increase sales and profitability by providing data-driven insights. It offers a comprehensive view of sales and profit trends to identify growth opportunities and inform marketing strategies. The goal is to achieve sustainable growth and profitability by utilizing the insights provided.
Size: 3.78 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

nachiketdixit/Google_BI_Professional
This certification focuses on in-demand skills like data modeling, data visualization, and dashboarding and reporting.
Size: 500 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ats-tandjoeng7/Mission-to-Mars
Application of Python web scraping methodologies for performing data analytics and visualization as part of the Extract, Transform, and Load (ETL) process.
Language: Jupyter Notebook - Size: 719 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

ats-tandjoeng7/Crowdfunding-ETL
Application of Python libraries, like Pandas, and their useful functions for performing efficient Extract, Transform, and Load (ETL) process.
Language: Jupyter Notebook - Size: 1.1 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

gopiashokan/Phonepe-Pulse-Data-Visualization-and-Exploration
Visualize insights from PhonePe Pulse data using Python, Streamlit, and Plotly. Explore interactive charts and uncover trends in digital transactions.
Language: Python - Size: 105 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

seonguook88/Seong_Portfolio
Data Analytics Portfolio
Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

NEXTSLIM/The-Music-has-Changed-WEBSIDE
We going to examine two data sets relate with the music Industry. We want Extract, transform and load this in order to identify insides and trend about the music Industry.
Language: CSS - Size: 822 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

NEXTSLIM/The-Music-has-Changed-Extract-transform-load-
We examine two data sets relate with the music Industry. We Extract, transform and load the data sets in order to create a data base and identify insides and trends about the music Industry.
Language: Jupyter Notebook - Size: 47 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

IamJafar/Youtube_Data_Harvesting_and_Warehousing
Domain : Social Media | Extracting data using Youtube API and storing it on MongoDB then Transforming it to a relational databaselike MySQL. For getting various info about youtube channels.
Language: Python - Size: 20.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

IsaacMwendwa/Twitter-ETL-of-Elections-PoliceBrutality-HateSpeech-Data
This Twitter ETL project is aimed at providing data to support UN SDG number 16. The project is directed at providing data to generate actionable insights to stakeholders; regarding the 2022 Presidential Elections, Police Brutality, and Propagation of Hate Speech on Twitter
Language: Python - Size: 593 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

deepankarvarma/Extract-Transform-Load-Process-Techniques
This repository contains code for comparing the performance of three different ELT (Extract, Load, Transform) methods on CSV files of different sizes. The three methods are implemented in Python using different approaches and libraries, and their execution times are compared and plotted for analysis.
Language: Python - Size: 31.8 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

tejal04/NYC-TLC-Data-Engineering
NYC TLC Data Analysis using Python, GCP Storage, Compute Engine, Mage Data Pipeline Tool, BigQuery, and Looker Studio. Aims to extract insights from the dataset for informed decisions and deeper operational understanding.
Language: Python - Size: 1.07 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

python-bonobo/bonobo-sqlalchemy
PREVIEW - SQL databases in Bonobo, using sqlalchemy
Language: Python - Size: 97.7 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 25 - Forks: 14

ats-tandjoeng7/surfs_up
Application of Python database toolkits, such as SQLAlchemy and Flask, for performing data analytics and visualization as part of the Extract, Transform, and Load (ETL) process.
Language: Jupyter Notebook - Size: 617 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

dimgold/ETL_with_Python
ETL with Python - Taught at DWH course 2017 (TAU)
Language: Jupyter Notebook - Size: 115 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 93 - Forks: 51

StationA/landgrab 📦
Geospatial data hoarding system
Language: Python - Size: 72.3 KB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

tek-cub/nlp_job-postings
Natural language processing of job postings in order to gain insight into the data science job market.
Language: Jupyter Notebook - Size: 3.06 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

sam-marhaendra/etl-project-anteraja-reviews
This repository is created for final group project on Data Engineering course.
Language: Jupyter Notebook - Size: 1.87 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

python-bonobo/bonobo-docker
PREVIEW - Run Bonobo data processing graphs in docker containers.
Language: Python - Size: 73.2 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 6

JaviSandoval94/ETL-Project Fork of ArceSaenzLuisAlejandro/ETL-Project
This project aims to create an ETL pipeline from energy consumption data.
Language: Jupyter Notebook - Size: 8.95 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

andreasscherbaum/faa
FAA Airline On-Time Performance Data
Language: Shell - Size: 125 KB - Last synced at: 2 months ago - Pushed at: about 12 years ago - Stars: 1 - Forks: 1

taiwofawumi/DE_ETL_HTML_CSV_JSON
This notebook scrapes information about the largest banks by market capitalization from a wiki page, and stores the information both as a CSV and as a JSON file.
Language: Jupyter Notebook - Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

dfornika/ncov-db
Store SARS-CoV-2 genomic analysis results from ncov2019-artic-nf and ncov-tools to a sqlite DB
Language: Python - Size: 60.5 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

OrionExplorer/dcpam
Data Construct-Populate-Access-Manage - Open source data warehouse solution.
Language: C - Size: 57.3 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

python-bonobo/bonobo-selenium
PRE-ALPHA - Write web crawlers using Bonobo
Language: Python - Size: 19.5 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

ZGrinacoff/ETL-Project
E (Extract), T (Transform), L (Load) Project that showcases both SQL and No-SQL Databases.
Language: Jupyter Notebook - Size: 4.6 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 1

PetraLee2019/Crime-Anyltics
Approximately 10 people are shot on an average day in Chicago. This project focuses on Poverty and Crime in Chicago Neighborhoods. Full-Stack Project.
Language: Jupyter Notebook - Size: 3.33 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

mediaintegration/twiddlepy
Python module for extracting, transforming and loading data
Language: Python - Size: 82 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

ZGrinacoff/Citi-Bike-Analytics
An analysis of Citi Bike with Tableau from January 2018 - September 2019
Size: 12.7 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

samsk/etlrun
Extract-Transform-Load tool based on Message passing, self reprocessing XML pipeline
Language: Perl - Size: 869 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0
