An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: extract-transform-load

chayansraj/Data-Pipeline-with-dbt-using-Airflow-on-GCP

This project demonstrates how to build and automate an ETL pipeline using DAGs in Airflow and load the transformed data to Bigquery. There are different tools that have been used in this project such as Astro, DBT, GCP, Airflow, Metabase.

Language: Python - Size: 15 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 23 - Forks: 5

Shanmukhi1920/ETL-MiniProject

Implemented an ETL Pipeline for weather data using OpenWeather API and orchestrated using Apache Airflow

Language: Python - Size: 1.16 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

Nachoxt17/Data-Warehouse-SQL-End-to-End-E.T.L.

Designed and implemented a complete Data Warehouse solution, defining data architecture, designing multi-layer (Bronze, Silver, Gold) E.T.L. pipelines, and building star schema models in SQL scripts to transform and load data from multiple CRM and ERP systems, with final data visualized using Tableau.

Size: 5.86 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

docwire/docwire

DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality

Language: C++ - Size: 35.8 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 83 - Forks: 18

python-bonobo/bonobo

Extract Transform Load for Python 3.5+

Language: Python - Size: 1.46 MB - Last synced at: about 9 hours ago - Pushed at: about 2 years ago - Stars: 1,589 - Forks: 145

chofste/ETL

Language: Python - Size: 2.71 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4 - Forks: 0

networktocode/diffsync

A utility library for comparing and synchronizing different datasets.

Language: Python - Size: 1.09 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 167 - Forks: 32

vaxdata22/Water-Quality-DW-on-SQL-Server

This is an MSSQL Data Warehouse and manual ETL demo on a specially formatted Water Quality dataset from DEFRA, UK. It is a personal academic-grade exercise to explore the basic concepts of data warehousing and manual ETL process from an academic perspective.

Language: Jupyter Notebook - Size: 394 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

vaxdata22/Water-Quality-DW-on-Oracle-Database

This is an Oracle DB Data Warehouse and manual ETL demo on a specially formatted Water Quality dataset from DEFRA, UK. It is a personal academic-grade exercise to explore the basic concepts of data warehousing and manual ETL process from an academic perspective.

Language: Jupyter Notebook - Size: 380 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

jakuzzibubbles/getting-started-with-python

scripts to make life easier and organized

Language: Python - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Shahrzad-kaveh/Sales-orders

This project was created using Power BI. In this project, I used various tools such as Dashboards - Data visualization, Extract, Transform, and Load (ETL), Measures, Data modeling, Data cleaning.

Size: 657 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Mandar-1007/Python-and-Tableau

Gathered, cleansed, manipulated and analyzed data effectively using Python to build interactive Tableau dashboards and present data in meaningful ways

Language: Python - Size: 21.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

MaxineXiong/Cloud-Data-Warehousing-with-AWS-Redshift

This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.

Language: Jupyter Notebook - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

abrahamkoloboe27/Random-User-Streaming-Pipeline

Data Engeenering Project - Data Pipeline

Language: Jupyter Notebook - Size: 128 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

R-Mahesh45/HR---Resume-Text-Classification

Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.

Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

fab2s/YaEtl

Yet Another ETL in PHP

Language: PHP - Size: 348 KB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 64 - Forks: 16

drisskhattabi6/Data-Space-for-Electronic-Medical-Records Fork of HAFDAOUIH/Medical-DS

This Repo contains "Data Space for Electronic Medical Records" project, Using python, Angular and MySQL

Language: Python - Size: 2.86 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

jennynzhuang/CA_State_Water_Climate_Impact

California's Water Resources & Impact of Climate Variability

Language: Jupyter Notebook - Size: 53.5 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

JV456/Network-Security-System

This project is about creating a powerful network security system using machine learning and cloud technologies.

Language: Python - Size: 176 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

SayamAlt/TMDB-Movies-End-to-End-ETL-and-ML-Pipeline

This project encompasses end-to-end ETL and ML pipeline development. Data ingestion from TMDB API covered top-rated, current, upcoming, and popular movies with genres. Performed EDA to derive several valuable insights and observations. Developed a regression model with 97% r2 score to predict average movie ratings accurately.

Language: Python - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Association-Rule-Mining-Using-Apriori-Algorithm

This project applies the Apriori algorithm to generate association rules from transaction datasets. It explores the impact of varying support, confidence, and minimum length parameters on rule generation. Results are visualized using scatterplots, heatmaps, and bar charts for better insights.

Language: Jupyter Notebook - Size: 61.5 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Text-Mining-Assignment

This project performs sentiment analysis on Elon Musk's tweets and emotion mining on product reviews from an e-commerce website. It involves data preprocessing techniques such as stemming, lemmatization, and removing stop words. The goal is to extract meaningful insights and classify text based on sentiment and emotion.

Language: Jupyter Notebook - Size: 810 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Salary-Prediction-using-Naive-Bayes

This project uses the Naive Bayes classification algorithm to predict an individual's salary based on features like age, education, occupation, and more. It evaluates model accuracy on training and test datasets. The model achieved a 77% accuracy on both sets.

Language: Jupyter Notebook - Size: 2.91 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Zoo-and-Glass-Classification-Using-KNN

This project uses a K-Nearest Neighbors (KNN) classifier to categorize animals and classify glass types based on various features, with data preprocessing, model training, and accuracy evaluation through cross-validation.

Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/Fraud-Detection-and-Sales-Analysis-using-Random-Forest

This project uses Random Forest to classify fraud risk based on taxable income and analyze key factors driving high sales for a cloth manufacturing company.

Language: Jupyter Notebook - Size: 301 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

R-Mahesh45/SVM-Classification-Models-for-Salary-Data-and-Forest-Fire-Size

This project uses SVM to classify salary categories and forest fire sizes. GridSearchCV is applied for hyperparameter tuning, achieving high accuracy on both datasets.

Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

SaifurRR/Extract-Transform-Load-Data-Engineering

Language: Jupyter Notebook - Size: 91.8 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ayush9892/Supply-Chain-ETL

Data Engineering Project on Supply Chain ETL. Creating a dynamic ADF pipeline to ingest both Full Load and Incremental Load data from SQL Server and then transform these datasets based on medallion architecture using Databricks.

Language: Jupyter Notebook - Size: 1.57 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

abrahamkoloboe27/Airflow-Pipeline-Dashboard-Compagnie-Aerienne

Lien de l'application

Language: Python - Size: 555 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 5 - Forks: 0

GreenInfo-Network/nyc-crash-mapper-etl-script

Extract, Transform, and Load script for fetching new data from the NYC Open Data Portal's vehicle collision data and loading into the NYC Crash Mapper table on CARTO.

Language: Python - Size: 4.34 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

SayamAlt/Amazon-Products-API-ETL-and-ML-pipeline

In this project, I've created an end-to-end ETL pipeline and subsequently developed a machine learning model to predict the price of Amazon products based on several product-related features.

Language: Python - Size: 2.95 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

thaychansy/medicaid_drug_ultilization_analysis

This comprehensive dataset allows for Exploratory Data Analysis (EDA) that provide insights into drug utilization trends, cost distribution, and the comparison of Medicaid versus non-Medicaid reimbursement.

Language: Jupyter Notebook - Size: 5.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

TenjinLabs/TenjinLabs

Language: TypeScript - Size: 215 KB - Last synced at: 9 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

filip-kustura/data-warehouse-olympics

This project, part of the elective Advanced Database Systems course, involved building a data warehouse based on the already existing database in PostgreSQL. It focuses on analyzing Olympic Games data across time, covering athletes' performance by discipline, location, and other dimensions. Implemented in Spring 2022.

Size: 3.79 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

kpratikin/Business-Intelligence-and-Data-Warehousing

Business Intelligence and Data Warehousing Project

Language: TSQL - Size: 5.41 MB - Last synced at: 8 months ago - Pushed at: over 5 years ago - Stars: 12 - Forks: 7

iTrauco/data

Language: Python - Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Abhi0323/Full-Cycle-ETL-Analytics-with-Google-Analytics-and-Snowflake

Explore the transformative power of data analytics in my portfolio, where Google Analytics and Snowflake converge to provide comprehensive insights. This project leverages advanced ETL techniques and real-time data integration to enhance user engagement and optimize content delivery effectively.

Language: Jupyter Notebook - Size: 1.48 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 4

grgadekar/Excel-Sales-Analysis-And-Finance-Analysis

This project involves creating comprehensive Sales and Finance Reports using data provided by AtliQ Hardware. The objective is to empower businesses to monitor and evaluate their sales activities and financial performance, supporting informed decision-making and stakeholder communication.

Size: 2.79 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Konrad-Olszewski/Multidimensional-Data-Analysis-OLAP-with-ETL-PROJECT

OLAP multidimensional data analysis project - SSIS, PowerBI, ETL

Size: 14.4 MB - Last synced at: 4 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 1

LeftCoastNerdGirl/Extract_Transform_Load

This mini project introduces data cleaning through ETL

Language: Jupyter Notebook - Size: 620 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

gopiashokan/Airbnb-Analysis-with-Tableau

Built an interactive Tableau dashboard to analyze the Airbnb data extracted from MongoDB Atlas. Developed a Streamlit application for trend analysis, pattern recognition, and data insights using EDA. Explored variations in price, location, property type, and seasons through dynamic plots and charts.

Language: Jupyter Notebook - Size: 1.68 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 1

Kmohamedalie/Python-Project-for-Data-Engineering

Python Project for Data Engineering

Language: Python - Size: 953 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

StationA/xgeo 📦

Scriptable geospatial data processing engine

Language: Go - Size: 411 KB - Last synced at: 11 months ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 0

marda-alliance/metadata_extractors

A Working Group on connecting and advancing interoperability of efforts on automated extraction of metadata from materials and chemical file formats

Size: 630 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 14 - Forks: 3

ashershaw/Crowdfunding_ETL

Extract, Transform, and Load (ETL) Project

Language: Jupyter Notebook - Size: 615 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

j-b-ferguson/relational-database-design-and-test

Designing and testing a relational database for The Happy Phone Company.

Language: SQL - Size: 545 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

FistGang/ETL_vs_ELT

Comparison between ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform)

Size: 2.93 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

marda-alliance/metadata_extractors_api 📦

Archive of MaRDA Metadata Extractors Schema. See Datatractor Beam, below, for the current repository.

Language: Python - Size: 2.83 MB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 1

marda-alliance/metadata_extractors_registry 📦

Archive. See Datatractor Yard, below:

Language: Python - Size: 205 KB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 6 - Forks: 6

Pawsanie/Steam_statistics_ETL

This pipeline can be used to collect statistical information about all games, distributed through the Steam platform.

Language: Python - Size: 2.35 MB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

success4lyf/ETL-Pipline

Building a fully scalable ETL (Extract, Transform, Load) pipeline to handle large volumes of transaction data for a café business.

Language: Jupyter Notebook - Size: 70.3 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mathewsrc/machine-learning-monitoring-with-evidently

ML Monitoring with EvidentlyAI

Language: Jupyter Notebook - Size: 23.1 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

immanuvelprathap/ETL-Sales_Analysis_Report---MySQL-PowerBI

This repo explains how ETL can be done in MySQL and PowerBi to generate insights!

Size: 6.75 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Aishwarya-TheAnalyst/AtliQ-Grands-Hospitality-Insights-using-Power-BI

AtliQ Grands hotel Data Analysis using Power BI

Size: 4.25 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mathewsrc/ETL-Chicago-Cafe-Permits

This ETL (Extract, Transform, Load) project employs several Python libraries, including Airflow, Soda, Polars, YData Profiling, DuckDB, Requests, Loguru, and Google Cloud to streamline the extraction, transformation, and loading of CSV datasets from the U.S. government's data repository at https://catalog.data.gov.

Language: HTML - Size: 42.3 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

RYANFRANKLIN237/Data-cleansing

A group of python scripts that clean large data sets by removing duplicate data, putting data in correct formats, and removing redundant cells

Language: Python - Size: 7.81 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

udaisharma99/Human-Activity-Prediction

This project focuses on using sensor data to predict human activity and is based on the ExtraSensory dataset, created by Ph.D. students and staff at the Department of Electrical and Computer Engineering, University of California, San Diego.

Language: Jupyter Notebook - Size: 755 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ramkumarpj/project-three

SEC Finance Data Engineering - ETL process for SEC Finance data of S&P 500 companies. Jupyter Notebooks to run ETL work flows. The final dataset is hosted in MongoDB Atlas(cloud). The API is written using Python with PyMongo and Flask libraries. The dashboards with charts are hosted in MongoDB Atlas.

Language: Jupyter Notebook - Size: 3.01 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 3

damaniayesh/Inventory_Management_Dashboard

This project provides Inventory Management using Power BI, extremely useful for Warehouse/ In-plant Inventory Managers to effectively control the Inventory levels and also maintain the Service Levels.

Size: 4.85 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ramkumarpj/Crowdfunding_ETL

This project takes the crowd funding data provided in excel files through Extract Transform and Load (ETL) process and makes it available in a relational database for further usage.

Language: Jupyter Notebook - Size: 768 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

codecadre/melhordazona-web

Web app using babashka/apache + ETL pipeline

Language: Clojure - Size: 14.6 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

benispresence/hexbase

open-source ETL pipeline for HEX cryptocurrency data

Language: Python - Size: 525 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

lfhohmann/wordle-ETL

ETL for Wordle game

Language: Jupyter Notebook - Size: 157 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

rtimbro185/syr_mads_ist722_data_warehouse

Syracuse University, Masters of Applied Data Science - IST 722 Data Warehouse

Language: TSQL - Size: 50.6 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 4

praveendecode/Data_Pipeline

Implemented ETL projects with interactive Streamlit UI for user-friendly data extraction, transformation, and loading tasks

Size: 1000 Bytes - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

KDerec/bookscrap

Student project #1 - Web scraping, use Python basics to create a program that automate the process of extracting, transform and load data from the online library "Books to Scrape".

Language: Python - Size: 11.2 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

phelps-sg/zipline-tardis-bundle

A bundle for zipline-reloaded to allow data for crypto assets to be ingested from Tardis

Language: Python - Size: 93.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

klipperdev/klipper

A Web and API Development Platform build over Symfony

Language: PHP - Size: 3.1 MB - Last synced at: 12 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

PredictGroup/1C-ERP-OLAP

OLAP ITL-Утилиты для 1С:ERP Управление предприятием.

Language: C# - Size: 1.13 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 6 - Forks: 5

MadAboutImport/DIFS

Data Importer For SharePoint & Office 365

Size: 88.2 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 23 - Forks: 15

praveen-kumar-maurya/Superstore-Sales-Dashboard

The superstore sales dashboard developed in Power BI aims to increase sales and profitability by providing data-driven insights. It offers a comprehensive view of sales and profit trends to identify growth opportunities and inform marketing strategies. The goal is to achieve sustainable growth and profitability by utilizing the insights provided.

Size: 3.78 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

nachiketdixit/Google_BI_Professional

This certification focuses on in-demand skills like data modeling, data visualization, and dashboarding and reporting.

Size: 500 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ats-tandjoeng7/Mission-to-Mars

Application of Python web scraping methodologies for performing data analytics and visualization as part of the Extract, Transform, and Load (ETL) process.

Language: Jupyter Notebook - Size: 719 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

ats-tandjoeng7/Crowdfunding-ETL

Application of Python libraries, like Pandas, and their useful functions for performing efficient Extract, Transform, and Load (ETL) process.

Language: Jupyter Notebook - Size: 1.1 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

gopiashokan/Phonepe-Pulse-Data-Visualization-and-Exploration

Visualize insights from PhonePe Pulse data using Python, Streamlit, and Plotly. Explore interactive charts and uncover trends in digital transactions.

Language: Python - Size: 105 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

seonguook88/Seong_Portfolio

Data Analytics Portfolio

Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

NEXTSLIM/The-Music-has-Changed-WEBSIDE

We going to examine two data sets relate with the music Industry. We want Extract, transform and load this in order to identify insides and trend about the music Industry.

Language: CSS - Size: 822 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

NEXTSLIM/The-Music-has-Changed-Extract-transform-load-

We examine two data sets relate with the music Industry. We Extract, transform and load the data sets in order to create a data base and identify insides and trends about the music Industry.

Language: Jupyter Notebook - Size: 47 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

IamJafar/Youtube_Data_Harvesting_and_Warehousing

Domain : Social Media | Extracting data using Youtube API and storing it on MongoDB then Transforming it to a relational databaselike MySQL. For getting various info about youtube channels.

Language: Python - Size: 20.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

IsaacMwendwa/Twitter-ETL-of-Elections-PoliceBrutality-HateSpeech-Data

This Twitter ETL project is aimed at providing data to support UN SDG number 16. The project is directed at providing data to generate actionable insights to stakeholders; regarding the 2022 Presidential Elections, Police Brutality, and Propagation of Hate Speech on Twitter

Language: Python - Size: 593 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

deepankarvarma/Extract-Transform-Load-Process-Techniques

This repository contains code for comparing the performance of three different ELT (Extract, Load, Transform) methods on CSV files of different sizes. The three methods are implemented in Python using different approaches and libraries, and their execution times are compared and plotted for analysis.

Language: Python - Size: 31.8 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

tejal04/NYC-TLC-Data-Engineering

NYC TLC Data Analysis using Python, GCP Storage, Compute Engine, Mage Data Pipeline Tool, BigQuery, and Looker Studio. Aims to extract insights from the dataset for informed decisions and deeper operational understanding.

Language: Python - Size: 1.07 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

python-bonobo/bonobo-sqlalchemy

PREVIEW - SQL databases in Bonobo, using sqlalchemy

Language: Python - Size: 97.7 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 25 - Forks: 14

ats-tandjoeng7/surfs_up

Application of Python database toolkits, such as SQLAlchemy and Flask, for performing data analytics and visualization as part of the Extract, Transform, and Load (ETL) process.

Language: Jupyter Notebook - Size: 617 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

dimgold/ETL_with_Python

ETL with Python - Taught at DWH course 2017 (TAU)

Language: Jupyter Notebook - Size: 115 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 93 - Forks: 51

StationA/landgrab 📦

Geospatial data hoarding system

Language: Python - Size: 72.3 KB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

tek-cub/nlp_job-postings

Natural language processing of job postings in order to gain insight into the data science job market.

Language: Jupyter Notebook - Size: 3.06 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

sam-marhaendra/etl-project-anteraja-reviews

This repository is created for final group project on Data Engineering course.

Language: Jupyter Notebook - Size: 1.87 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

python-bonobo/bonobo-docker

PREVIEW - Run Bonobo data processing graphs in docker containers.

Language: Python - Size: 73.2 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 6

JaviSandoval94/ETL-Project Fork of ArceSaenzLuisAlejandro/ETL-Project

This project aims to create an ETL pipeline from energy consumption data.

Language: Jupyter Notebook - Size: 8.95 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

andreasscherbaum/faa

FAA Airline On-Time Performance Data

Language: Shell - Size: 125 KB - Last synced at: 2 months ago - Pushed at: about 12 years ago - Stars: 1 - Forks: 1

taiwofawumi/DE_ETL_HTML_CSV_JSON

This notebook scrapes information about the largest banks by market capitalization from a wiki page, and stores the information both as a CSV and as a JSON file.

Language: Jupyter Notebook - Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

dfornika/ncov-db

Store SARS-CoV-2 genomic analysis results from ncov2019-artic-nf and ncov-tools to a sqlite DB

Language: Python - Size: 60.5 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

OrionExplorer/dcpam

Data Construct-Populate-Access-Manage - Open source data warehouse solution.

Language: C - Size: 57.3 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

python-bonobo/bonobo-selenium

PRE-ALPHA - Write web crawlers using Bonobo

Language: Python - Size: 19.5 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

ZGrinacoff/ETL-Project

E (Extract), T (Transform), L (Load) Project that showcases both SQL and No-SQL Databases.

Language: Jupyter Notebook - Size: 4.6 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 1

PetraLee2019/Crime-Anyltics

Approximately 10 people are shot on an average day in Chicago. This project focuses on Poverty and Crime in Chicago Neighborhoods. Full-Stack Project.

Language: Jupyter Notebook - Size: 3.33 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

mediaintegration/twiddlepy

Python module for extracting, transforming and loading data

Language: Python - Size: 82 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

ZGrinacoff/Citi-Bike-Analytics

An analysis of Citi Bike with Tableau from January 2018 - September 2019

Size: 12.7 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

samsk/etlrun

Extract-Transform-Load tool based on Message passing, self reprocessing XML pipeline

Language: Perl - Size: 869 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

Related Keywords