Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: etl-automation

data-solution-automation-engine/DIRECT

DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics control framework that can be used to monitor, log, audit and control data integration / ETL processes.

Language: TSQL - Size: 12.9 MB - Last synced: about 22 hours ago - Pushed: 2 days ago - Stars: 25 - Forks: 8

data-solution-automation-engine/data-warehouse-automation-metadata-schema

Generic interface exchange format for Data Warehouse Automation and ETL generation.

Language: C# - Size: 23 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 37 - Forks: 11

data-solution-automation-engine/virtual-data-warehouse

The Virtual Data Warehouse is a code generation and template management tool. It is part of the data solution automation ecosystem - the 'engine' for data solution automation.

Language: Handlebars - Size: 4.42 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 45 - Forks: 13

cybergeekgyan/Data-Engineering-Portfolio

Data Engineering portfolio projects, resources used to study data tools...

Language: Jupyter Notebook - Size: 2.92 MB - Last synced: 8 days ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

OndraZizka/csv-cruncher

Treats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.

Language: Kotlin - Size: 13 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 43 - Forks: 12

TheNJineer/NJRealtor-Scrapper

Full scale portfolio project which scrapes the NJ Realtor website of its monthly median sales data pdfs. The pdf's contents will be extracted to be cleaned and transformed to then be stores in a SQL data base for future use in a machine learning project.

Language: Python - Size: 3.37 MB - Last synced: 9 days ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

kyaiooiayk/ETL-and-ML-Pipelines-Notes

Notes, tutorials, code snippets and templates focused on ETL pipelines for Machine Learning

Language: Jupyter Notebook - Size: 3.71 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 0 - Forks: 0

heliomarpm/SQLDataTransfer

Ferramenta para Cópia de Dados SQL Server, que foi desenvolvida para auxiliar na geração de arquivos e cópia eficiente de dados entre bases de dados SQL Server.

Language: C# - Size: 4.81 MB - Last synced: 8 days ago - Pushed: 7 months ago - Stars: 1 - Forks: 2

omega1x/stmik

Real-time connector to BTSK-telemetry service

Language: Java - Size: 49.8 KB - Last synced: 16 days ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

EvilLord666/ReportGenerator

A small cross-database tool for building excel documents (reports) based on data from database that extacts via View or Stored Procedures with parametres, ordering e.t.c.

Language: C# - Size: 500 KB - Last synced: 11 days ago - Pushed: almost 2 years ago - Stars: 9 - Forks: 6

redis-field-engineering/redis-connect-dist

Real-Time Event Streaming & Change Data Capture

Language: Shell - Size: 37.2 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 40 - Forks: 11

mikeAdamss/tidychef

Python framework for transforming tabulated data with visual relationships into tidy data

Language: Jupyter Notebook - Size: 16.3 MB - Last synced: 10 days ago - Pushed: 10 months ago - Stars: 1 - Forks: 0

rohitkulkarni08/Azure-ETL-AmazonSalesAnalysis

A comprehensive ETL pipeline and sales analysis project leveraging Microsoft Azure and PySpark, designed to optimize e-commerce sales by providing actionable insights through detailed data analysis.

Language: Jupyter Notebook - Size: 8.04 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

rohitkulkarni08/Azure-ETL-Pipeline-MovieAnalytics

This project demonstrates an ETL pipeline using Microsoft Azure for IMDb Movie Rating Dataset analysis. It covers data extraction from Azure Blob Storage, transformation with Azure Databricks, and loading into Azure SQL using Azure Data Factory. The pipeline automates insights generation and is a practical example of cloud-based data engineering.

Language: Jupyter Notebook - Size: 15.9 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

stevehoober254/ETL_Data_Pipeline_For_Retail_Store

ETL (Extract, Transform, Load) pipeline to integrate sales data from various sources into a central data warehouse

Language: Python - Size: 5.86 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

skyffel/airbyte-connector-generator-poc

proof of concept to generate Airbyte low-code YAML connectors from API documentation

Language: Python - Size: 184 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 11 - Forks: 2

Dev-analysis/Four-year-sales-report

Power BI sales project. Creating an automated Power BI dashboard.

Size: 12.2 MB - Last synced: about 2 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

rizkyirw/Pipeline-Project

Resource for ETL & Data Ingestion program using Apache Airflow

Language: Python - Size: 207 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

OTRABAZOS/RealTimeNews_GoogleWorkspace

Optimize your marketing agency's data workflow with this repository, focusing solely on news integration. Leverage Google Workspace and TrawlingWeb for efficient news acquisition and analysis, transforming complex information into actionable insights with fast, autonomous processes.

Language: JavaScript - Size: 70.3 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

herianc/dados-ar-rj

Sistema de Informação dos dados de Poluição do Rio

Language: Python - Size: 11.7 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

wonderakwei/Automated-Traffic-Data-ETL-Project

Automated Traffic Data ETL: Python scripts convert, reformat, and upload traffic data to BigQuery via GCS. Terraform ensures efficient resource provisioning, and APScheduler automates daily cron jobs.

Language: Python - Size: 616 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

rickluizms/price-miner

Ferramenta dedicada a descobrir promoções diárias oferecidas por fornecedores online.

Size: 58.6 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

jmoc3/ETL_with_Airflow

Automatizacion de la extracción, transformación y carga de datos desde una API a bases de datos como MySQL. PostgreSQL y Redis.

Language: Python - Size: 734 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 3 - Forks: 0

TheNJineer/GSMLS-Analysis

Full Scale portfolio project used to aggregate sales data from the GSMLS (quarterly), clean and transform the data and store in a SQL database for future machine learning analysis

Language: Python - Size: 71.3 KB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 0 - Forks: 0

wlopezm-unal/Machine-learning

Modelos de machine learning. you can see different notebook where i worked with machine learning model, data exploring data cleaning.

Language: Jupyter Notebook - Size: 1.36 MB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 0 - Forks: 0

pavelmaksimov/FlowMaster

ETL flow framework based on Yaml configs in Python

Language: Python - Size: 606 KB - Last synced: 14 days ago - Pushed: 7 months ago - Stars: 21 - Forks: 3

codeexpress/webpluck

Extract specific information from webpage. Run in standalone mode or as API.

Language: Go - Size: 21.5 KB - Last synced: 4 months ago - Pushed: over 2 years ago - Stars: 5 - Forks: 0

nl2go/hetzner-invoice

Automatically download and transform Hetzner invoices.

Language: Python - Size: 59.6 KB - Last synced: about 2 months ago - Pushed: almost 4 years ago - Stars: 10 - Forks: 1

JonFillip/transloc_api_gcp_pipeline

Stream data directly from an API using Apache Beam to BigQuery.

Language: Python - Size: 40.9 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

restarone/violet_rails

an app engine for your business. Seamlessly implement business logic with a powerful API. Out of the box CMS, blog, forum and email functionality. Developer friendly & easily extendable for your next SaaS/XaaS project. Built with Rails 6, Devise, Sidekiq & PostgreSQL

Language: Ruby - Size: 32.7 MB - Last synced: 3 months ago - Pushed: 6 months ago - Stars: 95 - Forks: 42

iambhat/python-script-to-fetch-html-data

It's an python script used in one of the project to access the data from html page using beautiful soup.

Language: Python - Size: 8.79 KB - Last synced: 5 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

Olamylo/CI-CD-with-Azure-Pipelines

This Repo shows how to build a CI/CD process with Azure Pipelines. A python script places in an Azure Repo which extracts data from an API and exports the data to a blob on Azure cloud is utilized for this process.

Language: Python - Size: 7.81 KB - Last synced: 5 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

DanieleDepiro/ETL_pipeline

This project is an example of how to create a basic ETL pipeline, including web scraping, data transformation using pandas, and data loading into a pgAdmin4 database.

Language: Python - Size: 8.79 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

DATASCIENTISTSHENRY/PF_DataScience_Migraciones

Repositorio para Proyecto Final de Data Science en bootcamp Henry, se analizan los datos de migraciones a nivel mudial y nacional. Aplicando un stack tecnológico como Google Cloud Platform, con Machine learning, presentación de KPIs y visualizaciones en PowerBi

Language: Jupyter Notebook - Size: 124 MB - Last synced: 3 months ago - Pushed: 5 months ago - Stars: 1 - Forks: 2

seyedmahdiamin1998/ETL_catawiki

ETL : Extract --> transform --> load

Language: Python - Size: 260 KB - Last synced: 6 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

MarcoZazzini1989/ETL_import_financial_data_to_postgresql

Simple ETL script with Apache Airflow , downloading financial data from alphavantage trought API , trasform into pandas dataframe and uplod to PostgreSQL

Language: Jupyter Notebook - Size: 2.93 KB - Last synced: 3 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 1

matewz/simple_etl_example

Little ETL example. Extracting Data, Store and Visualization

Language: Python - Size: 26.4 KB - Last synced: 7 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

SanjinKurelic/AntennaDistribution

Antenna Distribution is a project that shows how to run business analysis tools on a set of a data.

Language: TSQL - Size: 23.5 MB - Last synced: 7 months ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

AnveshJarabani/END-END-ETL-PIPELINES

ETL Pipelines and Dashboard visualizatons

Language: Python - Size: 51.3 MB - Last synced: 21 days ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

smmiri/etl-visuals

Codes for data flow between models, data post-process, and visualization

Language: Jupyter Notebook - Size: 3.81 MB - Last synced: 8 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

lwdovico/LDS-Project

Repository of a Data Science Project

Language: Jupyter Notebook - Size: 14.7 MB - Last synced: 9 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

Usaid-Bin-Rehan/FAST_Resources_Reverse_Indexing

Search-Engine for FAST-Resources

Language: Jupyter Notebook - Size: 1.66 MB - Last synced: 8 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 2

TheCocoTeam/source-watcher-core

This is a PHP project which combines ETL with different strategies to extract data from multiple databases, files, and services, transform it and load it into multiple destinations.

Language: PHP - Size: 1.27 MB - Last synced: 16 days ago - Pushed: about 1 year ago - Stars: 8 - Forks: 0

AMPATH/etl-flat-table-sync

Sync service for seamless synchronization and transformation of data from AMRS to ETL flat tables

Language: JavaScript - Size: 148 KB - Last synced: 22 days ago - Pushed: 8 months ago - Stars: 0 - Forks: 4

MikeBidinger/Python_ETL

ETL processes using a Tkinter GUI with Python

Language: Python - Size: 101 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

aws-samples/amazon-redshift-serverless-rsql-etl-framework

Amazon Redshift Serverless RSQL ETL Framework

Language: TypeScript - Size: 1010 KB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 4 - Forks: 1

pyprogrammerblog/tiny-blocks

Tiny Blocks to build large and complex data pipelines!

Language: Python - Size: 70.8 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 3 - Forks: 0

timothypesi/ETL-Extract-Transform-Load---using-Pygrametl-

Language: Jupyter Notebook - Size: 2.93 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

moelhaj/elt-pipeline

Extracting data from csv, transforming it, and loading into a Data Warehouse.

Language: Jupyter Notebook - Size: 112 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 2 - Forks: 0

Pawsanie/Luigi_ETL

Universal Luigi ETL pipeline. Validates data received from external sources. Extracts, transforms them and lands.

Language: Python - Size: 95.7 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

mariajosemv/ETL-for-news-websites

♻️ Pipeline for Extract, Transform and Load articles from news websites into an SQLite database.

Language: Python - Size: 7.81 KB - Last synced: 12 months ago - Pushed: over 3 years ago - Stars: 3 - Forks: 1

SohaT7/Movies-ETL

Creates an automated ETL (Extract, Transform, Load) pipeline that extracts (from three data files), transforms, and loads data into a movies database. Uses Python (Pandas), PostgreSQL, and SQL.

Language: Jupyter Notebook - Size: 15.9 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

Wissance/ReportGeneratorWebGui

An ASP NET MVC 6 Web GUI (Net core) for easy reports generation using ReportGenerator

Language: C# - Size: 2.01 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 5 - Forks: 2

imsanjoykb/ETL-Project

The goal of this project is to illustrate Extract Transform Load (ETL) using Python and SQL. ETL is a process commonly done in computing, which takes raw data, cleans it and stores it for later use. The extraction phase targets and retrieves the data. Transform manipulates and cleans the data. Then load stores the data, typically in a data warehouse.

Language: Jupyter Notebook - Size: 285 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 9 - Forks: 3

OscarTHZhang/docker-airflow Fork of wmorin/docker-airflow-1 📦

Demo for AgDH data pipeline

Language: Python - Size: 31.8 MB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

visiologyofficial/vixtract

Language: HTML - Size: 845 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 40 - Forks: 8

manoj9788/spark-etl-tests

A sample repository showcasing, implementation of testing for ETL pipeline developed with Apache Spark

Size: 1000 Bytes - Last synced: 8 days ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0

44falls/modelt

Modelt(mow·delt) is a modern data integration solution that connects data to data for advanced analytics.

Language: Shell - Size: 463 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 2 - Forks: 1

cloudquery/deploy-cq-aws 📦

Cloudformation Template that deploys CloudQuery in an AWS Account

Language: Makefile - Size: 24.4 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 2

DeleLinus/HFR-Data-Warehousing

End-to-end data engineering processes for the NIGERIA Health Facility Registry (HFR). The project leveraged Selenium, Pandas, PySpark, PostgreSQL and Airflow

Language: Python - Size: 1.05 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 0

DGloi/SapR3AutomationTool

Automation method for any SAP R3 TCODE + SPECIFIC exemple of data treatment of the extracts(anonymised)

Language: Python - Size: 319 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

juniors90/PymaciesArg

An extension that registers all pharmacies in Argentina.

Language: Python - Size: 26.9 MB - Last synced: 12 days ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

dorisep/meta_gui

A gui that calls a script to scrape meta critic, create a playlist and store metadata.

Language: Python - Size: 4.84 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

kazizi-swe/system-monitoring

An application that is designed for monitoring and alerting.

Language: Python - Size: 960 KB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

Bag0niku/Movies_ETL

Cleaning and storing movie data for future analysis

Language: Jupyter Notebook - Size: 15.3 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

Sandrolaxx/handsOnETL

Realizando ETL na prática com PENTAHO PDI, feito com base no curso de BI da FIAP.

Language: PLSQL - Size: 12.6 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

san089/airflow-training Fork of mdivk/airflow-training

Introduction to the data pipeline management with Airflow. Airflow schedule and maintain numerous ETL processes running on a large scale Enterprise Data Warehouse.

Language: Python - Size: 5.25 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 5 - Forks: 4

moore3229/Erin-Moore

Data Engineer Portfolio

Size: 202 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

basvdberg/BETL-old

BETL. Meta data driven ETL generation using T-SQL

Size: 2.49 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 17 - Forks: 2

AnchorBase/AnchorBase

AnchorBase is the data warehouse designer

Language: Python - Size: 308 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

morikaglobal/covid19japan_tracker

Python web application that automates data ETL of the latest covid19 data on Japan as well as data visualisation with tableau

Language: Python - Size: 546 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 2 - Forks: 0

Phelipe-Sempreboni/case-clients-marketing-campaingn

Repository for marketing case developed for a selection process.

Language: TSQL - Size: 5.94 MB - Last synced: 12 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

lucianocoelho-28/dio-curso-etl

Curso oferecido para um DIO sobre ETL utilizando uma linguagem Python e como bibliotecas pandas e pandera.

Language: Jupyter Notebook - Size: 208 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 2 - Forks: 0

mikeroyal/Apache-Arrow-Guide

Apache Arrow Guide

Size: 160 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

AfonsoFeliciano/Python-Dowload-Arquivos

Realizar o download e descompactar arquivos de maneira dinâmica

Language: Python - Size: 156 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

aurcode/ETL-for-news-websites

♻️Pipeline for Extract, Transform and Load articles from news websites into an SQLite database.

Language: Python - Size: 8.79 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

Meluiscruz/Automated_Email_Reminder

This project is an automated e-mail sender for an Insurance company. The script reads some Excel files and prepares attachments to send to the clients via e-mail.

Language: Python - Size: 3.37 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

AlkaSaliss/aws_airflow

Project about automating ETL on aws Redshift using Apache Airflow. Part of Udacity data engineering nanodegreee

Language: Python - Size: 854 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

Mew-www/lol-data-collection-system

Academic thesis work's backend / data pipeline. 5/5. Details given privately.

Language: Python - Size: 742 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

danielczz/ETL_Project-Data_Panas

Extract, Transformation & Load analytical worflow for INEGI data for defunciones, year 2012.

Language: Jupyter Notebook - Size: 17.8 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 0