Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: etl-automation
data-solution-automation-engine/DIRECT
DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics control framework that can be used to monitor, log, audit and control data integration / ETL processes.
Language: TSQL - Size: 12.9 MB - Last synced: about 22 hours ago - Pushed: 2 days ago - Stars: 25 - Forks: 8
data-solution-automation-engine/data-warehouse-automation-metadata-schema
Generic interface exchange format for Data Warehouse Automation and ETL generation.
Language: C# - Size: 23 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 37 - Forks: 11
data-solution-automation-engine/virtual-data-warehouse
The Virtual Data Warehouse is a code generation and template management tool. It is part of the data solution automation ecosystem - the 'engine' for data solution automation.
Language: Handlebars - Size: 4.42 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 45 - Forks: 13
cybergeekgyan/Data-Engineering-Portfolio
Data Engineering portfolio projects, resources used to study data tools...
Language: Jupyter Notebook - Size: 2.92 MB - Last synced: 8 days ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0
OndraZizka/csv-cruncher
Treats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.
Language: Kotlin - Size: 13 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 43 - Forks: 12
TheNJineer/NJRealtor-Scrapper
Full scale portfolio project which scrapes the NJ Realtor website of its monthly median sales data pdfs. The pdf's contents will be extracted to be cleaned and transformed to then be stores in a SQL data base for future use in a machine learning project.
Language: Python - Size: 3.37 MB - Last synced: 9 days ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
kyaiooiayk/ETL-and-ML-Pipelines-Notes
Notes, tutorials, code snippets and templates focused on ETL pipelines for Machine Learning
Language: Jupyter Notebook - Size: 3.71 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 0 - Forks: 0
heliomarpm/SQLDataTransfer
Ferramenta para Cópia de Dados SQL Server, que foi desenvolvida para auxiliar na geração de arquivos e cópia eficiente de dados entre bases de dados SQL Server.
Language: C# - Size: 4.81 MB - Last synced: 8 days ago - Pushed: 7 months ago - Stars: 1 - Forks: 2
omega1x/stmik
Real-time connector to BTSK-telemetry service
Language: Java - Size: 49.8 KB - Last synced: 16 days ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
EvilLord666/ReportGenerator
A small cross-database tool for building excel documents (reports) based on data from database that extacts via View or Stored Procedures with parametres, ordering e.t.c.
Language: C# - Size: 500 KB - Last synced: 11 days ago - Pushed: almost 2 years ago - Stars: 9 - Forks: 6
redis-field-engineering/redis-connect-dist
Real-Time Event Streaming & Change Data Capture
Language: Shell - Size: 37.2 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 40 - Forks: 11
mikeAdamss/tidychef
Python framework for transforming tabulated data with visual relationships into tidy data
Language: Jupyter Notebook - Size: 16.3 MB - Last synced: 10 days ago - Pushed: 10 months ago - Stars: 1 - Forks: 0
rohitkulkarni08/Azure-ETL-AmazonSalesAnalysis
A comprehensive ETL pipeline and sales analysis project leveraging Microsoft Azure and PySpark, designed to optimize e-commerce sales by providing actionable insights through detailed data analysis.
Language: Jupyter Notebook - Size: 8.04 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0
rohitkulkarni08/Azure-ETL-Pipeline-MovieAnalytics
This project demonstrates an ETL pipeline using Microsoft Azure for IMDb Movie Rating Dataset analysis. It covers data extraction from Azure Blob Storage, transformation with Azure Databricks, and loading into Azure SQL using Azure Data Factory. The pipeline automates insights generation and is a practical example of cloud-based data engineering.
Language: Jupyter Notebook - Size: 15.9 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0
stevehoober254/ETL_Data_Pipeline_For_Retail_Store
ETL (Extract, Transform, Load) pipeline to integrate sales data from various sources into a central data warehouse
Language: Python - Size: 5.86 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0
skyffel/airbyte-connector-generator-poc
proof of concept to generate Airbyte low-code YAML connectors from API documentation
Language: Python - Size: 184 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 11 - Forks: 2
Dev-analysis/Four-year-sales-report
Power BI sales project. Creating an automated Power BI dashboard.
Size: 12.2 MB - Last synced: about 2 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0
rizkyirw/Pipeline-Project
Resource for ETL & Data Ingestion program using Apache Airflow
Language: Python - Size: 207 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0
OTRABAZOS/RealTimeNews_GoogleWorkspace
Optimize your marketing agency's data workflow with this repository, focusing solely on news integration. Leverage Google Workspace and TrawlingWeb for efficient news acquisition and analysis, transforming complex information into actionable insights with fast, autonomous processes.
Language: JavaScript - Size: 70.3 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0
herianc/dados-ar-rj
Sistema de Informação dos dados de Poluição do Rio
Language: Python - Size: 11.7 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0
wonderakwei/Automated-Traffic-Data-ETL-Project
Automated Traffic Data ETL: Python scripts convert, reformat, and upload traffic data to BigQuery via GCS. Terraform ensures efficient resource provisioning, and APScheduler automates daily cron jobs.
Language: Python - Size: 616 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
rickluizms/price-miner
Ferramenta dedicada a descobrir promoções diárias oferecidas por fornecedores online.
Size: 58.6 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
jmoc3/ETL_with_Airflow
Automatizacion de la extracción, transformación y carga de datos desde una API a bases de datos como MySQL. PostgreSQL y Redis.
Language: Python - Size: 734 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 3 - Forks: 0
TheNJineer/GSMLS-Analysis
Full Scale portfolio project used to aggregate sales data from the GSMLS (quarterly), clean and transform the data and store in a SQL database for future machine learning analysis
Language: Python - Size: 71.3 KB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 0 - Forks: 0
wlopezm-unal/Machine-learning
Modelos de machine learning. you can see different notebook where i worked with machine learning model, data exploring data cleaning.
Language: Jupyter Notebook - Size: 1.36 MB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 0 - Forks: 0
pavelmaksimov/FlowMaster
ETL flow framework based on Yaml configs in Python
Language: Python - Size: 606 KB - Last synced: 14 days ago - Pushed: 7 months ago - Stars: 21 - Forks: 3
codeexpress/webpluck
Extract specific information from webpage. Run in standalone mode or as API.
Language: Go - Size: 21.5 KB - Last synced: 4 months ago - Pushed: over 2 years ago - Stars: 5 - Forks: 0
nl2go/hetzner-invoice
Automatically download and transform Hetzner invoices.
Language: Python - Size: 59.6 KB - Last synced: about 2 months ago - Pushed: almost 4 years ago - Stars: 10 - Forks: 1
JonFillip/transloc_api_gcp_pipeline
Stream data directly from an API using Apache Beam to BigQuery.
Language: Python - Size: 40.9 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
restarone/violet_rails
an app engine for your business. Seamlessly implement business logic with a powerful API. Out of the box CMS, blog, forum and email functionality. Developer friendly & easily extendable for your next SaaS/XaaS project. Built with Rails 6, Devise, Sidekiq & PostgreSQL
Language: Ruby - Size: 32.7 MB - Last synced: 3 months ago - Pushed: 6 months ago - Stars: 95 - Forks: 42
iambhat/python-script-to-fetch-html-data
It's an python script used in one of the project to access the data from html page using beautiful soup.
Language: Python - Size: 8.79 KB - Last synced: 5 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0
Olamylo/CI-CD-with-Azure-Pipelines
This Repo shows how to build a CI/CD process with Azure Pipelines. A python script places in an Azure Repo which extracts data from an API and exports the data to a blob on Azure cloud is utilized for this process.
Language: Python - Size: 7.81 KB - Last synced: 5 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
DanieleDepiro/ETL_pipeline
This project is an example of how to create a basic ETL pipeline, including web scraping, data transformation using pandas, and data loading into a pgAdmin4 database.
Language: Python - Size: 8.79 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0
DATASCIENTISTSHENRY/PF_DataScience_Migraciones
Repositorio para Proyecto Final de Data Science en bootcamp Henry, se analizan los datos de migraciones a nivel mudial y nacional. Aplicando un stack tecnológico como Google Cloud Platform, con Machine learning, presentación de KPIs y visualizaciones en PowerBi
Language: Jupyter Notebook - Size: 124 MB - Last synced: 3 months ago - Pushed: 5 months ago - Stars: 1 - Forks: 2
seyedmahdiamin1998/ETL_catawiki
ETL : Extract --> transform --> load
Language: Python - Size: 260 KB - Last synced: 6 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0
MarcoZazzini1989/ETL_import_financial_data_to_postgresql
Simple ETL script with Apache Airflow , downloading financial data from alphavantage trought API , trasform into pandas dataframe and uplod to PostgreSQL
Language: Jupyter Notebook - Size: 2.93 KB - Last synced: 3 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 1
matewz/simple_etl_example
Little ETL example. Extracting Data, Store and Visualization
Language: Python - Size: 26.4 KB - Last synced: 7 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0
SanjinKurelic/AntennaDistribution
Antenna Distribution is a project that shows how to run business analysis tools on a set of a data.
Language: TSQL - Size: 23.5 MB - Last synced: 7 months ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0
AnveshJarabani/END-END-ETL-PIPELINES
ETL Pipelines and Dashboard visualizatons
Language: Python - Size: 51.3 MB - Last synced: 21 days ago - Pushed: 8 months ago - Stars: 0 - Forks: 0
smmiri/etl-visuals
Codes for data flow between models, data post-process, and visualization
Language: Jupyter Notebook - Size: 3.81 MB - Last synced: 8 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0
lwdovico/LDS-Project
Repository of a Data Science Project
Language: Jupyter Notebook - Size: 14.7 MB - Last synced: 9 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
Usaid-Bin-Rehan/FAST_Resources_Reverse_Indexing
Search-Engine for FAST-Resources
Language: Jupyter Notebook - Size: 1.66 MB - Last synced: 8 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 2
TheCocoTeam/source-watcher-core
This is a PHP project which combines ETL with different strategies to extract data from multiple databases, files, and services, transform it and load it into multiple destinations.
Language: PHP - Size: 1.27 MB - Last synced: 16 days ago - Pushed: about 1 year ago - Stars: 8 - Forks: 0
AMPATH/etl-flat-table-sync
Sync service for seamless synchronization and transformation of data from AMRS to ETL flat tables
Language: JavaScript - Size: 148 KB - Last synced: 22 days ago - Pushed: 8 months ago - Stars: 0 - Forks: 4
MikeBidinger/Python_ETL
ETL processes using a Tkinter GUI with Python
Language: Python - Size: 101 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
aws-samples/amazon-redshift-serverless-rsql-etl-framework
Amazon Redshift Serverless RSQL ETL Framework
Language: TypeScript - Size: 1010 KB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 4 - Forks: 1
pyprogrammerblog/tiny-blocks
Tiny Blocks to build large and complex data pipelines!
Language: Python - Size: 70.8 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 3 - Forks: 0
timothypesi/ETL-Extract-Transform-Load---using-Pygrametl-
Language: Jupyter Notebook - Size: 2.93 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
moelhaj/elt-pipeline
Extracting data from csv, transforming it, and loading into a Data Warehouse.
Language: Jupyter Notebook - Size: 112 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 2 - Forks: 0
Pawsanie/Luigi_ETL
Universal Luigi ETL pipeline. Validates data received from external sources. Extracts, transforms them and lands.
Language: Python - Size: 95.7 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
mariajosemv/ETL-for-news-websites
♻️ Pipeline for Extract, Transform and Load articles from news websites into an SQLite database.
Language: Python - Size: 7.81 KB - Last synced: 12 months ago - Pushed: over 3 years ago - Stars: 3 - Forks: 1
SohaT7/Movies-ETL
Creates an automated ETL (Extract, Transform, Load) pipeline that extracts (from three data files), transforms, and loads data into a movies database. Uses Python (Pandas), PostgreSQL, and SQL.
Language: Jupyter Notebook - Size: 15.9 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0
Wissance/ReportGeneratorWebGui
An ASP NET MVC 6 Web GUI (Net core) for easy reports generation using ReportGenerator
Language: C# - Size: 2.01 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 5 - Forks: 2
imsanjoykb/ETL-Project
The goal of this project is to illustrate Extract Transform Load (ETL) using Python and SQL. ETL is a process commonly done in computing, which takes raw data, cleans it and stores it for later use. The extraction phase targets and retrieves the data. Transform manipulates and cleans the data. Then load stores the data, typically in a data warehouse.
Language: Jupyter Notebook - Size: 285 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 9 - Forks: 3
OscarTHZhang/docker-airflow Fork of wmorin/docker-airflow-1 📦
Demo for AgDH data pipeline
Language: Python - Size: 31.8 MB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
visiologyofficial/vixtract
Language: HTML - Size: 845 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 40 - Forks: 8
manoj9788/spark-etl-tests
A sample repository showcasing, implementation of testing for ETL pipeline developed with Apache Spark
Size: 1000 Bytes - Last synced: 8 days ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0
44falls/modelt
Modelt(mow·delt) is a modern data integration solution that connects data to data for advanced analytics.
Language: Shell - Size: 463 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 2 - Forks: 1
cloudquery/deploy-cq-aws 📦
Cloudformation Template that deploys CloudQuery in an AWS Account
Language: Makefile - Size: 24.4 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 2
DeleLinus/HFR-Data-Warehousing
End-to-end data engineering processes for the NIGERIA Health Facility Registry (HFR). The project leveraged Selenium, Pandas, PySpark, PostgreSQL and Airflow
Language: Python - Size: 1.05 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 0
DGloi/SapR3AutomationTool
Automation method for any SAP R3 TCODE + SPECIFIC exemple of data treatment of the extracts(anonymised)
Language: Python - Size: 319 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
juniors90/PymaciesArg
An extension that registers all pharmacies in Argentina.
Language: Python - Size: 26.9 MB - Last synced: 12 days ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
dorisep/meta_gui
A gui that calls a script to scrape meta critic, create a playlist and store metadata.
Language: Python - Size: 4.84 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
kazizi-swe/system-monitoring
An application that is designed for monitoring and alerting.
Language: Python - Size: 960 KB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
Bag0niku/Movies_ETL
Cleaning and storing movie data for future analysis
Language: Jupyter Notebook - Size: 15.3 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
Sandrolaxx/handsOnETL
Realizando ETL na prática com PENTAHO PDI, feito com base no curso de BI da FIAP.
Language: PLSQL - Size: 12.6 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
san089/airflow-training Fork of mdivk/airflow-training
Introduction to the data pipeline management with Airflow. Airflow schedule and maintain numerous ETL processes running on a large scale Enterprise Data Warehouse.
Language: Python - Size: 5.25 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 5 - Forks: 4
moore3229/Erin-Moore
Data Engineer Portfolio
Size: 202 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0
basvdberg/BETL-old
BETL. Meta data driven ETL generation using T-SQL
Size: 2.49 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 17 - Forks: 2
AnchorBase/AnchorBase
AnchorBase is the data warehouse designer
Language: Python - Size: 308 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0
morikaglobal/covid19japan_tracker
Python web application that automates data ETL of the latest covid19 data on Japan as well as data visualisation with tableau
Language: Python - Size: 546 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 2 - Forks: 0
Phelipe-Sempreboni/case-clients-marketing-campaingn
Repository for marketing case developed for a selection process.
Language: TSQL - Size: 5.94 MB - Last synced: 12 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
lucianocoelho-28/dio-curso-etl
Curso oferecido para um DIO sobre ETL utilizando uma linguagem Python e como bibliotecas pandas e pandera.
Language: Jupyter Notebook - Size: 208 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 2 - Forks: 0
mikeroyal/Apache-Arrow-Guide
Apache Arrow Guide
Size: 160 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0
AfonsoFeliciano/Python-Dowload-Arquivos
Realizar o download e descompactar arquivos de maneira dinâmica
Language: Python - Size: 156 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
aurcode/ETL-for-news-websites
♻️Pipeline for Extract, Transform and Load articles from news websites into an SQLite database.
Language: Python - Size: 8.79 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
Meluiscruz/Automated_Email_Reminder
This project is an automated e-mail sender for an Insurance company. The script reads some Excel files and prepares attachments to send to the clients via e-mail.
Language: Python - Size: 3.37 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0
AlkaSaliss/aws_airflow
Project about automating ETL on aws Redshift using Apache Airflow. Part of Udacity data engineering nanodegreee
Language: Python - Size: 854 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
Mew-www/lol-data-collection-system
Academic thesis work's backend / data pipeline. 5/5. Details given privately.
Language: Python - Size: 742 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0
danielczz/ETL_Project-Data_Panas
Extract, Transformation & Load analytical worflow for INEGI data for defunciones, year 2012.
Language: Jupyter Notebook - Size: 17.8 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 0