GitHub topics: data-transformation
neurons-me/all.this
All.This is a modular framework for managing and standardizing data structures, enabling seamless interaction across the neurons.me ecosystem. It transforms objects like images, text, and audio into structured formats optimized for machine learning and deep learning applications.
Language: JavaScript - Size: 740 KB - Last synced at: about 6 hours ago - Pushed at: about 6 hours ago - Stars: 64 - Forks: 0

lykmapipo/US-Gas-Prices
Python scripts that scrape US gas prices
Language: Python - Size: 2.12 MB - Last synced at: about 7 hours ago - Pushed at: about 7 hours ago - Stars: 4 - Forks: 1

bruin-data/bruin
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Language: Go - Size: 147 MB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 949 - Forks: 41

caesarmario/weather-data-engineering-pipeline
This repository showcases a complete Python-based ETL (Extract, Transform, Load) data pipeline designed to process, validate, and analyze weather data for multiple cities. The project demonstrates a structured approach to handling weather data, focusing on data accuracy, transformation, and insights generation.
Language: Python - Size: 1.74 MB - Last synced at: about 23 hours ago - Pushed at: about 24 hours ago - Stars: 1 - Forks: 0

chriskenndy/AI-Job-Risk-Pipeline-Project
An end-to-end data pipeline and dashboard assessing the potential risk of AI replacement in various job roles and industries.
Language: Jupyter Notebook - Size: 3.27 MB - Last synced at: about 24 hours ago - Pushed at: about 24 hours ago - Stars: 0 - Forks: 0

deltafi/deltafi
DeltaFi is a flexible, code-light data transformation and normalization platform.
Language: Java - Size: 15 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

gloryodeyemi/SQL-Data-Warehouse
A comprehensive SQL Data Warehouse built from scratch using Azure Data Studio and SQL Server Express. It simulates an enterprise data pipeline using the Medallion Architecture and reflects industry best practices in Data Engineering, ETL design, and SQL-based data modeling.
Language: TSQL - Size: 12.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

goto/optimus Fork of raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Language: Go - Size: 13 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7 - Forks: 4

SUGHA22/Data_analysis
Actively upskilling in data science with hands-on learning during a Green Internship focused on environmental sustainability. Used Pandas and NumPy for data preprocessing and cleaning, and created visual dashboards in Excel and Tableau. Gained experience in interpreting sustainability metrics and communicating insights through data storytelling and
Language: Jupyter Notebook - Size: 1.07 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

lyrasis/kiba-extend
Extensions to Kiba ETL
Language: Ruby - Size: 13.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 6 - Forks: 0

panodata/tikray
A compact data transformation engine.
Language: Python - Size: 280 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

answerdigital/oxford-omop-data-mapper
A documentation-centric ETL tool, implementing transformations of a number of common clincial datasets into the OMOP CDM.
Language: C# - Size: 24.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 8 - Forks: 0

douglas-data-analyst/etl-power-bi-dashboard
Pipeline ETL com Python e dashboard interativo no Power BI para análise de vendas
Language: Jupyter Notebook - Size: 15.6 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

weAIDB/awesome-data-llm
Official Repository of "LLM × DATA" Survey Paper
Size: 48.8 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 320 - Forks: 26

apexDev37/excel-reader 📦
Utility module for research team, RefreshMe, to translate excel files to structured JSON.
Language: TypeScript - Size: 11.7 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

OsarohEkhoragbon/Office-Supplies-Sales-Analytics-and-Profitability-Insights-in-Power-BI
An end-to-end Power BI case study for Deskify Office Supply Co., analyzing sales, expenses, and profitability to drive growth through business intelligence.
Size: 3.18 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

data-integrations/wrangler
Wrangler Transform: A DMD system for transforming Big Data
Language: Java - Size: 6.15 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 105 - Forks: 1,244

markus-wa/cq
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
Language: Clojure - Size: 202 KB - Last synced at: about 6 hours ago - Pushed at: 10 months ago - Stars: 178 - Forks: 11

mahmoud/glom
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
Language: Python - Size: 1.27 MB - Last synced at: 10 days ago - Pushed at: 5 months ago - Stars: 2,016 - Forks: 68

ddeutils/ddeutil-extensions
:building_construction: Dynamic data processing & transformation plugins
Language: Python - Size: 604 KB - Last synced at: 7 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

c-tech-studio/yamahiro-euc
Full-stack serverless POS data transformation system built with Vue.js, Node.js, and AWS CDK
Language: Vue - Size: 737 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

ToucanToco/weaverbird
A visual data pipeline builder with various backends
Language: TypeScript - Size: 45.3 MB - Last synced at: 8 days ago - Pushed at: 11 days ago - Stars: 103 - Forks: 16

crate/commons-codec
Data decoding, encoding, conversion, and translation utilities.
Language: Python - Size: 203 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 2

hi-primus/optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Language: Python - Size: 110 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 1,512 - Forks: 232

rafsanahmed28/Data-Cleaning-MySQL
This project solely focuses on Data Cleaning using only MySQL
Size: 4.88 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

NoxHarmonium/nanoweave
A data transformation tool akin to DataWeave or jq
Language: Clojure - Size: 1.46 MB - Last synced at: about 18 hours ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

microsoft/prose
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
Language: C# - Size: 81.6 MB - Last synced at: about 4 hours ago - Pushed at: 16 days ago - Stars: 643 - Forks: 99

dbohdan/sqawk
Like awk, but with SQL and table joins
Language: Tcl - Size: 574 KB - Last synced at: 11 days ago - Pushed at: 7 months ago - Stars: 315 - Forks: 14

Shreyansh9805/PwC-Switzerland-Job-Simulation
This collection of interactive Power BI dashboards for PwC Switzerland’s Job Simulation offers insights into HR, operational performance, and customer engagement. Users can analyze key KPIs and identify gender disparities in executive management.
Size: 4.74 MB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

comyata/comyata
Computable Data Templates, for browser or server. Use JSONata or any other engine. Store in YAML, JSON files or DBs.
Language: TypeScript - Size: 1.31 MB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

roboto-ai/robologs-ros-actions
A collection of actions for working with ROS data
Language: Shell - Size: 8.44 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 11 - Forks: 2

globaldothealth/adtl
Another data transformation language
Language: Python - Size: 1.34 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2 - Forks: 0

1biot/FiQueLa
FiQueLa is a lightweight PHP library for querying structured files (JSON, XML, CSV, YAML, NEON) using SQL-like syntax with support for stream processing and advanced data functions.
Language: PHP - Size: 5.5 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 2 - Forks: 0

NellyCN/alura-store
Proyecto de análisis de ventas de tiendas para Alura Latam con Python y visualización de datos.
Language: Jupyter Notebook - Size: 7.32 MB - Last synced at: 24 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

NicolasKiryczun/sp500-project
Using MySQL and PowerBI, I created a dashboard to analyze the yearly returns of the S&P 500 Index
Size: 433 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 1 - Forks: 0

SebKrantz/collapse
Advanced and Fast Data Transformation in R
Language: C - Size: 110 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 681 - Forks: 34

simongray/clojure-dsl-resources
A curated list of Clojure resources for dealing with domain-specific languages.
Size: 109 KB - Last synced at: 8 days ago - Pushed at: 11 months ago - Stars: 182 - Forks: 5

wayofdev/laravel-symfony-serializer
🔧 Laravel + Symfony Serializer. This package provides a bridge between Laravel and Symfony Serializer.
Language: PHP - Size: 1.29 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 21 - Forks: 3

elsysigey/ALX-DATA-ANALYTICS-PROJECTS
Size: 75.2 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

2ndQuadrant/pglogical
Logical Replication extension for PostgreSQL 17, 16, 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Language: C - Size: 1.81 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 1,109 - Forks: 163

ScriptFUSION/Porter
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
Language: PHP - Size: 2.9 MB - Last synced at: 23 days ago - Pushed at: 4 months ago - Stars: 612 - Forks: 24

ludreinsalvador/global-covid-19-data-analysis
Contains Power BI dashboards that visualizes and analyzes global COVID-19 cases, deaths, and vaccination trends using data from the World Health Organization (WHO). The project aims to provide insights into the pandemic’s impact and vaccination progress worldwide through dynamic reports and advanced analytics.
Size: 11.1 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

nicosuave/awesome-sqlmesh
A curated list of awesome SQLMesh resources
Size: 25.4 KB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 34 - Forks: 1

lykmapipo/Python-Spark-Log-Analysis
Python scripts to process, and analyze log files using PySpark.
Language: Python - Size: 131 KB - Last synced at: 18 days ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

lykmapipo/NYC-TLC-Trip-Data
Python scripts to download, process, and analyze the New York City Taxi and Limousine Commission (TLC) Trip Record Data dataset
Language: Jupyter Notebook - Size: 100 MB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 5 - Forks: 1

tawfikhammad/Data-cleaning-tutorial
In this repo you will know how to deal with outliers, skewness, inconsistency, parsing dates, handle missing values with different techniques, and deal with different datatypes of data (e.g. numerical data and catogrical data)
Language: Jupyter Notebook - Size: 2.39 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

jagzmz/json-to-cypher
Effortlessly map JSON data to Neo4j graphs. Define a schema once, and JSON2Cypher generates the Cypher CREATE and MERGE queries for nodes and relationships.
Language: TypeScript - Size: 2.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

Solrikk/MagicData
🧙♂️ MagicXML is a FastAPI-based service designed to fetch, process, and convert XML data into structured CSV files. It is optimized for handling large XML files by processing them in chunks asynchronously, making it suitable for heavy data processing tasks.
Language: Python - Size: 6.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

bhrnjica/daany
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Language: C# - Size: 31.5 MB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 57 - Forks: 5

Estif-X/Complete-Data-Engineering-and-Analysis-Project
This project is a team project consisting of data engineers and data analysts. Starting from data extraction, ingestion, cleaning, transforming, loading up to doing data analysis and visualization. We will use a variety of on-premise and cloud platforms to make this happen.
Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

DaviMacielCavalcante/desafio1-prof-artemisia
🚀 ETL Challenge for Beginners: A project that explores ETL concepts with data manipulation in CSV and a complete CRUD using FastAPI and PostgreSQL. Learn to extract, transform, and load data into a relational database and create an API for managing information!
Language: Python - Size: 119 KB - Last synced at: 19 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

jim-schwoebel/allie
🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.
Language: Python - Size: 275 MB - Last synced at: 26 days ago - Pushed at: 3 months ago - Stars: 141 - Forks: 35

jhd3197/Tukuy
Tukuy is a robust, extensible data transformation library that leverages a flexible plugin system. It simplifies the manipulation, validation, and extraction of data across multiple formats (text, HTML, JSON, dates, numbers, and more), making it an ideal tool for building data pipelines and cleaning workflows.
Language: Python - Size: 52.7 KB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

fastverse/fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
Language: R - Size: 10.2 MB - Last synced at: 26 days ago - Pushed at: about 1 month ago - Stars: 272 - Forks: 15

mahmoudparsian/big-data-mapreduce-course
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
Language: HTML - Size: 601 MB - Last synced at: 20 days ago - Pushed at: 7 months ago - Stars: 158 - Forks: 143

Yan-ni/job-market-analysis
Custom ELT pipeline for scraping job listings from 'Welcome to the Jungle' (France), transforming and cleaning the data, and visualizing it for job market analysis.
Language: Python - Size: 973 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

sumit-sinha9/Sales-Insight-Using-PowerBI-and-SQL
This repository contains a Power BI solution for AtliQ hardware, enabling comprehensive analysis of sales trends, informed decision-making, and revenue growth in the brick and mortar business.
Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

patricksferraz/cep2address
A high-performance Python tool for batch processing Brazilian postal codes (CEP) into complete addresses. Features parallel processing, multiple API sources, and flexible I/O formats. Perfect for data enrichment and address validation.
Language: Python - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 1

okomestudio/orgutils
The PKM helper for Emacs Org mode written in Python.
Language: Python - Size: 61.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

cjdoris/Chevrons.jl
Your friendly >> chevron >> based syntax for piping data through multiple transformations.
Language: Julia - Size: 56.6 KB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 41 - Forks: 0

Tynoee/Loan-Approval-Prediction
This project is a web application for predicting loan approval status based on various financial and personal attributes. It uses a machine learning model that I trained on historical loan data to make predictions. I built the web application using Flask for the web framework, SQLite for the database, and the pre-trained model saved with joblib.
Language: Jupyter Notebook - Size: 653 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

WillianMonteiro23/projetos-sql
Esta repositório contém projetos que utilizam SQL para análise de dados. Cada projeto explora conjuntos de dados, realizando consultas para extrair insights, transformar dados e visualizar resultados, demonstrando habilidades em gerenciamento de banco de dados e otimização de consultas.
Language: TSQL - Size: 60.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ramonvermeulen/dbt-toolkit
The dbt-toolkit is an early-stage plugin designed to enhance your experience working with dbt-core projects in JetBrains IDEs.
Language: Kotlin - Size: 6.82 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 27 - Forks: 0

strengejacke/sjmisc
Data transformation and utility functions for R
Language: R - Size: 6.83 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 160 - Forks: 24

mahmoudparsian/data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Language: Python - Size: 44.9 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 215 - Forks: 93

PastorGL/datacooker-etl
ETL processing toolset with SQL-like language and GIS capabilities, built on core Spark. Extensible and modular. REPL included
Language: Java - Size: 11.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 0

VickyShapira/drug-discovery-project
Predicting RNA-binding activity of small molecules using machine learning. Includes data processing, feature analysis, and regression modeling.
Size: 4.08 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

ezvezdov/Dataset-Wrapper
NuScenes, Lyft, Waymo and a2d2 datasets parser.
Language: Python - Size: 476 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 0

raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Language: Go - Size: 12.2 MB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 748 - Forks: 154

AishwaryaGade02/Predicting-90thpercentile-of-annual-maximum-streamflow-using-Linear-Regression
Robust regression model that predicts the 90th percentile of annual max streamflow using Regression Analysis in R
Language: R - Size: 7.81 KB - Last synced at: 13 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

therionakkad/House-Price-Prediction
A Machine Learning project that predicts California house prices using Linear Regression and Random Forest. It includes data preprocessing, feature engineering, visualizations, and model evaluation with hyperparameter tuning using GridSearchCV.
Language: Python - Size: 397 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

dry-rb/dry-transformer
Data transformation toolkit
Language: Ruby - Size: 680 KB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 74 - Forks: 9

matzalazar/tmdb-data-pipeline
ETL pipeline para la extracción, transformación y almacenamiento de datos de películas desde la API de TMDB, utilizando Delta Lake como sistema de persistencia.
Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: about 16 hours ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Samuelson777/COVID-19-Data-Analysis-Project
COVID-19 Data Analysis Project: Interactive visualizations and insights into the pandemic's progression using Power BI. Explore trends, geographical comparisons, and the impact of public health interventions.
Size: 4.12 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Shuyib/chronic-kidney-disease-kaggle
Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.
Language: Jupyter Notebook - Size: 3.78 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 1

PragMath-Analytics/E2E-DataPipeline-SelfHosted-OpenSource
An end-to-end ELT project for self hosted environments using open source tools - PostgresSQL (database), Sling (ingestion), dbt (transformations), and metabase (visualizations)
Language: Python - Size: 352 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

VikkiezDev/Walmart-Sales-Data-Analytics
This project analyzes Walmart sales data to uncover trends in customer behavior, product performance, and revenue generation. Using Python (pandas, matplotlib, seaborn), it explores key metrics like best-selling product lines, peak sales times, and customer demographics through interactive visualizations.
Language: Jupyter Notebook - Size: 547 KB - Last synced at: 19 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

chofste/ETL
Language: Python - Size: 3.07 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 4 - Forks: 0

ronpinkas/dbBridge
dbBridge is an 'SQL Migration Tool' - enabling import of SQL Databases from any supported Dialect (MsSql, MySql, Oracle, PostgreSQL, Sqlite) to any of these supported dialects with just three lines of PHP code.
Language: PHP - Size: 81.1 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 1

varunpravesh/Power-Bi-Dashboards
Includes the .pbix file for all the dashboards on my website
Size: 7.95 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

WillianMonteiro23/projetos-excel
Repositório dedicado à análise de dados utilizando Excel. Aborda desde fórmulas básicas e avançadas até tabelas dinâmicas, gráficos, filtros e manipulação de dados. Focado na solução de problemas reais de negócios através do Excel
Size: 2.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

swapnadb/Nyc-Taxi-Data-Engineering-Project
Language: Jupyter Notebook - Size: 2.94 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

BjornMelin/coinvert
🪙 Coinvert – Effortlessly transform your crypto transaction CSVs into TurboTax-compatible formats using Python and Polars. Simplify your crypto tax reporting today!
Size: 18.6 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

adexoxo13/Knime-Sweden-Energy-Weather
A KNIME workflow analyzing the relationship between Sweden's energy production and weather patterns using data from SCB and SMHI APIs.
Size: 4.78 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

nilportugues/php-serializer
Serialize PHP variables, including objects, in any format. Support to unserialize it too.
Language: PHP - Size: 142 KB - Last synced at: 11 days ago - Pushed at: almost 4 years ago - Stars: 51 - Forks: 19

shrynx/struct_morph
macro for morphing one struct into another.
Language: Rust - Size: 6.84 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 7 - Forks: 1

LegallyNotBlonde/employee_department_analysis_using_postgresql
Executed structured SQL queries on a PostgreSQL database to analyze employee trends, departmental structure, and workforce KPIs across six related tables. Includes headcount forecasting based on recent data.
Language: Python - Size: 15.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

vasturiano/index-array-by
A utility function to index arrays by any criteria
Language: JavaScript - Size: 179 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 10 - Forks: 4

sushant1827/RAG-AI-Agent-2.0
AI-powered assistant that indexes Google Drive files to a vector store on upload and answers user queries based on the content.
Size: 6.84 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

blitz003/tamu_selfless_service_project
Data transformation project for The Bridge Ministries Food Pantry
Language: Python - Size: 9.77 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

kmatarese/glide
Easy ETL
Language: Python - Size: 576 KB - Last synced at: 26 days ago - Pushed at: almost 3 years ago - Stars: 18 - Forks: 2

PrathameshLakawade/Pipeline-Genie
Pipeline-Genie is an intelligent data pipeline that processes CSV datasets, identifies their schema, and leverages LLaMA 2.0 to extract business insights. Users can select relevant business needs, triggering automated ETL transformations using Apache Spark. The final transformed dataset is stored in AWS S3 and made available for download.
Language: Python - Size: 850 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

dataform-co/dataform-example-project
Example project on Dataform
Language: JavaScript - Size: 55.7 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 4

SatyaCoder29/CRM-Analytics-Power-BI
CRM Analytics Dashboard – An interactive dashboard using Tableau, SQL, and Salesforce CRM Analytics (CRMA) to analyze sales performance, customer segmentation, and churn prediction. Features automated ETL pipelines, predictive analytics, and real-time insights for data-driven decision-making. 🚀📊
Size: 17.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

muditbhargava66/macrodata-refinement
A robust Python toolkit for data refinement, validation, and transformation with strict type safety for numerical operations. Clean, validate, and transform your macrodata with confidence.
Language: Python - Size: 1 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

kgelli/PySpark-Fundamentals
A comprehensive collection of PySpark fundamentals with practical examples using retail and Formula 1 datasets.
Language: Jupyter Notebook - Size: 277 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

JoshuaMichaelHall-Tech/ruby-data-pipeline
Your ETL pipeline project
Size: 2.93 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

HirudikaAnupama/Predicting-Term-Deposit-Subscriptions
The purpose of this project is to help banks and financial institutions identify potential customers for term deposit subscriptions, optimize marketing strategies, and improve conversion rates using data-driven insights.
Language: Jupyter Notebook - Size: 5.05 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Ajeeb-Alameen/data-engineering
A Data Engineering project developed as part of my Postgraduate Diploma in Data Science at the German University in Cairo (GUC). This project focuses on cleaning, transforming, and automating data processing for a fintech loan dataset. Read more: https://github.com/Ajeeb-Alameen/data-engineering#readme
Language: Jupyter Notebook - Size: 100 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

pzaino/microETL
A simple, reusable, templates based ETL (Extract, Transform and Load) library and framework written in Python
Language: Python - Size: 383 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 1
