Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-pipelines

dagster-io/dagster

An orchestration platform for the development, production, and observation of data assets.

Language: Python - Size: 952 MB - Last synced: about 3 hours ago - Pushed: about 4 hours ago - Stars: 10,320 - Forks: 1,283

unicef/magasin

Cloud native open-source end-to-end data / AI / ML platform

Language: Mustache - Size: 18.6 MB - Last synced: about 4 hours ago - Pushed: about 1 month ago - Stars: 4 - Forks: 2

dataform-co/dataform

Dataform is a framework for managing SQL based data operations in BigQuery

Language: TypeScript - Size: 15.7 MB - Last synced: about 3 hours ago - Pushed: about 4 hours ago - Stars: 793 - Forks: 146

mycelial/mycelial

Move your data with ease.

Language: Rust - Size: 1.51 MB - Last synced: about 1 hour ago - Pushed: about 5 hours ago - Stars: 70 - Forks: 9

brunocampos01/data-engineering

Language: Python - Size: 165 MB - Last synced: about 15 hours ago - Pushed: about 15 hours ago - Stars: 11 - Forks: 2

artie-labs/transfer

Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift) in real-time.

Language: Go - Size: 11.3 MB - Last synced: about 21 hours ago - Pushed: about 21 hours ago - Stars: 536 - Forks: 24

elementary-data/elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

Language: HTML - Size: 192 MB - Last synced: 26 days ago - Pushed: 26 days ago - Stars: 1,725 - Forks: 144

tuva-health/tuva

Main repo including core data model, data marts, reference data, terminology, and the clinical concept library

Size: 23.2 MB - Last synced: about 21 hours ago - Pushed: about 22 hours ago - Stars: 153 - Forks: 30

terrytangyuan/awesome-kubeflow

A curated list of awesome projects and resources related to Kubeflow (a CNCF incubating project)

Size: 234 KB - Last synced: 3 days ago - Pushed: 13 days ago - Stars: 181 - Forks: 15

infinyon/fluvio

Lean and mean distributed stream processing system written in rust and web assembly.

Language: Rust - Size: 22.9 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 2,284 - Forks: 166

AnthonyByansi/Airflow-Data-Pipeline-Automation

Automate your data pipelines using Apache Airflow with this ready-to-use DAG for data integration, ETL and workflow automation.

Language: Python - Size: 15.6 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 7 - Forks: 0

Galileo-Galilei/kedro-pandera

A kedro plugin to use pandera in your kedro projects

Language: Python - Size: 213 KB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 30 - Forks: 2

apicrafter/datacrafter

NoSQL extract, transform, load (ETL) toolkit with Python

Language: Python - Size: 453 KB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 11 - Forks: 3

bruin-data/bruin

Bruin is a data pipeline tool that is designed to be easy-to-use. It allows building data pipelines using SQL and Python, and has built-in data quality checks.

Language: Go - Size: 22.8 MB - Last synced: 27 days ago - Pushed: 28 days ago - Stars: 46 - Forks: 1

meltano/meltano

Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

Language: Python - Size: 135 MB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 1,598 - Forks: 143

conductor-sdk/conductor-python

Conductor OSS SDK for Python programming language

Language: Python - Size: 1.28 MB - Last synced: 7 days ago - Pushed: 8 days ago - Stars: 50 - Forks: 25

goto/optimus Fork of raystack/optimus

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

Language: Go - Size: 26.3 MB - Last synced: about 10 hours ago - Pushed: 1 day ago - Stars: 3 - Forks: 1

rafaelvargas/bytebridge

A data tool designed to move data seamlessly between various sources and destinations.

Language: Python - Size: 46.9 KB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 0 - Forks: 1

aquemy/DOLAP_2019_supplementary_material

Supplementary material for DOLAP 2019 submission

Size: 5.04 MB - Last synced: 9 days ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

cybergeekgyan/Data-Engineering-Portfolio

Data Engineering portfolio projects, resources used to study data tools...

Language: Jupyter Notebook - Size: 2.92 MB - Last synced: 9 days ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

dataflint/spark

Performance Observability for Apache Spark

Language: TypeScript - Size: 18.6 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 125 - Forks: 9

data-engineering-community/data-engineering-wiki

The best place to learn data engineering. Built and maintained by the data engineering community.

Language: CSS - Size: 7.59 MB - Last synced: 9 days ago - Pushed: about 1 month ago - Stars: 1,032 - Forks: 103

kevin-hanselman/dud

A lightweight CLI tool for versioning data alongside source code and building data pipelines.

Language: Go - Size: 3.31 MB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 166 - Forks: 6

recap-build/recap

Work with your web service, database, and streaming schemas in a single format.

Language: Python - Size: 1.41 MB - Last synced: 7 days ago - Pushed: about 1 month ago - Stars: 306 - Forks: 24

infiniflow/ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language: Python - Size: 19.4 MB - Last synced: 13 days ago - Pushed: 14 days ago - Stars: 5,952 - Forks: 499

KayvanShah1/usc-dsci560-dspp-sp24

USC DSCI 560 - Data Science Professional Practicum - Spring 2024 - Prof. Young Cho

Language: Python - Size: 50.1 MB - Last synced: 12 days ago - Pushed: 16 days ago - Stars: 0 - Forks: 0

mage-ai/mage-ai

๐Ÿง™ Build, run, and manage data pipelines for integrating and transforming data.

Language: Python - Size: 170 MB - Last synced: 27 days ago - Pushed: 27 days ago - Stars: 6,940 - Forks: 616

dataplane-app/dataplane

Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.

Language: JavaScript - Size: 274 MB - Last synced: 10 days ago - Pushed: 4 months ago - Stars: 184 - Forks: 30

kiwicom/terraform-provider-montecarlo

This open-source Terraform provider enables users to seamlessly integrate the Monte Carlo data reliabillity platform into their infrastructure as a code (IaC) workflows.

Language: Go - Size: 230 KB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 8 - Forks: 0

apache/dolphinscheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code

Language: Java - Size: 199 MB - Last synced: 18 days ago - Pushed: 18 days ago - Stars: 11,997 - Forks: 4,411

tuva-health/tuva_demo

A starter dbt project and synthetic claims dataset for trying out the Tuva Project.

Size: 1.98 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 12 - Forks: 6

AiDAPT-A/VisArchPy

pipelines for the extraction and processing of visuals from PDFs

Language: Python - Size: 3.78 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 3 - Forks: 1

kestra-io/examples

Best practices for data workflows, integrations with the Modern Data Stack (MDS), Infrastructure as Code (IaC), Cloud Provider Services

Language: HCL - Size: 1.92 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 9 - Forks: 3

BogdanFloris/detecting-and-addressing-change

Code for my Master Thesis: How to detect and address changes in machine learning based data pipelines

Language: Python - Size: 151 KB - Last synced: 17 days ago - Pushed: 10 months ago - Stars: 3 - Forks: 0

giacbrd/SmartPipeline

A framework for rapid development of robust data pipelines following a simple design pattern

Language: Python - Size: 393 KB - Last synced: 14 days ago - Pushed: 2 months ago - Stars: 22 - Forks: 2

mpolinowski/apache-airflow-intro

Introduction to Apache Airflow

Language: Python - Size: 9.77 KB - Last synced: 20 days ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

CofluxLabs/coflux

Open-source workflow engine. Orchestrate and observe computational workflows defined in plain Python. Suitable for data pipelines, background tasks, chat bots.

Language: Elixir - Size: 3.61 MB - Last synced: 17 days ago - Pushed: 20 days ago - Stars: 4 - Forks: 0

DidactHQ/didact

The open source, standalone, fullstack .NET job orchestrator that we've been missing.

Size: 14.6 KB - Last synced: 22 days ago - Pushed: 6 months ago - Stars: 30 - Forks: 0

elementary-data/dbt-data-reliability

dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

Language: Python - Size: 7.47 MB - Last synced: 26 days ago - Pushed: 26 days ago - Stars: 338 - Forks: 76

Unstructured-IO/unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language: HTML - Size: 124 MB - Last synced: 24 days ago - Pushed: 25 days ago - Stars: 5,819 - Forks: 424

opendatadiscovery/odd-platform

First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

Language: Java - Size: 28.1 MB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 1,104 - Forks: 91

tsdat/tsdat

Time series data utilities for declaratively applying standardization, Q/C, and transformations to datastreams.

Language: Python - Size: 144 MB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 11 - Forks: 7

CogStack/CogStack-NiFi

Building data processing pipelines for documents processing with NLP using Apache NiFi and related services

Language: Python - Size: 74.9 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 31 - Forks: 16

vmware/versatile-data-kit

One framework to develop, deploy and operate data workflows with Python and SQL.

Language: Python - Size: 109 MB - Last synced: 27 days ago - Pushed: 28 days ago - Stars: 409 - Forks: 54

apache/airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Language: Python - Size: 264 MB - Last synced: 27 days ago - Pushed: 27 days ago - Stars: 34,343 - Forks: 13,504

SciPhi-AI/R2R

The framework for fast development and deployment of RAG backends.

Language: Python - Size: 19.5 MB - Last synced: 30 days ago - Pushed: 30 days ago - Stars: 1,103 - Forks: 92

smart-data-lake/smart-data-lake

Smart Automation Tool for building modern Data Lakes and Data Pipelines

Language: Scala - Size: 36.2 MB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 92 - Forks: 21

orchest/orchest

Build data pipelines, the easy way ๐Ÿ› ๏ธ

Language: TypeScript - Size: 27.2 MB - Last synced: 26 days ago - Pushed: 11 months ago - Stars: 4,019 - Forks: 251

bakdata/streams-explorer

Explore Apache Kafka data pipelines in Kubernetes.

Language: Python - Size: 3.87 MB - Last synced: 2 days ago - Pushed: about 1 month ago - Stars: 44 - Forks: 4

linkedin/Hoptimator

Multi-hop declarative data pipelines

Language: Java - Size: 332 KB - Last synced: 25 days ago - Pushed: about 1 month ago - Stars: 74 - Forks: 12

combust/mleap

MLeap: Deploy ML Pipelines to Production

Language: Scala - Size: 3.32 MB - Last synced: 9 days ago - Pushed: 6 months ago - Stars: 1,494 - Forks: 313

beneath-hq/beneath

Beneath is a serverless real-time data platform โšก๏ธ

Language: Go - Size: 11 MB - Last synced: 18 days ago - Pushed: about 2 years ago - Stars: 81 - Forks: 9

DidactHQ/didact-engine

The REST API and execution engine for the Didact Platform.

Language: C# - Size: 238 KB - Last synced: 22 days ago - Pushed: about 2 months ago - Stars: 44 - Forks: 0

glassflow/cli

GlassFlow CLI to create and manage data pipelines

Language: Shell - Size: 20.5 KB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 6 - Forks: 0

DidactHQ/didact-ui

The VueJS single-page app dashboard for the Didact Platform.

Language: Vue - Size: 764 KB - Last synced: 22 days ago - Pushed: 25 days ago - Stars: 11 - Forks: 0

GoogleCloudPlatform/public-datasets-pipelines

Cloud-native, data onboarding architecture for Google Cloud Datasets

Language: Python - Size: 7.12 MB - Last synced: 23 days ago - Pushed: 24 days ago - Stars: 136 - Forks: 61

iesahin/xvc

A robust (๐Ÿข) and fast (๐Ÿ‡) MLOps tool for managing data and pipelines in Rust (๐Ÿฆ€)

Language: Rust - Size: 5.12 MB - Last synced: about 23 hours ago - Pushed: 1 day ago - Stars: 22 - Forks: 0

fmind/mlops-python-package

Kickstart your MLOps initiative with a flexible, robust, and productive Python package.

Language: Jupyter Notebook - Size: 1.26 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 206 - Forks: 24

srenegado/paintings-data

A Python ETL pipeline with a Postgres data warehouse for modeling art inventory.

Language: Python - Size: 528 KB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 0 - Forks: 0

DataCater/datacater ๐Ÿ“ฆ

The developer-friendly ETL platform for transforming data in real-time. Based on Apache Kafkaยฎ and Kubernetesยฎ.

Language: JavaScript - Size: 4.08 MB - Last synced: 17 days ago - Pushed: 9 months ago - Stars: 81 - Forks: 3

datajoint/datajoint-python

Relational data pipelines for the science lab

Language: Python - Size: 16.1 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 161 - Forks: 82

marcio-azevedo/fsharp-data-processing-pipeline

Provides an extensible solution for creating Data Processing Pipelines in F#.

Language: F# - Size: 352 KB - Last synced: 12 days ago - Pushed: about 6 years ago - Stars: 15 - Forks: 1

AnanthaRajuC/DataPractitioner

Data Practitioner

Language: Python - Size: 1010 KB - Last synced: 24 days ago - Pushed: 2 months ago - Stars: 3 - Forks: 0

koolreport/core

An Open Source PHP Reporting Framework that helps you to write perfect data reports or to construct awesome dashboards in PHP. Working great with all PHP versions from 5.6 to latest 8.0. Fully compatible with all kinds of MVC frameworks like Laravel, CodeIgniter, Symfony.

Language: PHP - Size: 2.56 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 151 - Forks: 34

Multiwoven/multiwoven-server

The backend control-plane for multiwoven, Built using Ruby on Rails & Temporal.

Language: Ruby - Size: 8.2 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 11 - Forks: 4

todofixthis/filters

๐Ÿค” What if we took the UNIX philosophy and applied it to input validation?

Language: Python - Size: 553 KB - Last synced: 1 day ago - Pushed: 7 months ago - Stars: 1 - Forks: 3

rcgsheffield/airbods

AIRBODS data pipelines and storage

Language: Python - Size: 262 KB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

tuva-health/FHIR_inferno

Connector that loads FHIR r4 USCDIv3 JSON data from local file storage into the Tuva common data model in Snowflake.

Language: Python - Size: 82 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 13 - Forks: 7

raystack/optimus

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

Language: Go - Size: 13.2 MB - Last synced: 2 months ago - Pushed: 6 months ago - Stars: 735 - Forks: 153

KyleZrey/data-pipeline

Creation of data pipeline using Jupyter Notebook, PostgreSQL, and Apache Airflow.

Language: Jupyter Notebook - Size: 9.74 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

Snehil-Shah/Seismic-Alerts-Streamer

A Realtime Seismic Logging & Alerts Service with Live Monitoring & Email Alerts made using Kafka Data Pipelines, all Dockerized & Deployment Ready!

Language: Java - Size: 11.1 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 2 - Forks: 0

arakat-community/arakat ๐Ÿ“ฆ

ARAKAT - Big Data Analysis and Business Intelligence Application Development Platform

Language: Python - Size: 31.6 MB - Last synced: 2 months ago - Pushed: almost 3 years ago - Stars: 26 - Forks: 21

mackelab/epiphyte

Python toolkit for working with high-dimensional neural data recorded during naturalistic, continuous stimuli @a-darcher @rachrapp

Language: Jupyter Notebook - Size: 191 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 3 - Forks: 1

leotech-dev/leoflow

A set of plugins (mappers, sinks, etc.) for Numaflow pipelines

Language: Go - Size: 11.7 KB - Last synced: 3 months ago - Pushed: 5 months ago - Stars: 2 - Forks: 0

allamiro/Data-Pipelines

Every thing about designing installing and implementing data pipelines to include kafka zookeeper hadoop If you enjoy my content please consider supporting what I do Thank you.

Language: Jinja - Size: 4.45 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 1 - Forks: 0

itsame-mcl/data-pypeline

Pure Python 3 data wrangling tools with support for pipelines

Language: Python - Size: 24.9 MB - Last synced: 3 months ago - Pushed: about 1 year ago - Stars: 2 - Forks: 1

DataDrivenGit/Music-Streaming-App-using-AWS-ETL

Implemented Data Warehouse, Data Lake on AWS and Data modeling with Postgres and Apache Cassandra, Also used Apache Airflow to create data pipeline

Language: Jupyter Notebook - Size: 725 KB - Last synced: about 1 month ago - Pushed: almost 4 years ago - Stars: 4 - Forks: 3

jmoussa/go-sentitweet

CLI Application holding a sentiment analysis data (Twitter tweets) pipeline with its own Web API to query results in the database. Written entirely in Go.

Language: Go - Size: 13.4 MB - Last synced: 4 months ago - Pushed: about 2 years ago - Stars: 1 - Forks: 1

zkan/introduction-to-data-pipelines-and-apache-airflow

Introduction to Data Pipelines and Apache Airflow

Language: Python - Size: 134 KB - Last synced: 26 days ago - Pushed: about 2 months ago - Stars: 3 - Forks: 9

Sibusiso-Gumede/supermarket-scraper

A data extraction program that is a component of a ETL data pipeline. The program scrapes product promotion data from supermarket websites.

Language: Python - Size: 465 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

patterns-app/patterns-devkit

Data pipelines from re-usable components

Language: Python - Size: 1.75 MB - Last synced: 28 days ago - Pushed: about 1 year ago - Stars: 106 - Forks: 5

mxagar/data_engineering_guide

Personal notes on the IBM Data Engineering Certificate as well as other sources focusing on AWS.

Size: 2.93 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

mdh266/AirflowDataPipeline

Example of an ETL Pipeline using Airflow

Language: Python - Size: 14.6 KB - Last synced: 24 days ago - Pushed: over 6 years ago - Stars: 31 - Forks: 19

MattTriano/analytics_data_where_house

An analytics engineering sandbox focusing on real estates prices in Cook County, IL

Language: Python - Size: 15.7 MB - Last synced: 4 months ago - Pushed: 7 months ago - Stars: 7 - Forks: 0

thecodemancer/Apache-Beam

๐Ÿ”ฅ๐Ÿ‘จโ€๐Ÿ’ป Build Big data pipelines with Apache Beam in any language and run it via Spark, Flink, GCP (Google Cloud Dataflow).

Language: Jupyter Notebook - Size: 321 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

tara-nguyen/modern-data-architecture

Follow along with materials in the book "Modern Data Architectures with Python: A practical guide to building and deploying data pipelines, data warehouses and data lakes" (Lipp, 2023)

Language: Jupyter Notebook - Size: 33.2 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

AnthonyByansi/Rust-Exploratorium

๐Ÿš€ Master Rust programming with this comprehensive roadmap! Explore fundamental and advanced concepts, code examples, and resources.

Language: Rust - Size: 38.1 KB - Last synced: 3 months ago - Pushed: 7 months ago - Stars: 8 - Forks: 0

larribas/dagger

Define sophisticated data pipelines with Python and run them on different distributed systems (such as Argo Workflows).

Language: Python - Size: 9.99 MB - Last synced: 5 days ago - Pushed: about 2 months ago - Stars: 13 - Forks: 5

tuva-health/medicare_cclf_connector

This connector is a dbt project that maps Medicare CCLF claims data to the Tuva Input Layer.

Size: 1010 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 12 - Forks: 12

vanderschaarlab/temporai-mivdp

TemporAI-MIVDP: Adaptation of MIMIC-IV-Data-Pipeline for TemporAI

Language: Python - Size: 1.85 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 1 - Forks: 0

anna-geller/kestra-ci-cd

CI/CD repository template to automate deployments of your production flows

Language: HCL - Size: 96.7 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 4 - Forks: 2

minyansh7/DisasterResponseProject

Build a web application to classify big data of messages into 36 categories that sent to related disaster relief agencies, and help disaster workers to classify new messages.

Language: Jupyter Notebook - Size: 37.8 MB - Last synced: 6 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

opendatadiscovery/odd-collector-gcp ๐Ÿ“ฆ

Open-source GCP metadata collector based on ODD Specification

Language: Python - Size: 188 KB - Last synced: 4 months ago - Pushed: 8 months ago - Stars: 4 - Forks: 0

rcorrero/light-pipe

A high-level syntax for data pipelines, designed to make pipeline development quick and painless.

Language: Python - Size: 1.5 MB - Last synced: 13 days ago - Pushed: 11 months ago - Stars: 3 - Forks: 1

StrictlySkyler/harbormaster Fork of luzlab/harbormaster-apache ๐Ÿ“ฆ

A framework for microservices

Language: JavaScript - Size: 1.82 MB - Last synced: 26 days ago - Pushed: 6 months ago - Stars: 3 - Forks: 5

electronick1/stepist

Framework for data processing

Language: Python - Size: 865 KB - Last synced: 6 days ago - Pushed: over 4 years ago - Stars: 27 - Forks: 5

Elkinmt19/airflow-master

This a repo that was created to learn more about Airflow and develop awesome data engineering projects. ๐Ÿš€๐Ÿš€

Language: Python - Size: 3.33 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 4 - Forks: 3

tuva-health/medicare_lds_connector

Maps Medicare LDS claims data to the Tuva Input Layer so you can easily run the Tuva Project.

Size: 664 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 7 - Forks: 4

shravan-kuchkula/udacity-data-eng-proj-1

Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3

Language: Python - Size: 3.47 MB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 88 - Forks: 58

projectmesadata/cropyield

Creates a data pipeline from the Famine Land Data Assimilation DataSet (FLDAS) to seed model terrain and assess the potential crop yield for a variety of crops.

Language: Jupyter Notebook - Size: 134 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 3 - Forks: 3