An open API service providing repository metadata for many open source software ecosystems.

Topic: "great-expectations"

iusztinpaul/energy-forecasting

๐ŸŒ€ ๐—ง๐—ต๐—ฒ ๐—™๐˜‚๐—น๐—น ๐—ฆ๐˜๐—ฎ๐—ฐ๐—ธ ๐Ÿณ-๐—ฆ๐˜๐—ฒ๐—ฝ๐˜€ ๐— ๐—Ÿ๐—ข๐—ฝ๐˜€ ๐—™๐—ฟ๐—ฎ๐—บ๐—ฒ๐˜„๐—ผ๐—ฟ๐—ธ | ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป ๐— ๐—Ÿ๐—˜ & ๐— ๐—Ÿ๐—ข๐—ฝ๐˜€ for free by designing, building and deploying an end-to-end ML batch system ~ ๐˜ด๐˜ฐ๐˜ถ๐˜ณ๐˜ค๐˜ฆ ๐˜ค๐˜ฐ๐˜ฅ๐˜ฆ + 2.5 ๐˜ฉ๐˜ฐ๐˜ถ๐˜ณ๐˜ด ๐˜ฐ๐˜ง ๐˜ณ๐˜ฆ๐˜ข๐˜ฅ๐˜ช๐˜ฏ๐˜จ & ๐˜ท๐˜ช๐˜ฅ๐˜ฆ๐˜ฐ ๐˜ฎ๐˜ข๐˜ต๐˜ฆ๐˜ณ๐˜ช๐˜ข๐˜ญ๐˜ด

Language: Python - Size: 4.1 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 913 - Forks: 207

adidas/lakehouse-engine

The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.

Language: Python - Size: 8.79 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 256 - Forks: 45

josephmachado/data_engineering_best_practices

Sample project to demonstrate data engineering best practices

Language: Python - Size: 644 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 185 - Forks: 32

trannhatnguyen2/NYC_Taxi_Data_Pipeline

Nyc_Taxi_Data_Pipeline - DE Project

Language: Python - Size: 6.58 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 106 - Forks: 21

GokuMohandas/testing-ml

Learn how to create reliable ML systems by testing code, data and models.

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 87 - Forks: 13

provectus/data-quality-gate

Data Quality Gate based on AWS

Language: Python - Size: 24 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 57 - Forks: 5

hoangsonww/End-to-End-Data-Pipeline

๐Ÿ“ˆ A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transformation, storage, monitoring, and AI/ML serving with CI/CD automation using Terraform & GitHub Actions.

Language: Python - Size: 31.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 24

PrefectHQ/prefect-great-expectations

Prefect integrations for interacting with Great Expectations

Language: Python - Size: 2.18 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 28 - Forks: 3

NatanMish/data_validation

Tutorial for implementing data validation in data science pipelines

Language: Jupyter Notebook - Size: 16.3 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 26 - Forks: 10

moritzkoerber/covid-19-data-engineering-pipeline

A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.

Language: Python - Size: 1.31 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 5

dain55788/ELT-Data-Pipeline

ELT Data Pipeline implementation in Data Warehousing environment

Language: Jupyter Notebook - Size: 1.62 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 20 - Forks: 8

BirdiD/BirdiDQ

BirdiDQ leverages the power of the Python Great Expectations open-source library and combines it with the simplicity of natural language queries to effortlessly identify and report data quality issues, all at the tip of your fingers.

Language: Jupyter Notebook - Size: 539 MB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 20 - Forks: 2

ismaildawoodjee/GreatEx

A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.

Language: Python - Size: 1.73 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 20 - Forks: 6

josephmachado/data_engineering_best_practices_log

Code to demonstrate data engineering metadata & logging best practices

Language: Python - Size: 962 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 4

MDS-BD/hands-on-great-expectations-with-spark

How to evaluate the Quality of your Data with Great Expectations and Spark.

Language: Jupyter Notebook - Size: 54.7 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 2

grillazz/fastapi-greatexpectations

Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool

Language: Python - Size: 7.51 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 13 - Forks: 0

serialbandicoot/great-assertions

This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.

Language: Python - Size: 940 KB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 1

PbVrCt/nft-arbitrage

An ML pipeline to flip nfts that makes use of the cloud and containers.

Language: Python - Size: 1.22 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 0

datarootsio/notion-dbs-data-quality

Using Great Expectations and Notion's API, this repo aims to provide data quality for our databases in Notion.

Language: Python - Size: 56.3 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 0

great-expectations/cloud

Source code for the gx cloud agent

Language: Python - Size: 5.81 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 8 - Forks: 3

adidas/lakehouse-engine-docs

The Goal of this project is to provide documentation for the Lakehouse Engine framework.

Language: HTML - Size: 14.1 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 7 - Forks: 5

phatnguyen080401/Real-Estate-Sale-Analytics

Create data pipeline using Lambda architecture with Spark, Kafka, Airflow and Snowflake

Language: Python - Size: 194 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

k3ai/plugins ๐Ÿ“ฆ

A lightweight tool to get an AI Infrastructure Stack up in minutes not days. K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.

Size: 208 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

xxl4tomxu98/data-engineering-python-great-expectations

Demo on Data Engineering using Great Expectations API

Language: Jupyter Notebook - Size: 6 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 0

anacision/kedro-expectations

Our fork of https://github.com/joao-pampanin/kedro-expectations "Tool to better integrate Kedro and Great Expectations" which supports newer versions of Kedro and Great Expectations, and has integrated some cool new features like Email alerts, delayed failure raising and performance gains.

Last synced at: 3 months ago - Stars: 4 - Forks: 1

JuanCampbsi/analytics_engineering_airbnb

In this project, dbt, Great Expectations, Python and Pandas were used to transform and validate the "Inside Airbnb" dataset. The tools ensure quality data, ready for analysis.

Language: HTML - Size: 81.5 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 3 - Forks: 2

KoenvdBerg/csv-validator

Validates tabular CSV data using predefined validations, inspired from its Python homologue "Great Expectations".

Language: Common Lisp - Size: 5.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

MagisterUnivers/undefined-Team-Project

A great project with a top teammates. [...undefined] will break through the roof!!!

Language: SCSS - Size: 9.57 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

luchonaveiro/open-source-data-stack

Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.

Language: HTML - Size: 13.5 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 1

anilkulkarni87/databricks_notebooks

A collection of Databricks notebooks for testing and learning

Language: HTML - Size: 4.22 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

MagisterUnivers/Undefined-project-2

A second project with a top teammates. [...undefined] will keep the quality, as always.

Language: JavaScript - Size: 64.4 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 1

luchonaveiro/great-expectations-postgres-tutorial

Tutorial using Great Expectations library, validating and profiling data on a local PostgreSQL database.

Language: HTML - Size: 3.68 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

aravinthsci/great-expectations-site

Dockerizing Data Docs autogenerated by Great Expectations using FastAPI Jinja Templates .

Language: HTML - Size: 3.5 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

PbVrCt/time-series-pipeline

A pipeline to forecast the direction stock prices from data from eodhistoricaldata.com

Language: Python - Size: 909 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

ismaildawoodjee/Great-Expectations-for-CSV

Ensuring data quality in an e-commerce data set using Great Expectations.

Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

ota2000/dataform-expectations

Porting "Great Expectations" to the Dataform package

Language: JavaScript - Size: 23.4 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

mjkanji/dagster-ge-demo

Demo showcasing how to validate your Dagster data pipelines using Great Expectations.

Language: Python - Size: 6.84 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

J6Software/Jan6Coin

The $JAN6 Commemorative Coin. The Only MEME that shouts FREEDOM!

Size: 704 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sheoran19/yahoo-airflow-data-engineering-project

Yahoo Data Pipeline using Airflow

Language: Python - Size: 1.92 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

paulf-999/data_profiling_w_great_expectations

Bulk Data Profiling Solution using Great Expectations

Language: HTML - Size: 37.1 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

luatnc87/modern-data-warehouse-modeling-and-data-quality-with-dbt-openmetada

This repository serves as a comprehensive guide to effective data modeling and robust data quality assurance using popular open-source tools

Language: Python - Size: 3.28 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

project-amenhotep/PROJECT-AMENHOTEP

JOIN US!!! BE A PART OF THE GREATEST REVOLUTION.....

Language: HTML - Size: 1.13 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

brendajanuario/pipeline-bigdata-pyspark

Personal Data Engineering project witch the objective is create the Data Lakehouse for a B2B e-commerce that must store the transactional and analytical data of the business. The final system delivers structured and clean data with the purpose of generate reports and find opportunities.

Language: Python - Size: 5.38 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

gamberooni/imdb-etl

Self learning project using IMDb datasets

Language: Jupyter Notebook - Size: 2.82 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

aravinthsci/great_expectations

Examples

Language: Jupyter Notebook - Size: 973 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

JhuniorZenx/energy

Explore the Energy Research Framework, uniting advanced energy technologies and theoretical physics. Discover warp drive, fusion, and quantum gravity. ๐ŸŒŒ๐Ÿ”‹

Language: HTML - Size: 155 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

AbhijeetDasBakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization โ€” enabling actionable insights into e-commerce

Language: Jupyter Notebook - Size: 657 KB - Last synced at: 22 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

PerseusRealDeal/TheTechnologicalTree

There is a tree greening by the ocean...

Size: 348 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yahiazakaria445/End-to-End-Sales-Data-Pipeline-Modeling-and-Analytics

End-to-end data engineering project using Python, SQL, Snowflake, Power BI, and Great Expectations.

Language: Jupyter Notebook - Size: 1.65 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

firoz-ahmad-likhon/airflow-data-pipeline

An airflow data pipeline that acquires data through api, validates using Great Expectations and stores data in Postgres.

Language: Python - Size: 70.3 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

firoz-ahmad-likhon/airflow-example

Sample project of dockerized Airflow to acquire data from API, validate using Great Expectations and sync to Postgres

Language: Python - Size: 142 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ArtemAntonov/Gearbox-Speed-Estimation-via-Vibration-Analysis

Gearbox speed estimation using vibration data transformed via FFT and a lightweight PyTorch CNN

Language: Jupyter Notebook - Size: 16.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

TheDataArtisanDev/Great-Expectations-Order-Data-Analysis

PySpark repository for order data analysis with Great Expectations.

Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

esteban-mendoza/dbt-project

Proof of concept project for dbt with Snowflake

Size: 3.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

slitayem/dbt-practice

Resources and scripts to start with Dbt

Language: Shell - Size: 74.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

jaredfiacco2/GreatExpectations

Example of Great Expectations - Manually Generate & Auto Generate Expectations

Language: Jupyter Notebook - Size: 104 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

makism/spark-gx-poc

Language: Python - Size: 19.5 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

paulfry-payroc/great_expectations

(Automated) bulk data profiling solution using Great Expectations (GX)

Language: HTML - Size: 3.99 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

makism/etl-playground

Size: 286 KB - Last synced at: 11 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

piyush-an/NYC-Restaurant-Inspection

A data warehousing application on NYC Food Inspection

Language: HTML - Size: 6.38 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

CassiaAlthman/Atividade-02

Resoluรงรฃo de exercรญcios para estudo do mรณdulo de Validando dados com Great Expectations utilizando as bibliotecas pandas e great_expectations.

Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

krishna-aditi/devops-tools-tutorial-labs

Tutorials for DevOps tools such as Google Codelabs, Apache Airflow, Streamlit, FastAPI, Great Expectations, etc.

Language: Python - Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

krishna-aditi/data-validation-using-great-expectations-and-xsv-querying

XSV queries on Amazon Musical Instruments Review and Great-Expectations library for Data Validation

Language: HTML - Size: 5.86 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

vmtl-adsk/spark-learning

Kafka-Spark jobs orchestrated with Airflow

Language: Python - Size: 3.39 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

RominaShin/Data_Pipeline_Notes

R&D around Data Engineering Tools

Language: Jupyter Notebook - Size: 91.8 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

FilipVan/great_expectations

Example projects using great expectations to validate data.

Language: HTML - Size: 6.22 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

etothexipi/sdsnv2020

State of Data Science Nevada Conference: Multi-track tutorial to create, provision, and version control AWS infrastructure to manage data pipelines effectively

Language: Python - Size: 106 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 2