Topic: "great-expectations"
iusztinpaul/energy-forecasting
๐ ๐ง๐ต๐ฒ ๐๐๐น๐น ๐ฆ๐๐ฎ๐ฐ๐ธ ๐ณ-๐ฆ๐๐ฒ๐ฝ๐ ๐ ๐๐ข๐ฝ๐ ๐๐ฟ๐ฎ๐บ๐ฒ๐๐ผ๐ฟ๐ธ | ๐๐ฒ๐ฎ๐ฟ๐ป ๐ ๐๐ & ๐ ๐๐ข๐ฝ๐ for free by designing, building and deploying an end-to-end ML batch system ~ ๐ด๐ฐ๐ถ๐ณ๐ค๐ฆ ๐ค๐ฐ๐ฅ๐ฆ + 2.5 ๐ฉ๐ฐ๐ถ๐ณ๐ด ๐ฐ๐ง ๐ณ๐ฆ๐ข๐ฅ๐ช๐ฏ๐จ & ๐ท๐ช๐ฅ๐ฆ๐ฐ ๐ฎ๐ข๐ต๐ฆ๐ณ๐ช๐ข๐ญ๐ด
Language: Python - Size: 4.1 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 913 - Forks: 207

adidas/lakehouse-engine
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
Language: Python - Size: 8.79 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 256 - Forks: 45

josephmachado/data_engineering_best_practices
Sample project to demonstrate data engineering best practices
Language: Python - Size: 644 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 185 - Forks: 32

trannhatnguyen2/NYC_Taxi_Data_Pipeline
Nyc_Taxi_Data_Pipeline - DE Project
Language: Python - Size: 6.58 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 106 - Forks: 21

GokuMohandas/testing-ml
Learn how to create reliable ML systems by testing code, data and models.
Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 87 - Forks: 13

provectus/data-quality-gate
Data Quality Gate based on AWS
Language: Python - Size: 24 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 57 - Forks: 5

hoangsonww/End-to-End-Data-Pipeline
๐ A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transformation, storage, monitoring, and AI/ML serving with CI/CD automation using Terraform & GitHub Actions.
Language: Python - Size: 31.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 24

PrefectHQ/prefect-great-expectations
Prefect integrations for interacting with Great Expectations
Language: Python - Size: 2.18 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 28 - Forks: 3

NatanMish/data_validation
Tutorial for implementing data validation in data science pipelines
Language: Jupyter Notebook - Size: 16.3 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 26 - Forks: 10

moritzkoerber/covid-19-data-engineering-pipeline
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
Language: Python - Size: 1.31 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 5

dain55788/ELT-Data-Pipeline
ELT Data Pipeline implementation in Data Warehousing environment
Language: Jupyter Notebook - Size: 1.62 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 20 - Forks: 8

BirdiD/BirdiDQ
BirdiDQ leverages the power of the Python Great Expectations open-source library and combines it with the simplicity of natural language queries to effortlessly identify and report data quality issues, all at the tip of your fingers.
Language: Jupyter Notebook - Size: 539 MB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 20 - Forks: 2

ismaildawoodjee/GreatEx
A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.
Language: Python - Size: 1.73 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 20 - Forks: 6

josephmachado/data_engineering_best_practices_log
Code to demonstrate data engineering metadata & logging best practices
Language: Python - Size: 962 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 4

MDS-BD/hands-on-great-expectations-with-spark
How to evaluate the Quality of your Data with Great Expectations and Spark.
Language: Jupyter Notebook - Size: 54.7 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 2

grillazz/fastapi-greatexpectations
Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool
Language: Python - Size: 7.51 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 13 - Forks: 0

serialbandicoot/great-assertions
This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.
Language: Python - Size: 940 KB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 1

PbVrCt/nft-arbitrage
An ML pipeline to flip nfts that makes use of the cloud and containers.
Language: Python - Size: 1.22 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 0

datarootsio/notion-dbs-data-quality
Using Great Expectations and Notion's API, this repo aims to provide data quality for our databases in Notion.
Language: Python - Size: 56.3 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 0

great-expectations/cloud
Source code for the gx cloud agent
Language: Python - Size: 5.81 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 8 - Forks: 3

adidas/lakehouse-engine-docs
The Goal of this project is to provide documentation for the Lakehouse Engine framework.
Language: HTML - Size: 14.1 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 7 - Forks: 5

phatnguyen080401/Real-Estate-Sale-Analytics
Create data pipeline using Lambda architecture with Spark, Kafka, Airflow and Snowflake
Language: Python - Size: 194 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

k3ai/plugins ๐ฆ
A lightweight tool to get an AI Infrastructure Stack up in minutes not days. K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
Size: 208 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

xxl4tomxu98/data-engineering-python-great-expectations
Demo on Data Engineering using Great Expectations API
Language: Jupyter Notebook - Size: 6 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 0

anacision/kedro-expectations
Our fork of https://github.com/joao-pampanin/kedro-expectations "Tool to better integrate Kedro and Great Expectations" which supports newer versions of Kedro and Great Expectations, and has integrated some cool new features like Email alerts, delayed failure raising and performance gains.
Last synced at: 3 months ago - Stars: 4 - Forks: 1

JuanCampbsi/analytics_engineering_airbnb
In this project, dbt, Great Expectations, Python and Pandas were used to transform and validate the "Inside Airbnb" dataset. The tools ensure quality data, ready for analysis.
Language: HTML - Size: 81.5 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 3 - Forks: 2

KoenvdBerg/csv-validator
Validates tabular CSV data using predefined validations, inspired from its Python homologue "Great Expectations".
Language: Common Lisp - Size: 5.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

MagisterUnivers/undefined-Team-Project
A great project with a top teammates. [...undefined] will break through the roof!!!
Language: SCSS - Size: 9.57 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

luchonaveiro/open-source-data-stack
Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.
Language: HTML - Size: 13.5 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 1

anilkulkarni87/databricks_notebooks
A collection of Databricks notebooks for testing and learning
Language: HTML - Size: 4.22 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

MagisterUnivers/Undefined-project-2
A second project with a top teammates. [...undefined] will keep the quality, as always.
Language: JavaScript - Size: 64.4 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 1

luchonaveiro/great-expectations-postgres-tutorial
Tutorial using Great Expectations library, validating and profiling data on a local PostgreSQL database.
Language: HTML - Size: 3.68 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

aravinthsci/great-expectations-site
Dockerizing Data Docs autogenerated by Great Expectations using FastAPI Jinja Templates .
Language: HTML - Size: 3.5 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

PbVrCt/time-series-pipeline
A pipeline to forecast the direction stock prices from data from eodhistoricaldata.com
Language: Python - Size: 909 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

ismaildawoodjee/Great-Expectations-for-CSV
Ensuring data quality in an e-commerce data set using Great Expectations.
Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

ota2000/dataform-expectations
Porting "Great Expectations" to the Dataform package
Language: JavaScript - Size: 23.4 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

mjkanji/dagster-ge-demo
Demo showcasing how to validate your Dagster data pipelines using Great Expectations.
Language: Python - Size: 6.84 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

J6Software/Jan6Coin
The $JAN6 Commemorative Coin. The Only MEME that shouts FREEDOM!
Size: 704 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sheoran19/yahoo-airflow-data-engineering-project
Yahoo Data Pipeline using Airflow
Language: Python - Size: 1.92 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

paulf-999/data_profiling_w_great_expectations
Bulk Data Profiling Solution using Great Expectations
Language: HTML - Size: 37.1 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

luatnc87/modern-data-warehouse-modeling-and-data-quality-with-dbt-openmetada
This repository serves as a comprehensive guide to effective data modeling and robust data quality assurance using popular open-source tools
Language: Python - Size: 3.28 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

project-amenhotep/PROJECT-AMENHOTEP
JOIN US!!! BE A PART OF THE GREATEST REVOLUTION.....
Language: HTML - Size: 1.13 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

brendajanuario/pipeline-bigdata-pyspark
Personal Data Engineering project witch the objective is create the Data Lakehouse for a B2B e-commerce that must store the transactional and analytical data of the business. The final system delivers structured and clean data with the purpose of generate reports and find opportunities.
Language: Python - Size: 5.38 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

gamberooni/imdb-etl
Self learning project using IMDb datasets
Language: Jupyter Notebook - Size: 2.82 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

aravinthsci/great_expectations
Examples
Language: Jupyter Notebook - Size: 973 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

JhuniorZenx/energy
Explore the Energy Research Framework, uniting advanced energy technologies and theoretical physics. Discover warp drive, fusion, and quantum gravity. ๐๐
Language: HTML - Size: 155 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

AbhijeetDasBakshi/ecommerce-insights
A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization โ enabling actionable insights into e-commerce
Language: Jupyter Notebook - Size: 657 KB - Last synced at: 22 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

PerseusRealDeal/TheTechnologicalTree
There is a tree greening by the ocean...
Size: 348 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yahiazakaria445/End-to-End-Sales-Data-Pipeline-Modeling-and-Analytics
End-to-end data engineering project using Python, SQL, Snowflake, Power BI, and Great Expectations.
Language: Jupyter Notebook - Size: 1.65 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

firoz-ahmad-likhon/airflow-data-pipeline
An airflow data pipeline that acquires data through api, validates using Great Expectations and stores data in Postgres.
Language: Python - Size: 70.3 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

firoz-ahmad-likhon/airflow-example
Sample project of dockerized Airflow to acquire data from API, validate using Great Expectations and sync to Postgres
Language: Python - Size: 142 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ArtemAntonov/Gearbox-Speed-Estimation-via-Vibration-Analysis
Gearbox speed estimation using vibration data transformed via FFT and a lightweight PyTorch CNN
Language: Jupyter Notebook - Size: 16.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

TheDataArtisanDev/Great-Expectations-Order-Data-Analysis
PySpark repository for order data analysis with Great Expectations.
Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

esteban-mendoza/dbt-project
Proof of concept project for dbt with Snowflake
Size: 3.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

slitayem/dbt-practice
Resources and scripts to start with Dbt
Language: Shell - Size: 74.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

jaredfiacco2/GreatExpectations
Example of Great Expectations - Manually Generate & Auto Generate Expectations
Language: Jupyter Notebook - Size: 104 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

makism/spark-gx-poc
Language: Python - Size: 19.5 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

paulfry-payroc/great_expectations
(Automated) bulk data profiling solution using Great Expectations (GX)
Language: HTML - Size: 3.99 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

makism/etl-playground
Size: 286 KB - Last synced at: 11 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

piyush-an/NYC-Restaurant-Inspection
A data warehousing application on NYC Food Inspection
Language: HTML - Size: 6.38 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

CassiaAlthman/Atividade-02
Resoluรงรฃo de exercรญcios para estudo do mรณdulo de Validando dados com Great Expectations utilizando as bibliotecas pandas e great_expectations.
Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

krishna-aditi/devops-tools-tutorial-labs
Tutorials for DevOps tools such as Google Codelabs, Apache Airflow, Streamlit, FastAPI, Great Expectations, etc.
Language: Python - Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

krishna-aditi/data-validation-using-great-expectations-and-xsv-querying
XSV queries on Amazon Musical Instruments Review and Great-Expectations library for Data Validation
Language: HTML - Size: 5.86 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

vmtl-adsk/spark-learning
Kafka-Spark jobs orchestrated with Airflow
Language: Python - Size: 3.39 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

RominaShin/Data_Pipeline_Notes
R&D around Data Engineering Tools
Language: Jupyter Notebook - Size: 91.8 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

FilipVan/great_expectations
Example projects using great expectations to validate data.
Language: HTML - Size: 6.22 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

etothexipi/sdsnv2020
State of Data Science Nevada Conference: Multi-track tutorial to create, provision, and version control AWS infrastructure to manage data pipelines effectively
Language: Python - Size: 106 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 2
