An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-observability

sodadata/soda-core

:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

Language: Python - Size: 3.91 MB - Last synced at: about 2 hours ago - Pushed at: about 7 hours ago - Stars: 2,085 - Forks: 234

elementary-data/elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

Language: HTML - Size: 205 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,063 - Forks: 184

dqops/dqo

Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.

Language: Java - Size: 91 MB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 147 - Forks: 28

open-metadata/OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

Language: TypeScript - Size: 1.8 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6,619 - Forks: 1,216

re-data/re-data

re_data - fix data issues before your users & CEO would discover them 😊

Language: HTML - Size: 76.5 MB - Last synced at: about 16 hours ago - Pushed at: about 1 year ago - Stars: 1,562 - Forks: 124

kiwicom/terraform-provider-montecarlo

This open-source Terraform provider enables users to seamlessly integrate the Monte Carlo data reliabillity platform into their infrastructure as a code (IaC) workflows.

Language: Go - Size: 249 KB - Last synced at: about 17 hours ago - Pushed at: about 18 hours ago - Stars: 10 - Forks: 3

DataKitchen/data-observability-installer

Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.

Language: Python - Size: 358 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 117 - Forks: 10

open-metadata/openmetadata-site

Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.

Language: TypeScript - Size: 54.6 MB - Last synced at: 6 days ago - Pushed at: 8 days ago - Stars: 14 - Forks: 11

elementary-data/dbt-data-reliability

dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

Language: Python - Size: 7.73 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 439 - Forks: 103

DataKitchen/dataops-testgen

DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling,  new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring

Language: Python - Size: 5.23 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 55 - Forks: 3

DataKitchen/dataops-observability-agents

DataOps Observability Integration Agents are part of DataKitchen's Open Source Data Observability. They connect to various ETL, ELT, BI, data science, data visualization, data governance, and data analytic tools. They provide logs, messages, metrics, overall run-time start/stop, subtask status, and scheduling information to DataOps Observability.

Language: Python - Size: 249 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 29 - Forks: 1

siffletdata/terraform-provider-sifflet

Terraform provider for Sifflet, the data observability platform.

Language: Go - Size: 683 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 7 - Forks: 1

montara-io/dbt-command-center

Never sift through endless dbtâ„¢ logs again. dbt Command Center is a free, open-source, local web application that provides a user-friendly interface to monitor and manage dbt runs.

Language: TypeScript - Size: 3.55 MB - Last synced at: 9 days ago - Pushed at: 23 days ago - Stars: 28 - Forks: 0

datachecks/dcs-core

Open Source Data Quality Monitoring.

Language: Python - Size: 4.33 MB - Last synced at: 6 days ago - Pushed at: 28 days ago - Stars: 154 - Forks: 22

opendatadiscovery/odd-platform

First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

Language: Java - Size: 27.9 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1,306 - Forks: 122

oslabs-beta/DataDoc

Endpoint downtime detection, monitoring, and traffic simulation developer tool

Language: JavaScript - Size: 3.44 MB - Last synced at: 12 days ago - Pushed at: over 2 years ago - Stars: 64 - Forks: 1

data-drift/data-drift

Metrics Observability & Troubleshooting

Language: HTML - Size: 11.7 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 323 - Forks: 12

InfuseAI/piperider

Code review for data in dbt

Language: Python - Size: 32.6 MB - Last synced at: 26 days ago - Pushed at: 4 months ago - Stars: 487 - Forks: 23

opendatadiscovery/odd-collectors

Language: Python - Size: 2.02 MB - Last synced at: 25 days ago - Pushed at: 3 months ago - Stars: 9 - Forks: 11

DataKitchen/dataops-observability

DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.

Language: Python - Size: 3.91 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 43 - Forks: 3

Swiple/swiple

Swiple enables you to easily observe, understand, validate and improve the quality of your data

Language: Python - Size: 180 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 82 - Forks: 11

re-data/dbt-re-data

re_data - fix data issues before your users & CEO would discover them 😊

Language: Python - Size: 4.12 MB - Last synced at: about 16 hours ago - Pushed at: about 1 year ago - Stars: 98 - Forks: 41

sodadata/soda-github-action

:zap: Prevent downstream data quality issues by integrating the Soda Library into your CI/CD pipeline.

Language: Python - Size: 47.9 KB - Last synced at: 11 days ago - Pushed at: 7 months ago - Stars: 14 - Forks: 0

opendatadiscovery/odd-collector 📦

Open-source metadata collector based on ODD Specification

Language: Python - Size: 1.96 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 43 - Forks: 13

JBris/marquez-test

Testing a Docker deployment of Marquez and OpenLineage

Language: Shell - Size: 19.5 KB - Last synced at: 14 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

JBris/openmetadata-test

Testing a Docker deployment of OpenMetadata for S3 data ingestion

Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

annamatias/dataengineer

Códigos, plataformas, ferramentas e processos em alta;

Language: Python - Size: 307 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

datasphere-oss/datasphere

DataSphere is the first open-source cloud-native data observability platform that helps you trace the whole data infrastructure in your warehouses, lakes and databases.

Language: Java - Size: 119 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 4

sodadata/soda-spark

Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes

Language: Python - Size: 118 KB - Last synced at: 20 days ago - Pushed at: almost 3 years ago - Stars: 63 - Forks: 8

korawica/armored

Armored Models for Data Pipeline & Data Observability

Language: Python - Size: 85 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

cgnorthcutt/reliablity_framework_for_rag

Demo showing how the Trustworthy Language Model add reliability to LLM outputs and improves RAG, agents, and data enrichment worfklows. can be used to improve fine-tuning of LLMs, accuracy of LLM outputs, and smart routing for RAG and agents.

Language: Jupyter Notebook - Size: 18.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 2

opendatadiscovery/odd-collector-sdk 📦

Language: Python - Size: 413 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

opendatadiscovery/odd-collector-gcp 📦

Open-source GCP metadata collector based on ODD Specification

Language: Python - Size: 188 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

Jaimeloeuf/Jevents

A simple to use EventEmitter and Data-Observer python package.

Language: Python - Size: 27.3 KB - Last synced at: 9 days ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

Swiple/swiple-action

Automatically validate datasets, poll task status, and display validation results in a GitHub using Swiple pull request.

Language: Python - Size: 28.3 KB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

GuinsooLab/stealthward

dbt native framework built to observe modern data stack

Language: HTML - Size: 1.11 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 0

SachinVarghese/pgamber

Data observability for postgreSQL using alibi-detect

Language: Go - Size: 895 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rishuatgithub/data-lin-observability

Data Lineage Observability Project

Language: Shell - Size: 17.6 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

mark-antal-csizmadia/em-simple

Expectation Maximization (EM) algorithm for estimating maximum likelihood (ML) parameters of partially observed data on a three-node Bayesian Network Probabilistic Graphical Model.

Language: Jupyter Notebook - Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Related Keywords
data-observability 39 data-quality 16 python 11 dbt 11 data-engineering 10 data-governance 10 data-profiling 10 data-reliability 10 data-lineage 8 dataquality 8 data-science 8 snowflake 7 data-catalog 7 data-testing 7 data-quality-monitoring 7 data-quality-checks 7 data-monitoring 7 dataops 6 dbt-packages 6 data-pipelines 6 data-discovery 6 redshift 5 observability 5 hacktoberfest 5 data 5 data-validation 5 postgresql 4 lineage 4 data-pipeline 4 datacatalog 3 bigquery 3 data-analysis 3 monitoring 3 datatesting 3 data-platform 3 metadata-management 3 metadata 3 metrics 2 datachecker 2 dbt-metrics 2 datavalidation 2 mssql 2 pyspark 2 self-hosted 2 bigdata 2 etl 2 python3 2 analytics 2 analytics-engineering 2 sql 2 data-exploration 2 data-contracts 2 data-quality-testing 2 data-unit-tests 2 pipeline-testing 2 data-warehouse 2 dbt-artifacts 2 data-ops 2 data-analytics 2 datadiscovery 2 swiple 2 automation 2 dataengineering 2 data-quality-framework 2 marquez 1 data-piplines 1 gcp 1 databricks 1 validation 1 airflow 1 elt 1 marquez-docker 1 fastapi 1 s3 1 openmetadata-docker 1 reporting 1 pipleine-monitoring 1 openlineage 1 openmetadata 1 minio-docker 1 minio 1 openlineage-docker 1 sufficient-statistics 1 probabilistic-graphical-models 1 missing-at-random 1 expectation-maximization 1 hacktoberfest2021 1 outlier-detection 1 data-modeling 1 data-alerting 1 events 1 eventemitter 1 event-driven-programming 1 open-data-discovery 1 rag 1 llms 1 data-curation 1 data-cleaning 1 chatgpt 1 models 1