Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: data-lineage
tuva-health/tuva
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
Size: 22.7 MB - Last synced: about 7 hours ago - Pushed: about 8 hours ago - Stars: 154 - Forks: 30
sergiomoraes/sergiomoraesblog
On this site I share personal thoughts about data, data governance, data quality, metadata, and side projects.
Language: Jupyter Notebook - Size: 87.1 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0
reata/sqllineage
SQL Lineage Analysis Tool powered by Python
Language: Python - Size: 9.11 MB - Last synced: 2 days ago - Pushed: 13 days ago - Stars: 1,145 - Forks: 209
elementary-data/elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Language: HTML - Size: 192 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,725 - Forks: 144
brunocampos01/pyssas 📦
Build and deploy automated to SQL Server Analysis Services (SSAS) with Python.
Language: Python - Size: 315 KB - Last synced: 11 days ago - Pushed: over 2 years ago - Stars: 9 - Forks: 2
open-metadata/OpenMetadata
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
Language: TypeScript - Size: 1.3 GB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 4,168 - Forks: 837
grai-io/grai-core
Language: Python - Size: 119 MB - Last synced: about 11 hours ago - Pushed: about 12 hours ago - Stars: 270 - Forks: 20
maropu/spark-sql-flow-plugin
Visualize column-level data lineage in Spark SQL
Language: Scala - Size: 705 MB - Last synced: 11 days ago - Pushed: about 2 years ago - Stars: 80 - Forks: 15
data-drift/data-drift
Metrics Observability & Troubleshooting
Language: HTML - Size: 11.7 MB - Last synced: 19 days ago - Pushed: 3 months ago - Stars: 299 - Forks: 11
tuva-health/tuva_demo
A starter dbt project and synthetic claims dataset for trying out the Tuva Project.
Size: 1.98 MB - Last synced: 17 days ago - Pushed: 18 days ago - Stars: 12 - Forks: 6
MarquezProject/marquez
Collect, aggregate, and visualize a data ecosystem's metadata
Language: Java - Size: 44.8 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,613 - Forks: 287
badoo/exasol-data-lineage
Exasol data lineage scripts
Language: Python - Size: 22.5 KB - Last synced: 28 days ago - Pushed: almost 3 years ago - Stars: 6 - Forks: 3
elementary-data/dbt-data-reliability
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Language: Python - Size: 7.47 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 338 - Forks: 76
GitDataAI/jiaozifs
An Git-like version control file system for data lineage & data collaboration.
Language: Go - Size: 1.66 MB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 41 - Forks: 2
opendatadiscovery/odd-platform
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Language: Java - Size: 28.1 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,104 - Forks: 91
vmware/versatile-data-kit
One framework to develop, deploy and operate data workflows with Python and SQL.
Language: Python - Size: 109 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 409 - Forks: 54
finos/waltz
Enterprise Information Service
Language: Java - Size: 55.8 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 163 - Forks: 126
google/grizzly
End-to-end DataOps platform deployed by Terraform.
Language: Python - Size: 112 MB - Last synced: 9 days ago - Pushed: 16 days ago - Stars: 56 - Forks: 10
slidoapp/dbt-superset-lineage
Make dbt docs and Apache Superset talk to one another
Language: Python - Size: 1.84 MB - Last synced: 4 days ago - Pushed: 30 days ago - Stars: 128 - Forks: 14
tokern/data-lineage
Generate and Visualize Data Lineage from query history
Language: Python - Size: 2.46 MB - Last synced: about 1 month ago - Pushed: 10 months ago - Stars: 295 - Forks: 41
Tinkoff/data-detective 📦
Data catalog for everything in your company
Language: Python - Size: 8.99 MB - Last synced: 3 months ago - Pushed: 12 months ago - Stars: 45 - Forks: 13
IBM/multi-data-lineage-capture-py
IBM Multi-Lineage Data System
Language: Python - Size: 237 KB - Last synced: 30 days ago - Pushed: about 1 year ago - Stars: 6 - Forks: 7
GoogleCloudPlatform/bigquery-data-lineage
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Language: Java - Size: 405 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 132 - Forks: 37
tosh2230/stairlight
A data lineage tool detects table dependencies from rendered SQL statements.
Language: Python - Size: 2.37 MB - Last synced: 2 days ago - Pushed: about 1 month ago - Stars: 26 - Forks: 1
tuva-health/medicare_cclf_connector
This connector is a dbt project that maps Medicare CCLF claims data to the Tuva Input Layer.
Size: 1010 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 12 - Forks: 12
pi2schema/pi2schema
Describe your Data Protection rules and Personal Identifying Information as part of your schema
Language: Java - Size: 528 KB - Last synced: 11 days ago - Pushed: 17 days ago - Stars: 9 - Forks: 2
tuva-health/medicare_lds_connector
Maps Medicare LDS claims data to the Tuva Input Layer so you can easily run the Tuva Project.
Size: 664 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 7 - Forks: 4
ahussein/ckanext-datalineage
A CKAN extension to allow providing and visualization of data lineage
Language: JavaScript - Size: 6.08 MB - Last synced: 10 months ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0
tuva-health/provider
A dbt project that transforms messy public provider datasets into usable data for the Tuva Project.
Size: 15.6 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 1
aws-samples/document-processing-pipeline-for-regulated-industries
A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
Language: Python - Size: 11.4 MB - Last synced: 12 months ago - Pushed: over 2 years ago - Stars: 50 - Forks: 12
thestyleofme/data-lineage-parent
数据血缘,Hive/Sqoop/HBase/Spark等,发送到kafka后,解析处理使用neo4j生成血缘
Language: Java - Size: 277 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 61 - Forks: 36
datascalehq/datascale
We help data teams ensure the quality of their SQL code and establish the traceability of their data.
Size: 1.95 KB - Last synced: 4 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0
GuinsooLab/darkseal
A Single place to Discover, Collaborate, and Get your data right
Language: TypeScript - Size: 272 MB - Last synced: 4 months ago - Pushed: about 1 year ago - Stars: 14 - Forks: 6
tosh2230/stairlight-app
A web application rendering table dependency graph with tosh2230/stairlight, using Graphviz, Streamlit and Google Cloud Run.
Language: Python - Size: 800 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 5 - Forks: 0
StatCan/pachyderm 📦
Data Lineage with End-to-End Pipelines on Kubernetes
Size: 3.91 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
miotech/kun-scheduler
A workflow scheduler understands both your data and metadata.
Language: Java - Size: 63.8 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 25 - Forks: 5
tomaztk/SQLServer-Data-Lineage
Data Lineage for Microsoft SQL Server, Azure SQL Server and Azure Synapse
Language: TSQL - Size: 86.9 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 11 - Forks: 6
metastore-developers/metastore
Metastore Python SDK. Feature store and data catalog for machine learning.
Language: Python - Size: 302 KB - Last synced: 17 days ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
michaelosthege/gittrail
Context manager for enforcing links between data pipeline outputs and git history.
Language: Python - Size: 41 KB - Last synced: 6 days ago - Pushed: about 2 years ago - Stars: 1 - Forks: 0
AbdullahMu/Data-Pipelines-with-Airflow
Schedule, automate, and monitor data pipelines using Apache Airflow. Run data quality checks, track data lineage, and work with data pipelines in production.
Language: Python - Size: 52.7 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 1
bballamudi/multi-data-lineage-capture-py Fork of IBM/multi-data-lineage-capture-py
IBM Multi-Lineage Data System
Size: 83 KB - Last synced: 11 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
dmartinpro/spuristo
Data Lineage
Language: Java - Size: 130 KB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 0 - Forks: 0