An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-quality-assessment

MigoXLab/dingo

Dingo: A Comprehensive AI Data Quality Evaluation Tool

Language: JavaScript - Size: 21.9 MB - Last synced at: about 12 hours ago - Pushed at: about 14 hours ago - Stars: 438 - Forks: 43

keshavksingh/dqu

Build intelligent, efficient, and trustable Data & ML Engineering Pipelines with this enterprise-ready Data Quality framework

Language: HTML - Size: 170 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

Bilpapster/stream-DaQ

🦆 Stream-first data quality monitoring in Python! Learn more: https://arxiv.org/abs/2506.06147

Language: Python - Size: 56.8 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 12 - Forks: 1

quantum-label/quantum_labelling_tool

Data quality, maturity and utility labelling tool for the EHDS (HealthData@EU)

Language: Python - Size: 1.86 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 8 - Forks: 2

qalita-io/packs

Qalita Public Packs

Language: Python - Size: 1.92 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

cbslneu/heartview

A signal quality assessment pipeline and dashboard for ambulatory cardiovascular data

Language: Python - Size: 70.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 10 - Forks: 0

isislab-unisa/KGHeartBeat

KGHeartBeat is a community-shared open-source knowledge graph quality assessment tool to perform quality analysis on a wide range of freely available knowledge graphs registered on the LOD cloud and DataHub. Web-App: http://www.isislab.it:12280/kgheartbeat/

Language: Python - Size: 173 MB - Last synced at: 21 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

maltzsama/sumeh

Sumeh — Unified Data Quality Framework Sumeh is a unified data quality validation framework supporting multiple backends (PySpark, Dask, Polars, DuckDB, Pandas) with centralized rule configuration.

Language: Python - Size: 1.69 MB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

KaisTahar/cvdDqChecker

CvdDqChecker: A Software Solution for Explainable and Traceable Assessments of Cardiovascular Disease Data Quality

Language: R - Size: 512 KB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

DesiLuis51/Customer-Analytics

Efficiently prepare customer data for modeling with a transformed DataFrame. Optimize data types for better performance. 📊💻

Language: Jupyter Notebook - Size: 2.23 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

YavinOwens/Data-Quality-Review-System

A GitHub repo of Python scripts for data engineering, featuring pip wheel management, high-speed fuzzy matching (RapidFuzz), data profiling (pandera, y_dataquality), and seamless Oracle DB connectivity with cx_Oracle and SQLAlchemy. Ideal for building robust, efficient, and modern data workflows.

Language: PLSQL - Size: 349 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

KaisTahar/dqLib Fork of medizininformatik-initiative/dqLib

Data Quality Library (dqLib): An R Package for Traceable and Explainable Assessments of Clinical Data Quality

Language: R - Size: 427 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

predict-idlab/data-quality-challenges-wearables

Addressing Data Quality Challenges in Ambulatory Wrist-worn Wearable Monitoring Through Analytical and Practical Approaches

Language: Jupyter Notebook - Size: 15.1 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 2

Victorwz/MLM_Filter

Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".

Language: Python - Size: 30.7 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 51 - Forks: 1

joseph-ishola/KPMG-Data-Analysis-Virtual-Internship

This project involves analyzing customer data for Sprocket Central Pty Ltd. The goal is to optimize the company's marketing strategy. We will assess data quality, target high-value customers, and develop a data-driven marketing plan. By leveraging customer data, we aim to provide valuable insights and recommendations to drive business growth.

Size: 6.58 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 1

KaisTahar/cordDqChecker Fork of medizininformatik-initiative/cord-dq-checker

A Set of Metrics and Tools for Data Quality Assessment and Reporting on Rare Diseases Data

Language: R - Size: 3.11 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

IMAbril/RENIS

Language: Jupyter Notebook - Size: 4.18 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

T0217/sdqcpy

SDQCPy is a comprehensive Python package designed for synthetic data management, quality control, and validation.

Language: Python - Size: 25.7 MB - Last synced at: 9 days ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

anitagraser/EDA-protocol-movement-data

Step-by-step exploratory movement data analysis protocol in a Jupyter notebook

Language: HTML - Size: 3.31 MB - Last synced at: 6 months ago - Pushed at: almost 4 years ago - Stars: 26 - Forks: 4

PeerNova-Solutions/cuneiformsf-reports-datahealth

... Cuneiform for Salesforce reporting library focusing on CRM Data Health. 110+ Data Health reports spanning 17 categories. 100% free to Cuneiform for Salesforce customers.

Size: 13.6 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

giorgosfatouros/IIoT-Data-Quality-Assessment

A service designed to analyze and assess the quality of high frequency data collected from Industrial Internet of Things (IIoT) sensors, efficiently.## Dependencies This app reads multiple sensor readings that monitor a machine from LeanXcale database supporting energy efficient and incremental analytics.

Language: Python - Size: 12.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Sarah-2510/KPMG-Virtual-Internship

This repository contains solutions to the 3 different tasks that must be performed during the data analytics virtual internship provided by KPMG via Forage.

Language: Jupyter Notebook - Size: 3.27 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 30 - Forks: 25

Abhishake-Patel/Process-Data-Analytics

PySpark and Python ML and Data Science Projects on a variety of Topics

Language: Jupyter Notebook - Size: 27.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

JoanyMarino/RPackages4DQA

Collection of R scripts to test packages in conducting data quality assessments

Language: HTML - Size: 62.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 2

Digital-Dermatology/SelfClean-Revised-Benchmarks

🧼🔎 SelfClean revised versions of benchmark datasets for more reliable performance estimation.

Size: 180 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

medizininformatik-initiative/kerndatensatzmodul-metadaten-datenqualitaet

Dieses Repository spezifiziert Methoden und Verfahren für Datenqualitätsfragestellungen.

Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ShubhamGTiwari/KPMG_Virtual_Internship

This Repository Contains the files for KPMG Data Analytics Consulting Virtual Internship.

Language: Jupyter Notebook - Size: 12.5 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

VishalKrish70/Data-Analytics-Customer-Segmentation

In this project, a RFM model is implemented to relate to customers in each segment. Assessed the Data Quality, performed EDA using Python and created Dashboard using Tableau.

Language: Jupyter Notebook - Size: 4.47 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ADVAIT135/Forage-KPMG-Data-Analytics-

This Repository consist of all the Jupyter Notebooks, Images and .CSV files of the tasks that were assigned during the KPMG Data Analytics Course hosted on Forage

Language: Jupyter Notebook - Size: 7.61 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

AbhishekGit-hash/Data-Analytics-Customer-Segmentation

In this project, a RFM model is implemented to relate to customers in each segment. Assessed the Data Quality, performed EDA using Python and created Dashboard using Tableau.

Language: Jupyter Notebook - Size: 42.3 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 26 - Forks: 11

qalita-io/data-quality-platform

Data quality made simple

Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

KosiMbachuIgwe/KPMG-Data-Analysis-Internship

This project involves analyzing Sprocket Central Pty Ltd Data to help the marketing department unveil useful insights that could help them optimize resources allocation for targeted marketing

Size: 7.12 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aristwn97/KPMG-Data-Analysis-Transaction

This repository contains solution data analytics virtual internship provided by KPMG via Forge Academy

Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

TheVishwakarma/KPMG-AU-Data-Analytics-Job-Simulation-on-Forage

KPMG-Virtual-Internship: This repo contains all the solutions and resources for the data analytics virtual internship provided by KPMG via Forage

Language: Jupyter Notebook - Size: 3.52 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Adepeju-Oladapo/Data-Quality-Assessment-with-Python-and-SQL

This repository contains my solution to the first task assigned to me during my virtual internship with KPMG through Forage.

Language: Jupyter Notebook - Size: 335 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

dbis-trier-university/TunA

Tunable Query Optimizer for Web APIs and User Preferences

Language: Java - Size: 240 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Garett601/data-quality-reports

A function that automatically generates a Data Quality Report for your data

Language: Python - Size: 57.6 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

GuilhermeGors/Exploratory_Data_Analysis_for_Machine_Learning

This repository provides a practical guide to understanding your data, enabling you to make data-driven decisions and build accurate machine learning models

Language: Python - Size: 4.57 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Abhilashayagyaseni/KPMG-Data-Analysis-Project

Generate valuable insights from customer and transactions data.

Size: 2.98 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

dikshashub/Customer-Segmentation-analysis

In this project, a RFM model is implemented to relate to customers in each segment. Assessed the Data Quality, performed EDA using Python .

Language: Jupyter Notebook - Size: 4.32 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

sohilamohey/KPMG-AU-Data-Analytics-virtual-internship

KPMG AU Data Analytics virtual internship

Language: Jupyter Notebook - Size: 202 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

baligoyem/dataqtor

🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎

Language: Python - Size: 9.43 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 14 - Forks: 6

seedatnabeel/Data-IQ

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)

Language: Jupyter Notebook - Size: 14.1 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 2

manishkr1754/KPMG_AU_Data_Analytics_Consulting

Data Quality Assessment , Data Insights and Presentation

Language: Jupyter Notebook - Size: 10.8 MB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

curie-data-factory/health-data-metrics

Health Data Metrics (HDM) a Data Quality assessment Application.

Language: PHP - Size: 4.71 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

interzoid/companynamesimkey-go

Generates a similarity key for a company/organization name for matching inconsistent names within a dataset(s)

Language: Go - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

adiag321/Customer-Segmentation-for-Automobile-Company

The purpose of this project is to conduct a Customer Segmentation Analysis for an Automobile bike Company. Customer segmentation is performed by developing a RFM Model.

Language: Jupyter Notebook - Size: 49.3 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

DanielBayo/AdventureWorks_PowerBI_Dashboard

To provide Sales trend visibility on monthly, Quarterly and yearly basis.

Size: 9.44 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

DanielBayo/KPMG-Data-Analysis-Intenship

This project involves analyzing Sprocket Central Pty Ltd Data to help the marketing department unveil useful insights that could help them optimize resources allocation for targeted marketing

Size: 9.37 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Related Keywords
data-quality-assessment 49 data-quality 16 data-visualization 15 data-quality-checks 9 data-analysis 8 data-quality-report 8 python 7 data-quality-monitoring 7 data-quality-measurement 7 exploratory-data-analysis 6 data-cleaning 6 tableau 6 powerbi 5 data-science 5 pandas 5 rfm-analysis 5 machine-learning 4 kpmg-virtual-internship 4 data-quality-analysis 4 dashboard 4 segmenting-customers 4 data-insights 4 analytics 3 data-quality-framework 3 forage 3 data-analytics 3 data-validation 3 python3 3 data 3 customer-segmentation-analysis 3 data-exploration 2 business-analytics 2 data-centric-ai 2 business-intelligence 2 data-cleaning-and-preprocessing 2 model-development 2 quality 2 excel 2 dataquality 2 spark 2 customer-segmentation 2 data-profiling 2 data-modeling 2 kpmg 2 virtual-internship 2 power-point 2 actiwave 1 synthetic-data 1 data-quality-monitor 1 movement-data 1 plotly 1 apex 1 crm 1 data-cloud 1 data-insights-and-presentation 1 data-health 1 profiling-data 1 reports 1 salesforce 1 soql 1 power-bi 1 kpmg-au 1 data-visualisation 1 data-insi 1 iot-data-analytics 1 sensor-data-annotation 1 benchmarks 1 initial-data-analysis 1 pyspark-mllib 1 visualization 1 streamlit 1 data-centric 1 data-centric-machine-learning 1 deep-learning 1 responsible-ai 1 trustworthy-ai 1 data-preparation-and-analysis 1 matplotlib-pyplot 1 powerbi-report 1 seaborn 1 database 1 hdm 1 ai 1 api 1 companydata 1 companynames 1 go 1 go-package 1 golang 1 standardization 1 data-reporting 1 tableau-dashboards 1 data-visualization-project 1 powerquery 1 data-analyst 1 dataanalytics 1 powerpoint 1 python-library 1 report 1 sql 1