GitHub topics: data-quality-assessment
Victorwz/MLM_Filter
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
Language: Python - Size: 30.7 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 51 - Forks: 1

quantum-label/quantum_labelling_tool
Data quality, maturity and utility labelling tool for the EHDS (HealthData@EU)
Language: Python - Size: 1.83 MB - Last synced at: 14 days ago - Pushed at: 16 days ago - Stars: 4 - Forks: 0

Bilpapster/stream-DaQ
A highly-configurable, real-time data quality monitoring tool designed for streaming data
Language: Python - Size: 32.7 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 8 - Forks: 0

isislab-unisa/KGHeartBeat
KGHeartBeat is a community-shared open-source knowledge graph quality assessment tool to perform quality analysis on a wide range of freely available knowledge graphs registered on the LOD cloud and DataHub. Web-App: http://www.isislab.it:12280/kgheartbeat/
Language: Python - Size: 173 MB - Last synced at: 13 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 0

KaisTahar/cvdDqChecker
CvdDqChecker: A Software Solution for Traceable and Explainable Assessments of Cardiovascular Disease Data Quality
Language: R - Size: 279 KB - Last synced at: 13 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

joseph-ishola/KPMG-Data-Analysis-Virtual-Internship
This project involves analyzing customer data for Sprocket Central Pty Ltd. The goal is to optimize the company's marketing strategy. We will assess data quality, target high-value customers, and develop a data-driven marketing plan. By leveraging customer data, we aim to provide valuable insights and recommendations to drive business growth.
Size: 6.58 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

KaisTahar/dqLib Fork of medizininformatik-initiative/dqLib
Data Quality Library (dqLib): An R Package for Traceable and Explainable Assessments of Clinical Data Quality
Language: R - Size: 408 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

qalita-io/packs
Qalita Public Packs
Language: Python - Size: 1.58 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

predict-idlab/data-quality-challenges-wearables
Addressing Data Quality Challenges in Ambulatory Wrist-worn Wearable Monitoring Through Analytical and Practical Approaches
Language: Jupyter Notebook - Size: 15.1 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 4 - Forks: 1

IMAbril/RENIS
Language: Jupyter Notebook - Size: 4.18 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

cbslneu/heartview
A signal quality assessment pipeline and dashboard for ambulatory cardiovascular data
Language: Python - Size: 61.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

T0217/sdqcpy
SDQCPy is a comprehensive Python package designed for synthetic data management, quality control, and validation.
Language: Python - Size: 25.7 MB - Last synced at: 16 days ago - Pushed at: 7 months ago - Stars: 6 - Forks: 0

anitagraser/EDA-protocol-movement-data
Step-by-step exploratory movement data analysis protocol in a Jupyter notebook
Language: HTML - Size: 3.31 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 26 - Forks: 4

PeerNova-Solutions/cuneiformsf-reports-datahealth
... Cuneiform for Salesforce reporting library focusing on CRM Data Health. 110+ Data Health reports spanning 17 categories. 100% free to Cuneiform for Salesforce customers.
Size: 13.6 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

giorgosfatouros/IIoT-Data-Quality-Assessment
A service designed to analyze and assess the quality of high frequency data collected from Industrial Internet of Things (IIoT) sensors, efficiently.## Dependencies This app reads multiple sensor readings that monitor a machine from LeanXcale database supporting energy efficient and incremental analytics.
Language: Python - Size: 12.5 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Sarah-2510/KPMG-Virtual-Internship
This repository contains solutions to the 3 different tasks that must be performed during the data analytics virtual internship provided by KPMG via Forage.
Language: Jupyter Notebook - Size: 3.27 MB - Last synced at: 11 months ago - Pushed at: about 2 years ago - Stars: 30 - Forks: 25

Abhishake-Patel/Process-Data-Analytics
PySpark and Python ML and Data Science Projects on a variety of Topics
Language: Jupyter Notebook - Size: 27.3 MB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

JoanyMarino/RPackages4DQA
Collection of R scripts to test packages in conducting data quality assessments
Language: HTML - Size: 62.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 2

Digital-Dermatology/SelfClean-Revised-Benchmarks
🧼🔎 SelfClean revised versions of benchmark datasets for more reliable performance estimation.
Size: 180 KB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

medizininformatik-initiative/kerndatensatzmodul-metadaten-datenqualitaet
Dieses Repository spezifiziert Methoden und Verfahren für Datenqualitätsfragestellungen.
Size: 1000 Bytes - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ShubhamGTiwari/KPMG_Virtual_Internship
This Repository Contains the files for KPMG Data Analytics Consulting Virtual Internship.
Language: Jupyter Notebook - Size: 12.5 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

VishalKrish70/Data-Analytics-Customer-Segmentation
In this project, a RFM model is implemented to relate to customers in each segment. Assessed the Data Quality, performed EDA using Python and created Dashboard using Tableau.
Language: Jupyter Notebook - Size: 4.47 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ADVAIT135/Forage-KPMG-Data-Analytics-
This Repository consist of all the Jupyter Notebooks, Images and .CSV files of the tasks that were assigned during the KPMG Data Analytics Course hosted on Forage
Language: Jupyter Notebook - Size: 7.61 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

AbhishekGit-hash/Data-Analytics-Customer-Segmentation
In this project, a RFM model is implemented to relate to customers in each segment. Assessed the Data Quality, performed EDA using Python and created Dashboard using Tableau.
Language: Jupyter Notebook - Size: 42.3 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 26 - Forks: 11

qalita-io/data-quality-platform
Data quality made simple
Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

KosiMbachuIgwe/KPMG-Data-Analysis-Internship
This project involves analyzing Sprocket Central Pty Ltd Data to help the marketing department unveil useful insights that could help them optimize resources allocation for targeted marketing
Size: 7.12 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aristwn97/KPMG-Data-Analysis-Transaction
This repository contains solution data analytics virtual internship provided by KPMG via Forge Academy
Size: 3.91 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

TheVishwakarma/KPMG-AU-Data-Analytics-Job-Simulation-on-Forage
KPMG-Virtual-Internship: This repo contains all the solutions and resources for the data analytics virtual internship provided by KPMG via Forage
Language: Jupyter Notebook - Size: 3.52 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Adepeju-Oladapo/Data-Quality-Assessment-with-Python-and-SQL
This repository contains my solution to the first task assigned to me during my virtual internship with KPMG through Forage.
Language: Jupyter Notebook - Size: 335 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

dbis-trier-university/TunA
Tunable Query Optimizer for Web APIs and User Preferences
Language: Java - Size: 240 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Garett601/data-quality-reports
A function that automatically generates a Data Quality Report for your data
Language: Python - Size: 57.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

GuilhermeGors/Exploratory_Data_Analysis_for_Machine_Learning
This repository provides a practical guide to understanding your data, enabling you to make data-driven decisions and build accurate machine learning models
Language: Python - Size: 4.57 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Abhilashayagyaseni/KPMG-Data-Analysis-Project
Generate valuable insights from customer and transactions data.
Size: 2.98 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

dikshashub/Customer-Segmentation-analysis
In this project, a RFM model is implemented to relate to customers in each segment. Assessed the Data Quality, performed EDA using Python .
Language: Jupyter Notebook - Size: 4.32 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

sohilamohey/KPMG-AU-Data-Analytics-virtual-internship
KPMG AU Data Analytics virtual internship
Language: Jupyter Notebook - Size: 202 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

baligoyem/dataqtor
🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎
Language: Python - Size: 9.43 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 6

seedatnabeel/Data-IQ
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
Language: Jupyter Notebook - Size: 14.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 2

manishkr1754/KPMG_AU_Data_Analytics_Consulting
Data Quality Assessment , Data Insights and Presentation
Language: Jupyter Notebook - Size: 10.8 MB - Last synced at: 27 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

curie-data-factory/health-data-metrics
Health Data Metrics (HDM) a Data Quality assessment Application.
Language: PHP - Size: 4.71 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

interzoid/companynamesimkey-go
Generates a similarity key for a company/organization name for matching inconsistent names within a dataset(s)
Language: Go - Size: 5.86 KB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

adiag321/Customer-Segmentation-for-Automobile-Company
The purpose of this project is to conduct a Customer Segmentation Analysis for an Automobile bike Company. Customer segmentation is performed by developing a RFM Model.
Language: Jupyter Notebook - Size: 49.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

DanielBayo/AdventureWorks_PowerBI_Dashboard
To provide Sales trend visibility on monthly, Quarterly and yearly basis.
Size: 9.44 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

DanielBayo/KPMG-Data-Analysis-Intenship
This project involves analyzing Sprocket Central Pty Ltd Data to help the marketing department unveil useful insights that could help them optimize resources allocation for targeted marketing
Size: 9.37 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0
