An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: missing-value-handling

awslabs/datawig

Imputation of missing values in tables.

Language: JavaScript - Size: 6.51 MB - Last synced at: 26 days ago - Pushed at: 12 months ago - Stars: 488 - Forks: 70

katerinaharana/Team-2-Project

Predicting the City Cycle Fuel Consumption in MPG of a Car. A Classification Problem

Language: Jupyter Notebook - Size: 26.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 2

oomeryk/Patient_Clustering_Project

Patient clustering with KMeans, Hierarchical and DBSCAN Clustering algorithms

Language: Jupyter Notebook - Size: 1.33 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

rahulvictor12/German-Bank-Loan-Defaulter-Prediction

A machine learning project to predict loan defaults in a German bank's customer base. Using the German Credit Risk dataset, it explores key factors contributing to defaults and trains models like Random Forest, GBM, and XGBoost. Includes EDA, data processing, hyperparameter tuning, and model evaluation.

Language: Jupyter Notebook - Size: 1.02 MB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

sajjad425/missingValue

This repository provides a guide on handling missing values in Python, covering identification methods, imputation techniques (mean, median, mode, fill, interpolation), advanced methods (KNN, multiple imputation), and best practices. It includes practical examples for both numerical and categorical data.

Language: Jupyter Notebook - Size: 22.5 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Mehnaz2004/Data-Cleaning-CaseStudy

This repository demonstrates data cleaning with a layoffs dataset. It covers handling missing values, detecting outliers, and encoding categorical data, using visualizations like boxplots and distplots to enhance data quality. Check out the code to see these techniques in action.

Language: Python - Size: 149 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

souravsuvarna/MissNoMore

MissNoMore is a Python-based missing value imputation tool designed to handle CSV datasets with missing data.

Language: Python - Size: 33.2 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

hrishabht5/Top-Movies-analysis-

This project utilizes Python for data preprocessing and analysis, along with Power BI for creating an interactive dashboard, to analyze trends and insights within the movie industry. The project encompasses data collection, cleaning, exploration, visualization, and interpretation to provide valuable insights into various aspects of the industry.

Language: Jupyter Notebook - Size: 1.73 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

AMRHiwa/Hotel_booking_Data_Exploration

In this repository, we intend to extract data from the mentioned dataset and display everything that seems interesting.

Language: Jupyter Notebook - Size: 4.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

jodiambra/ICE-Retail-EDA

Exploratory data analysis on ICE retail gaming store.

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

abibatoki/Classification-Model

A model that predicts startup success from data on early-stage investments in the Crunchbase database.

Size: 82 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

DivyaKrishnani/Data-Preprocessing-with-Python

Implementation of Data Preprocessing techniques such as handling missing values, noise smoothing, PCA, etc.

Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 11

prasadposture/Data-Preparation

There are lot of things that need to be done on the given dataset before we feed it to the machine, these things come under data preprocessing. In this repository I have tried to explain those things with some examples.

Language: Jupyter Notebook - Size: 420 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

grahman20/kDMI

kDMI employs two levels of horizontal partitioning (based on a decision tree and k-NN algorithm) of a data set, in order to find the records that are very similar to the one with missing value/s. Additionally, it uses a novel approach to automatically find the value of k for each record.

Language: Java - Size: 267 KB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

grahman20/FIMUS

FIMUS imputes numerical and categorical missing values by using a data set’s existing patterns including co-appearances of attribute values, correlations among the attributes and similarity of values belonging to an attribute.

Language: HTML - Size: 162 KB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

grahman20/SiMI

SiMI imputes numerical and categorical missing values by making an educated guess based on records that are similar to the record having a missing value. Using the similarity and correlations, missing values are then imputed. To achieve a higher quality of imputation some segments are merged together using a novel approach.

Language: Java - Size: 265 KB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

grahman20/DMI

DMI Class implements the DMI imputation algorithm for imputing missing values in a dataset from Rahman, M. G., and Islam, M. Z. (2013): Missing Value Imputation Using Decision Trees and Decision Forests by Splitting and Merging Records: Two Novel Techniques

Language: Java - Size: 21.5 KB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ANikhilAgarwal/Analysis-Of-Google-Play-Store-Data

Language: Jupyter Notebook - Size: 694 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

gjorshoskaivana/MIDA-in-FCDBs

Repository containing the implementation of the models and experiments in the paper "Missing value imputation in Food Composition Data with Denoising Autoencoders"

Language: Jupyter Notebook - Size: 17.1 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

anikch/Telecom-churn-analysis-and-prediction

Analyze customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn (usage-based churn) and identify the main indicators of churn.

Language: Jupyter Notebook - Size: 1.69 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

kajakgupta/Missing-Value-Treatment

Prevention and handling of missing data

Language: Jupyter Notebook - Size: 655 KB - Last synced at: 22 days ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1

Related Keywords
missing-value-handling 21 data-science 8 missing-value-imputation 7 missing-values 6 data-cleaning 5 missing-data 5 preprocessing 5 data-mining 5 machine-learning 4 data-visualization 3 random-forest 3 exploratory-data-analysis 3 decision-tree 3 linear-regression 3 missing-data-imputation 3 missing-value-treatment 3 data-processing 2 xgboost 2 data 2 python 2 data-analysis 2 logistic-regression 2 data-cleansing 2 classification 2 eda 2 pca 2 imputation 2 telecom-churn-prediction 1 missing-data-treatment 1 encoding 1 ensemble-classifier 1 ensemble-model 1 data-analytics 1 scaling 1 outliers 1 label-encoding 1 groupby 1 duplicate-rows 1 dummy-variables 1 data-binning 1 smoothing 1 normalization 1 dispersion 1 data-preprocessing 1 binning 1 training-data 1 test-data 1 onehot-encoding 1 telecom-churn-analysis 1 rfe 1 food-composition 1 deep-learning 1 visualization 1 statistical-analysis 1 analysis-of-google-play-store 1 weka 1 missing 1 java 1 imputation-algorithm 1 expectation-maximization-algorithm 1 analysis 1 numerical-missing-value 1 decision-tree-classifier 1 decision-forest-algorithm 1 decision-forest 1 dataset 1 categorical-missing-value 1 similarity-measures 1 data-quality 1 correlation 1 co-appearance 1 outlier-detection-and-removal 1 clustering 1 data-integrity 1 categorical-data-encoding 1 dbscan 1 health 1 data-analysis-python 1 hierarchical 1 kmeans 1 recall 1 randomsearch-cv 1 patient 1 precision 1 modelevaluation 1 hyperparameter-tuning 1 gridsearchcv 1 gbm 1 f1-score 1 unsupervised-learning 1 accuracy 1 categorical-encoding 1 bagging 1 ada-boost-classifier 1 model-creation 1 isolation-forest 1 heatmap 1 t-test 1 scipy 1 profitability 1