GitHub topics: data-cleansing
oreseckoa/Analysis-of-CRM-system-data
This project is dedicated to analyzing CRM system data for an online programming school.
Size: 3.27 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

trumhacker-cyber/Data-Analytics-Certificate
This repository showcases my journey in the fascinating world of Data Analytics.
Size: 19 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

data-forge/data-forge-ts
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Language: TypeScript - Size: 3.68 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 1,359 - Forks: 78

hi-primus/optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Language: Python - Size: 110 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 1,512 - Forks: 233

Desbordante/desbordante-core
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
Language: C++ - Size: 143 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 403 - Forks: 76

probcomp/PClean
A domain-specific probabilistic programming language for scalable Bayesian data cleaning
Language: Julia - Size: 1.36 MB - Last synced at: 15 days ago - Pushed at: 10 months ago - Stars: 223 - Forks: 33

DataPreprocessing/DataCleaning
Data Cleaning is a python package for data preprocessing. This cleans the CSV file and returns the cleaned data frame. It does the work of imputation, removing duplicates, replacing special characters, and many more.
Language: Python - Size: 117 KB - Last synced at: 23 days ago - Pushed at: about 4 years ago - Stars: 8 - Forks: 3

marvrch/Titanic-ExploratoryDataAnalysis
This project focuses on cleaning and analyzing the Titanic dataset using Python. It explores patterns in the data through exploratory data analysis (EDA) and highlights the importance of data cleaning in preparing datasets for further analysis or machine learning.
Language: Jupyter Notebook - Size: 16.3 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vaxdata22/Water-Quality-DW-on-Oracle-Database
This is an Oracle DB Data Warehouse and manual ETL demo on a specially formatted Water Quality dataset from DEFRA, UK. It is a personal academic-grade exercise to explore the basic concepts of data warehousing and manual ETL process from an academic perspective.
Language: Jupyter Notebook - Size: 380 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

bluestero/urlgenie
Python package to make URL extraction, generalization, validation, and filtration easy.
Language: Python - Size: 204 KB - Last synced at: 25 days ago - Pushed at: 12 months ago - Stars: 4 - Forks: 1

Rudra-G-23/SQL-Data-Warehouse-Project
This repo provides a step-by-step approach to building a modern data warehouse using PostgreSQL. It covers the ETL (Extract, Transform, Load) process, data modeling, exploratory data analysis (EDA), and advanced data analysis techniques.
Language: PLpgSQL - Size: 9.32 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

saya304/Data-Cleaning-and-Exploratory-Data-Analysis
This project focuses on data cleaning and exploratory data analysis (EDA) in Snowflake, transforming raw data into meaningful insights using SQL
Size: 89.8 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

jasonh119/TransactionTracker
Little Application for Transaction Aggregation, cleaning and Categorisation for Learning DS and LLMs
Language: Python - Size: 297 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

SamHollings/nhs_data_cleansing
A repo of reusable functions for cleansing data
Language: Python - Size: 52.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

data-integrations/wrangler
Wrangler Transform: A DMD system for transforming Big Data
Language: Java - Size: 5.75 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 91 - Forks: 56

TimKong21/PwC-Switzerland-Power-BI-in-Data-Analytics-Virtual-Case-Experience
Comprehensive Power BI dashboards showcasing insights on Call Centre Trends, Customer Retention, and Diversity & Inclusion to drive business impact.
Size: 4.43 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 14 - Forks: 5

bakdata/dedupe
Java DSL for (online) deduplication
Language: Java - Size: 1.01 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 20 - Forks: 2

HypertextAssassin0273/Excel_Data_Organizer_and_Cleaner-DS_Project
Data Structures project in C++11 language, uses custom Vector & String structures with Move Semantics (Rule of Five)
Language: C++ - Size: 1.39 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 2

Ashbyt/Python
Ashley Bythell - Python
Language: Jupyter Notebook - Size: 5.61 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

Skunkworks-Labs/data-management
The website is now described as an educational resource for data management, with the objective of educating, engaging, guiding, and providing resources.
Language: HTML - Size: 199 KB - Last synced at: 9 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

BDFD-LearningGround/Cousera_Google-Data-Analytics-Professional-Certificate
Quizzes & Assignment Solutions for Google Data Analytics Professional Certificate on Coursera. Also included a few resources on side that I found helpful.
Size: 38.2 MB - Last synced at: 10 months ago - Pushed at: about 3 years ago - Stars: 197 - Forks: 55

AhmdLx/PPP_loans_Analysis
Data Cleaning, Exploration, and Insights
Language: TSQL - Size: 534 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

brunocampos01/porto-seguro-safe-driver-prediction
Predict if a driver will file an insurance claim next year. (Kaggle Competition)
Language: Python - Size: 93.8 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 11 - Forks: 5

vbhvsingh0/CDC_immunization
This project explores the relationships in between different vaccines and the sex, age and other basic features in the data.
Language: Python - Size: 2.72 MB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

ClimerLab/mrclean
Two Mixed Integer Programs for cleaning a data file.
Language: C++ - Size: 43.9 KB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

decimus01/Spotify_songs_data_analysis
Analysis of songs from the period 18 October 2024 to 1 May 2024 from Spotify data.
Language: Jupyter Notebook - Size: 861 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

KevinVChin/Google-Data-Analytics-Professional-Certificate
Google Data Analytics Professional Certificate program instructs on how to clean and organize data for analysis, and complete analysis and calculations using spreadsheets, SQL, Tableau and R programming.
Language: HTML - Size: 1.97 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

LieseB-1746743/data-cleaning
Data cleaning tool.
Language: JavaScript - Size: 1.81 MB - Last synced at: 12 months ago - Pushed at: about 4 years ago - Stars: 9 - Forks: 5

fpjnijweide/autoencoder-pdb-cleaning
This is the source code for the paper "A probabilistic database approach to autoencoder-based data cleaning".
Language: Jupyter Notebook - Size: 242 MB - Last synced at: 17 days ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 0

Rahma-Farag/Rahma-Farag
Main Repository
Language: Jupyter Notebook - Size: 71.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

aminkhod/TA--Course-ofData-mining--Fall-2018 📦
Here is some implementation and using methods in Topics on Data mining course
Language: Python - Size: 32.1 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 1

Abhigyan76/Pizza-Sales-Insight
Used SQL, Power BI to make insightful dashboard
Size: 2.24 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

JoeRegnier/horkos
Data quality analysis and scoring system.
Language: Python - Size: 3.39 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 2

tneriaransom/data-analysis-portfolio
This repository houses a curated collection of projects designed to highlight my expertise in data analytics.
Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

RizqiSeijuuro/final-project-kelompok-03-aditya-bariq
Weekly Sales Prediction at Walmart Dataset. Buat dikumpulin di Final Projek Studi Independen Batch 3
Language: Jupyter Notebook - Size: 21.9 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1

Irrelev4nt13/Customer-Personality-Analysis
📊Customer Personality Analysis, using various Data Mining techniques and Machine Learning algorithms.
Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

fitria-dwi/Business-Decision-Research
Language: Jupyter Notebook - Size: 218 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Somu-cSs/Water-Quality-Analysis-and-Prediction.
Interactive Dashboard Web-app :
Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

weichi21/KNN-Model-Car-Price-Prediction
Predictive modeling project by implementing KNN regression model.
Language: Jupyter Notebook - Size: 438 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

kshaikh23/NBA-Playoffs-Project
Statistical analysis comparing team play in the NBA regular season and playoffs. Linear Regression algorithm to predict players playoffs points per game based on their regular season stats. Collaborated with Stephan MacDougall.
Language: Jupyter Notebook - Size: 664 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

BDFD-LearningGround/Cousera_Applied-Data-Science-with-Python-Specialization-OP
Quizzes & Assignment Solutions for Applied Data Science with Python Specialization on Coursera. Also included a few resources on side that I found helpful.
Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Yaresh01/Finance-and-Risk-Analytics-project
Language: Jupyter Notebook - Size: 9.42 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rahulodedra30/House-Recommendation-Based-on-Neighbourhood
Recommended house based on neighbourhood using K-Means clustering after scraping data from Wikipedia website
Language: Jupyter Notebook - Size: 572 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ITISALLDATA/DATA-CLEANING-PROJECT-WITH-SQL
In this project, I cleaned up a large FIFA 2021 dataset with 18,000+ player records. The data was messy, with inconsistencies in 77 columns. I focused on making the data consistent and usable for analysis. This repository documents my step-by-step process, demonstrating how I transformed the data into a clean format.
Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dikshantnaik/Data-Cleaning-Assignment-Internship
A Python script to Parse data from Non-Meaningful data to Meaningful and save it to .csv File
Language: Python - Size: 27.3 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

ajaymache/data-analysis-using-python
Exploratory data analysis 📊using python 🐍of used car 🚘 database taken from ⓚ𝖆𝖌𝖌𝖑𝖊
Language: Jupyter Notebook - Size: 49.3 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 193 - Forks: 89

extremecode/stress-detection-in-social-networks
stress detection in social networks
Language: R - Size: 5.26 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 2

ojasphansekar/Zillow-Home-Value-Prediction
XGBoost, LightGBM, LSTM, Linear Regression, Exploratory Data Analysis
Language: Jupyter Notebook - Size: 1.81 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 10 - Forks: 7

julianacastilloaraujo/Google-Data-Analytics
⭐️ Google Data Analytics + Coursera ⭐️ 👩💻 Datos, datos, en todas partes(este curso) 🔍 Skills : Spreadsheet, Data Cleansing, Data Analysis, Data Visualization (DataViz), SQL
Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

AVJdataminer/Squeaky
R package for data cleaning and pre-processing for data science
Language: R - Size: 79.1 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1

AliiPmD/House_Prices_Advanced_Regression
Advanced Regression for House Prices with data preprocessing steps (like Data exploration, Cleansing, visualization, etc.) and training a model with 0.945 score.
Language: Jupyter Notebook - Size: 661 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

muhammadhamzah8/Ecommerce-Shipping-Classification-Modeling
Exploratory Data Analysis & Modeling to predict whether the shipping deliveries will be received late or on-time by the customers
Language: Jupyter Notebook - Size: 34.4 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 2

edohgoka/Predict_Success_Of_a_Restaurant
Predicting the success or not of a restaurant.
Language: Jupyter Notebook - Size: 33.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

aaron-evans-cruz/Python-Portfolio-Projects
Python Portfolio Projects. Highlighting skills in Python, Pandas, data cleaning, correlation...
Language: Jupyter Notebook - Size: 2.48 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AlisonYao/2020SummerResearch_ChineseResDatabase Fork of xuyou1999/2020SummerResearch_ChineseResDatabase
Language: Jupyter Notebook - Size: 29.7 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

kenlhlui/Brexit_referendum_data_cleaning
R data cleaning project Brexit Referendum voting data.
Language: R - Size: 5.52 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

scrab017/tgdl
Analysis of physicians registered under The Tripura State Medical Council. Data scraped from https://tsmc.tripura.gov.in/doc_list
Language: HTML - Size: 1.25 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

AP-State-Skill-Development-Corporation/Data-Science-Using-Python-Internship-EB1
This repo created for sharing the required/discussed files during Online Internship training program on Data Science Using Python in May-2021
Language: Jupyter Notebook - Size: 21.8 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 13 - Forks: 10

iweld/data_cleaning
An SQL data cleaning project
Size: 586 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 4

doratako/Data-Quality-Assurance
Data validation and data cleansing
Language: Jupyter Notebook - Size: 54.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

pmb-7684/Google-Data-Analytics-Professional-Certificate
Learning materials, assignments, and helpful resources for professional certification. Completed October 2022
Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

pmb-7684/Applied-Data-Science-with-Python-Specialization
Coursera specialization taught by University of Michigan. Expected completion Date July 2023
Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

AlecVail/Preparing_Data_Using_Alteryx
Alteryx Academy Challenge #363
Size: 41 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

siegstedt/predict_credit_card_approval
Commercial banks receive a lot of applications for credit cards. Many of them get rejected for many reasons, like high loan balances, low income levels, or too many inquiries on an individual's credit report, for example. Manually analyzing these applications is mundane, error-prone, and time-consuming. Luckily, this task can be automated with the power of machine learning. Here is an automatic credit card approval predictor using machine learning techniques.
Language: Jupyter Notebook - Size: 32.2 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 1

DablewCodes/Nashville-Housing-Data-Cleaning
Performed Data Cleaning by using advanced SQL such as CTEs, Joins, Rank Functions, Aggregate Functions etc.
Size: 11.3 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

federicozukierman/data-cleaning-SQL
In this project I clean data from the Nashville (US) housing database
Size: 1000 Bytes - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

raihanjp98/DTS-KOMINFO-Data-Engineer-Career-Track-DQLab
A collection of scripts written to complete DQLab Data Engineer Career Track
Language: Python - Size: 28.9 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

grahman20/kDMI
kDMI employs two levels of horizontal partitioning (based on a decision tree and k-NN algorithm) of a data set, in order to find the records that are very similar to the one with missing value/s. Additionally, it uses a novel approach to automatically find the value of k for each record.
Language: Java - Size: 267 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

grahman20/FIMUS
FIMUS imputes numerical and categorical missing values by using a data set’s existing patterns including co-appearances of attribute values, correlations among the attributes and similarity of values belonging to an attribute.
Language: HTML - Size: 162 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

programindz/heartattackpredictor
A machine learning model using Support Vector Machine classification to predict chances of an individual having a heart attack based on features like age, sex, cholestrol, blood pressure, chest pain, heart beat etc.
Language: Jupyter Notebook - Size: 88.9 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

hazem-alabiad/taxi-tip-estimator
Taxi Tip Estimator (TTS) is a Data Mining project that uses the data collected by taxi drivers to estimate the tips given by customers.
Language: HTML - Size: 53.9 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

abccastro/Canada-PR-Data-Analysis-and-Visualization
Size: 28 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AchmadFachturrohman/FGA2022-Data-Engineer
This is my Data Engineer portfolio
Language: Python - Size: 38.1 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

S-Vijay-vj/imdb-rating_Data-wrangling-and-exploration-using-SQL
Size: 927 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AlexLamson/DataWrangler
Make quick and dirty data mining made easier in Sublime Text
Language: Python - Size: 353 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 11 - Forks: 2

PedroChaparro/PI202202-alako-data
This repository contains all the files related to project's data collection, data normalization / cleansing and database management.
Language: Jupyter Notebook - Size: 581 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

agungbudiwirawan/Data_Science_in_Telco-Data_Cleansing
Data cleansing using python: handling missing data values, outliers, and standardized values.
Language: Jupyter Notebook - Size: 259 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

RochaErik/X-MoviesDatasetPt3-Tidy_up_data
Code for cleaning up data. Data from almost 46 thousand movies used.
Language: Jupyter Notebook - Size: 21.8 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

RochaErik/X-MoviesDatasetPt4-Merging_datasets
Code for cleaning and merging datasets.
Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

pariosur/food_waste_analysis
Exploratory Data Analysis of Food Waste and Food Loss Database (FAO)
Language: Jupyter Notebook - Size: 628 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

siegstedt/predict_blood_donation
This project works with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a random sample of 748 donors. The task is to predict if a blood donor will donate within a given time window. The work contains a full model-building process: from inspecting the dataset to using the tpot library to automate your Machine Learning pipeline.
Language: Python - Size: 23.4 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 3

siegstedt/super_bowl_halftime
Whether or not you like football, the Super Bowl is a spectacle. There's drama in the form of blowouts, comebacks, and controversy in the games themselves. There are the ridiculously expensive ads, some hilarious, others gut-wrenching, thought-provoking, and weird. The half-time shows with the biggest musicians in the world, sometimes riding giant mechanical tigers or leaping from the roof of the stadium. Here, we find out how some of the elements of this show interact with each other.
Language: Jupyter Notebook - Size: 98.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

bastians/address_transformation Fork of thilohuellmann/address_transformation
Transform unstructured, inconsistent or incomplete address data into structured and complete address data with Google Maps Geocoding API.
Language: Python - Size: 12.7 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

shubhankar5/scrub-system-for-de-identification
A scrub system for de-identification and cleaning of data to maintain its privacy from the world.
Language: Python - Size: 1.72 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

thecodemancer/Residential_property_prices_2020
In this code, we're applying data cleansing to this dataset so that we can properly work with it later. The goal is to build a data model with a fact table and dimension tables.
Language: Jupyter Notebook - Size: 2.5 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

AREschweiz/microcensus-geodata-cleaning
R code used to clean the geodata of the Swiss Mobility and Transport Microcensus (MTMC)
Size: 822 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

Dimas263/Preprocessing-Data-into-Train-Test-Val-Data
Python Preprocessing for Sales Project Notebook
Language: Jupyter Notebook - Size: 3.06 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

NataliaAssange/littlejsontools
Some little json tools for my own use and maybe can help you
Language: Python - Size: 11.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

RizqiSeijuuro/walmart-weekly-sales-prediction
Weekly Sales Prediction at Walmart Dataset
Language: Jupyter Notebook - Size: 6.21 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

SabrinaSuraya/Project2-WorkerGarment
determine the worker garment productivity's. regression problem
Language: Python - Size: 12.7 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

data-forge/data-forge-fs
This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js
Language: TypeScript - Size: 265 KB - Last synced at: 12 days ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 2

mtimjones/dataprocessing
Data cleanse, clustering with Vector Quantization and Adaptive Resonance Theory
Language: C - Size: 37.1 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 9 - Forks: 1

zislam/CAIRAD
Implements the CAIRAD techique for detecting noisy values in a dataset for Weka
Language: Java - Size: 36.1 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 0

almeidacastrogabriela/Black_Friday_Analysis_DS815
Data manipulation and assessment using Pandas
Language: HTML - Size: 1.44 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

alkashef/cleaning-excel-data
Tidying and cleaning data in Excel sheets
Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 0

derekngoh/HDB-Resale-Flat-Valuation
HDB flats resale price prediction. Neural network in Python. Machine learning models in R. Data pre-processing, feature engineering and feature selection mainly in R.
Language: Jupyter Notebook - Size: 5.45 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

dikoharyadhanto/Data-Preparation-Documentation
Dokumentasi Pembelajaran Tahap Data Cleansing
Language: HTML - Size: 900 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Data-Wrangling-with-JavaScript/Chapter-6
Code examples for Chapter 6 of Data Wrangling with JavaScript
Language: JavaScript - Size: 154 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

antonindanalet/microcensus-geodata-cleaning
R code used to clean the geodata of the Swiss Mobility and Transport Microcensus (MTMC)
Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

Ketan2010/TCS-Talent-Ocean
TCS Talent Ocean Challenge submission. Find suitable candidate for project based on skills.
Language: Jupyter Notebook - Size: 735 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0
