GitHub topics: cleaning-dataset
Geoffrey3wu/sales-data-sas-project
SAS-based data cleaning and sales reporting project
Language: SAS - Size: 0 Bytes - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

fgiorgia/data-cleaning-for-housing-data
Cleaning the Nashville Housing Data dataset.
Language: PLpgSQL - Size: 3.29 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 0 - Forks: 0

Shivmalge/SQL-Data-Analysis-Healthcare-Project
SQL - Healthcare Dataset Analysis
Size: 561 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Alwin2397/MySQL_World_Corporate_Layoffs_Data_Analysis
MySQL project on world corporate layoffs: cleaning and analyzing layoffs dataset.
Size: 0 Bytes - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

DouweHorsthuis/EEG_to_ERP_pipeline_stats_R
General pipeline used for analyzing EEG data where Raw EEG data gets transformed into ERPS and Stats are done in R (Mixed effects models)
Language: MATLAB - Size: 10.5 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 12 - Forks: 4

josedanielchg/Efficient-Data-Storage-for-Predictive-Modeling
DataCamp project from the Associate Data Scientist track, focusing on optimizing dataset storage by transforming data types and filtering. Prepares data for efficient machine learning workflows
Language: Jupyter Notebook - Size: 2.23 MB - Last synced at: 27 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

DipunMohapatra/Nashville-Housing-Dataset-Cleaning-Using-SQL
A data cleaning project for the Nashville Housing dataset, focused on handling missing values, removing duplicates, and standardising fields to improve data quality and reliability for real estate analysis.
Size: 5.14 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

abeylicious/SQL-Projects
Data cleaning, transformation, standardization and exploration of data in MySQL server
Language: TSQL - Size: 4.88 KB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

BaNamTheAnalyst/U.S-Housing-Market-Factors-Project
The impact of macroeconomic indicators on the housing price index in the United States during the period from 19xx to 2012.
Language: Jupyter Notebook - Size: 663 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

MoonmoonSamal/Data-Driven-Google-Ads-for-Listing-Sites-Analysis
Analyzed Google Ads performance to identify top channels, keywords, and geographical impact
Language: Jupyter Notebook - Size: 321 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

MoonmoonSamal/Meesho_Order_Financial_Analysis
Generating insights from Meesho sales data (Oct-Nov)
Language: Jupyter Notebook - Size: 200 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vggm/neo4j_ML
Simple project that extract, clean and process a dataset and import the data to a nosql database. Implementation of a simple app to work with.
Language: Jupyter Notebook - Size: 52.3 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Manar20575/Data-Science-Project
build a models that predicts whether an individual makes over $50,000 per year.
Language: Jupyter Notebook - Size: 5.01 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Luc1eSky/finsim_data_exploration
This repo is a initial data exploration of the FinSim Game
Language: Rich Text Format - Size: 1.58 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

RAQUELFONT/Master-s-Projects
A compilation of impactful projects undertaken during my master's degree studies. 🎓
Language: Jupyter Notebook - Size: 4.06 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Chisom-chukwumerije/Netflix
Netflix is a streaming service that offers a wide variety of award winning TV Shows, Movies, Anime, Documentaries, and more. The service primarily distributes original and acquired films and television shows from various genres, and its availability in multiple languages.
Size: 1.57 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Kaybhee/SQL_DA_CLEANING
Size: 5.63 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Photoroom/fast-dataset-cleaner
A simple tool for cleaning image datasets at a glance.
Language: TypeScript - Size: 3.55 MB - Last synced at: about 10 hours ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 3

gambros/PortfolioProject-NashvilleHousingData
In this project I perform data cleaning using T-SQL, to improve the quality of a dataset containing information about houses in Nashville, Tennessee..
Language: TSQL - Size: 3.91 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

YaroslavaVob/DataCleaning_Project
Project of cleaning of data 'Flats in Moscow and Moscow region'
Language: Jupyter Notebook - Size: 6.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Kishawn-Dorman/Airbnb-Edinburgh-Housing-Dilemma-Analysis
host & listing characteristics to detect illegitimate listing rental
Size: 14.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ITISALLDATA/DATA-CLEANING-PROJECT-WITH-SQL
In this project, I cleaned up a large FIFA 2021 dataset with 18,000+ player records. The data was messy, with inconsistencies in 77 columns. I focused on making the data consistent and usable for analysis. This repository documents my step-by-step process, demonstrating how I transformed the data into a clean format.
Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

divyansh1195/Halliburton-Landmark-Learning-ML-with-Python
Machine Learning with Python: Halliburton Landmark Learning
Language: Jupyter Notebook - Size: 12.5 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Korede34/DATA-CLEANING-PROJECT-WITH-SQL
In this project, I cleaned up a large FIFA 2021 dataset with 18,000+ player records. The data was messy, with inconsistencies in 77 columns. I focused on making the data consistent and usable for analysis. This repository documents my step-by-step process, demonstrating how I transformed the data into a clean format.
Size: 62.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Arkantos-13/Clean_Airbnb_Dataset
Just cleaning an Airbnb dataset with no more digging
Language: Jupyter Notebook - Size: 45.9 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

zaha2020/Machine_Learning
Machine Learning projects
Language: Jupyter Notebook - Size: 167 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

zaha2020/Data_Analytics
Language: Jupyter Notebook - Size: 5.32 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

shrav-6/datapreparation-entip
This project involves cleaning and preparing data for entip project
Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

DoriDoro/algoInvest_trade
Project 7 OpenClassrooms Path - AlgoInvest&Trade -- develop an algorithm to solve a problem
Language: Python - Size: 338 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sudhansuku/IMDB-Movie-Analysis
This project aims to carry out the in-depth analysis of IMDB movie dataset. Excel is used to draw insights.
Size: 11.5 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

praoiticica/Titanic-traditional-ML
Data classification on Titanic dataset using traditional ML methods.
Language: Jupyter Notebook - Size: 14.2 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

afsanamimii/Movie-review-analysis
Language: Jupyter Notebook - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

Sriharish19/EDA-Hotel-Booking-Capstone-Project-1
After cleaning the data, EDA was performed using python libraries like matplotlib and seaborn to display the data and generate business insights that aid hotels in managing their inventories much more effectively.
Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ConX/drpt
Tool for preparing a dataset for publishing by dropping, renaming, scaling, and obfuscating columns defined in a recipe.
Language: Python - Size: 68.4 KB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

JuanDanielMarin/Global-Superstore-Project
Performed the data exploration and cleaning using SQL for a dataset about an e-commerce store to provide answers for smart business questions.
Size: 7.51 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

brianmaleek/project_workspace_2_tweepy
Wrangling and analyzing we rate dogs twitter account which rates people's dogs with a humorous comment about the dog.
Language: Jupyter Notebook - Size: 2.57 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

santurini/PCA-Kmeans-From-Scratch
Application of K-means algorithm on a music dataset after a dimensionality reduction with PCA.
Language: Jupyter Notebook - Size: 18 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

AbiolaBajo10/WeRateDogs-Twitter-Dataset
The dataset for this project is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. This dataset was carefully analysed to find meaninful insights.
Language: Jupyter Notebook - Size: 917 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

darshitparmar/Bike-Sharing-Data-Cleaning-and-Prep
Using the bike sharing data, I demonstrate skills in Data Cleaning and Preparation along with testing the data for normality and transforming it.
Language: HTML - Size: 2.49 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

mdbinger/School_District_Analysis
Determined whether student test scores are impacted by factors such as school size, school budget, student grade, etc. for a city school district using a python script in jupyter notebook with the Pandas dependency. Cleaned city school district data to eliminate problematic data that was impacting our analysis of student success on standardized tests.
Language: Jupyter Notebook - Size: 572 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Neel14-stack/ML-Tasks
Machine Learning Internship Assignment
Language: Jupyter Notebook - Size: 626 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Ron0p/Dashboard
Dataset of 59 ipl match from kaggle named as IPL_Matches_2022.csv,Data analysis is on IPL_2022.py file ,Dash.py is main application file in which ,gui is created using streamlit.
Language: Python - Size: 1.62 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

ZeroDarkHardy/School_District_Analysis
Analysis of District-wide school and student data, refactored to omit data sample with potential academic dishonesty
Language: Jupyter Notebook - Size: 1.76 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

junanda/preprocessing
source code train models Machine Learning and preprocessing text using python
Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

jacquie0583/Cryptocurrencies
Unsupervised Machine Learning- CyrptoCurrency Analysis, using several models on a cryptocurrency data in order to discover patterns and groups in data. Analysis done to create a report that includes what cryptocurrencies are on the trading market and how they could be grouped in order to create a classification system for potential new investments into the cryptocurrency market.
Language: Jupyter Notebook - Size: 9.81 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

c-morey/challenge-data-analysis
This repository provides a Jupyter notebook on basic data cleaning and exploratory data analysis process with a CSV file that was scrapped from a real estate website in Belgium.
Language: Jupyter Notebook - Size: 84 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

BahramJannesar/ChocolateReveiwDataAnalysis
Data Analysis
Language: Jupyter Notebook - Size: 643 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0
