An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: datacleansing

OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

Language: Java - Size: 387 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 11,278 - Forks: 2,047

ShanYue03/Hand-Writing-Recognition-Model

Implementation of a Neural Network (NN) model for handwriting recognition using the MNIST dataset.

Language: Jupyter Notebook - Size: 55.7 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

veenapaul/Data-Visualisations-using-Power-BI

Data visualisations in Power BI

Size: 14 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 23 - Forks: 8

Manas5789/Survey-Analysis-using-Power-BI

Analyzed a survey recieved using Power BI tool to draw useful insights.

Size: 5.17 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

AmruhaAhmed/Data-Cleaning-on-New-York-Airbnb-Listings

Language: Jupyter Notebook - Size: 3.11 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

AmruhaAhmed/Analyzing-Google-Play-Store-Data

Language: Jupyter Notebook - Size: 4.75 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

selvakrishnan/DataFusion_CDAP_Wrangler_Directives

Google Cloud Data Fusion - Data Transformation Logics using CDAP Wrangler Directives.

Size: 83 KB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

AkashJ1023/Survey-Data-Analysis

Assignment:- Survey analysis of T-20 world cup 2024

Size: 190 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

PavlaBla/czechitas_project

final project of the Digital Academy

Language: Python - Size: 3.91 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

anuppm9917/Super-Store-Sales-Analysis-Power-BI-Project

My drive to know which products, regions, categories and customer segments a company should target or avoid, I search and selected an appropriate dataset on kaggle which will match a standard superstore requirement.

Size: 10.1 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

ankitapiu/Income-qualification

Random Forest Classification

Language: Jupyter Notebook - Size: 1.97 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

TaiNguyen42/Data-Translation-Challenge-R

This repository addresses the "Data Translation Challenge in Data Communications," offering resources for professionals in data science and IT to tackle data translation complexities across formats, protocols, and standards. It provides insights, exercises, and methodologies for effective data translation.

Language: HTML - Size: 1.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

sowmya-kukkala/DataThirst

Language: Jupyter Notebook - Size: 1.9 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

rushikesh9595/Python

data = pd.read_github('My / Python / Repository.github') Conents = ["Practice", "Assignments"] sub_contents = ({"Practice" : "All notebooks created during course", "Assignments" : "Basic to Advance"})

Language: Jupyter Notebook - Size: 638 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AnAriz101/Portfolio

Versatile data analyst skilled in extracting actionable insights from complex datasets. Proficient in statistical analysis, data visualization, and trend identification. Proven track record in transforming raw data into strategic business recommendations.

Language: Jupyter Notebook - Size: 4.02 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Nikisharon229/DATA-ANALYSING

cleaned data from walmart by removing null data, standardizing columns and filled null value with average

Size: 523 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

himanshu-004/OnlineRetail

This Project is based of an Online Retail store that wants to analyse major contributing factors to the revenue so they can strategically plan for next year.

Size: 22 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

xguse/table_enforcer

Table Enforcer is my attempt to apply a sort of "test driven development" workflow to data cleaning and validation. A python package to facilitate the iterative process of developing and using schema-like representations of DataFrames in pandas for recoding and validating instances of these data.

Language: Python - Size: 222 KB - Last synced at: 9 days ago - Pushed at: about 7 years ago - Stars: 17 - Forks: 1

Rajdeep-Sutariya/Telewire-Analytics---Connetwork---Data-Science-project

"Telewire Analytics," an innovative project aimed at optimizing resource utilization within the telecom industry.

Language: CSS - Size: 4.69 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

mahmoudjobeel1/Data-Analysis-of-Egyptian-Movies

-This project targets the textual analysis of Egyptian movie plot summaries that were curated from online sources, covering the four golden decades of Egyptian Cinema.

Language: Jupyter Notebook - Size: 13.5 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 3

Benazir023/NYC_schools_perceptions_analysis

The analysis seeks to understand how the perceptions of schools affect performance and demographics and vice versa

Size: 15.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

alv1nZ/Dashboard-data-visualization-with-Tableau-

Cleaned a movies dataset to present specific visuals to answer research questions

Size: 1.47 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

MuhammadHasaanWahid/Data-Cleaning-Pipeline-ETL

This project extracts data from Azure datalake gen 2 storage, transforming it and then transferring it to SQL database.

Size: 127 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

IbadDE/Excel-Project

This repo contain Excel related material

Size: 35.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

DataScience-Lin/Learning-Equality-Curriculum-Recommendation

LECR EDA & Fine Tuning

Language: Jupyter Notebook - Size: 99.6 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ironmussa/Optimus-examples 📦

Examples for Optimus a Data Cleansing Library for Big Data.

Size: 925 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 9 - Forks: 4

AmitPatel-analyst/SQL_data_cleaning_RealEstateDB

In this SQL project, I've cleaned the Nashville housing data table for better analysis using intermediate to advanced SQL queries.

Size: 5.64 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rladiesparis/Tidyverse-Meetup Fork of meghall06/rladiesparis

GitHub Repo of our Tidyverse workshop organized on Sep 8, 2022

Language: HTML - Size: 2.19 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

alitheDev/CleanDataWithPython

Advance Guide Of Cleaning & 20+ ways of cleaning data with python

Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Nagamohan419/WebScrapping_and_DataAnalysis_project_on_Cars24

Cars24 is an online second handle cars selling company, in this project Data analysis was done on the cars for sale.

Language: Jupyter Notebook - Size: 13.3 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Virementz/Hotel-Booking

My codes and insight based on data provided open source on the internet. I want to provide comprehensive data insight and analysis of Hotel Booking data for a whole year to maximise impact for both the company and the customer. I also develop several machine learning algorithm and do an in depth evaluation of each and every model selected

Language: Jupyter Notebook - Size: 1.39 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

nvnwadhwani/DCPP_Group_Assignment Fork of nagik17/DCPP_Group_Assignment

This is a dataset for Indian Food Recipes sourced from Padma Shri Award winner Sanjeev Kapoor and Archana's Kitchen websites.

Language: Jupyter Notebook - Size: 12.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

nancyalaswad90/Google-Data-Analytics-Certificate

There are 8 Courses in this Professional Certificate : Data Foundations - by Google

Size: 13.7 KB - Last synced at: 9 months ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

Mohammad2527/kpmg_Virtual_Internship

Data cleaning, analysing in excel and finally creating a dashboard in Tableau as part of the KPMG virtual internship.

Size: 30.6 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

chinmaykumar06/Coursera-Introduction-to-Data-Science-in-Python

This course by University of Michigan introduces the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will also introduces data manipulation and cleaning techniques using python pandas data science library.

Language: Jupyter Notebook - Size: 7.25 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

AMVamsi/Intro_Datascience_Python-Coursera

Coursera Assignments

Language: Jupyter Notebook - Size: 148 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

aliemamalinezhad/Health_level_classification_deep_learning

we use keras and tensorflow and sklearn to classify health level of student by using Nursey UCI Dataset

Language: Jupyter Notebook - Size: 129 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

ElXaxe/DataCleansing

Data cleansing and validation for Data Science Master degree

Language: Jupyter Notebook - Size: 10.9 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

thegothamstak/DataMining

Programs I write for my Data Mining course

Language: Python - Size: 9.77 KB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

gazalpatel/Exploratory-Data-Analysis-in-R

Language: HTML - Size: 1.27 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1