Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-cleansing

aminkhod/TA--Course-ofData-mining--Fall-2018 πŸ“¦

Here is some implementation and using methods in Topics on Data mining course

Language: Python - Size: 32.1 MB - Last synced: 6 days ago - Pushed: over 4 years ago - Stars: 3 - Forks: 1

data-integrations/wrangler

Wrangler Transform: A DMD system for transforming Big Data

Language: Java - Size: 5.68 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 83 - Forks: 56

Abhigyan76/Pizza-Sales-Insight

Used SQL, Power BI to make insightful dashboard

Size: 2.24 MB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 0 - Forks: 0

Ashbyt/Python

Ashley Bythell - Python

Language: Jupyter Notebook - Size: 5.68 MB - Last synced: 17 days ago - Pushed: about 1 month ago - Stars: 1 - Forks: 0

JoeRegnier/horkos

Data quality analysis and scoring system.

Language: Python - Size: 3.39 MB - Last synced: 24 days ago - Pushed: 11 months ago - Stars: 2 - Forks: 2

Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

Language: C++ - Size: 126 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 62 - Forks: 50

hi-primus/optimus

:truck: Agile Data Preparation Workflows madeΒ easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Language: Python - Size: 110 MB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 1,441 - Forks: 233

tneriaransom/data-analysis-portfolio

This repository houses a curated collection of projects designed to highlight my expertise in data analytics.

Size: 39.1 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

probcomp/PClean

A domain-specific probabilistic programming language for scalable Bayesian data cleaning

Language: Julia - Size: 1.34 MB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 214 - Forks: 31

TimKong21/PwC-Switzerland-Power-BI-in-Data-Analytics-Virtual-Case-Experience

Comprehensive Power BI dashboards showcasing insights on Call Centre Trends, Customer Retention, and Diversity & Inclusion to drive business impact.

Size: 4.43 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 7 - Forks: 1

bakdata/dedupe

Java DSL for (online) deduplication

Language: Java - Size: 1.07 MB - Last synced: about 2 months ago - Pushed: 3 months ago - Stars: 19 - Forks: 2

RizqiSeijuuro/final-project-kelompok-03-aditya-bariq

Weekly Sales Prediction at Walmart Dataset. Buat dikumpulin di Final Projek Studi Independen Batch 3

Language: Jupyter Notebook - Size: 21.9 MB - Last synced: 2 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 1

Irrelev4nt13/Customer-Personality-Analysis

πŸ“ŠCustomer Personality Analysis, using various Data Mining techniques and Machine Learning algorithms.

Language: Jupyter Notebook - Size: 1.6 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

fitria-dwi/Business-Decision-Research

Language: Jupyter Notebook - Size: 218 KB - Last synced: 2 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

Somu-cSs/Water-Quality-Analysis-and-Prediction.

Interactive Dashboard Web-app :

Language: Jupyter Notebook - Size: 3.09 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

weichi21/KNN-Model-Car-Price-Prediction

Predictive modeling project by implementing KNN regression model.

Language: Jupyter Notebook - Size: 438 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 0

ClimerLab/mrclean

Two Mixed Integer Programs for cleaning a data file.

Language: C++ - Size: 39.1 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

DataPreprocessing/DataCleaning

Data Cleaning is a python package for data preprocessing. This cleans the CSV file and returns the cleaned data frame. It does the work of imputation, removing duplicates, replacing special characters, and many more.

Language: Python - Size: 117 KB - Last synced: about 2 months ago - Pushed: about 3 years ago - Stars: 5 - Forks: 2

kshaikh23/NBA-Playoffs-Project

Statistical analysis comparing team play in the NBA regular season and playoffs. Linear Regression algorithm to predict players playoffs points per game based on their regular season stats. Collaborated with Stephan MacDougall.

Language: Jupyter Notebook - Size: 664 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

BDFD-LearningGround/Cousera_Applied-Data-Science-with-Python-Specialization-OP

Quizzes & Assignment Solutions for Applied Data Science with Python Specialization on Coursera. Also included a few resources on side that I found helpful.

Size: 7.81 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

Yaresh01/Finance-and-Risk-Analytics-project

Language: Jupyter Notebook - Size: 9.42 MB - Last synced: 4 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

data-forge/data-forge-ts

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

Language: TypeScript - Size: 3.26 MB - Last synced: 5 months ago - Pushed: 10 months ago - Stars: 1,270 - Forks: 76

rahulodedra30/House-Recommendation-Based-on-Neighbourhood

Recommended house based on neighbourhood using K-Means clustering after scraping data from Wikipedia website

Language: Jupyter Notebook - Size: 572 KB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

ITISALLDATA/DATA-CLEANING-PROJECT-WITH-SQL

In this project, I cleaned up a large FIFA 2021 dataset with 18,000+ player records. The data was messy, with inconsistencies in 77 columns. I focused on making the data consistent and usable for analysis. This repository documents my step-by-step process, demonstrating how I transformed the data into a clean format.

Size: 20.5 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

dikshantnaik/Data-Cleaning-Assignment-Internship

A Python script to Parse data from Non-Meaningful data to Meaningful and save it to .csv File

Language: Python - Size: 27.3 KB - Last synced: 7 months ago - Pushed: about 2 years ago - Stars: 3 - Forks: 0

ajaymache/data-analysis-using-python

Exploratory data analysis πŸ“Šusing python 🐍of used car 🚘 database taken from β“šπ–†π–Œπ–Œπ–‘π–Š

Language: Jupyter Notebook - Size: 49.3 MB - Last synced: 6 months ago - Pushed: over 5 years ago - Stars: 193 - Forks: 89

extremecode/stress-detection-in-social-networks

stress detection in social networks

Language: R - Size: 5.26 MB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 2 - Forks: 2

ojasphansekar/Zillow-Home-Value-Prediction

XGBoost, LightGBM, LSTM, Linear Regression, Exploratory Data Analysis

Language: Jupyter Notebook - Size: 1.81 MB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 10 - Forks: 7

julianacastilloaraujo/Google-Data-Analytics

⭐️ Google Data Analytics + Coursera ⭐️ πŸ‘©β€πŸ’» Datos, datos, en todas partes(este curso) πŸ” Skills : Spreadsheet, Data Cleansing, Data Analysis, Data Visualization (DataViz), SQL

Size: 43.9 KB - Last synced: 2 months ago - Pushed: 11 months ago - Stars: 2 - Forks: 0

AVJdataminer/Squeaky

R package for data cleaning and pre-processing for data science

Language: R - Size: 79.1 KB - Last synced: 8 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1

AliiPmD/House_Prices_Advanced_Regression

Advanced Regression for House Prices with data preprocessing steps (like Data exploration, Cleansing, visualization, etc.) and training a model with 0.945 score.

Language: Jupyter Notebook - Size: 661 KB - Last synced: 4 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

muhammadhamzah8/Ecommerce-Shipping-Classification-Modeling

Exploratory Data Analysis & Modeling to predict whether the shipping deliveries will be received late or on-time by the customers

Language: Jupyter Notebook - Size: 34.4 MB - Last synced: 9 months ago - Pushed: about 2 years ago - Stars: 0 - Forks: 2

edohgoka/Predict_Success_Of_a_Restaurant

Predicting the success or not of a restaurant.

Language: Jupyter Notebook - Size: 33.8 MB - Last synced: 9 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

aaron-evans-cruz/Python-Portfolio-Projects

Python Portfolio Projects. Highlighting skills in Python, Pandas, data cleaning, correlation...

Language: Jupyter Notebook - Size: 2.48 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

AlisonYao/2020SummerResearch_ChineseResDatabase Fork of xuyou1999/2020SummerResearch_ChineseResDatabase

Language: Jupyter Notebook - Size: 29.7 MB - Last synced: 9 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

brunocampos01/porto-seguro-safe-driver-prediction

Predict if a driver will file an insurance claim next year. (Kaggle Competition)

Language: Python - Size: 93.8 MB - Last synced: 10 days ago - Pushed: over 2 years ago - Stars: 10 - Forks: 5

kenlhlui/Brexit_referendum_data_cleaning

R data cleaning project Brexit Referendum voting data.

Language: R - Size: 5.52 MB - Last synced: 9 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

scrab017/tgdl

Analysis of physicians registered under The Tripura State Medical Council. Data scraped from https://tsmc.tripura.gov.in/doc_list

Language: HTML - Size: 1.25 MB - Last synced: 10 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

AP-State-Skill-Development-Corporation/Data-Science-Using-Python-Internship-EB1

This repo created for sharing the required/discussed files during Online Internship training program on Data Science Using Python in May-2021

Language: Jupyter Notebook - Size: 21.8 MB - Last synced: 25 days ago - Pushed: almost 3 years ago - Stars: 13 - Forks: 10

iweld/data_cleaning

An SQL data cleaning project

Size: 586 KB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 8 - Forks: 4

doratako/Data-Quality-Assurance

Data validation and data cleansing

Language: Jupyter Notebook - Size: 54.7 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

pmb-7684/Google-Data-Analytics-Professional-Certificate

Learning materials, assignments, and helpful resources for professional certification. Completed October 2022

Size: 6.84 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

pmb-7684/Applied-Data-Science-with-Python-Specialization

Coursera specialization taught by University of Michigan. Expected completion Date July 2023

Size: 5.86 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

AlecVail/Preparing_Data_Using_Alteryx

Alteryx Academy Challenge #363

Size: 41 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

siegstedt/predict_credit_card_approval

Commercial banks receive a lot of applications for credit cards. Many of them get rejected for many reasons, like high loan balances, low income levels, or too many inquiries on an individual's credit report, for example. Manually analyzing these applications is mundane, error-prone, and time-consuming. Luckily, this task can be automated with the power of machine learning. Here is an automatic credit card approval predictor using machine learning techniques.

Language: Jupyter Notebook - Size: 32.2 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 3 - Forks: 1

DablewCodes/Nashville-Housing-Data-Cleaning

Performed Data Cleaning by using advanced SQL such as CTEs, Joins, Rank Functions, Aggregate Functions etc.

Size: 11.3 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

federicozukierman/data-cleaning-SQL

In this project I clean data from the Nashville (US) housing database

Size: 1000 Bytes - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

raihanjp98/DTS-KOMINFO-Data-Engineer-Career-Track-DQLab

A collection of scripts written to complete DQLab Data Engineer Career Track

Language: Python - Size: 28.9 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

programindz/heartattackpredictor

A machine learning model using Support Vector Machine classification to predict chances of an individual having a heart attack based on features like age, sex, cholestrol, blood pressure, chest pain, heart beat etc.

Language: Jupyter Notebook - Size: 88.9 KB - Last synced: 4 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

hazem-alabiad/taxi-tip-estimator

Taxi Tip Estimator (TTS) is a Data Mining project that uses the data collected by taxi drivers to estimate the tips given by customers.

Language: HTML - Size: 53.9 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 0

BDFD-LearningGround/Cousera_Google-Data-Analytics-Professional-Certificate

Quizzes & Assignment Solutions for Google Data Analytics Professional Certificate on Coursera. Also included a few resources on side that I found helpful.

Size: 38.2 MB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 69 - Forks: 21

abccastro/Canada-PR-Data-Analysis-and-Visualization

Size: 28 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

AchmadFachturrohman/FGA2022-Data-Engineer

This is my Data Engineer portfolio

Language: Python - Size: 38.1 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

S-Vijay-vj/imdb-rating_Data-wrangling-and-exploration-using-SQL

Size: 927 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

AlexLamson/DataWrangler

Make quick and dirty data mining made easier in Sublime Text

Language: Python - Size: 353 KB - Last synced: 7 months ago - Pushed: about 3 years ago - Stars: 11 - Forks: 2

HypertextAssassin0273/Excel_Data_Organizer_and_Cleaner-DS_Project

Data Structures project in C++11 language, uses custom Vector & String structures with Move Semantics (Rule of Five)

Language: C++ - Size: 1.39 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 7 - Forks: 2

PedroChaparro/PI202202-alako-data

This repository contains all the files related to project's data collection, data normalization / cleansing and database management.

Language: Jupyter Notebook - Size: 581 MB - Last synced: 12 months ago - Pushed: over 1 year ago - Stars: 4 - Forks: 1

RochaErik/X-MoviesDatasetPt3-Tidy_up_data

Code for cleaning up data. Data from almost 46 thousand movies used.

Language: Jupyter Notebook - Size: 21.8 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

RochaErik/X-MoviesDatasetPt4-Merging_datasets

Code for cleaning and merging datasets.

Language: Jupyter Notebook - Size: 14.7 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

pariosur/food_waste_analysis

Exploratory Data Analysis of Food Waste and Food Loss Database (FAO)

Language: Jupyter Notebook - Size: 628 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

siegstedt/predict_blood_donation

This project works with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a random sample of 748 donors. The task is to predict if a blood donor will donate within a given time window. The work contains a full model-building process: from inspecting the dataset to using the tpot library to automate your Machine Learning pipeline.

Language: Python - Size: 23.4 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 2 - Forks: 3

siegstedt/super_bowl_halftime

Whether or not you like football, the Super Bowl is a spectacle. There's drama in the form of blowouts, comebacks, and controversy in the games themselves. There are the ridiculously expensive ads, some hilarious, others gut-wrenching, thought-provoking, and weird. The half-time shows with the biggest musicians in the world, sometimes riding giant mechanical tigers or leaping from the roof of the stadium. Here, we find out how some of the elements of this show interact with each other.

Language: Jupyter Notebook - Size: 98.6 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 2 - Forks: 1

bastians/address_transformation Fork of thilohuellmann/address_transformation

Transform unstructured, inconsistent or incomplete address data into structured and complete address data with Google Maps Geocoding API.

Language: Python - Size: 12.7 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 1 - Forks: 1

shubhankar5/scrub-system-for-de-identification

A scrub system for de-identification and cleaning of data to maintain its privacy from the world.

Language: Python - Size: 1.72 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 1

thecodemancer/Residential_property_prices_2020

In this code, we're applying data cleansing to this dataset so that we can properly work with it later. The goal is to build a data model with a fact table and dimension tables.

Language: Jupyter Notebook - Size: 2.5 MB - Last synced: 12 months ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

AREschweiz/microcensus-geodata-cleaning

R code used to clean the geodata of the Swiss Mobility and Transport Microcensus (MTMC)

Size: 822 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

Dimas263/Preprocessing-Data-into-Train-Test-Val-Data

Python Preprocessing for Sales Project Notebook

Language: Jupyter Notebook - Size: 3.06 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 1

fpjnijweide/autoencoder-pdb-cleaning

This is the source code for the paper "A probabilistic database approach to autoencoder-based data cleaning".

Language: Jupyter Notebook - Size: 242 MB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 2 - Forks: 0

NataliaAssange/littlejsontools

Some little json tools for my own use and maybe can help you

Language: Python - Size: 11.7 KB - Last synced: 11 months ago - Pushed: almost 2 years ago - Stars: 3 - Forks: 0

RizqiSeijuuro/walmart-weekly-sales-prediction

Weekly Sales Prediction at Walmart Dataset

Language: Jupyter Notebook - Size: 6.21 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

SabrinaSuraya/Project2-WorkerGarment

determine the worker garment productivity's. regression problem

Language: Python - Size: 12.7 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

data-forge/data-forge-fs

This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js

Language: TypeScript - Size: 265 KB - Last synced: 5 days ago - Pushed: over 2 years ago - Stars: 10 - Forks: 2

Rahma-Farag/Rahma-Farag

Main Repository

Language: Jupyter Notebook - Size: 71.4 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

mtimjones/dataprocessing

Data cleanse, clustering with Vector Quantization and Adaptive Resonance Theory

Language: C - Size: 37.1 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 9 - Forks: 1

zislam/CAIRAD

Implements the CAIRAD techique for detecting noisy values in a dataset for Weka

Language: Java - Size: 36.1 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 2 - Forks: 0

almeidacastrogabriela/Black_Friday_Analysis_DS815

Data manipulation and assessment using Pandas

Language: HTML - Size: 1.44 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

alkashef/cleaning-excel-data

Tidying and cleaning data in Excel sheets

Size: 2.93 KB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 2 - Forks: 0

derekngoh/HDB-Resale-Flat-Valuation

HDB flats resale price prediction. Neural network in Python. Machine learning models in R. Data pre-processing, feature engineering and feature selection mainly in R.

Language: Jupyter Notebook - Size: 5.45 MB - Last synced: 12 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

dikoharyadhanto/Data-Preparation-Documentation

Dokumentasi Pembelajaran Tahap Data Cleansing

Language: HTML - Size: 900 KB - Last synced: 12 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

Data-Wrangling-with-JavaScript/Chapter-6

Code examples for Chapter 6 of Data Wrangling with JavaScript

Language: JavaScript - Size: 154 KB - Last synced: 23 days ago - Pushed: almost 2 years ago - Stars: 2 - Forks: 0

antonindanalet/microcensus-geodata-cleaning

R code used to clean the geodata of the Swiss Mobility and Transport Microcensus (MTMC)

Size: 8.79 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

Ketan2010/TCS-Talent-Ocean

TCS Talent Ocean Challenge submission. Find suitable candidate for project based on skills.

Language: Jupyter Notebook - Size: 735 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

Gustavo-Hernandez/Wearable-Data-Cleaning

Final Project of the Getting and Cleaning Data certification imparted by the Johns Hopkins University at Coursera

Language: R - Size: 3.57 MB - Last synced: 4 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

iTrauco/vtt-to-csv-python-script

Python3 script to convert transcribed video VTT to CSV for import into Google Sheets

Language: Python - Size: 1000 Bytes - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 3

zislam/DMI

Implements the DMI imputation algorithm for imputing missing values in a dataset from Rahman, M. G., and Islam, M. Z. (2013): Missing Value Imputation Using Decision Trees and Decision Forests by Splitting and Merging Records: Two Novel Techniques

Language: Java - Size: 21.5 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

pulkitmehta/bikeIntel

bikeIntel is an AI based Bike Rental Assistant which gives smart suggestions to user .

Language: Jupyter Notebook - Size: 61.3 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0

adamjbrennan/serverTippingAnalysis

Size: 324 KB - Last synced: 11 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 1

jankubierecki/python-ds

some practical examples to learn data science with python

Language: Jupyter Notebook - Size: 1.57 MB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 2 - Forks: 0

jcp/datafilter

Quickly find flags (words, phrases, etc) within your data. :male_detective:

Language: Python - Size: 88.9 KB - Last synced: 23 days ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

AhmedSalahBasha/data-cleaning

Data Integration - Data Cleaning Task

Language: Jupyter Notebook - Size: 10.8 MB - Last synced: about 1 year ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0

Related Keywords
data-cleansing 90 data-cleaning 27 python 26 machine-learning 24 data-science 22 data-visualization 20 data-analysis 17 data 15 data-wrangling 13 pandas 13 sql 10 data-preprocessing 8 exploratory-data-analysis 8 data-manipulation 7 data-preparation 7 data-mining 6 numpy 5 feature-engineering 5 matplotlib 5 data-analytics 5 excel 5 data-exploration 4 r 4 deep-learning 4 jupyter-notebook 4 feature-selection 4 javascript 4 classification 4 dataset 3 regression-models 3 random-forest 3 visualization 3 neural-networks 3 nodejs 3 json 3 machine-learning-algorithms 3 regression 3 data-munging 3 csv 3 data-transformation 3 preprocessing 3 scikit-learn 3 data-engineering 3 logistic-regression 3 data-quality 3 java 2 weka 2 mining 2 parsing 2 linq 2 data-management 2 data-forge 2 data-processing 2 python-programming 2 web-scraping 2 seaborn-plots 2 analysis 2 kaggle-competition 2 imputation 2 predictive-modeling 2 knn-regression 2 plot 2 geodata 2 mobility-data 2 switzerland 2 powerbi 2 microsoft-power-bi 2 dataviz 2 preparation 2 sklearn 2 automation 2 python3 2 etl 2 big-data 2 xgboost 2 decision-making 2 data-profiling 2 feature-extraction 2 clustering 2 data-visualisation 2 tabular-data 2 excel-operations 1 move-semantics 1 object-oriented-programming 1 oop 1 open-source 1 easy-project 1 open-source-code 1 natural-language-toolkit 1 text-mining 1 open-source-project 1 rule-of-five 1 data-virtualization 1 tableau-software 1 sample-size-determination 1 string 1 university-project 1 r-studio 1 r-programming 1 r-markdown 1