An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-wrangling

Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

Language: C++ - Size: 143 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 401 - Forks: 76

iterative/datachain

ETL, Analytics, Versioning for Unstructured Data

Language: Python - Size: 10.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,548 - Forks: 112

lucascorumba/study-projects

Python and Data projects

Language: Jupyter Notebook - Size: 18.1 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

ConservationInternational/Wildlife-Insights----Data-Migration

Data Migration Code and Scripts for Wildlife Insights Data Providers

Language: R - Size: 109 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 7 - Forks: 7

dathere/qsv

Blazing-fast Data-Wrangling toolkit

Language: Rust - Size: 63.9 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,794 - Forks: 79

swcarpentry/r-novice-gapminder

R for Reproducible Scientific Analysis

Language: R - Size: 179 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 166 - Forks: 543

burhanahmed1/VehicLens

Comprehensive automobile attribute prediction framework combining EDA, grid-searched polynomial regression, and ridge regularization to maximize predictive accuracy across diverse vehicular features.

Language: Jupyter Notebook - Size: 459 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

microsoft/prose

Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.

Language: C# - Size: 81.6 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 642 - Forks: 99

ReusJimenez/bikebuyer-datawrangling-eda

Exploración y preparación de datos del dataset Bike Buyers, con foco en transformación de variables demográficas e identificar patrones en el comportamiento de compra de bicicletas. 🚴‍♂️📈

Language: Jupyter Notebook - Size: 3.18 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

skrub-data/skrub

Machine learning with dataframes

Language: Python - Size: 12.4 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1,380 - Forks: 128

brimdata/zui

Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.

Language: TypeScript - Size: 221 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,839 - Forks: 133

swcarpentry/r-novice-inflammation

Programming with R

Language: R - Size: 51.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 164 - Forks: 392

OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

Language: Java - Size: 387 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 11,303 - Forks: 2,052

ReusJimenez/breastcancer-datawrangling-eda

Limpieza, transformación y EDA del dataset Breast Cancer Wisconsin para preparar datos médicos de tumores benignos y malignos para análisis predictivos. 🧬🩺

Language: Jupyter Notebook - Size: 479 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

audiomuze/tagminder

Import, maintain and export tag metadata to/from audio files and a dynamically created SQLite table. Automates incremental tag cleanup, enrichment and standardisation for your digital audio library at scale using pre-scripted SQL queries and Polars, achieving quality and consistency in your metadata not possible with a tagger

Language: Python - Size: 644 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8 - Forks: 1

gagolews/datawranglingpy

Minimalist Data Wrangling with Python (Open-Access Textbook)

Size: 300 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 79 - Forks: 4

openswoop/isqool

ISQ Tool: scrape UNF ISQ data for courses and professors

Language: Go - Size: 8.81 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 1

GorkemCin/data_analiysis_with_pandas

in this repository, you can find pandas examples and theroical information

Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

bugrayuksel90/pandas-experiments

Personal learning repository focused on mastering data with pandas in Python

Language: Jupyter Notebook - Size: 10.7 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

Burakkylmz/data_analysis_with_pandas

In this repository, you can find pandas examples and theroical information

Language: Jupyter Notebook - Size: 6.84 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

TomWright/dasel

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

Language: Go - Size: 8.56 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 7,448 - Forks: 146

hi-primus/optimus

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Language: Python - Size: 110 MB - Last synced at: about 13 hours ago - Pushed at: 5 months ago - Stars: 1,508 - Forks: 232

stefmolin/Hands-On-Data-Analysis-with-Pandas

Materials for following along with Hands-On Data Analysis with Pandas.

Language: Jupyter Notebook - Size: 31.2 MB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 417 - Forks: 818

datacarpentry/python-ecology-lesson

Data Analysis and Visualization in Python for Ecologists

Language: Jupyter Notebook - Size: 28.4 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 168 - Forks: 309

anpabeltj/us-real-estate-sales-analysis

Comprehensive analysis of Connecticut residential and commercial property sales from 2001 to 2021—visualizing price trends, town‐level comparisons, and property‐type distributions using Python and an interactive dashboard.

Language: Jupyter Notebook - Size: 265 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

datacarpentry/r-intro-geospatial

Introduction to R for Geospatial Data

Language: R - Size: 130 MB - Last synced at: 1 day ago - Pushed at: 5 days ago - Stars: 46 - Forks: 70

lyrasis/kiba-extend

Extensions to Kiba ETL

Language: Ruby - Size: 12.7 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 6 - Forks: 0

data-forge/data-forge-ts

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

Language: TypeScript - Size: 3.68 MB - Last synced at: 5 days ago - Pushed at: 11 days ago - Stars: 1,361 - Forks: 79

singhsidhukuldeep/singhsidhukuldeep.github.io

This is a completely open-source repo of interview questions and answers for people preparing for such interviews. This is maintained by you and you can send the questions that you faced during interviews.

Language: HTML - Size: 8.55 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 12 - Forks: 6

anshsri08/Scientific_Language_Processing

Utilizing neural networks for scientific knowledge represented in natural language text data.

Size: 605 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

swcarpentry/r-novice-gapminder-es

R para Análisis Científicos Reproducibles

Language: R - Size: 92.4 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 10 - Forks: 52

swcarpentry/python-novice-gapminder

Plotting and Programming in Python

Size: 17 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 170 - Forks: 433

macenkrace/Kidney-Disease-Prediction-Using-ML

This repository contains a comprehensive machine learning project for predicting Chronic Kidney Disease (CKD) using various classifiers. The project implements a systematic pipeline including data cleaning, preprocessing, model training, evaluation, and inference,enabling healthcare practitioners to leverage predictive analytics for early detection

Language: Jupyter Notebook - Size: 111 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

kjam/data-cleaning-101

Data Cleaning Libraries with Python

Language: Jupyter Notebook - Size: 6.78 MB - Last synced at: about 6 hours ago - Pushed at: over 1 year ago - Stars: 287 - Forks: 172

datacarpentry/genomics-r-intro

Intro to R and RStudio for Genomics

Language: R - Size: 135 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 28 - Forks: 89

pmgraham/datagrunt

Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.

Language: Python - Size: 6.37 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 7 - Forks: 0

stefmolin/Hands-On-Data-Analysis-with-Pandas-2nd-edition

Materials for following along with Hands-On Data Analysis with Pandas – Second Edition

Language: Jupyter Notebook - Size: 70.1 MB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 622 - Forks: 1,462

datacarpentry/r-socialsci

R for Social Scientists

Language: R - Size: 221 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 121 - Forks: 208

kids-first/kf-lib-data-ingest

🏭 Kids First Data Ingest Library

Language: Python - Size: 11.2 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 5 - Forks: 0

Francis-Calingo/IBM-Capstone-Data-Science-for-Rocket-Science

Language: Jupyter Notebook - Size: 6.43 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

joshuapowell/preparing-your-mainframe-data-for-machine-learning

Mainframe Data Wrangling: Preparing Your Mainframe Data for Machine Learning

Language: TeX - Size: 178 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

ezechias463/Data-Science-Capstone-Project

This project is part of the Data Science Specialization provided by IBM on Coursera. Data scientists use data to solve several problems in the society. This project is an example of what a real data science project looklike. The project aims at predicting whether the launch of a newly made Falcon 9 space craft would land sucessfully or not.

Language: Jupyter Notebook - Size: 25.9 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

DCS-training/CDCS-Summer-School2021

2021 Text and Data Analysis Summer School

Language: Jupyter Notebook - Size: 101 MB - Last synced at: 16 days ago - Pushed at: 10 months ago - Stars: 10 - Forks: 7

DCS-training/intromachinelearning

This course is aimed at providing an introduction to machine learning for those with some beginner level python/Rstudio skills. Go to the readme file

Language: Jupyter Notebook - Size: 9.21 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 2

burhanahmed1/LaptopPricing-MachineLearning-Analysis

Data Analysis, training Machine Learning models, and Model Evaluation and Refinement for LaptopPricing dataset.

Language: Jupyter Notebook - Size: 95.7 KB - Last synced at: 19 days ago - Pushed at: 10 months ago - Stars: 5 - Forks: 1

KGVikas/SQL-Data-Analysis

Cleaned & analyzed layoff data from `layoffs.csv`. Performed data cleaning (removed duplicates, standardized values) in `data_cleaning_project.sql`, then explored trends in `data_analysis.sql`—top companies, industries, countries, and temporal patterns using SQL aggregations, CTEs, and window functions.

Size: 55.7 KB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

ContextLab/hypertools

A Python toolbox for gaining geometric insights into high-dimensional data

Language: Python - Size: 95.3 MB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 1,843 - Forks: 161

datacarpentry/R-ecology-lesson

Data Analysis and Visualization in R for Ecologists

Language: R - Size: 608 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 321 - Forks: 508

EngNormie/Projects-Portfolio

Portfolio of my selected projects in Data Science, Data Analysis, Artificial Intelligence, Business Process Automation, Robotic Process Automation, etc. These projects reflect my long stretching career practice. I hope one gets insights and inspiration.

Language: Jupyter Notebook - Size: 41.5 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 2 - Forks: 0

SizaNcina/IBM-Data-Science-Professional-Certificate

Welcome to my Data Science project portfolio! This repository contains projects I am currently working on as part of my enrolled courses for the IBM Data Science Professional Certificate. I am fairly new to data analytics, all constructive feedback is welcomed and will be appreciated!

Language: Jupyter Notebook - Size: 1.27 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

dathere/100.dathere.com

🧩 (WIP) The book "100 exercises with qsv". For new qsv users to read lessons and try out exercises either in-browser or locally in their terminal. Built with Jupyter Book.

Language: Jupyter Notebook - Size: 621 MB - Last synced at: 6 days ago - Pushed at: 27 days ago - Stars: 2 - Forks: 0

LucaCappelletti94/csv_trimming

Package python to remove common ugliness from a csv-like file

Language: Python - Size: 130 KB - Last synced at: 18 days ago - Pushed at: 8 months ago - Stars: 99 - Forks: 0

happyhappyprojects/predicting_credit_card_approvals

This is a comprehensive personal project on supervised machine learning. The goal is to create a machine learning model that decides whether a new credit card application should be approved.

Language: Jupyter Notebook - Size: 655 KB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

tirthajyoti/Data-science-best-resources

Carefully curated resource links for data science in one place

Size: 8.93 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 3,038 - Forks: 1,001

nazif96/Sales-Analytics-dashboard

Tableau de bord Analytics des Ventes

Size: 14.6 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

fairtracks/omnipy

Omnipy is a high level Python library for type-driven data wrangling and scalable workflow orchestration (under development)

Language: Python - Size: 8.06 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 22 - Forks: 1

khanhnamle1994/cracking-the-data-science-interview

A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep

Language: Jupyter Notebook - Size: 235 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 3,978 - Forks: 1,099

audy21/kaggle

A practice-focused exploratory portfolio, using Kaggle datasets.

Language: Jupyter Notebook - Size: 1.96 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

stefmolin/pandas-workshop

An introductory workshop on pandas with notebooks and exercises for following along. Slides contain all solutions.

Language: Jupyter Notebook - Size: 27.1 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 386 - Forks: 773

BabithaRavindra/Data-Analytics

Diverse Data Analytics Projects: reports & dashboards for various domains. Includes Tata & Deloitte Virtual Internship Reports, along with their dashboards and datasets.

Size: 1.61 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

brenden-DS/diabetes-web-app

Language: Jupyter Notebook - Size: 438 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

mandliya/ml

A 60 days+ streak of daily learning of ML/DL/Maths concepts through projects

Language: Jupyter Notebook - Size: 101 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 77 - Forks: 16

mahadi-nahid/NormTab

[EMNLP 2024] NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization

Language: Python - Size: 13.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

BdR76/CSVLint

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.

Language: C# - Size: 12.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 182 - Forks: 16

edcr09/Delivery_app_orders_analysis

DA-3 Proyecto de analisis de ordenes de clientes en la app Instacart

Language: Jupyter Notebook - Size: 113 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

DCS-training/Digital-Method-of-the-Month

In this repository you are going to find the documents we produced to support the discussion in our Digital Methods of the Month. These documents will help you orienting yourself if you want to pickup the method in your research. Go to the readme file

Size: 446 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 4

datacarpentry/r-raster-vector-geospatial

Introduction to Geospatial Raster and Vector Data with R

Language: R - Size: 249 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 112 - Forks: 112

rorrell/spotifyhistory

A Jupyter Notebook where I wrangle some data and plot a chart to draw some conclusions about a user's Spotify history

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Prem-ium/Tax-Merge

Consolidate multiple 1099 brokerage tax forms, visualize realized gains/losses, and track data for Fidelity, Chase, Vanguard, Schwab, and others!

Size: 65.4 KB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

vineet416/Chronic-Kidney-Disease-Prediction

This repository contain code of Chronic Kidney Disease Detection Prediction Project. The goal of this project is predict the chronic kidney disease using parameters like Diabetes Mellitus, Blood Urea, Sugar, Hypertension etc.. I used multiple machine learning algorithms with hyperparameter tuning which is having highest accuracy score of 97.5

Language: Jupyter Notebook - Size: 3.15 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

Deller23/hotel_booking_data_cleaning

Efficiently transforming raw hotel booking data into actionable insights! This project leverages Python and Pandas for advanced data cleaning—handling missing values, detecting outliers, and optimizing features—ensuring a high-quality dataset ready for analysis and modeling.

Language: Jupyter Notebook - Size: 2.04 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

swcarpentry/sql-novice-survey

Databases and SQL

Size: 15.7 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 64 - Forks: 174

frank01101/quasar_candidates

Exploratory data analysis in Python of the quasar candidates catalog by Richards et al., ApJS 219 (2015).

Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

TempeHS/Practical-Application-of-NESA-Software-Engineering-MLOps

A Jupyter Notebook collection designed to develop a practical understanding of Machine Learning Operations (MLOps) defined in the NESA Software Engineering Course Specifications pg 27.

Language: Jupyter Notebook - Size: 1.46 MB - Last synced at: 30 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 5

dbohdan/sqawk

Like awk, but with SQL and table joins

Language: Tcl - Size: 574 KB - Last synced at: 15 days ago - Pushed at: 6 months ago - Stars: 313 - Forks: 14

sufwanmubeen/Data_Science_with_Python

This repository showcases my data science skills, including EDA, Python, data cleaning/wrangling, and visualization. It demonstrates my problem-solving abilities through interactive insights. Explore the notebooks and provide feedback.

Language: Jupyter Notebook - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

ohspc89/NEUR490

A repository containing R scripts and metadata to demonstrate data wrangling procedure to USC Neuroscience undergraduate students

Language: R - Size: 2.24 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

stemxresearch/dta

An R Library for Efficient Data Management and Manipulation

Language: R - Size: 1.67 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

PacktWorkshops/The-Data-Visualization-Workshop

A New, Interactive Approach to Learning Data Visualization

Language: Jupyter Notebook - Size: 254 MB - Last synced at: 26 days ago - Pushed at: almost 3 years ago - Stars: 86 - Forks: 99

moderndive/ModernDive_book

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse

Language: HTML - Size: 1.35 GB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 771 - Forks: 501

MMBazel/Springboard-DataScienceTrack-Student

Springboard Program: Data Science Career Track - NLP

Language: Jupyter Notebook - Size: 63.3 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 146 - Forks: 81

mkearney/funique

⌚️ A faster unique() function

Language: R - Size: 7.15 MB - Last synced at: 29 days ago - Pushed at: over 6 years ago - Stars: 19 - Forks: 0

R-js/mangos

🥭's is monorepo collecting data wrangling and data validation utilities

Language: JavaScript - Size: 424 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

hrbrmstr/fish-stocking-pdf-data-wrangling

🐠A fishy example of how to do PDF data wrangling in R

Language: R - Size: 1.81 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 0

jezcope/pyrefine

Execute OpenRefine JSON scripts without OpenRefine (or Java)

Language: Python - Size: 460 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 30 - Forks: 2

TrainingByPackt/Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 128 - Forks: 247

singhdivyank/PartisanshipClassifier

Text classification using PySpark and Seaborn

Language: Python - Size: 3.43 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sethbr11/IntroToDataWrangling

A brief course intended to introduce non-programmers to python and data wrangling. Also, demonstration of network optimization, pdf creation in Python, and a simple Monte Carlo simulation.

Language: Jupyter Notebook - Size: 719 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

vaxdata22/Amazon-Product-Sales

This is an Exploratory Data Analysis done on the Amazon Product Sales dataset from kaggle.

Language: Jupyter Notebook - Size: 1.59 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vaxdata22/Istanbul-Shopping

This is an Exploratory Data Analysis done on Istanbul Shopping dataset from kaggle.

Language: Jupyter Notebook - Size: 2.31 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vaxdata22/Countries-Population

This is an Exploratory Data Analysis done on a Countries dataset from kaggle

Language: Jupyter Notebook - Size: 379 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vaxdata22/Automobile-Data

This is an Exploratory Data Analysis done on an Automobile dataset from kaggle

Language: Jupyter Notebook - Size: 297 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vaxdata22/Foresight-Institution

This is a Data Analysis case study done on the Foresight Institution dataset.

Size: 822 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vaxdata22/Foresight-Pharmaceutical

This is a Data Analysis case study done on the Foresight Pharmaceutical Company dataset.

Size: 301 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vaxdata22/Cyclistic-Ride-Sharing-Company

This is my Google Data Analytics Certificate case study for the Cyclistic ride-sharing company

Size: 98.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

mramshaw/Data-Cleaning

Data Cleaning with Python

Language: Python - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 44 - Forks: 17

Daviedavie100/Investigating_dataset_project

Udacity Nanodegree Program, investigating dataset project using the Soccer dataset.

Language: Jupyter Notebook - Size: 53.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

laura-budurlean/Data-Wrangling-Exercise-RO4532A

This R script performs data wrangling, cleaning, and transformation tasks for a fictitious study RO4532A. It processes multiple sheets from an Excel file, merges and reshapes the data, and generates a curated dataset.

Language: R - Size: 27.3 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

gagolews/teaching-data

Dr Marek's Data for Teaching/Training

Size: 168 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 22 - Forks: 49

whythawk/whyqd

data wrangling simplicity, complete audit transparency, and at speed

Language: Python - Size: 14 MB - Last synced at: 26 days ago - Pushed at: about 2 months ago - Stars: 34 - Forks: 1

Related Keywords
data-wrangling 996 python 338 data-visualization 338 data-analysis 293 data-science 249 pandas 212 data-cleaning 182 machine-learning 146 exploratory-data-analysis 118 r 113 numpy 110 data 89 jupyter-notebook 89 matplotlib 84 sql 78 seaborn 70 python3 62 statistics 52 data-visualisation 52 eda 47 data-analytics 38 csv 36 data-mining 36 web-scraping 35 udacity-data-analyst-nanodegree 35 data-analysis-python 33 feature-engineering 33 data-preprocessing 33 udacity 29 visualization 29 json 27 excel 27 data-analyst-nanodegree 26 data-manipulation 26 database 25 data-exploration 24 webscraping 23 linear-regression 22 twitter-api 22 javascript 22 udacity-nanodegree 21 tidyverse 21 dplyr 21 data-collection 21 carpentries 21 lesson 20 english 20 data-processing 20 statistical-analysis 20 ggplot2 19 data-engineering 19 machine-learning-algorithms 19 dashboard 18 matplotlib-pyplot 18 pandas-dataframe 18 data-munging 17 scikit-learn 17 data-preparation 17 rstats 17 nodejs 17 dataset 16 analytics 16 tableau 16 etl 16 api 15 tweepy 15 stable 15 r-programming 15 data-modeling 15 data-structures 14 model-evaluation 14 data-gathering 14 postgresql 13 data-carpentry 13 pyspark 13 data-cleansing 13 mongodb 13 logistic-regression 12 data-transformation 12 twitter 12 programming 12 predictive-modeling 12 deep-learning 11 pandas-python 11 hypothesis-testing 11 plotly 11 rmarkdown 11 requests 11 node-js 11 node 11 classification 11 data-scraping 11 anaconda 10 rstudio 10 xml 10 business-intelligence 10 model-development 10 feature-selection 10 clustering 10 data-management 10