An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-munging

data-forge/data-forge-ts

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

Language: TypeScript - Size: 3.28 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 1,357 - Forks: 80

danielvartan/groomr

🧹 Tidy Tools

Language: R - Size: 1.6 MB - Last synced at: 9 days ago - Pushed at: 22 days ago - Stars: 2 - Forks: 0

TrainingByPackt/Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: 15 days ago - Pushed at: over 3 years ago - Stars: 128 - Forks: 247

caltechlibrary/datatools

A set of tools for working with JSON, CSV and Excel workbooks

Language: Go - Size: 4.07 MB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 78 - Forks: 10

mramshaw/Data-Cleaning

Data Cleaning with Python

Language: Python - Size: 1.17 MB - Last synced at: 14 days ago - Pushed at: 10 months ago - Stars: 44 - Forks: 17

ledvinkao/zvzgr

Language: R - Size: 267 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

trinker/textclean

Tools for cleaning and normalizing text data

Language: R - Size: 23.8 MB - Last synced at: 17 days ago - Pushed at: over 3 years ago - Stars: 248 - Forks: 26

FalconSoft/dataPipe

dataPipe is a data processing and data analytics library for JavaScript. Inspired by LINQ (C#) and Pandas (Python)

Language: TypeScript - Size: 279 KB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 20 - Forks: 2

btmonier/vidger

Make rapid visualizations of RNA-seq data in R

Language: R - Size: 4.25 MB - Last synced at: about 1 month ago - Pushed at: almost 6 years ago - Stars: 19 - Forks: 6

ManuKot/wlamart_globaltech

Walmart USA Advanced Software Engineering Virtual Experience Program on Forage - October 2024 * Completed the Advanced Software Engineering Job Simulation where I solved difficult technical projects for a variety of teams at Walmart. * Developed a novel version of a heap data structure in Java for Walmart’s shipping department, showcasing

Language: Python - Size: 458 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

LukasHedegaard/datasetops

Fluent dataset operations, compatible with your favorite libraries

Language: Python - Size: 24.7 MB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 0

rmelikov/computing_for_data_analysis

Georgia Tech's Spring 2020 CSE 6040 Computing for Data Analysis Class with Dr. Richard Vuduc

Language: Python - Size: 452 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

annajiat/2022-09-24-bracu-pr-hybrid

Language: HTML - Size: 2.08 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

fuchsia-programming/scrape 📦

When you need those jobs hypersonic 🚀 scrape 🔪

Language: JavaScript - Size: 2.79 MB - Last synced at: 12 months ago - Pushed at: over 5 years ago - Stars: 10 - Forks: 3

interruptinuse/txr

unofficial txr lisp mirror, unaffiliated with Kaz Kylheku, just to make builds easier i guess

Language: C - Size: 40 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

CleverInsight/cognito

🚀🤖 Cognito - Simplifies AutoML Data Preprocessing.

Language: Python - Size: 950 KB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 12 - Forks: 11

akashborigi/Countries-Project

Our investigative analysis within Tableau has offered a panoramic view of the factors influencing global development.

Size: 696 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mx360-s/mungingtools

µnging

Language: Python - Size: 79.1 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

Govind-Asawa/Loan-Prediction

Comprehensive work containing EDA, data munging, classification models to mention a few on loan dataset

Language: Python - Size: 52.7 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

verascity/verascity.github.io

This repository contains my portfolio site:

Language: HTML - Size: 4.75 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 1

settinge/Heroes_of_Pymoli

Pandas is used to report on insightful statistics around video game purchasing.

Language: Jupyter Notebook - Size: 83 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

allentv/DataGoose

A project to untangle the mystery of data wrangling

Size: 32.2 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 1

garynth41/University-of-Califonia-Davis-Data-Visualization-with-Tableau-Specialization

Visualize Business Data with Tableau. Create powerful business intelligence reports

Size: 20 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

aeglon97/Wrangle-Analyze-Twitter-WeRateDogs

Implemented the entire data wrangling process to analyze and visualize data from the WeRateDogs Twitter page.

Language: Jupyter Notebook - Size: 3.88 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

aeglon97/Apriori-Algorithm

Using the Apriori Algorithm to analyze different types of crime in NYC boroughs

Language: Python - Size: 120 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 2

Ikarthikmb/Machine-Learning-Notebook

Analysing, processing, visualizing the data using Machine Learning and Data Science in Python

Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

nathan-booth/weratedogs

Udacity data cleaning project in Jupyter

Language: Jupyter Notebook - Size: 2.05 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 7

nathan-booth/2016_NE_presidential_contributions

Udacity exploratory analysis in R

Language: HTML - Size: 5.23 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

yashpandey474/AI-Disease-Prediction-System

Project for implementing a predictive model for medical diagnosis using supervised linear regression model, allowing certain improvements in the data munging process and experimentation with the feature selection process along with exploring other models

Language: Java - Size: 23.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

tara-nguyen/english-premier-league-datasets-for-10-seasons

Clean datasets for 10 seasons of the English Premier League, including league tables, match stats, and head-to-head performances

Language: R - Size: 161 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 11

chinmaybhalodia/walmart-virtual-internship

All the tasks submitted during the Advanced Software Engineering Virtual Training Program offered by Walmart Global Tech on Forage platform.

Language: Jupyter Notebook - Size: 171 KB - Last synced at: 17 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

baligoyem/BAG

A toolbox that contains methods to solve real-world data science problems

Language: Jupyter Notebook - Size: 298 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

bdoremus/BreakingLessBad

3 week Capstone project with BP - predicting part failure based on historical data

Size: 11.9 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

sayaliwalke30/Kaggle-Projects

This repo contains 4 different projects. Built various machine learning models for Kaggle competitions. Also carried out Exploratory Data Analysis, Data Cleaning, Data Visualization, Data Munging, Feature Selection etc

Language: Jupyter Notebook - Size: 4.03 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 14 - Forks: 11

phrazzld/kaggle 📦

Solutions to Kaggle problems.

Language: Jupyter Notebook - Size: 2.7 MB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

AlexLamson/DataWrangler

Make quick and dirty data mining made easier in Sublime Text

Language: Python - Size: 353 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 11 - Forks: 2

samkamau81/Walmart-USA-Advanced-Software-Engineering-Virtual-Experience

This Virtua Experience included in-depth tasks on Advanced Data Structure, Software Architecture, Relational Database Design and Data Mungling

Language: Java - Size: 2.22 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

caheredia/caheredia.github.io

Cristian Heredia | Data exploration expert, maximizing signal-to-noise through storytelling. Ask me how to use data to inform the business decisions.

Language: HTML - Size: 1.69 MB - Last synced at: about 3 hours ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

data-forge/data-forge-fs

This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js

Language: TypeScript - Size: 265 KB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 10 - Forks: 3

bjschafer/pfutils

Utilities and helpers for the Pathfinder RPG (1e)

Language: Python - Size: 16.6 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Henrik-Kowalkowski/r_zillow_house_prices

An exploration of Zillow house prices with R

Size: 24.1 MB - Last synced at: 10 days ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

CleverInsight/sparx

Data Munging, Data Wrangling and Data Preparation Simplified

Language: Python - Size: 1.51 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 3

NTavou/Wrangle-OpenStreetMap-Data

Used data munging techniques to assess the quality of Liverpool OpenStreetMap dataset for validity, accuracy, completeness, consistency and uniformity. (Python, SQL, data verification, data cleaning)

Language: Python - Size: 7.69 MB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

augustinasn/_data_science_projects

Size: 48.3 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

sherirosalia/Stars-of-LA

How many stars is a restaurant in LA likely to get? This project explores the relationship between location, reviews and number of ratings for various cuisine types within Los Angeles. Link to site is just below this paragraph.

Language: Jupyter Notebook - Size: 14.1 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

wbuchanan/stataConference2020-readit

Slides for 2020 Stata Conference talk about the program readit

Language: JavaScript - Size: 1.63 MB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

Mantej-Singh/Data-Munging---Python

Data Wrangling using Pandas

Language: Jupyter Notebook - Size: 1.61 MB - Last synced at: 5 days ago - Pushed at: about 6 years ago - Stars: 11 - Forks: 3

ncov19-us/ds

Data Analytics, Data Munging, and Plotly Plots

Language: Jupyter Notebook - Size: 1.77 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 2

ldnicolasmay/Tidy_Eval_in_R_for_Munging_UDS_Data

NACC ADRC Data Core webinar 2019-07-31 https://www.youtube.com/watch?v=52R8fNbccx4

Language: HTML - Size: 1.2 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

filipok/2012ROElections

2012 Romania elections data (R learning project)

Language: R - Size: 2.72 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

learning-dev/data-munging-codekata-4

Language: JavaScript - Size: 26.4 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

ericoulster/DJ-Beatmatch-Encoder

A tool I use to format my songs when designing DJ sets.

Language: Python - Size: 20.5 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

heitorbaldo/Data-Analysis-with-Julia

Codes for Data Analysis with Julia: A Short Introduction (Summer School - 2016)

Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

felsen/Python_for_Data_Science_Essentials

Python for Data Science Essentials

Language: Jupyter Notebook - Size: 12.7 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

StevenMMortimer/crmfunc

An R Package for Handling CRM Data

Language: R - Size: 9.77 KB - Last synced at: 29 days ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

nickcica/dand

Udacity's Data Analyst Nanodegree - Projects 1-7

Language: HTML - Size: 10 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

alemosie/zurich-osm-munging

Udacity DAND project: Wrangle OpenStreetMap (OSM) Data

Language: Python - Size: 875 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

Related Keywords
data-munging 57 data-wrangling 17 python 15 data-science 14 data-cleaning 13 data-analysis 11 pandas 11 r 10 data-visualization 10 machine-learning 9 python3 7 visualization 5 data-manipulation 5 csv 5 json 4 data-mining 4 eda 4 numpy 4 sql 4 java 3 jupyter-notebook 3 linear-regression 3 exploratory-data-analysis 3 software-architecture 3 deep-learning 3 linq 3 javascript 3 data-cleansing 3 data 3 udacity-data-analyst-nanodegree 2 openstreetmap 2 dataset 2 relational-database-design 2 data-structures 2 data-preprocessing 2 tableau-desktop 2 kaggle 2 plotly 2 analytics 2 web-scraping 2 jupyter-notebooks 2 nodejs 2 data-visualisation 2 data-forge 2 data-management 2 machine-learning-algorithms 2 twitter-api 1 software-engineering 1 regex 1 house-price-prediction 1 python-packages 1 statistics 1 bayesian-nonparametric-models 1 compressor-failures 1 gaussian-processes 1 diabetes-prediction 1 series-data 1 bankloanprediction 1 creditcardfrauddetection 1 datacleaning 1 datamunging 1 datavisualization 1 new-york-city 1 spyder 1 algorithms 1 ml 1 exploratory-analysis 1 disease-prediction 1 heart-disease-prediction 1 improvement 1 homework 1 datasets 1 datamining 1 football 1 data-mining-algorithms 1 crime-data 1 premier-league 1 crime-analysis 1 soccer 1 crime 1 apriori-algorithm 1 weratedogs 1 coronavirus 1 covid19-data 1 gis-data 1 plotly-express 1 scraping-websites 1 metaprogramming 1 tidyeval 1 tidyverse 1 romania 1 codekata 1 codekata-4 1 file-format 1 functional-programming 1 functional-python 1 music-library 1 music-notation 1 regular-expressions 1 toronto 1