An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-wrangling"

OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

Language: Java - Size: 387 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 11,303 - Forks: 2,052

TomWright/dasel

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

Language: Go - Size: 8.56 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 7,448 - Forks: 146

khanhnamle1994/cracking-the-data-science-interview

A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep

Language: Jupyter Notebook - Size: 235 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 3,978 - Forks: 1,099

tirthajyoti/Data-science-best-resources

Carefully curated resource links for data science in one place

Size: 8.93 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 3,038 - Forks: 1,001

dathere/qsv

Blazing-fast Data-Wrangling toolkit

Language: Rust - Size: 63.9 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,794 - Forks: 79

iterative/datachain

ETL, Analytics, Versioning for Unstructured Data

Language: Python - Size: 10.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,548 - Forks: 112

ContextLab/hypertools

A Python toolbox for gaining geometric insights into high-dimensional data

Language: Python - Size: 95.3 MB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 1,843 - Forks: 161

brimdata/zui

Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.

Language: TypeScript - Size: 221 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,839 - Forks: 133

hi-primus/optimus

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Language: Python - Size: 110 MB - Last synced at: about 22 hours ago - Pushed at: 5 months ago - Stars: 1,508 - Forks: 232

skrub-data/skrub

Machine learning with dataframes

Language: Python - Size: 12.4 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,380 - Forks: 128

data-forge/data-forge-ts

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

Language: TypeScript - Size: 3.68 MB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 1,361 - Forks: 79

moderndive/ModernDive_book

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse

Language: HTML - Size: 1.35 GB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 771 - Forks: 501

microsoft/prose

Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.

Language: C# - Size: 81.6 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 642 - Forks: 99

stefmolin/Hands-On-Data-Analysis-with-Pandas-2nd-edition

Materials for following along with Hands-On Data Analysis with Pandas – Second Edition

Language: Jupyter Notebook - Size: 70.1 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 622 - Forks: 1,462

stefmolin/Hands-On-Data-Analysis-with-Pandas

Materials for following along with Hands-On Data Analysis with Pandas.

Language: Jupyter Notebook - Size: 31.2 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 417 - Forks: 818

Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

Language: C++ - Size: 143 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 401 - Forks: 76

stefmolin/pandas-workshop

An introductory workshop on pandas with notebooks and exercises for following along. Slides contain all solutions.

Language: Jupyter Notebook - Size: 27.1 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 386 - Forks: 773

datacarpentry/R-ecology-lesson

Data Analysis and Visualization in R for Ecologists

Language: R - Size: 608 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 321 - Forks: 508

dbohdan/sqawk

Like awk, but with SQL and table joins

Language: Tcl - Size: 574 KB - Last synced at: 16 days ago - Pushed at: 6 months ago - Stars: 313 - Forks: 14

shawnbrown/datatest

Tools for test driven data-wrangling and data validation.

Language: Python - Size: 3.27 MB - Last synced at: 26 days ago - Pushed at: over 3 years ago - Stars: 294 - Forks: 13

georgevbsantiago/qsacnpj

Pacote que trata e organiza os dados do Cadastro Nacional da Pessoa Jurídica (CNPJ)

Language: R - Size: 2.83 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 294 - Forks: 74

kjam/data-cleaning-101

Data Cleaning Libraries with Python

Language: Jupyter Notebook - Size: 6.78 MB - Last synced at: about 14 hours ago - Pushed at: over 1 year ago - Stars: 287 - Forks: 172

tirthajyoti/Web-Database-Analytics

Web scrapping and related analytics using Python tools

Language: Jupyter Notebook - Size: 4.24 MB - Last synced at: 21 days ago - Pushed at: almost 5 years ago - Stars: 273 - Forks: 168

ajaymache/data-analysis-using-python

Exploratory data analysis 📊using python 🐍of used car 🚘 database taken from ⓚ𝖆𝖌𝖌𝖑𝖊

Language: Jupyter Notebook - Size: 49.3 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 193 - Forks: 89

BdR76/CSVLint

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.

Language: C# - Size: 12.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 182 - Forks: 16

LibreCat/Catmandu

Catmandu - a data processing toolkit

Language: Perl - Size: 53.1 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 176 - Forks: 31

swcarpentry/python-novice-gapminder

Plotting and Programming in Python

Size: 17 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 170 - Forks: 433

datacarpentry/python-ecology-lesson

Data Analysis and Visualization in Python for Ecologists

Language: Jupyter Notebook - Size: 28.4 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 168 - Forks: 309

swcarpentry/r-novice-gapminder

R for Reproducible Scientific Analysis

Language: R - Size: 179 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 166 - Forks: 543

swcarpentry/r-novice-inflammation

Programming with R

Language: R - Size: 51.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 164 - Forks: 392

strengejacke/sjmisc

Data transformation and utility functions for R

Language: R - Size: 6.83 MB - Last synced at: 8 days ago - Pushed at: 12 months ago - Stars: 159 - Forks: 24

MMBazel/Springboard-DataScienceTrack-Student

Springboard Program: Data Science Career Track - NLP

Language: Jupyter Notebook - Size: 63.3 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 146 - Forks: 81

data-forge/data-forge-js 📦

JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

Language: JavaScript - Size: 2.09 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 142 - Forks: 11

dlab-berkeley/R-Fundamentals-Legacy

D-Lab's 12 hour introduction to R Fundamentals. Learn how to create variables and functions, manipulate data frames, make visualizations, use control flow structures, and more, using R in RStudio.

Language: R - Size: 14.3 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 140 - Forks: 50

sl-solution/InMemoryDatasets.jl

Multithreaded package for working with tabular data in Julia

Language: Julia - Size: 7.54 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 130 - Forks: 18

TrainingByPackt/Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 128 - Forks: 247

datacarpentry/r-socialsci

R for Social Scientists

Language: R - Size: 221 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 121 - Forks: 208

datacarpentry/r-raster-vector-geospatial

Introduction to Geospatial Raster and Vector Data with R

Language: R - Size: 249 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 112 - Forks: 112

LucaCappelletti94/csv_trimming

Package python to remove common ugliness from a csv-like file

Language: Python - Size: 130 KB - Last synced at: 18 days ago - Pushed at: 8 months ago - Stars: 99 - Forks: 0

asavinov/prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Language: Python - Size: 1.95 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 90 - Forks: 5

PacktWorkshops/The-Data-Visualization-Workshop

A New, Interactive Approach to Learning Data Visualization

Language: Jupyter Notebook - Size: 254 MB - Last synced at: 26 days ago - Pushed at: almost 3 years ago - Stars: 86 - Forks: 99

uc-r/uc-r.github.io

Main repository for R programming courses @ University of Cincinnati, courses and tutorials that focus on data wrangling, exploration, visualization, and analysis with R.

Language: HTML - Size: 173 MB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 86 - Forks: 52

r-rudra/tidycells

Automatic transformation of untidy spreadsheet-like data into tidy form

Language: R - Size: 2.81 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 83 - Forks: 10

gagolews/datawranglingpy

Minimalist Data Wrangling with Python (Open-Access Textbook)

Size: 300 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 79 - Forks: 4

mandliya/ml

A 60 days+ streak of daily learning of ML/DL/Maths concepts through projects

Language: Jupyter Notebook - Size: 101 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 77 - Forks: 16

datacarpentry/wrangling-genomics

Data Wrangling and Processing for Genomics

Language: Shell - Size: 73.8 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 71 - Forks: 153

swcarpentry/sql-novice-survey

Databases and SQL

Size: 15.7 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 64 - Forks: 174

SISBID/Data-Wrangling

Teaching material for Summer Institute in Statistics for Big Data Module 1.

Language: HTML - Size: 456 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 59 - Forks: 85

R-Korea/weekly_R_quiz

Data wrangling & visualization quizzes for R users

Language: R - Size: 70.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 54 - Forks: 11

datacarpentry/sql-ecology-lesson

Data Management with SQL for Ecologists

Size: 11.9 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 51 - Forks: 148

rsquaredacademy-education/online-courses

Free online R courses

Language: R - Size: 125 MB - Last synced at: 5 months ago - Pushed at: almost 4 years ago - Stars: 49 - Forks: 35

datacarpentry/r-intro-geospatial

Introduction to R for Geospatial Data

Language: R - Size: 130 MB - Last synced at: 2 days ago - Pushed at: 6 days ago - Stars: 46 - Forks: 70

kaishengteh/Data-Analyst-Nanodegree

Kai Sheng Teh - Udacity Data Analyst Nanodegree

Language: HTML - Size: 33.7 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 45 - Forks: 34

mramshaw/Data-Cleaning

Data Cleaning with Python

Language: Python - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 44 - Forks: 17

TomFevrier/kiwis

A Pandas-inspired data wrangling toolkit in JavaScript

Language: JavaScript - Size: 546 KB - Last synced at: 9 days ago - Pushed at: almost 5 years ago - Stars: 38 - Forks: 2

pwwang/pipda

A framework for data piping in python

Language: Python - Size: 850 KB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 37 - Forks: 3

dlab-berkeley/R-Data-Wrangling-Legacy

D-Lab's 6 hour introduction to data wrangling with R. Learn how to manipulate dataframes using the tidyverse in R.

Language: R - Size: 6.23 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 37 - Forks: 17

seifip/udacity-data-analyst-nanodegree

Project work for the Udacity Data Analyst Nanodegree

Language: HTML - Size: 15.4 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 37 - Forks: 41

datacarpentry/python-socialsci

Data Analysis and Visualization with Python for Social Scientists

Size: 15.2 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 36 - Forks: 71

whythawk/whyqd

data wrangling simplicity, complete audit transparency, and at speed

Language: Python - Size: 14 MB - Last synced at: 26 days ago - Pushed at: about 2 months ago - Stars: 34 - Forks: 1

chrislicodes/Udacity-Data-Analyst-Nanodegree

Repository for the projects needed to complete the Data Analyst Nanodegree.

Language: Jupyter Notebook - Size: 93.1 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 34 - Forks: 22

InPhyT/COVID19-Italy-Integrated-Surveillance-Data

COVID-19 integrated surveillance data provided by the Italian Institute of Health and processed via UnrollingAverages.jl to deconvolve the weekly moving averages.

Language: Julia - Size: 1.94 GB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 31 - Forks: 4

jezcope/pyrefine

Execute OpenRefine JSON scripts without OpenRefine (or Java)

Language: Python - Size: 460 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 30 - Forks: 2

datacarpentry/genomics-r-intro

Intro to R and RStudio for Genomics

Language: R - Size: 135 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 28 - Forks: 89

UBOdin/mimir

Data-ish exploration through SQL+Uncertainty

Language: Scala - Size: 126 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 13

yassinassafi/WQU-data-science-challenges

I successfully completed a 2-unit, 16-week and 6 mini-projects of the Data Science module at WorldQuant University. The mini-projects included scientific computing, data wrangling, machine learning and natural language processing with Python.

Language: Jupyter Notebook - Size: 44.9 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 21

umich-dbgroup/foofah

Foofah: programming-by-example data transformation program synthesizer

Language: CSS - Size: 4.31 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 25 - Forks: 10

kHarshit/udacity-nanodegree-projects

Udacity nanodegree projects: DLND, DRLND, DAND

Language: Jupyter Notebook - Size: 46.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 23 - Forks: 16

fairtracks/omnipy

Omnipy is a high level Python library for type-driven data wrangling and scalable workflow orchestration (under development)

Language: Python - Size: 8.06 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 22 - Forks: 1

gagolews/teaching-data

Dr Marek's Data for Teaching/Training

Size: 168 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 22 - Forks: 49

naqvis/CrysDA

Crystal library for Data Analysis, Wrangling, Munging

Language: Crystal - Size: 8.11 MB - Last synced at: 21 days ago - Pushed at: about 2 years ago - Stars: 22 - Forks: 2

Mr-Chang95/Portfolio

Daniel Chang's Portfolio

Language: Jupyter Notebook - Size: 57.8 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 7

maprihoda/data-analysis-with-python-and-pyspark

Language: Python - Size: 6.87 MB - Last synced at: 12 days ago - Pushed at: over 4 years ago - Stars: 22 - Forks: 12

VianneyMI/monggregate

Library to make MongoDB aggregation framework and pipelines easy to use in python.

Language: Python - Size: 1.56 MB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 21 - Forks: 3

buabaj/xplore

A python package built for data scientist/analysts, AI/ML engineers for exploring features of a dataset in minimal number of lines of code for quick analysis before data wrangling and feature extraction.

Language: Python - Size: 1.74 MB - Last synced at: about 22 hours ago - Pushed at: about 4 years ago - Stars: 21 - Forks: 11

FalconSoft/dataPipe

dataPipe is a data processing and data analytics library for JavaScript. Inspired by LINQ (C#) and Pandas (Python)

Language: TypeScript - Size: 279 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 20 - Forks: 2

chris-prener/qualmap

R package for working with semi-structured qualitative GIS data

Language: R - Size: 1.2 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 3

pradeepsinngh/Data-Science-101

Notes and tutorials on how to use python, pandas, seaborn, numpy, matplotlib, scipy for data science.

Language: Jupyter Notebook - Size: 6.21 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 20 - Forks: 4

datacarpentry/stata-economics

Economics Lesson with Stata

Language: Makefile - Size: 17.1 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 19 - Forks: 17

LukaIgnjatovic/DataCamp_-_Track_-_Data_Scientist_with_R_-_Course_03_-_Introduction_to_the_Tidyverse

Repository of DataCamp's "Introduction to the Tidyverse" course.

Size: 3.37 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 19 - Forks: 17

mkearney/funique

⌚️ A faster unique() function

Language: R - Size: 7.15 MB - Last synced at: 29 days ago - Pushed at: over 6 years ago - Stars: 19 - Forks: 0

csc-training/da-with-r-remote

Data Analysis with R (Remote Course)

Language: HTML - Size: 16.5 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 16 - Forks: 3

r-hyperspec/hyperSpec

hyperSpec: Tools for Spectroscopy (R package)

Language: R - Size: 107 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 16 - Forks: 4

Data-Wrangling-with-JavaScript/Chapter-2

Code examples for Chapter 2 of Data Wrangling with JavaScript

Language: JavaScript - Size: 69.3 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 9

dlab-berkeley/advanced-data-wrangling-in-R-legacy 📦

Advanced-data-wrangling-in-R, Workshop

Language: HTML - Size: 24.6 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 8

tara-nguyen/english-premier-league-datasets-for-10-seasons

Clean datasets for 10 seasons of the English Premier League, including league tables, match stats, and head-to-head performances

Language: R - Size: 161 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 11

LibraryCarpentry/lc-sql

Library Carpentry: SQL

Size: 20.5 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 13 - Forks: 36

ISUgenomics/datascience-workbook

Introduction to large scale computing and data wrangling with hands-on tutorials

Language: HTML - Size: 244 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 13 - Forks: 5

jananiravi/workshop-tidyverse

Workshop: Using R/tidyverse to analyze & visualize gapminder/processed transcriptomics data!

Language: HTML - Size: 115 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 20

singhsidhukuldeep/singhsidhukuldeep.github.io

This is a completely open-source repo of interview questions and answers for people preparing for such interviews. This is maintained by you and you can send the questions that you faced during interviews.

Language: HTML - Size: 8.55 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 12 - Forks: 6

CleverInsight/cognito

🚀🤖 Cognito - Simplifies AutoML Data Preprocessing.

Language: Python - Size: 950 KB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 12 - Forks: 11

datacarpentry/sql-socialsci

Data Management with SQL for Social Scientists

Size: 14.5 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 11 - Forks: 31

LukasHedegaard/datasetops

Fluent dataset operations, compatible with your favorite libraries

Language: Python - Size: 24.7 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 0

chuksoo/IBM-Data-Science-Capstone-SpaceX

In this project, we predicted if the Falcon 9 first stage will land successfully by following the data science methodology. We also summarized the results for the business stakeholders.

Language: Jupyter Notebook - Size: 25.2 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 11 - Forks: 68

AlexLamson/DataWrangler

Make quick and dirty data mining made easier in Sublime Text

Language: Python - Size: 353 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 11 - Forks: 2

Mantej-Singh/Data-Munging---Python

Data Wrangling using Pandas

Language: Jupyter Notebook - Size: 1.61 MB - Last synced at: 6 days ago - Pushed at: about 6 years ago - Stars: 11 - Forks: 3

swcarpentry/r-novice-gapminder-es

R para Análisis Científicos Reproducibles

Language: R - Size: 92.4 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 10 - Forks: 52

datacarpentry/python-ecology-lesson-es

Análisis y visualización de datos usando Python

Language: Jupyter Notebook - Size: 31.3 MB - Last synced at: 12 days ago - Pushed at: 13 days ago - Stars: 10 - Forks: 52

DCS-training/CDCS-Summer-School2021

2021 Text and Data Analysis Summer School

Language: Jupyter Notebook - Size: 101 MB - Last synced at: 16 days ago - Pushed at: 10 months ago - Stars: 10 - Forks: 7

ContextLab/data-wrangler

Wrangle messy numerical, image, and text data into consistent well-organized formats

Language: Python - Size: 1.26 MB - Last synced at: 30 days ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 2

Related Topics
python 338 data-visualization 338 data-analysis 293 data-science 249 pandas 212 data-cleaning 182 machine-learning 146 exploratory-data-analysis 118 r 113 numpy 110 data 89 jupyter-notebook 89 matplotlib 84 sql 78 seaborn 70 python3 62 data-visualisation 52 statistics 52 eda 47 data-analytics 38 csv 36 data-mining 36 web-scraping 35 udacity-data-analyst-nanodegree 35 data-preprocessing 34 feature-engineering 33 data-analysis-python 33 visualization 29 udacity 29 json 27 excel 27 data-manipulation 26 data-analyst-nanodegree 26 database 25 data-exploration 24 webscraping 23 linear-regression 22 twitter-api 22 javascript 22 carpentries 21 tidyverse 21 udacity-nanodegree 21 data-collection 21 dplyr 21 lesson 20 english 20 data-processing 20 statistical-analysis 20 ggplot2 19 machine-learning-algorithms 19 data-engineering 19 dashboard 18 pandas-dataframe 18 matplotlib-pyplot 18 rstats 17 nodejs 17 scikit-learn 17 data-munging 17 data-preparation 17 analytics 16 etl 16 dataset 16 tableau 16 tweepy 15 r-programming 15 stable 15 api 15 data-modeling 15 model-evaluation 14 data-gathering 14 data-structures 14 data-carpentry 13 data-transformation 13 mongodb 13 pyspark 13 postgresql 13 data-cleansing 13 logistic-regression 12 predictive-modeling 12 twitter 12 programming 12 deep-learning 11 pandas-python 11 hypothesis-testing 11 data-scraping 11 node-js 11 node 11 rmarkdown 11 requests 11 plotly 11 classification 11 data-analyst 10 rstudio 10 supervised-learning 10 xml 10 clustering 10 openstreetmap 10 feature-selection 10 business-intelligence 10 anaconda 10