GitHub / gandalf1819 / NYCOpenData-Profiling-Analysis
Open Data Profiling, Quality and Analysis on NYC OpenData dataset with semantic profiling using fuzzy ratio, Levenshtein distance and regex
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gandalf1819%2FNYCOpenData-Profiling-Analysis
PURL: pkg:github/gandalf1819/NYCOpenData-Profiling-Analysis
Stars: 6
Forks: 4
Open issues: 0
License: mit
Language: Jupyter Notebook
Size: 17.9 MB
Dependencies parsed at: Pending
Created at: over 5 years ago
Updated at: over 2 years ago
Pushed at: over 4 years ago
Last synced at: over 2 years ago
Topics: big-data, dask, dask-distributed, data-profiling, fuzzy-logic, fuzzywuzzy, hdfs, levenshtein-distance, modin, nyc-311-dataset, nyc-opendata, pandas, pyspark, regular-expressions