An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-cleaning-pipeline

Shuyib/chronic-kidney-disease-kaggle

Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.

Language: Jupyter Notebook - Size: 3.78 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8 - Forks: 1

apelullo/cobalt_health_wellness_platform_ops

Cobalt is a mental health and wellness platform created for Penn Medicine employees that serves as a hub for support services such as therapy, wellness coaching, topic- and population-specific group sessions, and a variety of self-help resources.

Language: Jupyter Notebook - Size: 194 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

jim-schwoebel/allie

๐Ÿค– An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.

Language: Python - Size: 275 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 143 - Forks: 35

rrsmart8/Product-Deduplication

Language: Jupyter Notebook - Size: 29.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

Elysian01/Data-Purifier

A Python library for Automated Exploratory Data Analysis, Automated Data Cleaning, and Automated Data Preprocessing For Machine Learning and Natural Language Processing Applications in Python.

Language: Jupyter Notebook - Size: 7.51 MB - Last synced at: 18 days ago - Pushed at: almost 3 years ago - Stars: 44 - Forks: 6

LaureBerti/Learn2Clean

Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning

Language: Python - Size: 34.6 MB - Last synced at: 20 days ago - Pushed at: over 2 years ago - Stars: 51 - Forks: 20

everks/dial-clean

ไธญๆ–‡ๅฏน่ฏๆ•ฐๆฎๆธ…ๆด—

Language: Python - Size: 646 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 6

vdechen/DataAnalysis_NGO

This data analysis and visualization project aimed at presenting the work of OBA-Floripa NGO to authorities and the general population. The idea is to claim the need for continued funding resources, given the positive impact of the organization's activities on public health issues.

Language: Jupyter Notebook - Size: 12.8 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

AnalystHub-Hub/IBM-Data-Science-Professional-Certificate

I learnt data science through hands-on practice in the IBM Cloud using real data science tools and real-world data sets.

Language: Jupyter Notebook - Size: 14.3 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

125ryun/Espresso

์„œ๊ฐ•๋Œ€ํ•™๊ต 2023-2 '๋น…๋ฐ์ดํ„ฐ์˜ ์ดํ•ด์™€ ๊ต์œก์  ํ™œ์šฉ(์บก์Šคํ†ค๋””์ž์ธ)' ๊ณผ๋ชฉ '์—์Šคํ”„๋ ˆ์†Œ' ํŒ€

Language: Python - Size: 7.32 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

JamesHanZhang/table-data-format-transform-app

excel, markdown, csv, sql ๆ•ฐๆฎๆบๆ‰น้‡/ๅ•็‹ฌๆ ผๅผไบ’็›ธ่ฝฌๆข

Language: Python - Size: 321 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

DeleLinus/WeRateDogs-Wrangle-Analyze-Data

The dataset I wrangled (and analysed and visualized) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog.

Language: HTML - Size: 1.86 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

RashikaKarki/Auto-Wrangler

Automating the data preprocessing pipeline

Language: Jupyter Notebook - Size: 9.69 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

ved93/ml-express

A Python library for day to day data analysis and machine learning. This aims to make data building, cleaning and machine learning much much faster. A library of extension and helper modules for Python's data analysis and machine learning libraries.

Language: Python - Size: 68.4 KB - Last synced at: 25 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

259mit/MAHA Fork of FlintyTub49/MAHA

MAHA is an in-progress ETL package which uses machine learning to clean your dataset with one line command.

Size: 59.6 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

DesiSanou/data-scraping

scrape e-commerce site products information

Language: Python - Size: 58.6 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

getiria-onsongo/itallic

A tool that automatically detects and corrects errors in location data and imputes missing values for location-dependent data, such as region name.

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: 14 days ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Related Keywords
data-cleaning-pipeline 17 data-visualization 6 machine-learning 6 data-science 5 data-cleaning 4 data-analysis 4 data-preprocessing 4 python 3 data-transformation 2 feature-engineering 2 data-wrangling 2 machine-learning-algorithms 2 data-analysis-python 2 big-data 2 eda 2 educational-technology 1 log-data 1 log-data-analysis 1 log-level 1 time-series 1 user-behavioral-sequences 1 big-data-processing 1 csv-to-excel 1 csv-to-sql 1 data-cleaning-and-preprocessing 1 big-data-analytics 1 topic-modeling 1 dialog 1 ibm-watson-services 1 ibm-cognos-analytics 1 data-scraping 1 data-extraction 1 tableau-public 1 dashboard 1 plant-breeding-data 1 conda 1 webscrapping 1 scrapy 1 scraping 1 data-collection 1 etl-pipeline 1 visualization 1 pandas-profiling 1 data-summarization 1 data-preparation 1 preprocessing 1 automation 1 automate 1 weratedogs 1 twitter 1 data-wrangling-twitter 1 data-interpretation 1 data-exploration 1 data-analytics 1 data-analyst-with-python 1 data-analyst-nanodegree 1 multifileupload 1 excel-to-md 1 etl-framework 1 easy-to-use 1 text-mining 1 reporting-pipeline 1 product-analytics 1 operations-research 1 nlp 1 mental-health-services 1 lda-model 1 key-performance-metrics 1 hipaa 1 healthcare-data 1 feature-development 1 decision-support 1 data-validation 1 customer-segmentation 1 academic-research 1 preventative-medicine 1 model-interpretability 1 machine-learning-algorithm 1 health-data-science 1 health-data-analysis 1 feature-selection 1 dimensionality-reduction 1 diagnostics 1 reinforcement-learning 1 data-curation 1 automated 1 python3 1 python-library 1 python-lib 1 jupyter 1 exploratory-data-analysis 1 datapurifier 1 clustering 1 voice-computing 1 tpot 1 model-deployment 1 model-compression 1 machine-learning-models 1 machine-learning-library 1 machine-learning-api 1