Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: datapreparation
sfu-db/dataprep
Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
Language: Python - Size: 214 MB - Last synced: 3 days ago - Pushed: about 2 months ago - Stars: 1,941 - Forks: 200
AnjaliKumari021/Retail_Customer_Behavior_Analysis_using_SQL
Analysed Retail data to understand customer behavior, transaction pattern using SQL
Size: 717 KB - Last synced: 17 days ago - Pushed: 18 days ago - Stars: 0 - Forks: 0
visokio/omniscope-custom-blocks
Public repository for custom blocks for Omniscope
Language: Python - Size: 6.05 MB - Last synced: 5 days ago - Pushed: about 1 month ago - Stars: 5 - Forks: 3
huseyincenik/data_science
Data Science materials
Language: Jupyter Notebook - Size: 51.1 MB - Last synced: 27 days ago - Pushed: 27 days ago - Stars: 3 - Forks: 1
CoDS-GCS/KGFarm
A Holistic Platform for Automating Data Preparation
Language: Python - Size: 290 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 6 - Forks: 2
elalfredoignacio/Customer-segmentation
Proyecto de segmentación de clientes, mediante clusterización.
Language: HTML - Size: 2.42 MB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
mehadihn/Data-Preparation-Techniques-Project
This project was completed for the data preparation techniques course.
Language: Jupyter Notebook - Size: 1.24 MB - Last synced: about 2 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
MadhuBala11/DiabetesPrediction
In this project, I have used logistic regression, a supervised machine learning algorithm, to predict whether a person has diabetes or not based on various features such as age, blood pressure, glucose level, body mass index, etc. I have used Python and popular libraries such as Pandas, Scikit-Learn, and Matplotlib to perfom model building
Language: Jupyter Notebook - Size: 3.2 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 1 - Forks: 0
wsperger/dataprepping_generative_ai
A one stop shop for all tools to prepare datasets for generative ai
Language: Python - Size: 127 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 1 - Forks: 0
mahmudie/1_GDP_Analysis
India GDP Analysis using Python
Language: Jupyter Notebook - Size: 871 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 1 - Forks: 0
RafeyIqbalRahman/Data-Imputation-Techniques
This repository demonstrates data imputation using Scikit-Learn's SimpleImputer, KNNImputer, and IterativeImputer.
Language: Python - Size: 8.79 KB - Last synced: 8 months ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 0
sfansaria/Data-Preparation-of-a-housing-dataset
Data Preparation and Data Visualization
Language: Jupyter Notebook - Size: 1.19 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0
sunilbabu1981/Learning_Path_for_NLP
A comprehensive path for NLP
Language: Jupyter Notebook - Size: 255 KB - Last synced: 10 months ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0
imuhammadaasim/Bikes_Sales_Data_Analysis
The Bikes Sales Analysis Excel Project is a practical exploration of sales data analysis using Microsoft Excel. This project showcases how Excel can be a powerful tool for data cleaning, preprocessing, visualization, and dashboard creation, all within a familiar spreadsheet environment.
Size: 224 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0
venkatesh2022/logisticregression-telecom-churn
Language: Jupyter Notebook - Size: 1 MB - Last synced: 10 months ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0
rainaa0277/House-Price-Prediction-using-Linear-Regression
For a real estate firm, building a house price prediction model based upon various factors. Problem - Regression | Algorithm used -Linear Regression using OLS
Language: Jupyter Notebook - Size: 4.03 MB - Last synced: 11 months ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0
prathmesh444/Sentiment-Analysis-of-BlackCoffer-Blogs
This project estimates Sentiment Analysis by calculating text Metrices to drive sentimental opinion, sentiment scores, readability, passive words, personal pronouns, etc, etc.
Language: Jupyter Notebook - Size: 57.1 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0
NAVEENDATAANALYST/HOTEL-RESERVATIONS-PREDICTION-IN-R
CAN U PREDICT CORRECTLY WHETHER A CUSTOMER WILL CANCEL THE RESERVATION?? You can find the dataset from this kaggle website: https://www.kaggle.com/datasets/ahsan81/hotel-reservations-classification-dataset
Size: 453 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
JarredP/RStudio-projects-from-Data-Mining-Course
This Repository contains several RMarkdown files that follow the tutorials from 'Introduction to data mining R examples' authored by M.Hahsler. These RMarkdown Tutorials were completed during a Data Mining course completed as part of an MS in Applied Data Analytics
Size: 51.8 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
ydataai/ydata-talkdatatome
Make your dataset talk to you. The AI assistant for data preparation.
Language: Python - Size: 9.77 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 3 - Forks: 0
NAVEENDATAANALYST/SPACESHIP-TITANIC-PASSENGER-TRANSPORT-PREDICTION
The data is available in kaggle competitions. https://www.kaggle.com/competitions/spaceship-titanic I have participated and completed the competition on my own.
Size: 284 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
NAVEENDATAANALYST/CUSTOMER-ANALYTICS-ON-USA-BASED-COMPANY-DATA
This is my 6th semester Essentials of Data Analytics project.
Size: 157 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
deepu9962/Exploratory-Analysis-of-Geolocational-Data
This project involves the use of K-Means Clustering to find the best accommodation for students in Bangalore (or any other city of your choice) by classifying accommodation for incoming students on the basis of their preferences on amenities, budget and proximity to the location.
Language: Jupyter Notebook - Size: 4.17 MB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
fazildgr8/wrist_control_CNN_AWEAR
This is the cumulative repository for the research project Deep Learning Approach to Robotic Prosthetic Wrist Control using EMG Signals done in the AWEAR lab. This repository would consist of all the Data processing pipelines codes, custom data preprocessing library built for this project, and all the time series CNN training Jupyter notebooks using the Data collected within the AWEAR Lab, University at Buffalo.
Language: Jupyter Notebook - Size: 541 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0
muharienal/nordstrom-products-prep
Nordstrom Products dataset preparation includes collection, discovery, cleaning, normalization, enrichment, and validation using SQL
Size: 565 KB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
zcebeci/odetector
Outlier Detection Using Cluster Analysis
Language: R - Size: 558 KB - Last synced: 12 days ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
victorcouste/trifacta-flows-examples
Trifacta Flows Examples and Templates. Flows zip files, recipes and datasets.
Size: 2.65 MB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 5 - Forks: 2
Ashleshk/Tableau-10-A-Z-Hands-on-Tableau-Training-for-Data-Science-Udemy
Learn data visualization through Tableau 2020 and create opportunities for you or key decision-makers to discover data patterns such as customer purchase behavior, sales trends, or production bottlenecks. This Course on Udemy
Size: 4.7 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 2 - Forks: 1
ms8909/dptron
mltrons dptron: Dirty Data in, Clean Data Out!
Language: Python - Size: 75.5 MB - Last synced: 1 day ago - Pushed: over 1 year ago - Stars: 4 - Forks: 2
DaveChui/Data-Preparation-and-Cleaning---Geo-Data
Preparing and Cleaning Data
Language: Jupyter Notebook - Size: 26.4 KB - Last synced: over 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0
shinanna/Tripadvisor_NLP_Analysis
NLP Analysis on Tripadvisor Restaurant Reviews
Language: Jupyter Notebook - Size: 2.72 MB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0
chiranjeevbitm/US-House-Price-Prediction
Need to model the price of houses with the available independent variables. This model will then be used by the management to understand how exactly the prices vary with the variables. They can accordingly manipulate the strategy of the firm and concentrate on areas that will yield high returns. Further, the model will be a good way for management to understand the pricing dynamics of a new market.
Language: Jupyter Notebook - Size: 728 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0
hedata/rapid-hackathon-2018
Data preparation and logistic regression model training and testing for the Rapid Hackathon 2018 ORF challenge
Language: Jupyter Notebook - Size: 255 KB - Last synced: over 1 year ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0
FlavioIsoni/Machine-Learning-Mastery-Course
Machine Learning Mastery Course (by Jason Brownlee)
Size: 1000 Bytes - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
FlavioIsoni/Bootcamp-Machine-Learning-Analyst
Bootcamp - Machine Learning Analyst / Analista de Aprendizado de Máquina (by IGTI)
Size: 385 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
prakhargurawa/Titanic-Survival-Predictor
Trying to predict survival rate of passengers using algorithms like Logistic Regression, Ada Boost, Gradient Boost , Decision Tree Classifiers , Extra Tree Classifiers , Random Forest Classifiers and XG Boost with appropriate data preprocessing techniques.
Language: Jupyter Notebook - Size: 53.7 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0
anthonychristian1997/Transaction-PYTHON-DataPREPARATION-PracticeCase7
In this repository, I implement a data preparation process. Data preparation is the stage where we prepare data for machine learning processes or other things related to data analysis.
Language: Jupyter Notebook - Size: 8.79 KB - Last synced: 9 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
ankit013/Time-series-forecasting-and-sales-pipeline-prediction
Machine learning models build on real time data
Language: R - Size: 57.6 KB - Last synced: 4 months ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0
ngupta23/data_prep_helper
A helper package for preparing and combining data from a variety of sources
Language: Python - Size: 50.8 KB - Last synced: 20 days ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0
vinayramegowda/DataPreparationPython
Data Preparation using python (Automobile Dataset)
Language: Jupyter Notebook - Size: 1.62 MB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0