GitHub topics: preprocessing-data
Mohammed061/Transportation-and-logistics-Challenge
Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.
Language: Jupyter Notebook - Size: 3.36 MB - Last synced at: about 16 hours ago - Pushed at: about 18 hours ago - Stars: 3 - Forks: 0
AlwaysDhruv/Images-Preprocessing
Hi their, My self Dhruv. So this repository are fully work on the images preprocessing.
Language: C++ - Size: 2.42 MB - Last synced at: about 11 hours ago - Pushed at: 1 day ago - Stars: 2 - Forks: 0
Goyam02/movie_recommend
Language: Jupyter Notebook - Size: 9.57 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0
0xPutri/Eksperimen_SML_Rozhak
Repository ini berisi eksperimen awal dan proses preprocessing otomatis untuk Proyek Akhir Membangun Sistem Machine Learning. Dataset dianalisis, diproses, dan disiapkan menjadi data siap latih sesuai kriteria yang ditetapkan.
Language: Jupyter Notebook - Size: 246 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0
leansandoval/CienciaDeDatos
Ejercicios de clase y Trabajo Práctico de la materia Ciencia de Datos UNLaM (3670) - 1C / 2C 2025.
Language: Jupyter Notebook - Size: 22 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0
AyaBoughanmi02/Salary-Prediction-Project
A comprehensive machine learning project using Linear and Logistic Regression to forecast salary value and classify six-figure earners
Language: Jupyter Notebook - Size: 725 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0
Mozilla-Data-Collective/dataset-preprocessing-scripts
Scripts for preprocess dataset and adequate for MDC platform
Language: Python - Size: 21.5 KB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0
Ryannn06/SQL-Case-Study-on-DepEd-Schools-Masterlist
This project uses the S.Y. 2020-2021 DepEd Schools Masterlist that contains 64,000+ school information across the Philippines, including location, sectors, and classification details.
Language: Jupyter Notebook - Size: 7.85 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0
BadBoy0170/training-data_BOT
Enterprise-grade training data curation bot for LLM fine-tuning using Decodo and Python automation. It provides an async, modular pipeline for document loading, preprocessing, task-specific data generation (Q&A, summarization, classification), quality evaluation, and dataset export — all through a unified API.
Language: Python - Size: 33.2 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0
courtois-neuromod/ds_prep
All the scripts to prepare the Courtois-Neuromod dataset
Language: Python - Size: 67.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 4
Davide011/ML_project_South_African_Heart_Disease
Public Repository: Machine Learning & Data Mining project using the South African Heart Disease dataset. Applied PCA, Regularized Linear Regression, ANN, Logistic Regression, and Decision Trees with cross-validation for regression and classification. Includes feature scaling, EDA, and statistical tests.
Size: 1.32 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0
Multiomics-Analytics-Group/acore
Functionality to preprocess and analyse multi-omics data
Language: Python - Size: 8.62 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 1 - Forks: 1
ArthurMangussi/pymdatagen
A Python Library for the Generation of Artificial Missing Data
Language: Python - Size: 2.81 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 3
Abdullah-056/90-days-with-Buildables
This Repository contains all the work done in Buildables Fellowship.
Language: Jupyter Notebook - Size: 14.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0
andiachmad/olist-ml
Data Mining Final Project
Language: Jupyter Notebook - Size: 5.03 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0
AmanSharma01Prime/netflix-content-analysis
netflix content analysis is a data analysis project using python in google colab, sql in postgreSQL and visualization in google sheets.
Language: Jupyter Notebook - Size: 3.42 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0
Atquiya-Labiba/Analyzing-Critic-and-User-Scores-in-Movies
Analyzing critic and user scores in movies to explore trends from 2000 to 2025
Language: Python - Size: 766 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0
subhadipsinha722133/Multiple-Disease-Prediction
🤖This is an interactive Streamlit web application that predicts the likelihood of multiple diseases(Diabetes Prediction, Heart Disease Prediction, Parkinson's Disease Prediction) using Machine Learning models.
Language: Jupyter Notebook - Size: 104 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 2
vanderschaarlab/hyperimpute
A framework for prototyping and benchmarking imputation methods
Language: Python - Size: 428 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 196 - Forks: 16
hpazooki/precisionFDA
Data science challenge launched by the FDA and National Cancer Institute for detecting mislabeled genomics data.
Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
ashir41/Comparative-Analysis-of-Dhaka-s-Rental-Market-by-Area-Using-Bproperty.com-
Explore Dhaka's rental market with 1494 listings from bproperty.com. This project features a Tableau dashboard analyzing pricing, bedroom trends, and value hotspots. Built with Python and Tableau Public.
Language: Jupyter Notebook - Size: 155 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0
Progati00/Return-Rate-Reduction-Analysis
E-commerce Return Rate Reduction Analysis – Data-driven project using SQL, Python (Logistic Regression), and Power BI to analyze return patterns, predict customer behavior, and provide actionable insights to reduce product returns.
Language: Jupyter Notebook - Size: 1.34 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0
Tszon/End-to-End_DS_ML_Project
I built an end-to-end customer churn segregation and prediction project.
Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0
cgizo/TwoPhotonPP
Preprocessing scripts for Two-Photon data. Compile, motion correct and downsample tifs.
Language: Python - Size: 16.6 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0
ThalesGroup/Iliad-custom-to-OIM-transformer
Scripts to preprocess ocean data files from custom apps in order to export the data to Ocean Information Model.
Language: Python - Size: 2.34 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0
Sabaudian/Music_Genre_Classification_project
Audio Pattern Recognition project - Music Genres Classification
Language: Python - Size: 1.33 GB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0
pgoyal77/Movie_Recommender_Website
I used CountVectorizer to convert movie data into vectors, Cosine Similarity to find similar movies, and PorterStemmer to clean the text data for better accuracy in recommendations.
Language: Jupyter Notebook - Size: 3.85 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0
HoangLeminh17/Ranks-Prediction-for-LOL
A method to predict rankings based on performances of players for game League Of Legends
Language: Jupyter Notebook - Size: 10.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 1
chollette/SEDNet_Shallow-Encoder-Decoder-Network-for-Brain-Tumor-Segmentation
Official Implementation for SEDNet
Language: Jupyter Notebook - Size: 57.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 2
Abdelrahman-Atef-Elsayed/NLP_Preprocessing_pipeline
This repo includes a generalized preprocessing pipeline for text data in NLP tasks.
Language: Jupyter Notebook - Size: 64.5 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0
Rachelnk/Customer-Churn-Prediction-ML
This repository contains an analysis of customer data to predict customer churn for a telecommunications company that provides home phone and internet services
Language: Jupyter Notebook - Size: 604 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
Ayesha24banu/Customer-Purchase-Behaviour-Analysis-in-Retail
Customer Purchase Behaviour Analysis in Retail using Python, RFM Segmentation, Market Basket Analysis, and Power BI Dashboard.
Language: Jupyter Notebook - Size: 13 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
ddihora1604/Advanced_Business_Analytics_on_World_Bank_Global_Financial_Inclusion_Data_2021
Bridging the Gaps in Financial Inclusion: Understanding the Cash-Credit Paradox, Divide between Cash and Digital Payments, and Financial Resilience.
Language: Jupyter Notebook - Size: 27 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
lucianoscarpaci/News-Data-Classification
Using the Reuters dataset, this example illustrates the process of data preprocessing, model definition and training, and performance evaluation.
Language: Jupyter Notebook - Size: 94.7 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0
agailloty/preprocess
preprocess is a fast data analysis preprocessing tool.
Language: Go - Size: 423 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1
DelphinKdl/home_price_prediction_using_regularized_polynomial_regression
Housing price prediction using regularized polynomial regression
Language: Jupyter Notebook - Size: 1.77 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0
anishdeshmukh9/AI-model-Training-Disease-prognosis
this was a academic project that showcase my pre&post ML model knowledge such as, data collection, data preprocessing, AI model training( ML) and finetune the model
Language: Python - Size: 8.13 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0
gaurav-singh7092/ResuMatch
An AI-powered resume and job description matching application using natural language processing and machine learning techniques. This application provides intelligent analysis of resume-job compatibility with detailed scoring and recommendations.
Language: Python - Size: 1.45 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0
dlite-tools/NLPiper
NLPiper is a package that agglomerates different NLP tools and applies their transformations in the target document.
Language: Python - Size: 165 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 1
SherineTarek224/Credit_Score
This repo is for credit score classification based on financial and demographic data. using supervised machine learning algorithms
Language: Jupyter Notebook - Size: 2.02 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0
sonjaove/ML-hands-on
repo for some hands on stuff
Language: Jupyter Notebook - Size: 137 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0
MohammedSaim-Quadri/networksecurity
This project is an end-to-end MLOps pipeline for a network security system that detects phishing and malicious activities using machine learning. It automates data ingestion, preprocessing, model training, and deployment while leveraging AWS S3 for model storage and GitHub Actions for CI/CD. The system includes realtime monitoring & a web interface
Language: Python - Size: 9.23 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
LuisFelipePoma/Machine_Learning
Learning about the algorithms used in machine learning, along with techniques for training and testing models.
Language: Jupyter Notebook - Size: 17.3 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0
JoseRuiz01/ChestXRayPneumoniaDetection
Pneumonia detection using Convolutional Neural Networks
Language: Jupyter Notebook - Size: 1.46 GB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0
Naeem1144/segmentation-project
Customer Segmentation using Machine learning models for clustering analysis
Language: Jupyter Notebook - Size: 16.8 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0
okfkzksk/traitement-automatique-des-langages
Entrainement et évaluation du moteur de traduction neuronale OpenNMT sur un corpus en formes fléchies puis en lemmes
Language: Python - Size: 3.35 MB - Last synced at: 23 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
Lummy-A/montgomery-county-crime-analysis
Analysis of crime patterns in Montgomery County (2018-2022) using Python data science tools to identify trends, spatial hotspots, and temporal distributions across crime types. Includes visualizations and insights to inform prevention strategies.
Language: Jupyter Notebook - Size: 5.24 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0
tejaswirupa/Early-Prediction-of-Diabetes-Risk-Using-Machine-Learning
Built a predictive model using CDC health data to identify individuals at risk of developing diabetes. Achieved 90.6% F1-score using Logistic Regression and revealed key health indicators like BMI and blood pressure as top predictors.
Language: Jupyter Notebook - Size: 4.03 MB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
Tomaslopera/Fifa_Analysis
Language: Jupyter Notebook - Size: 8.71 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
Shakilgithub20/News-Classification
Language: Jupyter Notebook - Size: 11 MB - Last synced at: 5 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1
RafiQamar/IMDb-Movie-Analysis
This project involves web scraping, data preprocessing, database storage and visualization of IMDb movie data from the last decade (2014-2024). The dataset includes details of 10,000 movies such as name, release year, genre, ratings, metascore and more. The project culminates in an interactive Power BI dashboard for in-depth insights and reporting.
Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: 6 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0
BHARGAVPRAVEEN-CHINTAPALLI/Uber-Trends-Analysis
UBER TREND ANALYSIS
Size: 2.33 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
DavidRichardson02/Standardized_CSV_Data_Analysis
Given the pathname of a file, it automates data extraction, statistical analysis, and modeling via MATLAB plotting scripts, facilitating a streamlined approach to handling analysis of datasets. This project provides a robust, standardized pipeline for reading, preprocessing, analyzing, and modeling data from CSV(or similarly delimited) files.
Language: C - Size: 2.88 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0
shellynagar27/Transportation-and-logistics-Challenge
Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.
Language: Jupyter Notebook - Size: 3.38 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
r-a-j/Social-Scope
"SocialScope harnesses the power of data science to Instagram's vast content, providing insightful analytics and trend predictions for informed decision-making."
Language: SCSS - Size: 16.6 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0
NavyaTrilok/Earthquake-Analysis-Dashboard
We have designed an Earthquake dashboard for Researchers, Emergency Response Teams and Educators studying earthquake patterns and trends.
Size: 4.88 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
Jingvu/Anime-Database-Preprocessing-R-Project
This project cleans, merges, and preprocesses anime metadata to make it ready for predictive analysis. The processed dataset is now optimized for trend analysis, user rating predictions, and personalized content recommendations.
Size: 16.3 MB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
RafiQamar/Customer-Churn-Prediction-App
Built and deployed a Streamlit-based customer churn prediction app using ML models. Preprocessed data with encoding and scaling, improving model accuracy. Designed for churn prediction and retention insights.
Language: Jupyter Notebook - Size: 2.52 MB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0
BirchKwok/spinesUtils
A library that provides template code for Python development to shorten the project development cycle.
Language: Python - Size: 209 KB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0
lisekarimi/ts_forecasting_notebook
Time series forecasting using ML models (ARIMA, SARIMA, SARIMAX and Prophet)
Language: Jupyter Notebook - Size: 22.9 MB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
Himank-Khatri/ClassiFlow
A web app that automates tedious data preprocessing and machine learning model testing.
Language: Python - Size: 258 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
ArtZaragozaGitHub/CV--P5_Plants_Seedling_Classification
A robust image classifier using CNNs to efficiently classify different plant seedlings and weeds to improve crop yields and minimize the extensive human effort to do this manually.
Language: Jupyter Notebook - Size: 7.82 MB - Last synced at: 6 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0
nlqthinh/WeaviateAnime
Explore your favorite anime with this interactive search app! 🚀 This project leverages Weaviate for vector search and Gradio for a seamless user interface. Using embeddings from a custom anime dataset, you can perform quick and accurate similarity searches for anime titles
Language: Python - Size: 8.87 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0
bindugayatri02/Employee-Data-Preprocessing-for-Tableau-Analysis-Coursera-Project-
For this project, I preprocessed employee data sourced from three Excel files hosted on Tableau Public: "Employee names," "Employee data," and "Employee travel responses." This dataset encompasses employee IDs, names, hire dates, travel survey responses, and other relevant information. The source files and the final processed data are attached.
Size: 120 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0
RafiQamar/HR-Analytics-Project
Cleaned and processed HR data using Python for analysis and visualization. Analyzed employee trends and performance using SQL and Python. Built an interactive Power BI dashboard connected to MySQL for dynamic insights.
Language: Jupyter Notebook - Size: 4.71 MB - Last synced at: 8 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0
msche81/2-Jedha_Fullstack
450h Data Scientist training - Collect and store large amounts of data - Build prediction models in Machine Learning and Deep Learning - Deploy your models in real conditions
Language: Jupyter Notebook - Size: 248 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0
khangbdd/Data-processing-CLI
CLI tools for preprocess csv data
Language: Python - Size: 23.4 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0
GMeghana19/solar-power-output
Solar power prediction using liner regression
Language: Jupyter Notebook - Size: 1.44 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0
drleniaw/Analysis_Sentiment_Twitter_Free_Sex_In_Indonesian
Analysis Sentiment on Twitter Free Sex In Indonesia
Language: Jupyter Notebook - Size: 2.35 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0
iliavrtn/final-project
This project explores whether Mathematics and Computer Science texts still retain enough linguistic patterns (metalanguage) for classification once domain-specific words are removed. 🤖📚
Language: Jupyter Notebook - Size: 15.5 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0
WaodeAnisaNurdinia/PreprocessingModelKNN
22.114966_Waode Nurdinia Anisa
Language: Jupyter Notebook - Size: 1.33 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0
MatanNafshi/Wine-Quality-Prediction-Machine-Learning-Python
This project predicts wine quality using machine learning based on chemical properties like acidity, sugar content, and alcohol. It includes data exploration, preprocessing, and applying models like Linear Regression, Random Forest, and SVM. Models are evaluated for accuracy to determine the best predictor of wine quality.
Language: Jupyter Notebook - Size: 2.24 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0
Nouran246/Credit-Card-Approval-Prediction-Classification Fork of Rowlkh/Credit-Card-Approval-Prediction-Classification-
This project predicts credit card application approval by analyzing applicant data. It includes EDA, preprocessing, feature selection with Genetic Algorithms, and classification using KNN, Decision Trees, and MLP models.
Language: Jupyter Notebook - Size: 8.25 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 1
sarahloree/Project-2--Bank-Loan-Marketing-Model
This is the second project I completed as part of the Machine Learning Module from my post-graduate certification in AI/ Machine Learning from University of Texas' McCombs School of Business.
Language: Jupyter Notebook - Size: 3.62 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0
shahzadsiddiqi/NLP
This repository contains implementations and workflows for key NLP tasks like text classification, Generative AI, sentiment analysis, and entity recognition. It includes preprocessing scripts, annotated datasets, and fine-tuning methods for frameworks like Hugging Face and spaCy. Ideal for building and deploying scalable NLP solutions.
Size: 1000 Bytes - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0
alvaro-concha/animal-behavior-preprocessing
animal-behavior-preprocessing is a Python repository to preprocess animal behavior data. It works on the output spreadsheets from video-tracking of animal body parts with LEAP or DeepLabCut. It applies a Median Filter, an Ensemble Kalman Filter, transforms data to joint angles and computes their Morlet Wavelet Spectra.
Language: Python - Size: 251 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0
AlejandroLara11/MachineLearningCourse
Machine Learning Basics: From Setup to Clustering
Language: Python - Size: 1.1 MB - Last synced at: 8 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0
Pritam3355/Scripts
This Repository contains differnt scripts for data collection
Language: HTML - Size: 2.12 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
girgisadel/RegressionUsingCsharp
A machine learning project to predict taxi fares using ML.NET. This solution includes end-to-end data preprocessing, training, evaluation, and prediction, designed for both learning and practical deployment.
Language: C# - Size: 0 Bytes - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
thiwaK/preprocess-50k-tiles-sri-lanka
Preprocessing scripts for 1:50K tiles issued by the survey department, Sri Lanka
Language: Python - Size: 16.6 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0
karthik-d/nyc-taxi-dataset-eda
Clearning, transformation and analysis large datasets as part of coursework for UCS1629: Data Warehousing and Data Mining.
Language: Jupyter Notebook - Size: 9.79 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1
saadhaniftaj/AI-EssayScore-Automated-Essay-Scoring-Using-LSTM
AI-EssayScore is an automated essay scoring system using LSTM neural networks. It tokenizes and pads essays, processes them through an LSTM model, and predicts scores. The project includes data preprocessing, model training, evaluation, and saving the model for future use.
Language: Jupyter Notebook - Size: 8.8 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
ChristianGoueguel/specProc
The specProc package is a collection of preprocessing tools for spectroscopy data analysis.
Language: R - Size: 68.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0
PhilaController/gun-violence-dashboard-data
Python toolkit for preprocessing data for the City Controller's Gun Violence Dashboard
Language: Python - Size: 355 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1
Pawel-Tomasz-Nowak/Scientific-collaboration
The repository highlights the results of my scientific collaboration with Dr. Eng. Adam Zagdański
Language: Jupyter Notebook - Size: 111 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
neosaffana/TugasDataMining1
Tugas 1 Mata Kuliah Data Mining
Language: Jupyter Notebook - Size: 25.4 KB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
ELHoussineT/AutoDataCleaner
Simple and automatic data cleaning in one line of code! It performs one-hot encoding, date & time casting to datetime dtype, detects binary columns, safely convert non-numeric columns to numeric dtypes, cleaning dirty/empty values, normalizing values and removing unwanted columns all in one line of code. Get your data ready for model training and fitting quickly.
Language: Python - Size: 647 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 4
UniFeat/unifeat
An open-source tool for performing feature selection process in different areas of research
Language: Java - Size: 30.5 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 3
sorrychoe/pyBigKinds
BigKinds Data Analysis Toolkit for python
Language: Python - Size: 31.1 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0
Animesh-Chourey/Loan-Classifier
Trained machine learning algorithms (Logistic Regression, KNN, SVM, Decision Tree) specifically, after performing visualization and pre-preocessing tasks on a loan dataset. Executed the evaluation metrics such as F1-score, Log loss and jaccard-similarity score to assess the algorithms performance.
Language: Jupyter Notebook - Size: 29.3 KB - Last synced at: 9 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
AmruhaAhmed/Data-Cleaning-on-New-York-Airbnb-Listings
Language: Jupyter Notebook - Size: 3.11 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0
mariotruss/ML-supportticket-classifyer-prep
🔬 For a paper on AI / ML in Support Ticket Systems, I used this code to clean my data.
Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
eZWALT/MVA-MultiVariate-Analysis
MDS-FIB Multivariate-Analysis (MVA) subject 2024-25 Q1
Language: R - Size: 135 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
DavidRichardson02/CSV_DataSet_Analysis
The program processes CSV files to capture and format file contents, generate custom directories of files, extract data, perform analysis, and generate MATLAB script(s) for visualization and further analysis.
Language: C - Size: 128 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0
istolesweetroll/Elimination-of-entry-preprocessing-errors
R language Shiny application using shiny.fluent, presenting methods of applying machine learning algorithms in elimination of entry preprocessing errors.
Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
BhavinPatel4199/Machine-Learning-Framework
This repository, showcases various projects that explore key concepts in both supervised and unsupervised learning, with a focus on real-world applications. The projects utilize a range of machine learning techniques, including data preprocessing, feature selection, exploratory data analysis (EDA), and model optimization.
Language: Jupyter Notebook - Size: 13.8 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
AndrewDettor/YouTubeMostPopularVideos
ETL data pipeline using YouTube API, AWS EC2, and AWS RDS, with EDA and Tableau visualizations.
Language: Jupyter Notebook - Size: 7.59 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
ManjiriSDS/Data-Science-Case-Study
Language: Jupyter Notebook - Size: 104 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
blleshi/Neural_Network_Binary_Classification
Venture Funding with Deep Learning (Neural Network Binary Classification)
Language: Jupyter Notebook - Size: 278 KB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
weiglszonja/meeg-tools
EEG/MEG data preprocessing and analyses framework
Language: Jupyter Notebook - Size: 120 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 12 - Forks: 5