An open API service providing repository metadata for many open source software ecosystems.

Topic: "preprocessing-data"

vanderschaarlab/hyperimpute

A framework for prototyping and benchmarking imputation methods

Language: Python - Size: 428 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 196 - Forks: 16

dlite-tools/NLPiper

NLPiper is a package that agglomerates different NLP tools and applies their transformations in the target document.

Language: Python - Size: 165 KB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 1

Unstructured-IO/community 📦

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Size: 5.7 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 6

ELHoussineT/AutoDataCleaner

Simple and automatic data cleaning in one line of code! It performs one-hot encoding, date & time casting to datetime dtype, detects binary columns, safely convert non-numeric columns to numeric dtypes, cleaning dirty/empty values, normalizing values and removing unwanted columns all in one line of code. Get your data ready for model training and fitting quickly.

Language: Python - Size: 647 KB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 19 - Forks: 4

weiglszonja/meeg-tools

EEG/MEG data preprocessing and analyses framework

Language: Jupyter Notebook - Size: 120 MB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 12 - Forks: 5

UniFeat/unifeat

An open-source tool for performing feature selection process in different areas of research

Language: Java - Size: 30.5 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 3

cecivieira/cotas-genero-eleicoes-e-proposicoes-legislativas

Análise de dados sobre cotas de gênero e seu impacto nas eleições e proposições legislativas da Câmara dos Deputados Federais entre 1934 e 2021. Parte do TCC da pós-graduação em Inteligência Artificial e Aprendizado de Máquina na @pucminas

Language: Jupyter Notebook - Size: 121 MB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 0

ArthurMangussi/pymdatagen

A Python Library for the Generation of Artificial Missing Data

Language: Python - Size: 2.81 MB - Last synced at: about 8 hours ago - Pushed at: 2 months ago - Stars: 7 - Forks: 3

tuanio/backend-recommender-system-book

Flask REST API for Recommender System Book App on Android

Language: Jupyter Notebook - Size: 1.14 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 7 - Forks: 1

Mohammed061/Transportation-and-logistics-Challenge

Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.

Language: Jupyter Notebook - Size: 3.36 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 5 - Forks: 0

subhadipsinha722133/Multiple-Disease-Prediction

🤖This is an interactive Streamlit web application that predicts the likelihood of multiple diseases(Diabetes Prediction, Heart Disease Prediction, Parkinson's Disease Prediction) using Machine Learning models.

Language: Jupyter Notebook - Size: 104 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 2

courtois-neuromod/ds_prep

All the scripts to prepare the Courtois-Neuromod dataset

Language: Python - Size: 67.1 MB - Last synced at: 12 days ago - Pushed at: 14 days ago - Stars: 4 - Forks: 4

ChristianGoueguel/specProc

The specProc package is a collection of preprocessing tools for spectroscopy data analysis.

Language: R - Size: 68.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

imyjk729/Memristor

In-sensor reservoir computing for language learning via two-dimensional memristors

Language: Jupyter Notebook - Size: 458 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 2

Sabaudian/Music_Genre_Classification_project

Audio Pattern Recognition project - Music Genres Classification

Language: Python - Size: 1.33 GB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

bharadwaj-chukkala/Data-driven-motion-planning-using-various-machine-learning-algorithms

ENPM808A: Introduction to Machine Learning Final Project

Language: Jupyter Notebook - Size: 4.52 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

kkmk11/BLIGHT-VISION

This is a ML based Web App that aims to detect the presence of late blight or early blight on potato leaves, which are the primary causes of crop damage. Additionally, the system recommends appropriate precautions and pesticides to help farmers eliminate the blight and protect their crops and increasing their yields.

Language: PureBasic - Size: 79.3 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

drleniaw/Analysis_Sentiment_Twitter_Free_Sex_In_Indonesian

Analysis Sentiment on Twitter Free Sex In Indonesia

Language: Jupyter Notebook - Size: 2.35 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

Navaneeth-Sharma/Speech_Recognition_of_Digits

This project of recognizing digit and converting it to text uses Signal processing techniques such as MFCC and other Advanced Signal Processing techniques for the preprocessing of the data. Then the Preprocessed data is used by the Neural Network algorithms to learn the pattern or structure of the sound.

Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: almost 3 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

rifkyahmadsaputra/Hollywood-Movies-Visualizations-and-Recommender-System

In this project, I do some analysis, visualizations, and then create movie recommender system on imdb data. I do that because I want to know more about movies, especially Hollywood movies. Therefore, I do analysis and visualization on imdb data which is contain informations about movies, e.g. who is produced, when the movies release, rating movies, budget and income, etc. After that, I create movie recommender system, which is the system will recommend top 10 similar movies based on the movie that has been input by the user.

Language: Jupyter Notebook - Size: 2.89 MB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

AlwaysDhruv/Images-Preprocessing

Hi their, My self Dhruv. So this repository are fully work on the images preprocessing.

Language: C++ - Size: 2.42 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 2 - Forks: 0

LuisFelipePoma/Machine_Learning

Learning about the algorithms used in machine learning, along with techniques for training and testing models.

Language: Jupyter Notebook - Size: 17.3 MB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

r-a-j/Social-Scope

"SocialScope harnesses the power of data science to Instagram's vast content, providing insightful analytics and trend predictions for informed decision-making."

Language: SCSS - Size: 16.6 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

BirchKwok/spinesUtils

A library that provides template code for Python development to shorten the project development cycle.

Language: Python - Size: 209 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

nlqthinh/WeaviateAnime

Explore your favorite anime with this interactive search app! 🚀 This project leverages Weaviate for vector search and Gradio for a seamless user interface. Using embeddings from a custom anime dataset, you can perform quick and accurate similarity searches for anime titles

Language: Python - Size: 8.87 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

msche81/2-Jedha_Fullstack

450h Data Scientist training - Collect and store large amounts of data - Build prediction models in Machine Learning and Deep Learning - Deploy your models in real conditions

Language: Jupyter Notebook - Size: 248 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

RafiQamar/HR-Analytics-Project

Cleaned and processed HR data using Python for analysis and visualization. Analyzed employee trends and performance using SQL and Python. Built an interactive Power BI dashboard connected to MySQL for dynamic insights.

Language: Jupyter Notebook - Size: 4.71 MB - Last synced at: 9 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

RafiQamar/IMDb-Movie-Analysis

This project involves web scraping, data preprocessing, database storage and visualization of IMDb movie data from the last decade (2014-2024). The dataset includes details of 10,000 movies such as name, release year, genre, ratings, metascore and more. The project culminates in an interactive Power BI dashboard for in-depth insights and reporting.

Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: 7 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

alvaro-concha/animal-behavior-preprocessing

animal-behavior-preprocessing is a Python repository to preprocess animal behavior data. It works on the output spreadsheets from video-tracking of animal body parts with LEAP or DeepLabCut. It applies a Median Filter, an Ensemble Kalman Filter, transforms data to joint angles and computes their Morlet Wavelet Spectra.

Language: Python - Size: 251 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

DavidRichardson02/CSV_DataSet_Analysis

The program processes CSV files to capture and format file contents, generate custom directories of files, extract data, perform analysis, and generate MATLAB script(s) for visualization and further analysis.

Language: C - Size: 128 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

CCaribe9/AdaptStdEPF

Code and experiments related to the paper: 'An adaptive standardisation model for Day-Ahead electricity price forecasting'

Language: Jupyter Notebook - Size: 96.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

XuanyiJennyMa/pupil_cloud_data_preprocessing_Phase_1

Scripts for pre-processing eye-tracker data from pupil cloud

Language: Python - Size: 2.08 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

Shaheer-khan-github/Natural-Language-Processing-in-Python-DataCamp

Language: Jupyter Notebook - Size: 10.2 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

Shakilgithub20/News-Classification

Language: Jupyter Notebook - Size: 11 MB - Last synced at: 6 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1

Multiomics-Analytics-Group/acore

Functionality to preprocess and analyse multi-omics data

Language: Python - Size: 8.63 MB - Last synced at: 21 days ago - Pushed at: 24 days ago - Stars: 1 - Forks: 1

Ryannn06/SQL-Case-Study-on-DepEd-Schools-Masterlist

This project uses the S.Y. 2020-2021 DepEd Schools Masterlist that contains 64,000+ school information across the Philippines, including location, sectors, and classification details.

Language: Jupyter Notebook - Size: 7.85 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

Progati00/Return-Rate-Reduction-Analysis

E-commerce Return Rate Reduction Analysis – Data-driven project using SQL, Python (Logistic Regression), and Power BI to analyze return patterns, predict customer behavior, and provide actionable insights to reduce product returns.

Language: Jupyter Notebook - Size: 1.34 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

Tszon/End-to-End_DS_ML_Project

I built an end-to-end customer churn segregation and prediction project.

Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

HoangLeminh17/Ranks-Prediction-for-LOL

A method to predict rankings based on performances of players for game League Of Legends

Language: Jupyter Notebook - Size: 10.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 1

chollette/SEDNet_Shallow-Encoder-Decoder-Network-for-Brain-Tumor-Segmentation

Official Implementation for SEDNet

Language: Jupyter Notebook - Size: 57.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 2

Abdelrahman-Atef-Elsayed/NLP_Preprocessing_pipeline

This repo includes a generalized preprocessing pipeline for text data in NLP tasks.

Language: Jupyter Notebook - Size: 64.5 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

anishdeshmukh9/AI-model-Training-Disease-prognosis

this was a academic project that showcase my pre&post ML model knowledge such as, data collection, data preprocessing, AI model training( ML) and finetune the model

Language: Python - Size: 8.13 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

DavidRichardson02/Standardized_CSV_Data_Analysis

Given the pathname of a file, it automates data extraction, statistical analysis, and modeling via MATLAB plotting scripts, facilitating a streamlined approach to handling analysis of datasets. This project provides a robust, standardized pipeline for reading, preprocessing, analyzing, and modeling data from CSV(or similarly delimited) files.

Language: C - Size: 2.88 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

RafiQamar/Customer-Churn-Prediction-App

Built and deployed a Streamlit-based customer churn prediction app using ML models. Preprocessed data with encoding and scaling, improving model accuracy. Designed for churn prediction and retention insights.

Language: Jupyter Notebook - Size: 2.52 MB - Last synced at: 6 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

ArtZaragozaGitHub/CV--P5_Plants_Seedling_Classification

A robust image classifier using CNNs to efficiently classify different plant seedlings and weeds to improve crop yields and minimize the extensive human effort to do this manually.

Language: Jupyter Notebook - Size: 7.82 MB - Last synced at: 7 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

khangbdd/Data-processing-CLI

CLI tools for preprocess csv data

Language: Python - Size: 23.4 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

iliavrtn/final-project

This project explores whether Mathematics and Computer Science texts still retain enough linguistic patterns (metalanguage) for classification once domain-specific words are removed. 🤖📚

Language: Jupyter Notebook - Size: 15.5 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

lucianoscarpaci/News-Data-Classification

Using the Reuters dataset, this example illustrates the process of data preprocessing, model definition and training, and performance evaluation.

Language: Jupyter Notebook - Size: 94.7 KB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

karthik-d/nyc-taxi-dataset-eda

Clearning, transformation and analysis large datasets as part of coursework for UCS1629: Data Warehousing and Data Mining.

Language: Jupyter Notebook - Size: 9.79 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

PhilaController/gun-violence-dashboard-data

Python toolkit for preprocessing data for the City Controller's Gun Violence Dashboard

Language: Python - Size: 355 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

sorrychoe/pyBigKinds

BigKinds Data Analysis Toolkit for python

Language: Python - Size: 31.1 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

AmruhaAhmed/Data-Cleaning-on-New-York-Airbnb-Listings

Language: Jupyter Notebook - Size: 3.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

ItsCodeBakery/K-Means-Clustering

Music Recommendation System using K-Means Clustering

Language: Jupyter Notebook - Size: 2.69 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

akshupande/Sales-Analysis-Enhancing-Customer-Experience-and-Boosting-Sales-through-Data-Insights

Unlock valuable data insights with Sales Analysis, a project focused on analyzing sales data to identify trends, patterns, and recommendations for enhancing customer experience and increasing sales.

Language: Jupyter Notebook - Size: 807 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

DavidRichardson02/CSV_Data_Set_Analysis

The program processes CSV files to capture and format file contents, generate custom directories of files, extract data, perform analysis, and generate MATLAB script(s) for visualization and further analysis.

Language: C - Size: 255 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

emkr-13/model_ta

Model buat TA Sentimen and Topik Berita Indonesia

Language: Jupyter Notebook - Size: 70 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

azevedontc/dataPreprocessing

Introduction to KDD and data preprocessing / Introdução ao KDD e pré-processamento de dados

Language: Python - Size: 396 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Rubenmarbez/Proyecto-HomeFinder

Con HomeFinder se busca crear una herramienta que permita a sus usuarios encontrar las mejores ofertas que se adapten a sus necesidades y preferencias, a través del análisis de datos de venta de inmuebles de segunda mano en Madrid.

Language: Jupyter Notebook - Size: 2.62 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SkullkyAI/ML-CLASIFICACION-TITANIC-KAGGLE

Práctica de clasificación con Machine Learning en el dataset del Titanic, abordando exploración de datos, preprocesamiento, selección de métricas y modelos, con el objetivo de analizar detalladamente los resultados obtenidos.

Language: Jupyter Notebook - Size: 7.97 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

bilaloumehdi/TP_NLP

Language: Jupyter Notebook - Size: 71.3 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

functorism/snapcrop

CLI for crop/resize of large amounts of images with configurable resolutions

Language: Rust - Size: 17.5 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

m92vyas/Implementing_Attention_Mechanism_Language_Translation

Bahdanau Attention Mechanism | Tensorflow Custom Layers/Model/Loss Function/Metrics | LSTM | Encoder | Decoder | Cross-Attention | Language Translation | Blue Score | Dropout

Language: Jupyter Notebook - Size: 48.4 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

BalajiN743/Multi-Linear-Regression-examples

Language: Jupyter Notebook - Size: 2.19 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

xxl4tomxu98/learn-you-from-text

This App predicts author's sentiment and personality traits by analyzing simple text input he or she writes. M1 Macbook Optimized Pytorch Neural Network Models.

Size: 179 MB - Last synced at: almost 3 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

PedramPeiro/Customer-Health-Score-Prediction

This project was done for Didar CRM, a leading company in CRM in Iran. In this project the aim was to assign Health Score to each customer in order to recognize ill customers and decrease churn rate.

Language: Jupyter Notebook - Size: 1.21 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

ShubhamAgr09/Chennai_Housing-Price_Pridiction

Regression Model to precisely predict the price of house based on various proposed features and also help the sellers understand what factors are fetching more money for the houses.

Language: Jupyter Notebook - Size: 862 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

caesarmario/data-warehouse-credit-card-applicant-using-pentaho

This repository contains OLTP, ETL process (using Pentaho Data Integration), and OLAP of credit card dataset. The dataset is taken from Kaggle (https://www.kaggle.com/rikdifos/credit-card-approval-prediction) and part of author Capstone Project.

Size: 1010 KB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

abderrahman-bns/Data-Cleaning-and-Preprocessinng-with-Pandas

Introducing you to the fundamentals of the quintessential Python data analysis library, pandas, and its core data structures – the Series and DataFrame objects.

Language: Jupyter Notebook - Size: 604 KB - Last synced at: almost 3 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 2

Faroja/Practice-Machine-Learning-11

Machine Learning Practice Essemble Model Bagging, Using detailed EDA, Preprocessing Scheme, looking model with best performance F1 score, Hyperparamater Tunning for best models, and intrepertation

Language: Jupyter Notebook - Size: 93.8 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

Shakilgithub20/Improving-Classification

Language: Jupyter Notebook - Size: 3.76 MB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

EslamElbassel/MNIST-Dataset-Classification-with-KNN-using-centroid-preprocessing

MNIST is a Dataset for images of handwritten digits Classification with KNN by extracting features using centroid

Language: Python - Size: 1.71 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

MaxBubblegum47/Preprocessing

Preprocessing method for Information Retrieval System

Language: Python - Size: 13.7 KB - Last synced at: 9 months ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

playingwithgithub24/HackBio-Single-Cell-RNA-Seq-Stage-2

Single-Cell RNA-seq Analysis of Bone Marrow Dataset Using Scanpy: This repository reproduces a complete scRNA-seq analysis pipeline using the Scanpy library on a modified bone marrow dataset (originally from CZI). The workflow includes preprocessing, normalization, clustering, marker-based annotation, and biological interpretation.

Language: Python - Size: 639 KB - Last synced at: 20 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

andiachmad/olist-ml

Data Mining Final Project

Language: Jupyter Notebook - Size: 5.63 MB - Last synced at: 22 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

Mozilla-Data-Collective/dataset-preprocessing-scripts

Scripts for preprocess dataset and adequate for MDC platform

Language: Python - Size: 30.3 KB - Last synced at: 25 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

Goyam02/movie_recommend

Language: Jupyter Notebook - Size: 9.57 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

0xPutri/Eksperimen_SML_Rozhak

Repository ini berisi eksperimen awal dan proses preprocessing otomatis untuk Proyek Akhir Membangun Sistem Machine Learning. Dataset dianalisis, diproses, dan disiapkan menjadi data siap latih sesuai kriteria yang ditetapkan.

Language: Jupyter Notebook - Size: 246 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

leansandoval/CienciaDeDatos

Ejercicios de clase y Trabajo Práctico de la materia Ciencia de Datos UNLaM (3670) - 1C / 2C 2025.

Language: Jupyter Notebook - Size: 22 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

AyaBoughanmi02/Salary-Prediction-Project

A comprehensive machine learning project using Linear and Logistic Regression to forecast salary value and classify six-figure earners

Language: Jupyter Notebook - Size: 725 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

BadBoy0170/training-data_BOT

Enterprise-grade training data curation bot for LLM fine-tuning using Decodo and Python automation. It provides an async, modular pipeline for document loading, preprocessing, task-specific data generation (Q&A, summarization, classification), quality evaluation, and dataset export — all through a unified API.

Language: Python - Size: 33.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Davide011/ML_project_South_African_Heart_Disease

Public Repository: Machine Learning & Data Mining project using the South African Heart Disease dataset. Applied PCA, Regularized Linear Regression, ANN, Logistic Regression, and Decision Trees with cross-validation for regression and classification. Includes feature scaling, EDA, and statistical tests.

Size: 1.32 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Abdullah-056/90-days-with-Buildables

This Repository contains all the work done in Buildables Fellowship.

Language: Jupyter Notebook - Size: 14.9 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

AmanSharma01Prime/netflix-content-analysis

netflix content analysis is a data analysis project using python in google colab, sql in postgreSQL and visualization in google sheets.

Language: Jupyter Notebook - Size: 3.42 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Atquiya-Labiba/Analyzing-Critic-and-User-Scores-in-Movies

Analyzing critic and user scores in movies to explore trends from 2000 to 2025

Language: Python - Size: 766 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

cgizo/TwoPhotonPP

Preprocessing scripts for Two-Photon data. Compile, motion correct and downsample tifs.

Language: Python - Size: 16.6 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ThalesGroup/Iliad-custom-to-OIM-transformer

Scripts to preprocess ocean data files from custom apps in order to export the data to Ocean Information Model.

Language: Python - Size: 2.34 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

pgoyal77/Movie_Recommender_Website

I used CountVectorizer to convert movie data into vectors, Cosine Similarity to find similar movies, and PorterStemmer to clean the text data for better accuracy in recommendations.

Language: Jupyter Notebook - Size: 3.85 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Rachelnk/Customer-Churn-Prediction-ML

This repository contains an analysis of customer data to predict customer churn for a telecommunications company that provides home phone and internet services

Language: Jupyter Notebook - Size: 604 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Ayesha24banu/Customer-Purchase-Behaviour-Analysis-in-Retail

Customer Purchase Behaviour Analysis in Retail using Python, RFM Segmentation, Market Basket Analysis, and Power BI Dashboard.

Language: Jupyter Notebook - Size: 13 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ddihora1604/Advanced_Business_Analytics_on_World_Bank_Global_Financial_Inclusion_Data_2021

Bridging the Gaps in Financial Inclusion: Understanding the Cash-Credit Paradox, Divide between Cash and Digital Payments, and Financial Resilience.

Language: Jupyter Notebook - Size: 27 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

agailloty/preprocess

preprocess is a fast data analysis preprocessing tool.

Language: Go - Size: 423 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

DelphinKdl/home_price_prediction_using_regularized_polynomial_regression

Housing price prediction using regularized polynomial regression

Language: Jupyter Notebook - Size: 1.77 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

gaurav-singh7092/ResuMatch

An AI-powered resume and job description matching application using natural language processing and machine learning techniques. This application provides intelligent analysis of resume-job compatibility with detailed scoring and recommendations.

Language: Python - Size: 1.45 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

SherineTarek224/Credit_Score

This repo is for credit score classification based on financial and demographic data. using supervised machine learning algorithms

Language: Jupyter Notebook - Size: 2.02 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

sonjaove/ML-hands-on

repo for some hands on stuff

Language: Jupyter Notebook - Size: 137 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

JoseRuiz01/ChestXRayPneumoniaDetection

Pneumonia detection using Convolutional Neural Networks

Language: Jupyter Notebook - Size: 1.46 GB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Naeem1144/segmentation-project

Customer Segmentation using Machine learning models for clustering analysis

Language: Jupyter Notebook - Size: 16.8 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Lummy-A/montgomery-county-crime-analysis

Analysis of crime patterns in Montgomery County (2018-2022) using Python data science tools to identify trends, spatial hotspots, and temporal distributions across crime types. Includes visualizations and insights to inform prevention strategies.

Language: Jupyter Notebook - Size: 5.24 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Tomaslopera/Fifa_Analysis

Language: Jupyter Notebook - Size: 8.71 MB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

tejaswirupa/Early-Prediction-of-Diabetes-Risk-Using-Machine-Learning

Built a predictive model using CDC health data to identify individuals at risk of developing diabetes. Achieved 90.6% F1-score using Logistic Regression and revealed key health indicators like BMI and blood pressure as top predictors.

Language: Jupyter Notebook - Size: 4.03 MB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Related Topics
machine-learning 66 python 55 data-science 27 exploratory-data-analysis 23 data-visualization 21 data-analysis 21 pandas 20 preprocessing 19 numpy 14 scikit-learn 14 machine-learning-algorithms 13 logistic-regression 12 feature-engineering 12 eda 11 seaborn 10 data 10 deep-learning 9 classification 8 feature-selection 8 streamlit 8 dataset 8 random-forest 7 random-forest-classifier 7 clustering 7 linear-regression 7 matplotlib 7 powerbi 7 predictive-modeling 6 nlp 6 cleaning-data 6 tensorflow 6 sklearn 6 artificial-intelligence 6 python3 6 datacleaning 5 data-mining 5 statistical-analysis 5 data-cleaning 5 csv 5 decision-tree-classifier 5 jupyter-notebook 5 knn-classification 5 nltk-python 5 analysis 4 sql 4 scikitlearn-machine-learning 4 sklearn-library 4 keras-tensorflow 4 neural-network 4 statistics 4 nltk-library 4 svm-classifier 4 dimensionality-reduction 4 data-engineering 4 visualization 4 kaggle-dataset 4 kaggle 3 plotly 3 regression-models 3 pandas-python 3 svm-model 3 hyperparameter-tuning 3 machinelearning 3 neural-networks 3 image-processing 3 ml 3 docker 3 r 3 cross-validation 3 natural-language-processing 3 feature-extraction 3 numpy-library 3 preprocessor 3 business-analytics 3 datascience 3 supervised-learning 3 flask 3 tableau 3 naive-bayes-classifier 3 nlp-machine-learning 3 sentiment-analysis 3 matplotlib-pyplot 3 twitter 3 keras 3 data-structures 3 string-manipulation 2 standardization 2 vizualization 2 vizualize-data 2 time-series 2 dataset-generation 2 descision-tree 2 cpp 2 jupyter 2 database 2 confusion-matrix 2 standard-scaler 2 cosine-similarity 2 pandas-library 2 automation 2