An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: imbalanced-data

sachinML/Churn-Prediction-Web-Application-using-Deep-Learning

End to End Deep learning based Project to predict whether the customer will churn or not.

Language: Jupyter Notebook - Size: 736 KB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

antorguez95/synthetic_data_generation_framework

This repository contains the code of our published work in IEEE JBHI. Our main objective was to demonstrate the feasibility of the use of synthetic data to effectively train Machine Learning algorithms, prooving that it benefits classification performance most of the times.

Language: Python - Size: 36.9 MB - Last synced at: 10 months ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 1

GenTaylor/Traffic-Accident-Analysis

Traffic Accident Analysis using python machine learning

Language: Jupyter Notebook - Size: 36.4 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 23 - Forks: 10

daehan-lim/associative-classifier-mortality-prediction

An interpretable associative classifier for predicting patient mortality using Electronic Medical Records (EMRs). Designed to handle highly imbalanced healthcare datasets.

Language: Jupyter Notebook - Size: 19.5 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

miriamspsantos/dcai-ecai-tutorial-2024

A multi-view panorama of Data-Centric AI: Techniques, Tools, and Applications (ECAI Tutorial 2024)

Size: 1.8 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Sh-31/Credit-Card-Fraud-Detection

This repository contains the code for a Credit Card Fraud Detection project using a highly unbalanced Kaggle dataset of 284,807 transactions with only 492 frauds. To address the imbalance the project implements voting classifier and a neural network with focal loss in PyTorch, achieving an F1-score of 0.86 and PR_AUC of 0.85 for the positive class.

Language: Jupyter Notebook - Size: 175 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

MuafiraThasni/Credit-Card-Fraud-Prediction

Credit card fraud is a major concern in the financial industry nowadays. Analysing fraudulent transactions manually is unfeasible due to huge amounts of data and its complexity. However, given sufficiently informative features, one could expect it is possible to do using Machine Learning.

Language: Jupyter Notebook - Size: 438 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

marinafajardo/prevendo-nivel-satisfacao

Prevendo o Nível de Satisfação dos Clientes do Santander.

Language: Jupyter Notebook - Size: 7.79 MB - Last synced at: 10 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

marinafajardo/prevendo-customer-churn

Prevendo Customer Churn em Operadoras de Telecom

Language: Jupyter Notebook - Size: 2.18 MB - Last synced at: 10 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

leabrodyheine/ML-Kaggle-Cirrhosis-Data

This project showcases skills in machine learning, data preprocessing, and model evaluation using Python libraries such as scikit-learn, XGBoost, and Optuna. It involves implementing various machine learning models, handling imbalanced data, and employing imputation techniques to enhance model performance for predicting cirrhosis outcomes.

Language: Jupyter Notebook - Size: 12.6 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ramiyappan/Credit-card-Fraud

Explored various resampling techniques to learn from an imbalanced dataset for detecting Credit card frauds.

Language: Jupyter Notebook - Size: 9.23 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

micahwiesner67/Decision_Tree_Classifier_Heart_Disease

This is an end-to-end machine learning model in which I implement random-forest and decision tree classifiers to predict heart disease. I utilized cross-validation, and oversampling to deal with an imbalanced dataset.

Language: Python - Size: 302 KB - Last synced at: 10 months ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

As00-00/Fraud_detection

Credit card fraud detection from european cardholders transactions

Language: Jupyter Notebook - Size: 4.11 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

paulinamoskwa/Pediatric-Pneumonia-Chest-X-Ray

Pediatric pneumonia image classification with (strongly) imbalanced data via Pytorch 🫁

Language: HTML - Size: 8.64 MB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 4

dhwabqryh/Data-Mining-I

Tugas praktikum Data Mining I

Language: Jupyter Notebook - Size: 2.01 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

power-TY/Imbalnce_Handling_PySpark

PySpark를 이용한 불균형 데이터 처리 알고리즘 구현

Language: Jupyter Notebook - Size: 43 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

yaxinhou/imFTP

imFTP: Deep Imbalance Learning via Fuzzy Transition and Prototypical Learning (imFTP, Information Sciences 2024)

Language: Python - Size: 4.74 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Tejas-Nakave/Fraud-Analytics-ML-

Fraud analytics for credit cards utilizes advanced algorithms and machine learning to monitor transaction patterns and detect suspicious activities. By analyzing real-time data, it identifies anomalies such as unusual spending behaviors, geographic inconsistencies, and high-risk transactions.

Language: Jupyter Notebook - Size: 8.19 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

ashishpatel26/datascienv

datascienv is package that helps you to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries

Language: Python - Size: 229 KB - Last synced at: 25 days ago - Pushed at: over 3 years ago - Stars: 58 - Forks: 12

abhishekdbihani/Home-Credit-Default-Risk-Recognition

The project provides a complete end-to-end workflow for building a binary classifier in Python to recognize the risk of housing loan default. It includes methods like automated feature engineering for connecting relational databases, comparison of different classifiers on imbalanced data, and hyperparameter tuning using Bayesian optimization.

Language: Jupyter Notebook - Size: 2.93 MB - Last synced at: 11 months ago - Pushed at: almost 5 years ago - Stars: 17 - Forks: 10

ArmanDavoodi/CS-SBU-MachineLearning-BSc-2022 Fork of alisharifi2000/CS-SBU-MachineLearning-BSc-2022

Machine Learning Course of Computer Science Faculty of Shahid Beheshti University. Winter 2022

Language: Jupyter Notebook - Size: 93.7 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

SerkanGuldal/sentetik

Synthetic data generation package to balance imblanaced datasets

Language: Python - Size: 7.22 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

emrecanduran/Marketing-Campaign-Response-Prediction

Develop a model to predict which retail customers will respond to a marketing campaign. Logistic Regression shows the best performance.

Language: Jupyter Notebook - Size: 4.65 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

nguyendoanhoang/Graduation-Thesis

Using Machine Learning in predicting customer churn from bank credit card services

Language: R - Size: 1.44 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

viniciusds2020/ml_balaceamento_allknn

Este repositório contém um código de Machine Learning que utiliza o algoritmo AllKNN do pacote imblearn para realizar o balanceamento de dados.

Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

shinho123/23.11.10-1st-Korean-Society-of-Industrial-Engineers

2023년 11월 대한산업공학회(UNIST) : 다중 역할 경험을 고려한 게임 유저 이탈 예측: 롤 게임을 중심으로, 1저자

Language: Jupyter Notebook - Size: 45 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

Davityak03/Credit-Card-Fraud-Detection

Language: Jupyter Notebook - Size: 37.1 KB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Rahafzsh/CreditCardFraudDetection

An Artificial Neural Network (ANN) model detects whether a credit card is fraudulent or not. 

Language: Jupyter Notebook - Size: 3.26 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

mdh266/TextClassificationApp

Building and Deploying A Serverless Text Classification Web App

Language: Jupyter Notebook - Size: 8.61 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 18 - Forks: 10

aditya11ad/ML-course

Best for beginners | Well explained ML algorithms | organized Notebooks | Case Studies

Language: Jupyter Notebook - Size: 7.06 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 3 - Forks: 0

pradeepdev-1995/databalancer

Databalancer is the python library using in machine learning applications to balance the imbalanced text classification datasets before the model training.

Language: Python - Size: 247 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 0

filipusarif/Imbalance-data-SVM-Python

Working with Imbalance Dataset for classification using SVM model

Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

zhanghaoshuang/Data-Analytics-in-Business-Group-Project

Using R Markdown for Data Analysis, Machine Learning

Language: HTML - Size: 0 Bytes - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

foxcroftjn/PAKDD-Class-Ratio

Supplementary code for "Class ratio and its implications for reproducibility and performance in record linkage" presented at The Pacific-Asia Conference on Knowledge Discovery and Data Mining 2024.

Language: Jupyter Notebook - Size: 34.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

emrecanduran/Marketing-Campaign-Classification

In this repository, two tree-based models from different families-bagging(RF) and boosting(XGB)- are implemented using the CRISP-DM process to predict responses for a marketing campaign, incorporating Borderline-SMOTE to effectively tackle the imbalance in the dataset.

Language: Jupyter Notebook - Size: 3.64 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ekellbuch/longtail_ensembles

Evaluating ensemble performance in long-tailed datasets (Neurips 2023 Heavy Tails Workshop)

Language: Python - Size: 1.58 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

christopher-w-murphy/Class-Imbalance-in-WW-Polarization

Treating the measurement of the same-sign W polarization fraction as a class imbalance problem

Language: Jupyter Notebook - Size: 26.1 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 2

NaquibAlam/TheMisfits

Language: Jupyter Notebook - Size: 1.08 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

lavinomenezes/desafio_indicium

Projeto de classificação multiclasse de dados desbalanceados | Multiclass classification project for imbalanced data.

Language: Jupyter Notebook - Size: 7.98 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

gkoays/Oversample-Image-Data-with-Augmentations

Language: Python - Size: 165 KB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

pgrondein/scoring_model_financial_company

Scoring model for financial company - all files

Language: Python - Size: 156 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 2

deBUGger404/Imbalanced-classification

Imbalanced Data Classification Repository - 📦🤖 Code for classifying products into categories using deep learning. Divided into dataset creation, model development, and transfer learning sections. Implements TensorFlow for efficient training, tackles imbalanced classes, and includes saved models and one-hot encoded labels.

Language: Jupyter Notebook - Size: 89.8 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

amy-panda/NBA_Career_Prediction

Predicting if a NBA rookie player will last at least 5 years in the league

Language: HTML - Size: 68.1 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

zjukg/AdaMF-MAT

[Paper][LREC-COLING 2024] Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Language: Python - Size: 1.91 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 1

fsarshad/Covid19XRaysHw2

Covid-19 X-Rays Deep Learning

Language: Jupyter Notebook - Size: 6.13 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rishisinghlive/Fraud-Detection

Transection Fraud Detection

Language: Jupyter Notebook - Size: 156 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

GeorgeM2000/Vehicle-Insurance-Fraud-Detection-and-Credit-Risk-Assessment

This project endeavors to synthesize the challenges posed by varying misclassification costs and class imbalances, along with the corresponding solutions available for addressing these issues.

Language: Jupyter Notebook - Size: 703 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

MylieMudaliyar/Credit-Card-Fraud-Detection

Credit Fraud Detection of a highly imbalanced dataset of 280k transactions. Multiple ML algorithms(LogisticReg, ShallowNeuralNetwork, RandomForest, SVM, GradientBoosting) are compared for prediction purposes.

Language: Jupyter Notebook - Size: 305 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

LaurentVeyssier/Starbucks_case_study_Udacity_Data_Science

Case study from UDACITY Data Scientist Nanodegree

Language: Jupyter Notebook - Size: 1.75 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

gadiankit/Pseudo_Code

Pseudo Code or parts of code to speed up the process of data exploration and model building with little modifications.

Language: R - Size: 11.7 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

vyshaal/circleup-mltask

Exploratory Data Analysis & Data Modeling

Language: Jupyter Notebook - Size: 1.27 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

Samir-Zade/Feature-Engineering-and-Exploratory-Data-Analysis

This repository contains resources and code examples related to Feature Engineering and Exploratory Data Analysis (EDA) techniques in the field of data science and machine learning.

Language: Jupyter Notebook - Size: 1.58 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

KhoiDOO/ibla 📦

IBLA - Imbalance Learning Archive

Language: Python - Size: 132 KB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ireneban/imbalanced_classification_python

study repo for imbalanced data classification

Language: Jupyter Notebook - Size: 82 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

igarleni/Imbalanced-Data-analysis-in-R---KAGGLE-competition

Imbalanced data Analysis in R combining different technyques.

Language: R - Size: 943 KB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 1

igarleni/Imbalanced-Data-analysis-with-R---First-steps

Learning how to analyze imbalanced Data, implementing SMOTE and using unbalanced R package

Language: R - Size: 897 KB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 0

kaiquefreire05/imbalanced-data-attribute-selection

- Notebook with my studies on how to handle unbalanced data and attribute selection.

Language: Jupyter Notebook - Size: 37.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

abdoghareeb46/NTI-Final-Assignment

NTI-Final-Assignment Use flask(python) and shiny dashboard (R) to build simple user interface to see how choosing classification model may affect prediction accuracy, using Customer Churn Dataset.

Language: Jupyter Notebook - Size: 534 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

WiktorPieklik/BaggFold

Official implementation of Bagging Folds using Synthetic Majority Oversampling for Imbalance Classification

Language: Python - Size: 1.08 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

abhirup-ghosh/credit-risk-modelling

LightGBM credit default predictor + AWS deployment (EC2 + ECR)

Language: Jupyter Notebook - Size: 867 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mohammad95labbaf/Outlier-Imbalanced-Fraud-Detection

The Credit Card Fraud Detection project uses statistical techniques and machine learning for identifying fraudulent transactions. It includes data preprocessing, outlier detection using Boxplots and Z-scores, and a decision tree model. Evaluation goes beyond accuracy, considering precision, recall, F1-score, and ROC AUC.

Language: Jupyter Notebook - Size: 646 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

phiyodr/multilabel-oversampling

Many algorithms for imbalanced data support binary and multiclass classification only. This approach is made for mulit-label classification (aka multi-target classification). :sunflower:

Language: Python - Size: 358 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 0

Divya-Bhargavi/wids_datathon_2019

Women in Data Science Competition

Language: Jupyter Notebook - Size: 88.3 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

Dhrumil-Zion/Sentiments-Prediction-Using-NLP

Predicting customer sentiments from feedbacks for amazon. While exploring NLP and its fundamentals, I have executed many data preprocessing techniques. In this repository, I have implemented a bag of words using CountVectorizer class from sklearn. I have trained this vector using the LogisticRegression algorithm which gives approx 93% accuracy. I have found out the top 20 positive and negative feedback words from thousands how feedbacks. Also after processing this much I have automated the whole process with one function so that it can be used as generic for many machine learning algorithms. I have also tested another algorithm called DummyClassifier which gives an accuracy of around 84%. After that, I have executed the famous algorithm which is TF-IDF for NLP. I have combined TF-IDF with LogisticRegression which gives almost 93% accuracy but deep insights. Also, while working with data has solved the problem of imbalanced data through RandomOverSampler class from imblearn library.

Language: Jupyter Notebook - Size: 316 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

DocentSzachista/CNN-tackle-imbalance

Deep learning course project

Language: Python - Size: 2.65 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Hands-On-Fraud-Analytics/Chapter-12-Data-Preparation-for-Fraud-Analytics

Chapter 12: Data Preparation for Fraud Analytics

Language: Jupyter Notebook - Size: 3.57 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

rezamosavi8740/Model-Selection-in-Credit-Card-Transaction-Analysis

Explore model selection in credit card transaction analysis with Reza Mousavi's Git project. Addressing class imbalance, it employs undersampling and features tree-based models, SVM, and logistic regression for effective fraud detection

Language: Jupyter Notebook - Size: 3.36 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

shreyasngredd/SentimentAnalysis-Bank-Reviews

Data scraping, data pre-processing, exploratory data analysis, sentiment analysis of Bank Online Reviews

Language: Jupyter Notebook - Size: 3.36 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ashishrana1501/Feature-Engineering

This particular notebook consist of all the Feature Engineering technique and Feature Transformation technique

Language: Jupyter Notebook - Size: 1.18 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

shivamkc01/Handling_Imbalanced_dataset

This project is about how you can deal with imbalanced data and which performance metrics' particularly important compared to usual practices with fairly balanced data.

Language: Jupyter Notebook - Size: 338 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

AjNavneet/Customer_LiabilityToAsset_PredictiveAnalysis

Classifier for predicting customers who can be converted from liability to asset.

Language: Jupyter Notebook - Size: 2.01 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

huongnd12/credit-card-fraud-detection

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dipakexe/CREDIT_CARD_FRAUD_DETECTION

Language: Jupyter Notebook - Size: 73.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

atmacvit/bincrowd

Official Implementation of ACMMM'21 paper "Wisdom of (Binned) Crowds: A Bayesian Stratification Paradigm for Crowd Counting"

Language: Python - Size: 4.04 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 17 - Forks: 2

aaryadevg/Metro_PdM_Research

Failure prediction for APU’s on a Metro System Research project source code

Language: Jupyter Notebook - Size: 8.14 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

anudeepvanjavakam1/churn_prediction

A flask app to predict customer churn for a subscription service business

Language: Jupyter Notebook - Size: 66.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rezamosavi8740/Detect-Tuberculosis-Chest-X-ray-Dataset

In the project, it is related to the identification of radiological photographs in which tuberculosis patients are diagnosed. Among the main challenges in this project are data imbalance and data preprocessing and implementing deep learning models to solve it.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Giut0/NASA-NEO-Project

Final project for Data Mining course (Uniba)

Language: Jupyter Notebook - Size: 25.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

creativeDev6/twitter_sentiment_analysis_with_word2vec

Twitter sentiment analysis with word2vec.

Language: Python - Size: 2.08 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Hands-On-Fraud-Analytics/Chapter-11-Handling-Imbalanced-Data-Sets

Handling Imbalanced Data Sets

Language: Jupyter Notebook - Size: 325 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ammahmoudi/Credit-Risk-Prediction

Predicting credit risk when a person requests for loan using random forest on south German dataset (fixing imbalanced data)

Language: Jupyter Notebook - Size: 352 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

alaminbhuyan/Handling-Imbalance-data-in-Machine-Learning

Language: Jupyter Notebook - Size: 38.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

seoruosa/ds-for-healthcare-final-project-autism

Repositório referente ao projeto final com o tema de diagnóstico de autismo da matéria de Ciência e Visualização de Dados em Saúde na UNICAMP desenvolvido por Gabriela Servidone, Felipe Labate e Thiago Giachetto de Araujo.

Language: Jupyter Notebook - Size: 2.44 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 1

baibai25/MNDO

Multivariate Normal Distribution based Oversampling

Language: Jupyter Notebook - Size: 65.4 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 1

anjalysam/Health-Insurance-Cross-Sell-Prediction

Predict Health Insurance Owners' who will be interested in Vehicle Insurance

Language: Jupyter Notebook - Size: 6.23 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 4

sharmaroshan/Fraud-Detection-in-Insurace-Claims

This is a very Important part of Data Science Case Study because Detecting Frauds and Analyzing their Behaviours and finding reasons behind them is one of the prime responsibilities of a Data Scientist. This is the Branch which comes under Anamoly Detection.

Language: Jupyter Notebook - Size: 2.23 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 7 - Forks: 3

chenxu93/imbalanced_flow

PCCN for imbalanced flow data classifcation

Language: Python - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 6

30lm32/ml-imbalanced-car-booking-data

Create a ML model using Random Forest Classifier over skew (imbalanced) booking data

Language: Jupyter Notebook - Size: 2.53 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 2

Albertsr/Class-Imbalance

Cost-Sensitive Learning / ReSampling / Weighting / Thresholding / BorderlineSMOTE / AdaCost / etc.

Language: Python - Size: 6 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 96 - Forks: 25

FarhanaTeli/Kyphosis_Disease_Prediction_with_NN_XGBoost

Kyphosis disease prediction using simple Neural Network (NN) model and XGBoost model with GridSearchCV

Language: Jupyter Notebook - Size: 221 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Soumyajit2709/Drug-Detection-using-Graph-Embedding

It is a simple machine learning algorithm to get the latent vector of the Molecules from the datasets. After that we address the imbalance problem in the dataset and handle it by using various resampling techniques. Then we measure the performance of the algorithm by deploying various Classifiers.

Language: Jupyter Notebook - Size: 20.1 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

cristinelpopescu/Resampling-strategies-for-imbalanced-datasets

Language: Jupyter Notebook - Size: 89.8 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

pankush9096/kaggle-Credit-Card-Fraud-Detection

This is an highly imbalanced data with only 1.72% minority and 98.28% majority class, i will be explaining Up and down sampling and effect of sampling before and while doing cross validation. Model has been evaluated using precision recall curve.

Language: Jupyter Notebook - Size: 2.21 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 4

Soumyajit2709/AN-APPROACH-TO-CLASSIFY-ASTRONOMICAL-OBJECT-USING-IMBALANCED-SLOAN-DIGITAL-SKY-SURVEY-DATA

Classify stars, galaxies, and quasars with SDSS DR16 data. Balanced dataset using resampling techniques improves AdaBoost classifier's performance, enhancing astronomical object classification accuracy.

Language: Jupyter Notebook - Size: 62.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

alecngo/cervical-cancer-project

Deploy SVM, Random Forest, and Streamlit Package to make a web app to early detect Cervical Cancer

Language: Jupyter Notebook - Size: 5.46 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Swastik-25/Imbalanced-Data-with-SMOTE-Techniques

This repository contains implementation of some techniques like SMOTE, ADASYN, SMOTE + Tomek Links, SMOTE + ENN to overcome class imbalance in a binary classification problem.

Language: Jupyter Notebook - Size: 2.74 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 31 - Forks: 22

LuChang-CS/MTGAN

Code for the paper: Multi-Label Clinical Time-Series Generation via Conditional GAN

Language: Python - Size: 671 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 16 - Forks: 7

xecyborg/fraud-transaction-detection

Fraud transaction detection using Machine Learning algorithms on highly imbalanced dataset

Language: Jupyter Notebook - Size: 940 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

arunsinghbabal/Automated-Defective-Substrate-Identification-for-Expedited-Manufacturing

Identifies the faulty wafer before it can be used for the fabrication of integrated circuits and, in photovoltaics, to manufacture solar cells. The project retrains itself after every prediction, making it more robust and generalized over time.

Language: Python - Size: 9.37 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

bnafack/Turorial-machine-learning-

This is the tutorial I gave to the undergraduate student in Cameroon

Language: Jupyter Notebook - Size: 169 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Related Keywords
imbalanced-data 424 machine-learning 176 classification 90 python 78 imbalanced-learning 56 data-science 54 logistic-regression 53 smote 53 deep-learning 53 random-forest 43 imbalanced-classification 38 fraud-detection 33 scikit-learn 28 oversampling 26 xgboost 26 sklearn 24 feature-engineering 21 pandas 20 random-forest-classifier 20 pytorch 19 tensorflow 18 undersampling 16 jupyter-notebook 16 feature-selection 15 exploratory-data-analysis 15 binary-classification 14 machine-learning-algorithms 14 data-visualization 14 imbalance-classification 14 imblearn 14 eda 14 svm 14 data-preprocessing 13 class-imbalance 13 python3 13 xgboost-classifier 12 credit-card-fraud 12 outlier-detection 12 keras 12 neural-network 12 lightgbm 11 numpy 11 hyperparameter-tuning 11 supervised-learning 11 decision-tree-classifier 10 r 10 seaborn 10 ensemble-learning 10 cnn 10 flask 9 decision-trees 9 classification-model 9 multiclass-classification 9 transfer-learning 9 cross-validation 9 matplotlib 9 churn-prediction 8 data-analysis 8 classification-algorithm 8 regression 8 streamlit 8 data-augmentation 8 neural-networks 8 data-mining 8 image-classification 8 ml 7 svm-classifier 7 boosting 7 data-cleaning 7 computer-vision 7 natural-language-processing 7 pca 7 oversampling-technique 7 tensorflow2 6 naive-bayes-classifier 6 gridsearchcv 6 confusion-matrix 6 hyperparameter-optimization 6 meta-learning 6 kaggle 6 knn 6 pipeline 6 optuna 6 text-classification 6 credit-card-fraud-detection 6 long-tail 6 credit-card 6 knn-classification 6 fraudulent-transactions 5 adasyn 5 resampling-methods 5 focal-loss 5 clustering 5 decision-tree 5 long-tailed-recognition 5 imbalanced-classes 5 deep-neural-networks 5 healthcare 5 smote-oversampler 5 shap 5