An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: over-sampling

ChaitanyaC22/Telecom-Churn-Prediction

In this project, data analytics is used to analyze customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn, and identify the main indicators of churn. The project focuses on a four-month window, wherein the first two months are the ‘good’ phase, the third month is the ‘action’ phase, while the fourth month is the ‘churn’ phase. The business objective is to predict the churn in the last i.e. fourth month using the data from the first three months.

Language: Jupyter Notebook - Size: 27.7 MB - Last synced at: 26 days ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

nickkunz/smogn

Synthetic Minority Over-Sampling Technique for Regression

Language: Python - Size: 730 KB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 308 - Forks: 76

baibai25/MNDO

Multivariate Normal Distribution based Oversampling

Language: Jupyter Notebook - Size: 65.4 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 1

sharmaroshan/Fraud-Detection-in-Online-Transactions

Detecting Frauds in Online Transactions using Anamoly Detection Techniques Such as Over Sampling and Under-Sampling as the ratio of Frauds is less than 0.00005 thus, simply applying Classification Algorithm may result in Overfitting

Language: Jupyter Notebook - Size: 300 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 56 - Forks: 29

hanfei1986/Oversampling-of-imbalanced-data-with-RandomOverSampler--SMOTE-and-ADASYN

Imbalanced data commonly exist in real world, especially in anomaly-detection tasks. Handling imbalanced data is important to the tasks, otherwise the predictions are biased towards the majority class. RandomOverSampler, SMOTE, and ADASYN are useful oversampling tools to fabricate data for minority classes and make the dataset balanced.

Language: Jupyter Notebook - Size: 8.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

M-Hashemzadeh/RCSMOTE

RCSMOTE: Range-Controlled Synthetic Minority Over-sampling Technique for handling the class imbalance problem

Size: 6.28 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

NeonOstrich/Credit-Risk-Classification-using-Logistic-Regression

Trained and evaluated two supervised machine learning models using original and resampled data to identify 'healthy loan' and 'high risk loan' applicants from financial disclosures.

Language: Jupyter Notebook - Size: 932 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

alicevillar/student_admission_prediction

Predicting students admission with Logistic Regression, Decision Tree, SVM (SVC) and Random Forest

Language: Jupyter Notebook - Size: 174 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1

chihangs/diabetes_classification

Use random forest, gradient boosting, neural network, with SMOTE-ENN and random over-sampling

Language: Jupyter Notebook - Size: 4.09 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

cbrito3/Credit_Risk_Analysis

Supervised Machine Learning and Credit Risk

Language: Jupyter Notebook - Size: 986 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

baibai25/MNDO-NC

Multivariate Normal Distribution Based Over-Sampling for Numerical and Categorical Features

Language: Jupyter Notebook - Size: 168 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

abhiram-ds/credit_card_fraud_detection

Credit Card Fraud detection based on anonymized data using multiple classification algorithms

Language: Jupyter Notebook - Size: 1.73 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

jabhinav/Data-Science-and-ML-for-Structured-Data-Classification

Repo contains scripts to perform data analysis on structure data. It also provides a comparison of various ML algorithms at different stages of data preparation.

Language: Jupyter Notebook - Size: 522 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

JalajVora/Text-Analytics-with-Multi-Class-and-Imbalanced-Learning

Genre Identification task along with Text Analytics with Multi-Class and Imbalanced Learning on Gutenberg Corpus

Language: HTML - Size: 166 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

Related Keywords
over-sampling 14 machine-learning 7 logistic-regression 5 imbalanced-data 5 under-sampling 4 smote 4 classification 3 machine-learning-algorithms 3 random-forest 3 imbalanced-learning 3 data-analysis 2 classification-report 2 python 2 xgboost 2 pandas 2 neural-network 2 random-forest-classifier 2 class-imbalance 2 data-analytics 2 text-retrieval 1 naive-random-oversampler 1 imbalance-learning 1 easy-ensemble-classifier 1 performance-measurements 1 cluster-centroids-undersampling 1 prediction-model 1 resampling 1 balanced-random-forest 1 ada-boost-classifier 1 svc 1 diabetes-prediction 1 smote-enn 1 gradient-boosting 1 text-analytics 1 svm-rbf 1 naive-bayes-classifier 1 multiclass-classification 1 gutenberg 1 decision-tree-classifier 1 complement-navie-bayes 1 data-science 1 data-preparation 1 cost-sensitive-learning 1 binary-classification 1 skewness 1 decision-trees 1 credit-card-fraud 1 smote-oversampler 1 scikitlearn-machine-learning 1 scikit-learn 1 precision-recall 1 machine-learning-projects 1 data-visualization 1 confusion-matrix 1 auprc 1 anamoly-detection 1 synthetic-data 1 regression 1 telecom 1 statistics 1 rfe 1 pca 1 model-evaluation 1 model-building 1 hyperparameter-tuning 1 feature-engineering 1 evaluation-metrics 1 data-manipulation 1 data-cleaning 1 kfold-cross-validation 1 decision-tree 1 train-test-split 1 supervised-machine-learning 1 sklearn 1 reporting 1 pathlib 1 numpy 1 smote-sampling 1 imbalanced-datasets 1 imbalanced-classification 1 class-imbalance-problem 1 sampling 1 query 1 large-dataset 1 finance 1 deep-learning 1