Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: categorical-features

TienNguyen93/hospital-readmission

Apply ensemble technique of model stacking to predict patient's readmission

Language: Jupyter Notebook - Size: 134 KB - Last synced: about 12 hours ago - Pushed: about 14 hours ago - Stars: 0 - Forks: 0

catboost/catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Language: Python - Size: 1.6 GB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 7,793 - Forks: 1,155

bhattbhavesh91/catboost-tutorial

A small tutorial to demonstrate the power of CatBoost Algorithm

Language: Jupyter Notebook - Size: 293 KB - Last synced: 29 days ago - Pushed: almost 3 years ago - Stars: 6 - Forks: 14

serengil/chefboost

A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4.5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting, Random Forest and Adaboost w/categorical features support for Python

Language: Python - Size: 1.08 MB - Last synced: 28 days ago - Pushed: 5 months ago - Stars: 444 - Forks: 101

soum-io/GPA_Predictor

Language: Python - Size: 2.02 MB - Last synced: about 1 month ago - Pushed: almost 6 years ago - Stars: 1 - Forks: 0

jessislearning/Medical-Data-Visualizer Fork of freeCodeCamp/boilerplate-medical-data-visualizer

Data Analysis with Python project from freeCodeCamp (3 of 5)

Language: Python - Size: 884 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

entron/entity-embedding-rossmann

Language: Jupyter Notebook - Size: 3.86 MB - Last synced: 2 months ago - Pushed: over 4 years ago - Stars: 865 - Forks: 328

ritika-0111/Customer-Segmentation-using-LightGBM-Classifier

It predicts the right group of new customers by Segmentation among A, B, C, and D segments using LightGBM Classifier.

Language: Jupyter Notebook - Size: 101 KB - Last synced: 3 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

c4pub/deodel

A mixed attributes predictive algorithm implemented in Python.

Language: Python - Size: 267 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 4 - Forks: 1

Arfua/tabnet

Data Science Session: TabNet

Language: Jupyter Notebook - Size: 466 KB - Last synced: 4 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 1

Atomu2014/product-nets

Tensorflow implementation of Product-based Neural Networks. An extended version is at https://github.com/Atomu2014/product-nets-distributed.

Language: Python - Size: 4.57 MB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 373 - Forks: 127

ashishyadav24092000/MAchineLearning_FeatureEngineering1

In this i have performed complete feature engineering that is from handling null values, Categorical features upto performing feature scaling on our test_data and train_data.

Language: Jupyter Notebook - Size: 2.27 MB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

ashishyadav24092000/EDA_on_HousePrice

In this repository I have performed Exploratory data analysis on the dataset famously known as House Price Prediction.

Language: Jupyter Notebook - Size: 1.52 MB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

ashishyadav24092000/FE_categorical_missing_values

In this code handling of the missing values for the categorical features from any dataset is shown.

Language: Jupyter Notebook - Size: 145 KB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

ashishyadav24092000/Encoding_categorical-variables

Mostl oftenly used Encoding techniques for categorical Varibales are performed here.

Language: Jupyter Notebook - Size: 1.13 MB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

ahmedlrashed/housing-prediction-model

Built and optimized a predictive regression model of housing prices with historical CA housing data.

Language: Jupyter Notebook - Size: 1.64 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

Nikolay-Lysenko/dsawl 📦

A set of tools for machine learning (for the current day, there are active learning utilities and implementations of some stacking-based techniques).

Language: Python - Size: 194 KB - Last synced: 14 days ago - Pushed: 9 months ago - Stars: 2 - Forks: 0

daniele-salerno/Handle-missing-values-in-Categorical-Features

Medium Post: some techniques useful to deal with missing values of Categorical Features

Language: Jupyter Notebook - Size: 87.9 MB - Last synced: 9 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

ShrayanRoy/cda_project

Project of a coursework - Categorical Data Analysis (M.Stat Semester 2) under the supervision of Prof. Arindam Chatterjee.,ISID

Language: HTML - Size: 661 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

PriyankaSett/obesity_multiclassification

Given a person's data, the task is to predict that in which category the person's weight should fit in. This is a Multiclassification project.

Language: Jupyter Notebook - Size: 1.91 MB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

konodyuk/kts

Interactive ML Toolset

Language: Python - Size: 2.24 MB - Last synced: 19 days ago - Pushed: about 4 years ago - Stars: 17 - Forks: 2

bfgray3/cattonum

Encode Categorical Features (unmaintained)

Language: R - Size: 212 KB - Last synced: 3 months ago - Pushed: over 1 year ago - Stars: 32 - Forks: 5

cpa-analytics/embedding-encoder

Scikit-Learn compatible transformer that turns categorical variables into dense entity embeddings.

Language: Jupyter Notebook - Size: 758 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 34 - Forks: 6

mmortazavi/EntityEmbedding-Working_Example

This repository contains a notebook demonstrating a practical implementation of the so-called Entity Embedding for Encoding Categorical Features for Training a Neural Network.

Language: Jupyter Notebook - Size: 653 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 75 - Forks: 32

marcello-calabrese/edatemplates

Exploratory Data Analysis standard templated in markdown and txt format

Size: 2.93 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

raynardj/category

Category transformation

Language: Python - Size: 32.2 KB - Last synced: 19 days ago - Pushed: about 2 years ago - Stars: 3 - Forks: 1

adimajo/glmdisc_python

glmdisc Python package: discretization, factor level grouping, interaction discovery for logistic regression

Language: Python - Size: 5.92 MB - Last synced: 29 days ago - Pushed: 6 months ago - Stars: 6 - Forks: 1

spayot/mte-plus

benchmarking various categorical encoding techniques for tabular data across 6 classification tasks and using 5 different downstream classifiers.

Language: Jupyter Notebook - Size: 14.3 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

victor7246/Notebooks

This repository contains notebooks on different topics across - linear algebra, image classification, language models etc.

Language: Jupyter Notebook - Size: 9.85 MB - Last synced: over 1 year ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0

Maskar/chicago_traffic_crashes

This study creates machine learning models to predict the seriousness of car crashes using 2019 and 2020 crash reports from the publicly accessable database maintained by the Chicago Police Department. A car crash is considered serious if the crash results in an injury or the car is towed due to the crash. Models use categorical features that describe conditions at the time of the crash and crash causes to predict the required target. The current focus is to classify whether a crash results in an injury. All machine learning models are trained, validated, and tested on randomly split 2019 crash reports. The best model (along with all others) are then tested using the full set of 2020 crash reports.

Language: Jupyter Notebook - Size: 39.4 MB - Last synced: 12 months ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

licesonw/deepmm

Multimodal deep learning package that uses both categorical and text-based features in a single deep architecture for regression and binary classification use cases.

Language: Python - Size: 385 KB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 3 - Forks: 0

abhmalik/categorical-feature-importances-without-one-hot-encoding-dummies

Feature Importance of categorical variables by converting them into dummy variables (One-hot-encoding) can skewed or hard to interpret results. Here I present a method to get around this problem using H2O.

Language: Jupyter Notebook - Size: 108 KB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 1 - Forks: 0

sumansahoo16/Categorical-Feature-Encoding-Challenge-II

Language: Jupyter Notebook - Size: 24.6 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 3 - Forks: 0

vc1492a/henosis

A Python framework for deploying recommendation models for form fields.

Language: Python - Size: 1.05 MB - Last synced: 29 days ago - Pushed: over 1 year ago - Stars: 11 - Forks: 3

Navadeeppasala/Data-Analysis-with-Python

Why data analysis? , How to understand the problem, what to do for data analysis, and how clean the data for building Machine Learning models

Language: Jupyter Notebook - Size: 201 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

CircArgs/target_statistic_encoding

A lightweight library for encoding categorical features in your dataset with robust k-fold target statistics in training with credibility filtering, and custom statistics.

Language: Python - Size: 309 KB - Last synced: 27 days ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

ItsWajdy/categorical_features_euclidean_distance

A python package to compute pairwise Euclidean distances on datasets with categorical features in little time

Language: Python - Size: 17.6 KB - Last synced: about 1 month ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 1

Helga-Helga/methods_of_artificial_intelligence

Laboratory works on Methods of Artificial Intelligence course

Language: Jupyter Notebook - Size: 653 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

ShrishailSGajbhar/Categorical-Feature-Encoding-Challenge-II

My solution for Kaggle competition "categorical feature encoding challenge II" with public and private score of 0.783.

Language: Jupyter Notebook - Size: 18.6 KB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0

viktorsapozhok/cafeen

Kaggle Categorical Feature Encoding Challenge II, private score 0.78795 (110 place)

Language: Jupyter Notebook - Size: 22.5 MB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 2 - Forks: 1

davidmasse/US-supreme-court-prediction

Predicting the ideological direction of Supreme Court decisions: ensemble vs. unified case-based model

Language: Jupyter Notebook - Size: 4.74 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 5 - Forks: 3

anhtholee/zalo-hit-song

Solution to Zalo AI Challenge 2019's Hit Song Prediction.

Language: Python - Size: 5.57 MB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

ocramz/record-encode

Generic encoding of record types

Language: Haskell - Size: 39.1 KB - Last synced: 29 days ago - Pushed: over 5 years ago - Stars: 2 - Forks: 1

praxitelisk/CATegorical-Feature-Encoding-Challenge

Binary classification, with every feature as categoricals

Language: Jupyter Notebook - Size: 12.4 MB - Last synced: 11 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

yurayli/kaggle-redhat

Red hat dataset in kaggle competition

Language: Jupyter Notebook - Size: 29.9 MB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

Related Keywords
categorical-features 45 machine-learning 19 python 9 data-science 8 deep-learning 7 kaggle 7 feature-engineering 5 exploratory-data-analysis 5 categorical-data 5 missing-values 4 catboost 4 encoding 4 gradient-boosting 4 supervised-learning 3 tabular-data 3 tutorial 3 embeddings 3 random-forest 3 neural-networks 3 decision-trees 3 data-mining 3 xgboost 3 numerical-features 3 jupyter-notebook 3 python3 2 scikit-learn 2 logistic-regression 2 entity-embedding 2 regression 2 eda 2 recommender-system 2 onehot-encoding 2 seaborn 2 one-hot-encode 2 feature-importance 2 pandas-dataframe 2 kaggle-competition 2 gbdt 2 gbm 2 r 2 gpu 2 gpu-computing 2 analysis 1 data-analysis 1 car-crashes 1 chicago 1 chicago-data-portal 1 chicago-police-department 1 crash-reports 1 injury 1 machinelearning 1 study 1 traffic-crashes 1 deep-and-cross 1 deepfm 1 zaloai 1 factorization-machine 1 multimodal 1 multimodal-deep-learning 1 zalo 1 embedding 1 bivariate-analysis 1 class-imbalance 1 preprocessing 1 business-analytics 1 markdown 1 problem-solving 1 statistical-analysis 1 template 1 univariate-analysis 1 discretization 1 gibbs-sampler 1 interactions 1 generic-programming 1 clustering 1 image-classification 1 language-models 1 linear-algebra 1 matrix-factorization 1 natural-language-processing 1 one-shot-learning 1 sentiment-classification 1 topic-modeling 1 categorical-feature-encoding 1 normalization 1 pre-processing-data 1 pandas 1 target-statistic 1 euclidean-distance 1 euclidean-distances 1 fast 1 kaggle-solution 1 kaggle-dataset 1 pairwise 1 artificial-intelligence 1 artificial-intelligence-algorithms 1 fuzzy-logic 1 linear-classification 1 perceptron 1 sales-prediction 1