GitHub topics: categorical-features
serengil/chefboost
A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4.5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting, Random Forest and Adaboost w/categorical features support for Python
Language: Python - Size: 1.09 MB - Last synced at: about 21 hours ago - Pushed at: about 2 months ago - Stars: 476 - Forks: 101

catboost/catboost
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Language: C++ - Size: 1.66 GB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 8,392 - Forks: 1,218

entron/entity-embedding-rossmann
Language: Jupyter Notebook - Size: 3.86 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 873 - Forks: 324

bhattbhavesh91/catboost-tutorial
A small tutorial to demonstrate the power of CatBoost Algorithm
Language: Jupyter Notebook - Size: 293 KB - Last synced at: 27 days ago - Pushed at: almost 4 years ago - Stars: 10 - Forks: 14

vc1492a/henosis
A Python framework for deploying recommendation models for form fields.
Language: Python - Size: 1.05 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 3

c4pub/deodel
A mixed attributes predictive algorithm implemented in Python.
Language: Python - Size: 267 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 5 - Forks: 2

konodyuk/kts
Interactive ML Toolset
Language: Python - Size: 2.24 MB - Last synced at: 7 months ago - Pushed at: 11 months ago - Stars: 16 - Forks: 2

98MM/msc_cc
MSC Project - Artifical Categorical Datasets
Language: Jupyter Notebook - Size: 2.15 MB - Last synced at: 9 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

rom1mouret/catclustering
Rust crate for clustering categorical data
Language: Rust - Size: 7.81 KB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

TienNguyen93/hospital-readmission
Apply ensemble technique of model stacking to predict patient's readmission
Language: Jupyter Notebook - Size: 134 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

jessislearning/Medical-Data-Visualizer Fork of freeCodeCamp/boilerplate-medical-data-visualizer
Data Analysis with Python project from freeCodeCamp (3 of 5)
Language: Python - Size: 884 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

bfgray3/cattonum
Encode Categorical Features (unmaintained)
Language: R - Size: 212 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 32 - Forks: 4

ritika-0111/Customer-Segmentation-using-LightGBM-Classifier
It predicts the right group of new customers by Segmentation among A, B, C, and D segments using LightGBM Classifier.
Language: Jupyter Notebook - Size: 101 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

Arfua/tabnet
Data Science Session: TabNet
Language: Jupyter Notebook - Size: 466 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

Atomu2014/product-nets
Tensorflow implementation of Product-based Neural Networks. An extended version is at https://github.com/Atomu2014/product-nets-distributed.
Language: Python - Size: 4.57 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 373 - Forks: 127

ashishyadav24092000/MAchineLearning_FeatureEngineering1
In this i have performed complete feature engineering that is from handling null values, Categorical features upto performing feature scaling on our test_data and train_data.
Language: Jupyter Notebook - Size: 2.27 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ashishyadav24092000/EDA_on_HousePrice
In this repository I have performed Exploratory data analysis on the dataset famously known as House Price Prediction.
Language: Jupyter Notebook - Size: 1.52 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ashishyadav24092000/FE_categorical_missing_values
In this code handling of the missing values for the categorical features from any dataset is shown.
Language: Jupyter Notebook - Size: 145 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ashishyadav24092000/Encoding_categorical-variables
Mostl oftenly used Encoding techniques for categorical Varibales are performed here.
Language: Jupyter Notebook - Size: 1.13 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ahmedlrashed/housing-prediction-model
Built and optimized a predictive regression model of housing prices with historical CA housing data.
Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Nikolay-Lysenko/dsawl 📦
A set of tools for machine learning (for the current day, there are active learning utilities and implementations of some stacking-based techniques).
Language: Python - Size: 194 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

daniele-salerno/Handle-missing-values-in-Categorical-Features
Medium Post: some techniques useful to deal with missing values of Categorical Features
Language: Jupyter Notebook - Size: 87.9 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

ShrayanRoy/cda_project
Project of a coursework - Categorical Data Analysis (M.Stat Semester 2) under the supervision of Prof. Arindam Chatterjee.,ISID
Language: HTML - Size: 661 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

PriyankaSett/obesity_multiclassification
Given a person's data, the task is to predict that in which category the person's weight should fit in. This is a Multiclassification project.
Language: Jupyter Notebook - Size: 1.91 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

cpa-analytics/embedding-encoder
Scikit-Learn compatible transformer that turns categorical variables into dense entity embeddings.
Language: Jupyter Notebook - Size: 758 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 34 - Forks: 6

mmortazavi/EntityEmbedding-Working_Example
This repository contains a notebook demonstrating a practical implementation of the so-called Entity Embedding for Encoding Categorical Features for Training a Neural Network.
Language: Jupyter Notebook - Size: 653 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 75 - Forks: 32

marcello-calabrese/edatemplates
Exploratory Data Analysis standard templated in markdown and txt format
Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

raynardj/category
Category transformation
Language: Python - Size: 32.2 KB - Last synced at: 18 days ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 1

adimajo/glmdisc_python
glmdisc Python package: discretization, factor level grouping, interaction discovery for logistic regression
Language: Python - Size: 5.92 MB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 1

spayot/mte-plus
benchmarking various categorical encoding techniques for tabular data across 6 classification tasks and using 5 different downstream classifiers.
Language: Jupyter Notebook - Size: 14.3 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

victor7246/Notebooks
This repository contains notebooks on different topics across - linear algebra, image classification, language models etc.
Language: Jupyter Notebook - Size: 9.85 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

Maskar/chicago_traffic_crashes
This study creates machine learning models to predict the seriousness of car crashes using 2019 and 2020 crash reports from the publicly accessable database maintained by the Chicago Police Department. A car crash is considered serious if the crash results in an injury or the car is towed due to the crash. Models use categorical features that describe conditions at the time of the crash and crash causes to predict the required target. The current focus is to classify whether a crash results in an injury. All machine learning models are trained, validated, and tested on randomly split 2019 crash reports. The best model (along with all others) are then tested using the full set of 2020 crash reports.
Language: Jupyter Notebook - Size: 39.4 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

licesonw/deepmm
Multimodal deep learning package that uses both categorical and text-based features in a single deep architecture for regression and binary classification use cases.
Language: Python - Size: 385 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

abhmalik/categorical-feature-importances-without-one-hot-encoding-dummies
Feature Importance of categorical variables by converting them into dummy variables (One-hot-encoding) can skewed or hard to interpret results. Here I present a method to get around this problem using H2O.
Language: Jupyter Notebook - Size: 108 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

sumansahoo16/Categorical-Feature-Encoding-Challenge-II
Language: Jupyter Notebook - Size: 24.6 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

Navadeeppasala/Data-Analysis-with-Python
Why data analysis? , How to understand the problem, what to do for data analysis, and how clean the data for building Machine Learning models
Language: Jupyter Notebook - Size: 201 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

CircArgs/target_statistic_encoding
A lightweight library for encoding categorical features in your dataset with robust k-fold target statistics in training with credibility filtering, and custom statistics.
Language: Python - Size: 309 KB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

ItsWajdy/categorical_features_euclidean_distance
A python package to compute pairwise Euclidean distances on datasets with categorical features in little time
Language: Python - Size: 17.6 KB - Last synced at: 8 days ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

Helga-Helga/methods_of_artificial_intelligence
Laboratory works on Methods of Artificial Intelligence course
Language: Jupyter Notebook - Size: 653 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

yvetteyyuan/ML-regression-mealkit-DTC
Develop a predictive model to understand the LTV of each customer for a DTC meal-kit business.
Language: Jupyter Notebook - Size: 1.13 MB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

ShrishailSGajbhar/Categorical-Feature-Encoding-Challenge-II
My solution for Kaggle competition "categorical feature encoding challenge II" with public and private score of 0.783.
Language: Jupyter Notebook - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

viktorsapozhok/cafeen
Kaggle Categorical Feature Encoding Challenge II, private score 0.78795 (110 place)
Language: Jupyter Notebook - Size: 22.5 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

saibharath2/logistic-regression-
log
Language: Jupyter Notebook - Size: 2.46 MB - Last synced at: 10 months ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

davidmasse/US-supreme-court-prediction
Predicting the ideological direction of Supreme Court decisions: ensemble vs. unified case-based model
Language: Jupyter Notebook - Size: 4.74 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 3

anhtholee/zalo-hit-song
Solution to Zalo AI Challenge 2019's Hit Song Prediction.
Language: Python - Size: 5.57 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

ocramz/record-encode
Generic encoding of record types
Language: Haskell - Size: 39.1 KB - Last synced at: 24 days ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 1

praxitelisk/CATegorical-Feature-Encoding-Challenge
Binary classification, with every feature as categoricals
Language: Jupyter Notebook - Size: 12.4 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

soum-io/GPA_Predictor
Language: Python - Size: 2.02 MB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

yurayli/kaggle-redhat
Red hat dataset in kaggle competition
Language: Jupyter Notebook - Size: 29.9 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0
