An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: feature-engineering

Sanchemtos/Multi-Label-Emotion-Recognition

This project focuses on detecting multiple emotions from English text using a fine-tuned **BERT** model. It leverages the [GoEmotions](https://huggingface.co/datasets/go_emotions) dataset — a large-scale human-annotated dataset of Reddit comments labeled with 27 emotions + neutral.

Language: Jupyter Notebook - Size: 118 KB - Last synced at: about 2 hours ago - Pushed at: about 3 hours ago - Stars: 0 - Forks: 0

chalk-ai/docs

Docs for Chalk AI

Language: MDX - Size: 5.68 MB - Last synced at: about 8 hours ago - Pushed at: about 8 hours ago - Stars: 3 - Forks: 2

bindugayatri02/Real-Estate-Price-Prediction-Project

To import data from multiple sources, clean and wrangle data, perform exploratory data analysis (EDA), and create meaningful data visualizations. I will then predict future trends from data by developing linear, multiple, polynomial regression models & pipelines and learn how to analyzethem.

Language: Jupyter Notebook - Size: 67.4 KB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 0 - Forks: 0

Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

Language: C++ - Size: 143 MB - Last synced at: about 18 hours ago - Pushed at: about 20 hours ago - Stars: 401 - Forks: 76

chalk-ai/chalk-ts

Typescript client for working with Chalk

Language: TypeScript - Size: 1.36 MB - Last synced at: about 6 hours ago - Pushed at: about 6 hours ago - Stars: 7 - Forks: 0

xLightless/uwe-enterprise-mlaas-models

A repository containing each machine learning model used in the UWE Enterprise MLAAS.

Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Thuraaung021822/pairs

Pairs are a fundamental data structure that consist of two elements linked together. In programming, pairs are often used to store related data in a simple and convenient way.

Size: 1000 Bytes - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

AbhinavSharma07/Kaggle-Comp.

A repository showcasing solutions to Kaggle competitions with end-to-end workflows in machine learning and data science.

Language: Jupyter Notebook - Size: 32 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2 - Forks: 1

alibaba/Alink

Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.

Language: Java - Size: 18 MB - Last synced at: about 19 hours ago - Pushed at: 11 months ago - Stars: 3,605 - Forks: 800

FabianCormier/Cross-Domain-transfer-learning-from-Human-Motion-to-Robot-Fault-Detection

The code trains an LSTM-based residual model on human motion data and applies transfer learning to detect robotic joint faults. It preprocesses data, maps robot features to human-like patterns, and fine-tunes a model while freezing early layers. The optimized model is evaluated with class weighting, callbacks, and feature importance analysis.

Language: Jupyter Notebook - Size: 3.2 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Qanfat/PaySim-Fraud-Detection-XGBoost

An XGBoost-based fraud detection modelto identify money laundering in mobile transactions using PaySim synthetic dataset.

Size: 1.95 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

EpistasisLab/tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Language: Jupyter Notebook - Size: 86.9 MB - Last synced at: 1 day ago - Pushed at: 8 days ago - Stars: 9,897 - Forks: 1,578

chalk-ai/chalk-go

Go client for Chalk

Language: Go - Size: 4.92 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4 - Forks: 0

JensBender/loan-default-prediction

Leverage machine learning to predict loan defaults from customer application data of financial institutions.

Language: Jupyter Notebook - Size: 74.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

muthuganeshece/Business-Case-Study

This repository contains a collection of my work on business case studies of various industries, including e-commerce, logistics, retail, media etc.,

Language: Jupyter Notebook - Size: 123 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

feathr-ai/feathr

Feathr – A scalable, unified data and AI engineering platform for enterprise

Language: Scala - Size: 29.4 MB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 1,894 - Forks: 230

microsoft/nni 📦

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Language: Python - Size: 127 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 14,185 - Forks: 1,820

gattsu001/Telecom-Churn-Predictor

Predicts which telecom customers are likely to churn with 95% accuracy using engineered features from usage, billing, and support data. Implements Sturges-based binning, one-hot encoding, stratified 80/20 train-test split, and a two-level ensemble pipeline with soft voting. Achieves 94.60% accuracy, 0.8968 AUC, 0.8675 precision, 0.7423 recall.

Language: Python - Size: 191 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

Smit-Parekh/deep-demand-forecast-retail

End-to-end Deep Learning (TFT) demand forecasting system for Retail/FMCG with automated MLOps pipeline on Google Cloud (Vertex AI) for inventory optimization. Demonstrates advanced time series modeling, feature engineering, explainability (SHAP), and scalable deployment.

Size: 3.91 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

AvaAvarai/Java_Tabular_Vis_Toolkit

Cross-platform tool for Computational Interactive Visual Learning using lossless General Line Coordinate data visualizations and human-in-the-loop guided classification by eight classifier algorithms to find, test, and boost robust machine learning models with a goal of high case to parameter ratio.

Language: Java - Size: 241 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 3 - Forks: 1

fasihfast/car-price-estimator

Car Price Predictor based on real-time data Web-Scraped Using Pak Wheels

Language: Python - Size: 17.7 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

chalk-ai/chalk-elixir

Elixir client for Chalk

Language: Elixir - Size: 57.6 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 3 - Forks: 2

daemonX10/Machine-Learning

Comprehensive notes and code on Python, data analysis, visualization, machine learning, and deep learning from my data science learning journey. _________ _______ DON'T FORGET TO 🌟 __________ __________

Language: Jupyter Notebook - Size: 414 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

ReverendBayes/Telecom-Churn-Predictor

Predicts which telecom customers are likely to churn with 95% accuracy using real-world data features from usage, billing, and support data. Implements Sturges-based binning, one-hot encoding, stratified 80/20 train-test split, and a two-level ensemble pipeline with soft voting. Achieves 94.60% accuracy, 0.8968 AUC, 0.8675 precision, 0.7423 recall.

Language: Python - Size: 242 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

gayatri1505/Solana-Memcoin-Prediction-Using-Machine-Learning

This project predicts whether a memcoin launched on Solana’s Pump Fun platform will reach 85 SOL liquidity, using only the first 100 blocks of on-chain data. It features custom feature engineering, a stacking ensemble model, and optimized log loss performance.

Language: Python - Size: 15.6 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

DavidMembreno/Data-Science

This repository contains essential project files for various data science and data analysis projects.

Language: Jupyter Notebook - Size: 9.14 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Rudra-G-23/100-Days-of-ML

A complete and in-depth machine learning resource containing detailed notes, mathematical explanations, Python code, and Jupyter notebooks., and lectures.

Language: Jupyter Notebook - Size: 25.8 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

upgini/upgini

Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

Language: Python - Size: 164 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 331 - Forks: 24

ferrangarciarovira/SP500-ML-Forecasting

Forecasting S&P 500 returns using ML models across multiple time horizons (1-day, 1-week, 1-month). Includes feature engineering, rolling-window backtesting, and performance evaluation to assess predictive power and trading utility of each model.

Size: 15.6 KB - Last synced at: 2 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Parnika798/Customer-Segmentation

Customer Behaviour Analysis and Churn Predictor

Language: Python - Size: 183 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

0290192029/apartment-price-predictor

Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

Size: 1.95 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

arabind-meher/Stock-Market-Data-Lifecycle-FAANG-Companies

“A full-stack data lifecycle project for stock market data using Python, MySQL, Feature Engineering, and EDA, focused on FAANG companies.”

Language: Jupyter Notebook - Size: 163 KB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Brevidade/fleet-pattern

A demonstration of hierarchical Durable Objects in Cloudflare Workers, enabling infinite nesting of manager/agent relationships through URL paths.

Size: 1000 Bytes - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

XuegongLab/neoguider

NeoGuider, neoepitope detection using advanced feature engineering

Language: Python - Size: 12.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 1

coreymichaud/predicting-aids-deaths

Predicting AIDS deaths from a clinical study.

Language: Jupyter Notebook - Size: 1.42 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Yimeng-Zhang/feature-engineering-and-feature-selection

A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.

Language: Jupyter Notebook - Size: 1.28 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 1,549 - Forks: 416

4paradigm/OpenMLDB

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.

Language: C++ - Size: 163 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,630 - Forks: 317

phancykemunto/Automotive_sales

Sales Analysis of Automotive Manufacturing Data

Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

JensBender/machine-learning-template

A ready-to-use Jupyter Notebook template for machine learning projects.

Language: Jupyter Notebook - Size: 1.96 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

volga-project/volga

Real-time data processing/feature engineering in Python. Tailored for modern AI/ML systems.

Language: Python - Size: 11.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 57 - Forks: 4

HubertSzydlowski/Decision-Tree-Heart-Attack-Prediction

Language: Jupyter Notebook - Size: 498 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Saravanan9698/Clickstream_Customer_Conversion

Analyzes clickstream data from an e-commerce platform to predict customer conversions, estimate potential revenue, and segment users for personalized marketing strategies. By leveraging machine learning techniques, the project enhances decision-making for businesses seeking to optimize user engagement and sales.

Language: Jupyter Notebook - Size: 41.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

fasihfast/demand_forecasting_for_retail

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

SimonBlanke/Hyperactive

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

Language: Python - Size: 30.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 517 - Forks: 48

KHATEEB-ARMAN/House_price_prediction

House_price_prediction : The House Price Prediction project is a machine learning model designed to estimate the selling price of a house based on various factors such as location, size, number of rooms, amenities, and market trends.

Language: Jupyter Notebook - Size: 806 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

harshineeshree/Machine-Learning

About This Repository A curated resource hub for learning machine learning, featuring tutorials, code examples, datasets, and hands-on projects to build foundational skills and explore real-world applications.

Language: Python - Size: 11.7 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

ProfwareSystems/artdata-ml-process-control

Machine Learning in Process Control

Language: Python - Size: 5.39 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

SahashRaee/Machine_Learning_Notebooks

Machine Learning From Scratch

Language: Jupyter Notebook - Size: 6.61 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

jeongyoonlee/Kaggler

Code for Kaggle Data Science Competitions

Language: Python - Size: 2.16 MB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 751 - Forks: 163

Adity-star/Complete-DataScience-Guide

Comprehensive repository for data science projects, tools, workflows, and resources across ML, DL, and NLP, it also contain intervew question ,ds books and some of the codes i have written over my journey

Language: Jupyter Notebook - Size: 350 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

functime-org/functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

Language: Python - Size: 278 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 1,098 - Forks: 63

alteryx/featuretools

An open source python library for automated feature engineering

Language: Python - Size: 7.27 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7,432 - Forks: 898

Tfat05/drug-discovery-project

Predicting RNA-binding activity of small molecules using machine learning. Includes data processing, feature analysis, and regression modeling.

Size: 6.84 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

bayudwimulyadi/Titanic-Survival-Prediction

Predicting passenger survival on the Titanic using an ensemble machine learning approach, achieving a Kaggle score of 0.77990. This project leverages stacking with Random Forest, Gradient Boosting, and SVM, enhanced by feature engineering and hyperparameter tuning, to model survival patterns effectively.

Language: Jupyter Notebook - Size: 587 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

azukds/tubular

Python package implementing transformers for pre processing steps for machine learning.

Language: Python - Size: 2.45 MB - Last synced at: 2 days ago - Pushed at: 12 days ago - Stars: 60 - Forks: 18

VickyShapira/drug-discovery-project

Predicting RNA-binding activity of small molecules using machine learning. Includes data processing, feature analysis, and regression modeling.

Size: 4.08 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

AbhaySingh71/machine_learning-for-ds-

In this repo machine learning or their library code will return for data science using pytorch,tenserflow,scikit-learn especially to deep dive into ml

Language: Jupyter Notebook - Size: 37.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

pixeltable/pixeltable

Pixeltable — AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.

Language: Python - Size: 207 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 185 - Forks: 29

featureform/featureform

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

Language: Go - Size: 217 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,888 - Forks: 97

jmson8/Study_BDA

Big Data Analysis - Assignments

Language: Jupyter Notebook - Size: 2.54 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

KennethanCeyer/awesome-llmops

Awesome series for LLMOps

Size: 224 KB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 46 - Forks: 7

alteryx/evalml

EvalML is an AutoML library written in python.

Language: Python - Size: 16.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 807 - Forks: 89

feature-engine/feature_engine

Feature engineering package with sklearn like functionality

Language: Python - Size: 14.3 MB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 2,044 - Forks: 325

winedarksea/AutoTS

Automated Time Series Forecasting

Language: Python - Size: 46.9 MB - Last synced at: 5 days ago - Pushed at: 23 days ago - Stars: 1,273 - Forks: 110

mahnoorsheikh16/Sketchify-A-Quick-Draw-drawing-classifier

Language: Jupyter Notebook - Size: 6.1 MB - Last synced at: 5 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

nashish109/smart-ecommerce-fraud-detection

AI-powered system to detect fraudulent transactions in e-commerce using machine learning. Includes data preprocessing, feature engineering, and classification models like Random Forest and XGBoost. Achieved high accuracy with interpretable results for real-time detection.

Language: Jupyter Notebook - Size: 3.43 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

mahnoorsheikh16/NLP-Approach-to-AI-Text-Classification Fork of andrew-jxhn/STT811_StatsProject

Language: Jupyter Notebook - Size: 40.7 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

SushalReddySabbella/OFI-Feature-Engineering

"Order Flow Imbalance (OFI) signal construction and PCA-based feature compression for equity markets"

Language: Jupyter Notebook - Size: 601 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

Mryadav02/Credit_score_prediction

Watch me build a website which can predict the credit score , based on inputs given

Language: Jupyter Notebook - Size: 14.3 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

HubertSzydlowski/Decision-Tree---Heart-Attack-Prediction

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

JamesCWeber/British_Airways_Data_Project_Part_2_Flight_Booking_Prediction

The second task given to me while completing the British Airways Data Science micro internship. Conduct feature analysis on customer booking data and create a Random Forest model to predict which customers will book a trip.

Language: Jupyter Notebook - Size: 5.81 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

JamesCWeber/BCG-X-Data-Science-Project-Part-2-Feature-Engineering

The second task given to me while completing the BCG X Data Science microinternship. Conduct feature engineering by selecting, manipulating and transforming raw data into features that can be used in a supervised learning model

Language: Jupyter Notebook - Size: 4.79 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

ballet/ballet

☀️🦶 A lightweight framework for collaborative, open-source feature engineering

Language: Python - Size: 17.4 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 33 - Forks: 6

achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project

Complete-Life-Cycle-of-a-Data-Science-Project

Size: 156 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 609 - Forks: 250

NiranjanRao07/ADHD-ML-Project

This project used machine learning to classify ADHD based on EEG data. We preprocessed the EEG signals, extracted various features, and used LDA for dimensionality reduction. A voting ensemble of classifiers achieved 72% accuracy in distinguishing between ADHD and control groups.

Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

icicle-lang/icicle

Icicle Streaming Query Language

Language: Haskell - Size: 14.3 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 27 - Forks: 3

EmilHvitfeldt/feature-engineering-az

Source for book "Feature Engineering A-Z"

Language: HTML - Size: 45.3 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 141 - Forks: 14

MarcosScatolinoBR/previsao-demanda-varejo

Projeto de previsão de demanda no varejo com Random Forest, Regressão Linear, análise exploratória e geração automática de previsões.

Language: Python - Size: 1.02 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

deniztemur00/VinDrMammo

Deep learning project for automated detection of breast abnormalities in mammogram images

Language: Python - Size: 4.82 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

pk142/DSML

Scaler_DSML

Language: Jupyter Notebook - Size: 7.94 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

chihabmahfouf/ML-Capstone-Prediction-Projects

Hello! I'm Palak Yaduvanshi, a data science enthusiast with a passion for applying machine learning to solve real-world problems. I’m constantly learning and experimenting with new algorithms and tools to build impactful prediction models.

Language: Jupyter Notebook - Size: 3.55 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

Cyberoctane29/Salifort-Motors-Predicting-Employee-Turnover-and-Improving-Retention-Analysis-and-Modeling

In this project, I work as a data analytics professional at Salifort Motors, a fictional leader in alternative energy vehicles. I analyze employee survey data to identify turnover drivers and build predictive models, including multiple logistic regression, decision trees, and random forests, to forecast attrition and support retention strategies.

Language: Jupyter Notebook - Size: 18.2 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

vijaykumar1799/Customer-Churn-Prediction-Retention-Strategy

Practical customer churn analysis and prediction using Python, XGBoost, and real-world business insights.

Language: Jupyter Notebook - Size: 2.62 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

heather-253/house-prices-prediction

Predict house prices using feature engineering, Random Forest, and Gradient Boosting models. Includes full machine learning workflow: EDA, preprocessing, modeling, and submission file creation.

Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

heather-253/taxi-fare-prediction

Predict NYC taxi fares using distance, time, and location features. A machine learning regression project with data cleaning, feature engineering, and visualization.

Language: Jupyter Notebook - Size: 58.6 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

aikho/awesome-feature-engineering

A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning

Size: 19.5 KB - Last synced at: 11 days ago - Pushed at: over 6 years ago - Stars: 590 - Forks: 190

habedi/feature-factory

A high-performance feature engineering library for Rust powered by Apache DataFusion 🦀

Language: Rust - Size: 87.9 KB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 15 - Forks: 0

Shuyib/london_weather_prediction

The London Weather Project aims to predict the mean temperature in London using historical weather data, involving data cleaning, feature engineering, and modeling with techniques like imputation, transformation, scaling, and the use of MLflow for tracking model performance and hyperparameters.

Language: Jupyter Notebook - Size: 2.93 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Shuyib/chronic-kidney-disease-kaggle

Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.

Language: Jupyter Notebook - Size: 3.78 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 8 - Forks: 1

MohammedAazam/Used-Car-Price-Prediction-App

A machine learning web application that predicts used car prices based on brand, age, and mileage using linear regression models. Built with Streamlit for an interactive, user-friendly interface and data visualization.

Language: Python - Size: 163 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

cemdurakk/housemaster-housing-price-prediction

A complete machine learning pipeline for predicting housing prices using EDA, feature engineering, and models like Random Forest and XGBoost.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Arj1998/Credit_Risk_Assessment_model-

A machine learning project to predict credit risk using Logistic Regression, SVC, and Random Forest, with model tuning via GridSearchCV and evaluation using standard classification metrics.

Language: Jupyter Notebook - Size: 1000 Bytes - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

ArunabhaPani/wine_quality_ml_model

Used Random Forest Classifier on physicochemical features like acidity, sugar, density, pH, alcohol, etc., with preprocessing (scaling) and achieving evaluation via accuracy, confusion matrix, and classification report. Libraries used: pandas, numpy, matplotlib, seaborn, scikit-learn; dataset: WineQT.csv; target variable: quality; 80-20 train-test

Language: Jupyter Notebook - Size: 406 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

Amirreza81/Applied-Data-Science-Course

Comprehensive notes, practical exercises, and problem-solving solutions from the Applied Data Science course, covering data preprocessing, machine learning algorithms, statistical analysis, data visualization, and real-world applications.

Language: Jupyter Notebook - Size: 3.73 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 3 - Forks: 0

asavinov/intelligent-trading-bot

Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

Language: Python - Size: 884 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 1,318 - Forks: 281

arsentag/Retail_Credit_Scoring_Model

Language: Jupyter Notebook - Size: 24 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

Machine-Learning-Related-Projects/Real-Fake-Job-Post

Real-Fake-Job-Post

Language: Jupyter Notebook - Size: 3.84 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

kurtispykes/fraud-detection-project

A mono-repository containing a packaged machine learning model and simple REST API.

Language: Jupyter Notebook - Size: 130 KB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 9

Moez-lab/House-Price-Prediction

A Machine Learning project that predicts California house prices using Linear Regression and Random Forest. It includes data preprocessing, feature engineering, visualizations, and model evaluation with hyperparameter tuning using GridSearchCV.

Language: Python - Size: 396 KB - Last synced at: about 13 hours ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

Ramtin-Karbaschi/HousingPriceAdvance_KERASmodel

Advanced machine learning implementation for housing price prediction, utilizing statistical modeling to analyze property attributes and their market impacts. Features comprehensive data visualization, feature engineering, and model comparison techniques.

Language: Jupyter Notebook - Size: 494 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

Related Keywords
feature-engineering 2,649 machine-learning 1,329 python 700 data-science 629 exploratory-data-analysis 408 feature-selection 379 data-visualization 371 pandas 295 data-analysis 290 feature-extraction 269 random-forest 234 eda 224 deep-learning 201 numpy 185 xgboost 185 scikit-learn 183 logistic-regression 180 classification 176 linear-regression 169 data-cleaning 154 regression 151 data-preprocessing 147 machine-learning-algorithms 142 hyperparameter-tuning 137 seaborn 132 matplotlib 132 sklearn 132 python3 122 jupyter-notebook 119 predictive-modeling 104 kaggle 97 cross-validation 83 regression-models 83 supervised-learning 78 visualization 77 nlp 73 data 70 decision-trees 67 ml 63 data-mining 61 neural-network 58 natural-language-processing 57 hyperparameter-optimization 56 tensorflow 55 kaggle-competition 55 flask 54 clustering 53 statistics 53 model-evaluation 53 pca 53 random-forest-classifier 53 time-series 53 r 51 neural-networks 51 preprocessing 49 automl 49 gradient-boosting 48 unsupervised-learning 47 artificial-intelligence 44 mlops 42 datacleaning 42 outlier-detection 41 data-engineering 41 feature-importance 40 model-deployment 39 statistical-analysis 38 lightgbm 38 model-selection 37 time-series-analysis 37 classification-algorithm 37 regression-analysis 35 data-analytics 35 hypothesis-testing 34 prediction 34 gridsearchcv 34 datapreprocessing 34 kmeans-clustering 33 ensemble-learning 33 dimensionality-reduction 33 data-wrangling 33 streamlit 32 pytorch 32 svm 32 ai 31 computer-vision 31 matplotlib-pyplot 30 catboost 30 lasso-regression 30 datascience 30 pca-analysis 30 ridge-regression 29 modeling 29 decision-tree-classifier 29 sql 29 analysis 28 machinelearning 28 feature-scaling 28 svm-classifier 27 keras 27 supervised-machine-learning 27