An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: feature-engineering

0290192029/apartment-price-predictor

Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

Size: 1.95 KB - Last synced at: 36 minutes ago - Pushed at: about 1 hour ago - Stars: 0 - Forks: 0

mahnoorsheikh16/NLP-Approach-to-AI-Text-Classification Fork of andrew-jxhn/STT811_StatsProject

A text‐classification pipeline for identifying human‐ versus AI-generated responses using engineered linguistic/semantic features and PCA-reduced vectors; achieves ~85% accuracy with Logistic Regression, SVM, and MLP, and includes a fine-tuned bert-base-uncased model.

Language: Jupyter Notebook - Size: 40.5 MB - Last synced at: about 2 hours ago - Pushed at: about 2 hours ago - Stars: 1 - Forks: 0

Brevidade/fleet-pattern

A demonstration of hierarchical Durable Objects in Cloudflare Workers, enabling infinite nesting of manager/agent relationships through URL paths.

Size: 1000 Bytes - Last synced at: about 2 hours ago - Pushed at: about 3 hours ago - Stars: 1 - Forks: 0

asavinov/intelligent-trading-bot

Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

Language: Python - Size: 997 KB - Last synced at: about 5 hours ago - Pushed at: about 6 hours ago - Stars: 1,337 - Forks: 281

Cuonghoangit/GeoMineralInsight

This project uses machine learning to analyze geological, geochemical, aeromagnetic, and remote sensing data over 39,000 sq. km in southern India. It identifies high-probability zones for concealed Au, Cu, and PGE deposits using XGBoost, SHAP, and GeoPandas. Key features include automated pipelines, explainable AI, and GIS-ready maps.

Language: Jupyter Notebook - Size: 7.38 MB - Last synced at: about 6 hours ago - Pushed at: about 7 hours ago - Stars: 0 - Forks: 0

LatiefDataVisionary/feature-engineering-college-task

Language: Jupyter Notebook - Size: 17.1 MB - Last synced at: about 6 hours ago - Pushed at: about 7 hours ago - Stars: 0 - Forks: 0

Saravanan9698/Clickstream_Customer_Conversion

Analyzes clickstream data from an e-commerce platform to predict customer conversions, estimate potential revenue, and segment users for personalized marketing strategies. By leveraging machine learning techniques, the project enhances decision-making for businesses seeking to optimize user engagement and sales.

Language: Jupyter Notebook - Size: 42.6 MB - Last synced at: about 6 hours ago - Pushed at: about 7 hours ago - Stars: 1 - Forks: 0

mljar/mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

Language: Python - Size: 9.44 MB - Last synced at: about 10 hours ago - Pushed at: 29 days ago - Stars: 3,153 - Forks: 422

jasjeev013/GeoMineralInsight

This project uses machine learning to analyze geological, geochemical, aeromagnetic, and remote sensing data over 39,000 sq. km in southern India. It identifies high-probability zones for concealed Au, Cu, and PGE deposits using XGBoost, SHAP, and GeoPandas. Key features include automated pipelines, explainable AI, and GIS-ready maps.

Language: Jupyter Notebook - Size: 7.38 MB - Last synced at: about 8 hours ago - Pushed at: about 15 hours ago - Stars: 0 - Forks: 0

mahnoorsheikh16/Sketchify-A-Quick-Draw-drawing-classifier

Implementation of a sketch‐recognition pipeline inspired by Google’s Quick, Draw!—from raw stroke data to prediction. Includes data preprocessing and feature‐engineering scripts, three Bayesian classifiers alongside Logistic Regression, SVM, K-NN and XGBoost baselines, and an RNN model.

Language: Jupyter Notebook - Size: 6.11 MB - Last synced at: about 16 hours ago - Pushed at: about 17 hours ago - Stars: 1 - Forks: 0

EpistasisLab/tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Language: Jupyter Notebook - Size: 86.9 MB - Last synced at: about 13 hours ago - Pushed at: 3 days ago - Stars: 9,902 - Forks: 1,578

DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

Language: Jupyter Notebook - Size: 75.8 MB - Last synced at: about 19 hours ago - Pushed at: about 19 hours ago - Stars: 2,128 - Forks: 145

Fertmeneses/titanic-ML-from-disaster

Code used in Kaggle competition: Titanic - Machine Learning from Disaster

Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: about 24 hours ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

zahra-ahmadbeigloo/Machine-Learning-Projects

This repository contains machine learning projects, where models are trained for classification, regression, clustering, and deep learning tasks. Each project includes data preprocessing, feature engineering, model training, evaluation, and visualizations to support findings.

Language: Jupyter Notebook - Size: 12.9 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

functime-org/functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

Language: Python - Size: 278 MB - Last synced at: about 23 hours ago - Pushed at: 11 months ago - Stars: 1,101 - Forks: 63

jmson8/Study_BDA

Big Data Analysis - Assignments

Language: Jupyter Notebook - Size: 2.61 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Shanmukhi1920/Fraud-Detection

A Comparative Analysis of Machine Learning Models for Credit Card Transactions with an Emphasis on Maximizing Recall.

Language: Jupyter Notebook - Size: 1.09 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

guangyizhangbci/EEG_Riemannian

IEEE Transactions on Emerging Topics in Computational Intelligence

Language: Python - Size: 1.56 MB - Last synced at: about 12 hours ago - Pushed at: 27 days ago - Stars: 67 - Forks: 12

sushantdhruv2003/Weather-forecasting-with-machine-learning

A dataset of meteorological sensors with data cleaning and finding the best learning model for forecasting

Language: Jupyter Notebook - Size: 3.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

bayudwimulyadi/Titanic-Survival-Prediction

Predicting passenger survival on the Titanic using an ensemble machine learning approach, achieving a Kaggle score of 0.77990. This project leverages stacking with Random Forest, Gradient Boosting, and SVM, enhanced by feature engineering and hyperparameter tuning, to model survival patterns effectively.

Language: Jupyter Notebook - Size: 587 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

pixeltable/pixeltable

Pixeltable — AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.

Language: Python - Size: 207 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 192 - Forks: 30

LuisFelipePoma/Machine_Learning

Learning about the algorithms used in machine learning, along with techniques for training and testing models.

Language: Jupyter Notebook - Size: 17.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2 - Forks: 0

feature-engine/feature_engine

Feature engineering package with sklearn like functionality

Language: Python - Size: 14.3 MB - Last synced at: about 7 hours ago - Pushed at: 15 days ago - Stars: 2,048 - Forks: 325

susmnty/Client-IQ

The project focuses on predicting customer status using machine learning. It classifies customers as active, inactive, or at risk. Python and ML libraries are used for data analysis and modeling. The goal is to help businesses reduce churn. It supports better decision-making through predictive insights.

Language: Jupyter Notebook - Size: 66.4 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

pbirthal/DataScience-Portfolio

📊 Projects Repository: Showcasing My Data Science and Machine Learning Expertise Welcome to my GitHub repository, a curated collection of projects demonstrating my proficiency in data science, machine learning, and analytical problem-solving. This repository is designed to highlight my hands-on experience with real-world datasets.

Language: Jupyter Notebook - Size: 32.7 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

marcus-24/Stock-Predictor-Feature-Store

Creates a feature store to serve stock predictor features for offline batch training and online serving for the model inference service

Language: Python - Size: 31.3 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

breimanntools/aaanalysis

Python framework for interpretable protein prediction

Language: Jupyter Notebook - Size: 485 MB - Last synced at: about 21 hours ago - Pushed at: 11 days ago - Stars: 60 - Forks: 3

Sanchemtos/Multi-Label-Emotion-Recognition

This project focuses on detecting multiple emotions from English text using a fine-tuned **BERT** model. It leverages the [GoEmotions](https://huggingface.co/datasets/go_emotions) dataset — a large-scale human-annotated dataset of Reddit comments labeled with 27 emotions + neutral.

Language: Jupyter Notebook - Size: 118 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

chalk-ai/docs

Docs for Chalk AI

Language: MDX - Size: 5.68 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 3 - Forks: 2

bindugayatri02/Real-Estate-Price-Prediction-Project

To import data from multiple sources, clean and wrangle data, perform exploratory data analysis (EDA), and create meaningful data visualizations. I will then predict future trends from data by developing linear, multiple, polynomial regression models & pipelines and learn how to analyzethem.

Language: Jupyter Notebook - Size: 67.4 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

noahho/CAAFE

Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, Müller, and Hutter (2023).

Language: Python - Size: 466 KB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 158 - Forks: 28

Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

Language: C++ - Size: 143 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 401 - Forks: 76

chalk-ai/chalk-ts

Typescript client for working with Chalk

Language: TypeScript - Size: 1.36 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 7 - Forks: 0

xLightless/uwe-enterprise-mlaas-models

A repository containing each machine learning model used in the UWE Enterprise MLAAS.

Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Thuraaung021822/pairs

Pairs are a fundamental data structure that consist of two elements linked together. In programming, pairs are often used to store related data in a simple and convenient way.

Size: 1000 Bytes - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

AbhinavSharma07/Kaggle-Comp.

A repository showcasing solutions to Kaggle competitions with end-to-end workflows in machine learning and data science.

Language: Jupyter Notebook - Size: 32 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 1

alibaba/Alink

Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.

Language: Java - Size: 18 MB - Last synced at: 4 days ago - Pushed at: 11 months ago - Stars: 3,605 - Forks: 800

FabianCormier/Cross-Domain-transfer-learning-from-Human-Motion-to-Robot-Fault-Detection

The code trains an LSTM-based residual model on human motion data and applies transfer learning to detect robotic joint faults. It preprocesses data, maps robot features to human-like patterns, and fine-tunes a model while freezing early layers. The optimized model is evaluated with class weighting, callbacks, and feature importance analysis.

Language: Jupyter Notebook - Size: 3.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Qanfat/PaySim-Fraud-Detection-XGBoost

An XGBoost-based fraud detection modelto identify money laundering in mobile transactions using PaySim synthetic dataset.

Size: 1.95 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

chalk-ai/chalk-go

Go client for Chalk

Language: Go - Size: 4.92 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 4 - Forks: 0

JensBender/loan-default-prediction

Leverage machine learning to predict loan defaults from customer application data of financial institutions.

Language: Jupyter Notebook - Size: 74.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

muthuganeshece/Business-Case-Study

This repository contains a collection of my work on business case studies of various industries, including e-commerce, logistics, retail, media etc.,

Language: Jupyter Notebook - Size: 123 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

feathr-ai/feathr

Feathr – A scalable, unified data and AI engineering platform for enterprise

Language: Scala - Size: 29.4 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 1,894 - Forks: 230

microsoft/nni 📦

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Language: Python - Size: 127 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 14,185 - Forks: 1,820

gattsu001/Telecom-Churn-Predictor

Predicts which telecom customers are likely to churn with 95% accuracy using engineered features from usage, billing, and support data. Implements Sturges-based binning, one-hot encoding, stratified 80/20 train-test split, and a two-level ensemble pipeline with soft voting. Achieves 94.60% accuracy, 0.8968 AUC, 0.8675 precision, 0.7423 recall.

Language: Python - Size: 191 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

Smit-Parekh/deep-demand-forecast-retail

End-to-end Deep Learning (TFT) demand forecasting system for Retail/FMCG with automated MLOps pipeline on Google Cloud (Vertex AI) for inventory optimization. Demonstrates advanced time series modeling, feature engineering, explainability (SHAP), and scalable deployment.

Size: 3.91 KB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

AvaAvarai/Java_Tabular_Vis_Toolkit

Cross-platform tool for Computational Interactive Visual Learning using lossless General Line Coordinate data visualizations and human-in-the-loop guided classification by eight classifier algorithms to find, test, and boost robust machine learning models with a goal of high case to parameter ratio.

Language: Java - Size: 241 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 1

fasihfast/car-price-estimator

Car Price Predictor based on real-time data Web-Scraped Using Pak Wheels

Language: Python - Size: 17.7 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

chalk-ai/chalk-elixir

Elixir client for Chalk

Language: Elixir - Size: 57.6 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 2

daemonX10/Machine-Learning

Comprehensive notes and code on Python, data analysis, visualization, machine learning, and deep learning from my data science learning journey. _________ _______ DON'T FORGET TO 🌟 __________ __________

Language: Jupyter Notebook - Size: 414 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 0

ReverendBayes/Telecom-Churn-Predictor

Predicts which telecom customers are likely to churn with 95% accuracy using real-world data features from usage, billing, and support data. Implements Sturges-based binning, one-hot encoding, stratified 80/20 train-test split, and a two-level ensemble pipeline with soft voting. Achieves 94.60% accuracy, 0.8968 AUC, 0.8675 precision, 0.7423 recall.

Language: Python - Size: 242 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

gayatri1505/Solana-Memcoin-Prediction-Using-Machine-Learning

This project predicts whether a memcoin launched on Solana’s Pump Fun platform will reach 85 SOL liquidity, using only the first 100 blocks of on-chain data. It features custom feature engineering, a stacking ensemble model, and optimized log loss performance.

Language: Python - Size: 15.6 KB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

DavidMembreno/Data-Science

This repository contains essential project files for various data science and data analysis projects.

Language: Jupyter Notebook - Size: 9.14 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

Rudra-G-23/100-Days-of-ML

A complete and in-depth machine learning resource containing detailed notes, mathematical explanations, Python code, and Jupyter notebooks., and lectures.

Language: Jupyter Notebook - Size: 25.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

upgini/upgini

Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

Language: Python - Size: 164 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 331 - Forks: 24

thomas-young-2013/mindware

An efficient open-source AutoML system for automating machine learning lifecycle, including feature engineering, neural architecture search, and hyper-parameter tuning.

Language: Python - Size: 63 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 59 - Forks: 28

ferrangarciarovira/SP500-ML-Forecasting

Forecasting S&P 500 returns using ML models across multiple time horizons (1-day, 1-week, 1-month). Includes feature engineering, rolling-window backtesting, and performance evaluation to assess predictive power and trading utility of each model.

Size: 15.6 KB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Parnika798/Customer-Segmentation

Customer Behaviour Analysis and Churn Predictor

Language: Python - Size: 183 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

arabind-meher/Stock-Market-Data-Lifecycle-FAANG-Companies

“A full-stack data lifecycle project for stock market data using Python, MySQL, Feature Engineering, and EDA, focused on FAANG companies.”

Language: Jupyter Notebook - Size: 163 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

XuegongLab/neoguider

NeoGuider, neoepitope detection using advanced feature engineering

Language: Python - Size: 12.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 3 - Forks: 1

coreymichaud/predicting-aids-deaths

Predicting AIDS deaths from a clinical study.

Language: Jupyter Notebook - Size: 1.42 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

bytehub-ai/bytehub

ByteHub: making feature stores simple

Language: Python - Size: 363 KB - Last synced at: 1 day ago - Pushed at: almost 4 years ago - Stars: 60 - Forks: 4

Yimeng-Zhang/feature-engineering-and-feature-selection

A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.

Language: Jupyter Notebook - Size: 1.28 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 1,549 - Forks: 416

4paradigm/OpenMLDB

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.

Language: C++ - Size: 163 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,630 - Forks: 317

phancykemunto/Automotive_sales

Sales Analysis of Automotive Manufacturing Data

Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

JensBender/machine-learning-template

A ready-to-use Jupyter Notebook template for machine learning projects.

Language: Jupyter Notebook - Size: 1.96 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

DeepWisdom/AutoDL

Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.

Language: Python - Size: 4.46 MB - Last synced at: about 20 hours ago - Pushed at: over 2 years ago - Stars: 1,160 - Forks: 217

volga-project/volga

Real-time data processing/feature engineering in Python. Tailored for modern AI/ML systems.

Language: Python - Size: 11.1 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 57 - Forks: 4

HubertSzydlowski/Decision-Tree-Heart-Attack-Prediction

Language: Jupyter Notebook - Size: 498 KB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

fasihfast/demand_forecasting_for_retail

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

SimonBlanke/Hyperactive

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

Language: Python - Size: 30.5 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 517 - Forks: 48

KHATEEB-ARMAN/House_price_prediction

House_price_prediction : The House Price Prediction project is a machine learning model designed to estimate the selling price of a house based on various factors such as location, size, number of rooms, amenities, and market trends.

Language: Jupyter Notebook - Size: 806 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

harshineeshree/Machine-Learning

About This Repository A curated resource hub for learning machine learning, featuring tutorials, code examples, datasets, and hands-on projects to build foundational skills and explore real-world applications.

Language: Python - Size: 11.7 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

ProfwareSystems/artdata-ml-process-control

Machine Learning in Process Control

Language: Python - Size: 5.39 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

Aura-healthcare/hrv-analysis

Package for Heart Rate Variability analysis in Python

Language: Python - Size: 8.97 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 407 - Forks: 100

SahashRaee/Machine_Learning_Notebooks

Machine Learning From Scratch

Language: Jupyter Notebook - Size: 6.61 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

jeongyoonlee/Kaggler

Code for Kaggle Data Science Competitions

Language: Python - Size: 2.16 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 751 - Forks: 163

Adity-star/Complete-DataScience-Guide

Comprehensive repository for data science projects, tools, workflows, and resources across ML, DL, and NLP, it also contain intervew question ,ds books and some of the codes i have written over my journey

Language: Jupyter Notebook - Size: 350 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

EpistasisLab/tpot2

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Language: Jupyter Notebook - Size: 10.5 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 230 - Forks: 31

alteryx/featuretools

An open source python library for automated feature engineering

Language: Python - Size: 7.27 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 7,432 - Forks: 898

Tfat05/drug-discovery-project

Predicting RNA-binding activity of small molecules using machine learning. Includes data processing, feature analysis, and regression modeling.

Size: 6.84 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

azukds/tubular

Python package implementing transformers for pre processing steps for machine learning.

Language: Python - Size: 2.45 MB - Last synced at: 5 days ago - Pushed at: 14 days ago - Stars: 60 - Forks: 18

VickyShapira/drug-discovery-project

Predicting RNA-binding activity of small molecules using machine learning. Includes data processing, feature analysis, and regression modeling.

Size: 4.08 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

AbhaySingh71/machine_learning-for-ds-

In this repo machine learning or their library code will return for data science using pytorch,tenserflow,scikit-learn especially to deep dive into ml

Language: Jupyter Notebook - Size: 37.8 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

featureform/featureform

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

Language: Go - Size: 217 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,888 - Forks: 97

KennethanCeyer/awesome-llmops

Awesome series for LLMOps

Size: 224 KB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 46 - Forks: 7

alteryx/evalml

EvalML is an AutoML library written in python.

Language: Python - Size: 16.3 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 807 - Forks: 89

winedarksea/AutoTS

Automated Time Series Forecasting

Language: Python - Size: 46.9 MB - Last synced at: 7 days ago - Pushed at: 26 days ago - Stars: 1,273 - Forks: 110

nashish109/smart-ecommerce-fraud-detection

AI-powered system to detect fraudulent transactions in e-commerce using machine learning. Includes data preprocessing, feature engineering, and classification models like Random Forest and XGBoost. Achieved high accuracy with interpretable results for real-time detection.

Language: Jupyter Notebook - Size: 3.43 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

SushalReddySabbella/OFI-Feature-Engineering

"Order Flow Imbalance (OFI) signal construction and PCA-based feature compression for equity markets"

Language: Jupyter Notebook - Size: 601 KB - Last synced at: 2 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

Mryadav02/Credit_score_prediction

Watch me build a website which can predict the credit score , based on inputs given

Language: Jupyter Notebook - Size: 14.3 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

HubertSzydlowski/Decision-Tree---Heart-Attack-Prediction

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

JamesCWeber/British_Airways_Data_Project_Part_2_Flight_Booking_Prediction

The second task given to me while completing the British Airways Data Science micro internship. Conduct feature analysis on customer booking data and create a Random Forest model to predict which customers will book a trip.

Language: Jupyter Notebook - Size: 5.81 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

JamesCWeber/BCG-X-Data-Science-Project-Part-2-Feature-Engineering

The second task given to me while completing the BCG X Data Science microinternship. Conduct feature engineering by selecting, manipulating and transforming raw data into features that can be used in a supervised learning model

Language: Jupyter Notebook - Size: 4.79 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

ballet/ballet

☀️🦶 A lightweight framework for collaborative, open-source feature engineering

Language: Python - Size: 17.4 MB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 33 - Forks: 6

achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project

Complete-Life-Cycle-of-a-Data-Science-Project

Size: 156 MB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 609 - Forks: 250

NiranjanRao07/ADHD-ML-Project

This project used machine learning to classify ADHD based on EEG data. We preprocessed the EEG signals, extracted various features, and used LDA for dimensionality reduction. A voting ensemble of classifiers achieved 72% accuracy in distinguishing between ADHD and control groups.

Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

icicle-lang/icicle

Icicle Streaming Query Language

Language: Haskell - Size: 14.3 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 27 - Forks: 3

EmilHvitfeldt/feature-engineering-az

Source for book "Feature Engineering A-Z"

Language: HTML - Size: 45.3 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 141 - Forks: 14

MarcosScatolinoBR/previsao-demanda-varejo

Projeto de previsão de demanda no varejo com Random Forest, Regressão Linear, análise exploratória e geração automática de previsões.

Language: Python - Size: 1.02 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Related Keywords
feature-engineering 2,655 machine-learning 1,332 python 702 data-science 629 exploratory-data-analysis 408 feature-selection 380 data-visualization 371 pandas 296 data-analysis 291 feature-extraction 269 random-forest 234 eda 224 deep-learning 201 xgboost 186 scikit-learn 185 numpy 185 logistic-regression 180 classification 176 linear-regression 169 data-cleaning 154 regression 151 data-preprocessing 148 machine-learning-algorithms 143 hyperparameter-tuning 138 matplotlib 133 sklearn 132 seaborn 132 python3 122 jupyter-notebook 119 predictive-modeling 104 kaggle 97 cross-validation 83 regression-models 83 supervised-learning 78 visualization 78 nlp 73 data 70 decision-trees 67 ml 63 data-mining 61 neural-network 59 hyperparameter-optimization 57 natural-language-processing 57 tensorflow 55 kaggle-competition 55 model-evaluation 55 flask 54 random-forest-classifier 53 time-series 53 pca 53 clustering 53 statistics 53 neural-networks 51 r 51 gradient-boosting 49 automl 49 preprocessing 49 unsupervised-learning 47 artificial-intelligence 44 mlops 42 datacleaning 42 outlier-detection 41 data-engineering 41 feature-importance 40 model-deployment 39 statistical-analysis 38 lightgbm 38 classification-algorithm 37 model-selection 37 time-series-analysis 37 regression-analysis 35 data-analytics 35 prediction 34 gridsearchcv 34 datapreprocessing 34 hypothesis-testing 34 kmeans-clustering 33 ensemble-learning 33 dimensionality-reduction 33 data-wrangling 33 svm 32 streamlit 32 pytorch 32 ai 31 computer-vision 31 matplotlib-pyplot 31 datascience 30 pca-analysis 30 lasso-regression 30 catboost 30 sql 29 ridge-regression 29 modeling 29 decision-tree-classifier 29 machinelearning 28 supervised-machine-learning 28 feature-scaling 28 analysis 28 svm-classifier 27 keras 27