An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: preprocessing

sunlabuiuc/PyHealth

A Deep Learning Python Toolkit for Healthcare Applications.

Language: Python - Size: 122 MB - Last synced at: 18 minutes ago - Pushed at: about 2 hours ago - Stars: 1,283 - Forks: 447

oggtgt/AI-Powered-Loan-Eligibility-Risk-Scoring-System

🤖 Build an AI-driven loan eligibility and risk scoring system to facilitate smarter loan decisions with advanced machine learning techniques.

Language: Jupyter Notebook - Size: 5.22 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 1

dongrixinyu/JioNLP

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Language: Python - Size: 159 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3,746 - Forks: 441

fantaskiss/ComfyUI-node-img88

一组ComfyUI的图片预处理相关节点。因为模型对于图像的两边长有被除数要求,为避免一般缩放产生形变而制作。包括扩图前对图片边长自动扩展的节点2个,生图前载入图片大小预设的节点一个。

Language: Python - Size: 83 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

PavelGrigoryevDS/olist-deep-dive

🌊 Deep Sales Analysis of Olist E-Commerce: EDA | Time Series| Viz | RFM | NLP | Geospatial | Segmentation & Actionable Business Recommendations.

Language: Jupyter Notebook - Size: 116 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3 - Forks: 0

feqq1/Air-Aware-smart-Air-Quality-prediction-system

🌍 Monitor and forecast air quality efficiently with AI-driven analytics and interactive dashboards for informed decision-making.

Language: Jupyter Notebook - Size: 4.87 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

raydac/jcp-ai

Connectors for Java Comment Preprocessor (JCP) to work with LLM clients

Language: Java - Size: 1.07 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4 - Forks: 0

jbferet/preprocs2

preprocS2 is an R package dedicated to basic preprocessing of Sentinel-2 Level-2A reflectance images.

Language: R - Size: 22.7 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3 - Forks: 0

Chau2873/UIU-DataMining-Lab

📊 Explore data mining concepts and hands-on Python examples with exercises for the UIU Data Mining Course. Enhance your skills in ML and data visualization.

Language: Jupyter Notebook - Size: 396 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

DaveBNU/cortexai

Language: JavaScript - Size: 256 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

yassineahmed/preq

preq is the community-driven problem detector for Common Reliability Enumerations (CREs).

Language: Go - Size: 79.1 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

fkie-cad/Logprep

log data pre processing, generation and shipping in python

Language: Python - Size: 9.62 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 34 - Forks: 10

Renen343/ai-flavor-remover

🌟 Enhance your text by removing AI-generated flavors, making it more natural and engaging while preserving the original meaning.

Size: 7.81 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

Mohid-Water-Modelling-System/MOHID_Jupyter-Notebooks

Jupyter Notebooks for the MOHID Water Modelling System

Language: Fortran - Size: 82.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 1

pratham-ak2004/sms-spam-classifier

This repository is deployed in the web with the help of streamlit web host service

Language: Jupyter Notebook - Size: 801 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

nipreps/nifreeze

A flexible framework for volume-wise artifact estimation and correction across multiple 4D neuroimaging modalities (diffusion MRI, functional MRI, and PET)

Language: Python - Size: 128 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 5 - Forks: 5

nipreps/dmriprep

dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.

Language: Python - Size: 115 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 71 - Forks: 25

Davide011/ML_project_South_African_Heart_Disease

Public Repository: Machine Learning & Data Mining project using the South African Heart Disease dataset. Applied PCA, Regularized Linear Regression, ANN, Logistic Regression, and Decision Trees with cross-validation for regression and classification. Includes feature scaling, EDA, and statistical tests.

Size: 1.32 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

MinishLab/semhash

Fast Semantic Text Deduplication & Filtering

Language: Python - Size: 6.18 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 825 - Forks: 51

songyz2019/hsi-preprocessing-toolkit

A Hyperspectral Image Preprocessing Toolkit from HSI Camera to Machine Learning Dataset

Language: Python - Size: 18.1 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

sappelhoff/pyprep

PyPREP: A Python implementation of the Preprocessing Pipeline (PREP) for EEG data

Language: Python - Size: 26 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 161 - Forks: 35

zamirmehdi/Data-and-Information-Analysis-Course

Implementation of data analysis algorithms — normalization, outlier detection, and K-Means — from scratch in Python

Language: Jupyter Notebook - Size: 1.85 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 0

winedarksea/AutoTS

Automated Time Series Forecasting

Language: Python - Size: 50.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1,330 - Forks: 117

jamal919/pycaz

Collection of functions for data analysis, model input preparation, post-processing, analysis.

Language: Python - Size: 1.12 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 5 - Forks: 2

Gersha2024/Alzheimer-MRI-Preprocessing-FreeSurfer-SliceSelection-DeepLearning-TransferLearning-EnsembleLearning

🧠 Detect Alzheimer's disease using MRI scans with transfer learning, deep learning, and ensemble methods for accurate stage classification and progression prediction.

Language: Python - Size: 38.1 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

KeeVeeGames/Shady.gml

GameMaker shader preprocessor for code reuse! Import and inline directives, generating shader variants.

Language: C# - Size: 41.3 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 24 - Forks: 1

qd-cae/awesome-CAE

A curated list of awesome CAE frameworks, libraries and software.

Size: 57.6 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 443 - Forks: 109

Awais-Asghar/SkinSense-Multi-Model-Skin-Cancer-Classifier

A machine learning project for binary classification of skin cancer as malignant or benign, utilizing models like XGBoost, LGBM Classifier, Adaboost, SVM, and Logistic Regression. Features comprehensive data preprocessing, model training, and evaluation for accurate diagnosis.

Language: Jupyter Notebook - Size: 8.56 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

calvinmccarter/kditransform

Kernel density integral transformation: feature preprocessing and univariate clustering (TMLR, 2023)

Language: Python - Size: 15.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 9 - Forks: 0

TheAlgorithms/R

Collection of various algorithms implemented in R.

Language: R - Size: 1.37 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,036 - Forks: 342

OpenGene/fastp

An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)

Language: C++ - Size: 691 KB - Last synced at: 7 days ago - Pushed at: 17 days ago - Stars: 2,204 - Forks: 354

SaadatMilad1792/PhysioPrep

A preprocessing pipeline for physiological waveform datasets.

Language: Python - Size: 6.13 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2 - Forks: 0

EttoreRocchi/MaldiAMRKit

Comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes

Language: Python - Size: 7.35 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2 - Forks: 0

labex-labs/scikit-learn-for-beginners

This comprehensive course covers the fundamental concepts and practical techniques of Scikit-learn, the essential machine learning library in Python. Learn to build, train, and evaluate machine learning models using various algorithms and preprocessing techniques.

Size: 37.1 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

itsatefe/RoboShiraz-AI-Basics

An educational tutorial for beginners to learn the fundamentals of Artificial Intelligence and Machine Learning using Python.

Language: Jupyter Notebook - Size: 9.09 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

jbusecke/xMIP

Analysis ready CMIP6 data in python the easy way with pangeo tools.

Language: Jupyter Notebook - Size: 20.4 MB - Last synced at: 3 days ago - Pushed at: 25 days ago - Stars: 204 - Forks: 44

geometric-intelligence/polpo

A Geometric Intelligence Lab's collection of weakly-related tools.

Language: Python - Size: 165 MB - Last synced at: 13 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 1

l-ramirez-lopez/prospectr

R package: Misc. Functions for Processing and Sample Selection of Spectroscopic Data

Language: R - Size: 17.4 MB - Last synced at: 10 days ago - Pushed at: 14 days ago - Stars: 44 - Forks: 21

veldhub/veld_chain__demo_nlp_generic_preprocessing

Demo of encapsulation of several commonly used NLP preprocessing workflows

Size: 300 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

veldhub/veld_code__nlp_generic_preprocessing

Encapsulation of several commonly used NLP preprocessing workflows

Language: Python - Size: 122 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

ikegami-yukino/jaconv

Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku

Language: Python - Size: 369 KB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 335 - Forks: 32

4t2de/disease-classification-ml

Prototype-Based Classifier for Disease Classification

Language: Jupyter Notebook - Size: 6.46 MB - Last synced at: about 5 hours ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

pawlyk/dsml-tools

set of Data Science and Machine Learning tools

Language: Python - Size: 265 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

autoreject/autoreject

Automated rejection and repair of bad trials/sensors in M/EEG

Language: Python - Size: 704 KB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 146 - Forks: 59

Mecanik/Modern-Text-Tokenizer

Modern UTF-8 aware C++ tokenizer with vocabulary support, ideal for NLP and transformer models. Header-only and zero-dependency.

Language: C++ - Size: 42 KB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

elcorto/pwtools

pwtools is a Python package for pre- and postprocessing of atomistic calculations, mostly targeted to Quantum Espresso, CPMD, CP2K and LAMMPS. It is almost, but not quite, entirely unlike ASE, with some tools extending numpy/scipy. It has a set of powerful parsers and data types for storing calculation data.

Language: Python - Size: 21.3 MB - Last synced at: 9 days ago - Pushed at: 5 months ago - Stars: 71 - Forks: 17

yosina-lib/yosina

Yosina is a transliteration library deals with the letters and symbols used in Japanese writing.

Language: Rust - Size: 1.96 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 19 - Forks: 1

AnaAquiles/MiniscopePipeLine

Preprocessing functions for Inscopix movies

Language: Jupyter Notebook - Size: 9.77 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

mlr-org/mlr3pipelines

Dataflow Programming for Machine Learning in R

Language: R - Size: 23.4 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 146 - Forks: 29

shawntz/eyeris

Fully featured R package for reproducible pupillometry preprocessing | Interactive reports, BIDS-compliant, High-throughput database tooling out-of-the-box | Developed by neuroscientists at Stanford

Language: R - Size: 106 MB - Last synced at: 10 days ago - Pushed at: 24 days ago - Stars: 5 - Forks: 4

k-vashpanova/rsl-slp

Модель перевода с русского языка на русский жестовый язык

Language: Python - Size: 33.2 KB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

speg03/jiren

jinja2 template renderer

Language: Python - Size: 315 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 2 - Forks: 0

moosmann/matlab

Data reconstruction and analysis tools for tomography data acquired at the P05 Imaging Beamline (IBL) and the P07 High-Energy Material Science (HEMS) beamline at PETRA III at DESY, both operated by Helmholtz-Zentrum Hereon.

Language: MATLAB - Size: 20.4 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 9 - Forks: 7

Chandrashekar0123/Students_Passout_Predictions

This Repository consists of Students pass out or fail using Machine Learning Techniques.

Language: Jupyter Notebook - Size: 948 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 1 - Forks: 0

LaurentDardenne/Template

Code generation by using text templates

Language: PowerShell - Size: 170 KB - Last synced at: 19 days ago - Pushed at: over 8 years ago - Stars: 2 - Forks: 0

DmitryRyumin/OpenAV

An open-source library for recognition of speech commands in the user dictionary using audiovisual data of the speaker

Language: Python - Size: 113 MB - Last synced at: 20 days ago - Pushed at: 8 months ago - Stars: 6 - Forks: 3

pytorch/torcharrow 📦

High performance model preprocessing library on PyTorch

Language: Python - Size: 11.3 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 644 - Forks: 81

OpenTabular/DeepTabular

Mambular is a Python package that simplifies tabular deep learning by providing a suite of models for regression, classification, and distributional regression tasks. It includes models such as Mambular, TabM, FT-Transformer, TabulaRNN, TabTransformer, and tabular ResNets.

Language: Python - Size: 9.06 MB - Last synced at: 25 days ago - Pushed at: 3 months ago - Stars: 264 - Forks: 16

pharo-ai/data-preprocessing

Project including data pre-processing algo. We aim to include scaling, centering, normalization, binarization methods.

Language: Smalltalk - Size: 32.2 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 1

keurfonluu/toughio

Pre- and post-processing Python library for TOUGH

Language: Python - Size: 18.6 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 66 - Forks: 9

PylarBear/pybear

pybear is a Python computing library that augments data analytics functionality found in popular packages that use the scikit-learn API, such as scikit-learn and xgboost.

Language: Python - Size: 50.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

AlwaysDhruv/Image-Classification-CPP

Hi their my self Dhruv. So this repository or project are developed on C++ and Python for image recognize. C++ are main engine and python are work preprocessing only. more information are in README file.

Language: C++ - Size: 1.06 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

oshinrathor/ML-NLP-Projects

This repository contains a collection of Machine Learning and NLP projects, including sentiment analysis with NLTK, text preprocessing, and deep learning models. It covers techniques like tokenization, stopword removal, lemmatization, rule-based analysis, and transformer models like BERT for practical NLP applications.

Language: Jupyter Notebook - Size: 2.85 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

OpenTabular/PreTab

pretab is a flexible and extensible preprocessing library for tabular data, built on top of scikit-learn. It provides advanced transformations, spline and neural feature expansions, and seamless integration with embeddings – all designed for modern tabular ML workflows.

Language: Python - Size: 113 KB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 11 - Forks: 1

obtic-sorbonne/Toolbox-site

Pandore offers a set of tools that facilitate the most common corpus processing tasks for digital humanities research. Automatic pipelines for a set of tasks are also available

Language: HTML - Size: 168 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 1

Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

Language: HTML - Size: 193 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 12,775 - Forks: 1,042

0xferit/ITU-Turkish-NLP-Pipeline-Caller 📦

A Python3 wrapper tool to help using ITU Turkish NLP Pipeline API -- UNMAINTAINED --

Language: Python - Size: 131 KB - Last synced at: about 18 hours ago - Pushed at: over 7 years ago - Stars: 45 - Forks: 9

anlijun/awesome-CAE-software

A curated list of awesome CAE frameworks, libraries, and software from a full CAE workflow perspective, including the integration of AI technologies.

Size: 324 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 18 - Forks: 0

veldhub/veld_code__wordembeddings_preprocessing

Code velds encapsulating preprocessing for training of wordembeddings.

Language: Python - Size: 51.8 KB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

NirLab-TAU/sleepeegpy

Language: Jupyter Notebook - Size: 166 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 33 - Forks: 12

Ashly1991/rnn-text-classification-tf2

IMDB sentiment analysis with a from-scratch RNN in low-level TensorFlow 2 (no Keras RNN layers). Padding/truncation, vocab limits, and BPTT training.

Language: Jupyter Notebook - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

exponentialR/QUB-HRI

Preprocessing Repository of QUB-Perception of Human Enagagement in Assembly Operations Dataset

Language: Python - Size: 91 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

lennymalard/melpy-project

A NumPy-based deep learning library for building neural networks. It features an automatic differentiation engine and supports training models like LSTM, CNN, and FNN.

Language: Python - Size: 159 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

kashinathbiradar/Bangalore-Housing-Price-Prediction

The objective of the project is to create a machine learning model. We are doing a supervised learning and our aim is to do predictive analysis to predict housing price.

Language: HTML - Size: 84 KB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

AxeldeRomblay/MLBox

MLBox is a powerful Automated Machine Learning python library.

Language: Python - Size: 50 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 1,518 - Forks: 273

Hyedryn/elikopy

ElikoPy is Python library aiming at easing the processing of diffusion imaging for microstructural analysis.

Language: Python - Size: 4.52 MB - Last synced at: 14 days ago - Pushed at: 3 months ago - Stars: 19 - Forks: 5

KinWaiCheuk/nnAudio

Audio processing by using pytorch 1D convolution network

Language: Python - Size: 94.7 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1,085 - Forks: 96

Alaeddin-B/Movie-Rental-Durations-Predictor

A comprehensive machine learning project to predict the number of days customers will rent DVDs based on movie features and rental characteristics, enabling optimized inventory planning for rental businesses.

Language: Jupyter Notebook - Size: 981 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

EttoreRocchi/combatlearn

The ComBat algorithm for a learning framework (scikit-learn compatible)

Language: Python - Size: 2.44 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

mervewhereubeen/VCTS

Görsel semptom tabanlı erken uyarı (hibrit: klasik + DL) — FastAPI + SQL — hasta/doktor akışı

Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Stan-create/ODC

This is project about CNN using for classification of osteoporosis, osteopenia pathologies and normal structure of bones

Language: Python - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Shriyaak/MachineLearning.studyjournal.1st

Language: Jupyter Notebook - Size: 246 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

gtkacz/undergrad_thesis

Code for my undergraduate thesis: Quantitative Analysis of the Impact of Image Pre-Processing on the Accuracy of Computer Vision Models Trained to Identify Dermatological Skin Diseases

Language: Jupyter Notebook - Size: 2.96 GB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

stanstrup/QC4Metabolomics

QC systems for metabolomics studies

Language: R - Size: 351 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 10 - Forks: 0

stefantaubert/english-text-normalization

Command-line interface (CLI) and library to normalize English texts.

Language: Python - Size: 235 KB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

NVIDIA-Merlin/NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Language: Python - Size: 98.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1,100 - Forks: 144

TapasviNomula/Air-Aware-smart-Air-Quality-prediction-system

AI-driven Air Quality Monitoring & Forecasting Project 🚀 | Air Quality Data Preprocessing, EDA, Visualization & Forecast-ready Dashboard

Language: Jupyter Notebook - Size: 3.97 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

bcrist/limp

Lua Inline Metaprogramming Preprocessor

Language: C - Size: 938 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

0xyydeca/arduino

🚀 C++ Machine Learning Project: Digit Recognition with Support Vector Machine (SVM) 🖥️ This project is a robust implementation of digit recognition using Support Vector Machine (SVM) in C++. The SVM algorithm, a powerful supervised learning technique, is employed to classify handwritten digits from the famous MNIST dataset.

Language: C++ - Size: 188 KB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

LuongHuuPhuc/LibBuild_techC

How static & dynamic/shared library in C/C++ work

Language: SWIG - Size: 409 KB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

google/tensorflow-recorder 📦

TFRecorder makes it easy to create TensorFlow records (TFRecords) from Pandas DataFrames and CSVs files containing images or structured data.

Language: Python - Size: 6.54 MB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 180 - Forks: 32

SyncfusionExamples/Creating-the-WPF-chart-with-Azure-open-AI-for-Data-Cleaning-and-Preprocessing

This article illustrates how to clean and preprocess data using Azure OpenAI in conjunction with a Syncfusion WPF chart.

Language: C# - Size: 17.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Awais11227/loan_prediction_Analysis

Predict loan approvals using the Kaggle Loan Prediction dataset. This project covers data preprocessing, exploratory data analysis (EDA), feature engineering, and building machine learning models to classify loan approval status.

Language: Jupyter Notebook - Size: 314 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

jknafou/TransCorpus

TransCorpus is a scalable toolkit for large-scale, parallel translation and preprocessing of text corpora, built for language model pretraining and research.

Language: Python - Size: 5.92 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

bids-apps/HCPPipelines

A BIDS App for minimal preprocessing using the HCP Pipelines

Language: Python - Size: 152 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 37 - Forks: 31

ropensci/MODIStsp

An "R" package for automatic download and preprocessing of MODIS Land Products Time Series

Language: R - Size: 180 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 159 - Forks: 53

preprocessy/preprocessy

Python package for Customizable Data Preprocessing Pipelines

Language: Jupyter Notebook - Size: 993 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 44 - Forks: 14

danpacho/obsidian_blog

🔨 Plugin based post preprocessing & CI/CD tool for obsidian

Language: TypeScript - Size: 95.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

bids-apps/freesurfer

BIDS app wrapping recon-all from FreeSurfer

Language: Python - Size: 223 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 43 - Forks: 35

karakatic/EvoPreprocess

A Python Toolkit for Data Preprocessing with Evolutionary and Nature-Inspired Algorithms.

Language: Python - Size: 129 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 3

Related Keywords
preprocessing 1,519 machine-learning 458 python 420 data-science 183 nlp 166 pandas 130 deep-learning 113 classification 112 data-visualization 81 data 81 data-analysis 80 numpy 79 eda 71 python3 71 sklearn 66 feature-engineering 66 logistic-regression 65 visualization 63 natural-language-processing 61 tensorflow 60 linear-regression 60 dataset 60 scikit-learn 56 exploratory-data-analysis 56 random-forest 55 matplotlib 54 machine-learning-algorithms 51 data-cleaning 51 regression 48 clustering 48 jupyter-notebook 47 data-mining 46 sentiment-analysis 42 seaborn 42 keras 40 image-processing 40 pytorch 38 neural-network 37 pipeline 34 r 34 nltk 34 feature-extraction 33 computer-vision 32 analysis 31 svm 31 ml 30 neural-networks 30 ai 30 artificial-intelligence 29 supervised-learning 28 cnn 28 preprocessor 27 decision-trees 26 xgboost 25 svm-classifier 25 datascience 24 nlp-machine-learning 24 prediction 23 normalization 22 feature-selection 22 pca 22 time-series 22 predictive-modeling 22 kaggle 21 streamlit 21 text-processing 20 knn-classification 20 statistics 20 text-classification 20 tf-idf 19 naive-bayes-classifier 19 eeg 19 preprocessing-data 19 knn 18 datacleaning 18 pca-analysis 18 opencv 18 lemmatization 18 tokenizer 17 text-mining 17 text 17 confusion-matrix 17 tokenization 17 java 17 kmeans-clustering 17 css 16 regression-models 16 word2vec 16 html 15 postprocessing 15 data-preprocessing 15 hyperparameter-tuning 15 random-forest-classifier 15 mri 15 unsupervised-learning 15 neuroimaging 15 pandas-dataframe 15 cross-validation 15 outlier-detection 15 flask 14