An open API service providing repository metadata for many open source software ecosystems.

Topic: "preprocessing"

Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

Language: HTML - Size: 194 MB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 13,460 - Forks: 1,111

dongrixinyu/JioNLP

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Language: Python - Size: 162 MB - Last synced at: 3 days ago - Pushed at: 27 days ago - Stars: 3,785 - Forks: 446

nidhaloff/igel

a delightful machine learning tool that allows you to train, test, and use models without writing code

Language: Python - Size: 18.8 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 3,127 - Forks: 193

OpenGene/fastp

An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)

Language: C++ - Size: 708 KB - Last synced at: 14 days ago - Pushed at: about 1 month ago - Stars: 2,243 - Forks: 364

AxeldeRomblay/MLBox

MLBox is a powerful Automated Machine Learning python library.

Language: Python - Size: 50 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1,518 - Forks: 273

sunlabuiuc/PyHealth

A Deep Learning Python Toolkit for Healthcare Applications.

Language: Python - Size: 131 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 1,349 - Forks: 548

winedarksea/AutoTS

Automated Time Series Forecasting

Language: Python - Size: 47.6 MB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 1,334 - Forks: 117

NVIDIA-Merlin/NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Language: Python - Size: 98.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1,100 - Forks: 144

KinWaiCheuk/nnAudio

Audio processing by using pytorch 1D convolution network

Language: Python - Size: 94.7 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 1,085 - Forks: 96

TheAlgorithms/R

Collection of various algorithms implemented in R.

Language: R - Size: 1.37 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1,036 - Forks: 342

MinishLab/semhash

Fast Semantic Text Deduplication & Filtering

Language: Python - Size: 6.18 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 825 - Forks: 51

pytorch/torcharrow 📦

High performance model preprocessing library on PyTorch

Language: Python - Size: 11.3 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 644 - Forks: 81

qd-cae/awesome-CAE

A curated list of awesome CAE frameworks, libraries and software.

Size: 57.6 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 443 - Forks: 109

R1j1t/contextualSpellCheck

✔️Contextual word checker for better suggestions (not actively maintained)

Language: Python - Size: 2.45 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 417 - Forks: 64

msamogh/nonechucks

Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!

Language: Python - Size: 25.4 KB - Last synced at: 23 days ago - Pushed at: over 3 years ago - Stars: 378 - Forks: 27

MaxHalford/xam

:dart: Personal data science and machine learning toolbox

Language: Python - Size: 1.12 MB - Last synced at: 4 months ago - Pushed at: almost 6 years ago - Stars: 365 - Forks: 75

DataCanvasIO/HyperGBM

A full pipeline AutoML tool for tabular data

Language: Python - Size: 11 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 355 - Forks: 47

ikegami-yukino/jaconv

Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku

Language: Python - Size: 379 KB - Last synced at: 22 days ago - Pushed at: 25 days ago - Stars: 336 - Forks: 32

advaitsave/Introduction-to-Time-Series-forecasting-Python

Introduction to time series preprocessing and forecasting in Python using AR, MA, ARMA, ARIMA, SARIMA and Prophet model with forecast evaluation.

Language: Jupyter Notebook - Size: 2.02 MB - Last synced at: 9 months ago - Pushed at: about 7 years ago - Stars: 323 - Forks: 138

cylondata/cylon

Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.

Language: C++ - Size: 10.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 293 - Forks: 44

ikegami-yukino/neologdn

Japanese text normalizer for mecab-neologd

Language: Cython - Size: 649 KB - Last synced at: 20 days ago - Pushed at: 22 days ago - Stars: 287 - Forks: 20

OpenTabular/DeepTab

DeepTab is a Python package that simplifies tabular deep learning by providing a suite of models for regression, classification, and distributional regression tasks. It includes models such as Mambular, TabM, FT-Transformer, TabulaRNN, TabTransformer, and tabular ResNets.

Language: Python - Size: 9.15 MB - Last synced at: 8 days ago - Pushed at: 18 days ago - Stars: 283 - Forks: 19

nlpcl-lab/ace2005-preprocessing

ACE 2005 corpus preprocessing for Event Extraction task

Language: Python - Size: 45.9 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 280 - Forks: 71

dunky11/voicesmith

[WIP] VoiceSmith makes training text to speech models easy.

Language: Python - Size: 57 MB - Last synced at: 8 months ago - Pushed at: about 3 years ago - Stars: 224 - Forks: 32

Deffro/text-preprocessing-techniques

16 Text Preprocessing Techniques in Python for Twitter Sentiment Analysis.

Language: Python - Size: 2.36 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 217 - Forks: 82

jbusecke/xMIP

Analysis ready CMIP6 data in python the easy way with pangeo tools.

Language: Jupyter Notebook - Size: 20.4 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 204 - Forks: 43

free-astro/siril

The Siril image processing software for amateur astronomy

Last synced at: 1 day ago - Stars: 186 - Forks: 109

google/tensorflow-recorder 📦

TFRecorder makes it easy to create TensorFlow records (TFRecords) from Pandas DataFrames and CSVs files containing images or structured data.

Language: Python - Size: 6.54 MB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 181 - Forks: 31

quqixun/BrainPrep 📦

Preprocessing pipeline on Brain MR Images through FSL and ANTs, including registration, skull-stripping, bias field correction, enhancement and segmentation.

Language: Python - Size: 43.7 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 172 - Forks: 51

sappelhoff/pyprep

PyPREP: A Python implementation of the Preprocessing Pipeline (PREP) for EEG data

Language: Python - Size: 26 MB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 167 - Forks: 35

ropensci/MODIStsp

An "R" package for automatic download and preprocessing of MODIS Land Products Time Series

Language: R - Size: 180 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 159 - Forks: 53

Razor12911/xtool 📦

Just some tool repackers like to use...

Language: Pascal - Size: 22.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 152 - Forks: 11

githubharald/DeslantImg

The deslanting algorithm sets text upright in images. Python, C++ and OpenCL implementations provided.

Language: C++ - Size: 591 KB - Last synced at: 7 months ago - Pushed at: about 4 years ago - Stars: 150 - Forks: 38

mlr-org/mlr3pipelines

Dataflow Programming for Machine Learning in R

Language: R - Size: 24.9 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 148 - Forks: 28

autoreject/autoreject

Automated rejection and repair of bad trials/sensors in M/EEG

Language: Python - Size: 704 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 147 - Forks: 59

jaeho3690/LIDC-IDRI-Preprocessing

This is the preprocessing step of the LIDC-IDRI dataset

Language: Jupyter Notebook - Size: 1.84 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 140 - Forks: 39

chakki-works/chariot

Deliver the ready-to-train data to your NLP model.

Language: Jupyter Notebook - Size: 5.61 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 122 - Forks: 9

calebevans/cordon

Reduce logs to their semantic anomalies

Language: Python - Size: 14.9 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 120 - Forks: 5

KananVyas/BoxDetection

A Box detection algorithm for any image containing boxes.

Language: Jupyter Notebook - Size: 411 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 118 - Forks: 53

lozuwa/impy

Impy is a Python3 library with features that help you in your computer vision tasks.

Language: Python - Size: 91.4 MB - Last synced at: 9 months ago - Pushed at: almost 7 years ago - Stars: 116 - Forks: 32

chrise96/3D_Ground_Segmentation

A ground segmentation algorithm for 3D point clouds based on the work described in “Fast segmentation of 3D point clouds: a paradigm on LIDAR data for Autonomous Vehicle Applications”, D. Zermas, I. Izzat and N. Papanikolopoulos, 2017. Distinguish between road and non-road points. Road surface extraction. Plane fit ground filter

Language: C++ - Size: 2.91 MB - Last synced at: 9 months ago - Pushed at: almost 4 years ago - Stars: 108 - Forks: 14

methlabUZH/automagic

Automagic

Language: MATLAB - Size: 414 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 104 - Forks: 32

acroucher/PyTOUGH

A Python library for automating TOUGH2 simulations of subsurface fluid and heat flow

Language: Python - Size: 40.4 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 102 - Forks: 38

kharchenkolab/dropEst

Pipeline for initial analysis of droplet-based single-cell RNA-seq data

Language: C++ - Size: 47.1 MB - Last synced at: 9 days ago - Pushed at: over 3 years ago - Stars: 95 - Forks: 42

GiftMungmeeprued/document-parsers-list

A comprehensive list of document parsers, covering PDF-to-text conversion and layout extraction. Each tested for support of tables, equations, handwriting, two-column layouts, and multi-column layouts.

Size: 4.25 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 94 - Forks: 1

MLD3/FIDDLE

FlexIble Data-Driven pipeLinE – a preprocessing pipeline that transforms structured EHR data into feature vectors to be used with ML algorithms. https://doi.org/10.1093/jamia/ocaa139

Language: Jupyter Notebook - Size: 6.41 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 94 - Forks: 19

madyankin/postcss-each 📦

PostCSS plugin to iterate through values

Language: JavaScript - Size: 581 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 94 - Forks: 19

VisLab/EEG-Clean-Tools

Contains tools for EEG standardized preprocessing

Language: MATLAB - Size: 4.32 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 92 - Forks: 30

damianhorna/multi-imbalance

Python package for tackling multi-class imbalance problems. http://www.cs.put.poznan.pl/mlango/publications/multiimbalance/

Language: Python - Size: 66 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 79 - Forks: 12

Yu-Group/veridical-flow

Making it easier to build stable, trustworthy data-science pipelines based on the PCS framework.

Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 72 - Forks: 8

nipreps/dmriprep

dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.

Language: Python - Size: 115 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 71 - Forks: 25

elcorto/pwtools

pwtools is a Python package for pre- and postprocessing of atomistic calculations, mostly targeted to Quantum Espresso, CPMD, CP2K and LAMMPS. It is almost, but not quite, entirely unlike ASE, with some tools extending numpy/scipy. It has a set of powerful parsers and data types for storing calculation data.

Language: Python - Size: 21.3 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 71 - Forks: 17

keurfonluu/toughio

Pre- and post-processing Python library for TOUGH

Language: Python - Size: 18.3 MB - Last synced at: 15 days ago - Pushed at: 19 days ago - Stars: 67 - Forks: 10

ALebrun-108/BoxSERS

Python package that provides a full range of functionality to process and analyze vibrational spectra (Raman, SERS, FTIR, etc.).

Language: Jupyter Notebook - Size: 20 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 66 - Forks: 15

ildoonet/remote-dataloader

PyTorch DataLoader processed in multiple remote computation machines for heavy data processings

Language: Python - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 66 - Forks: 2

hirofumi0810/asr_preprocessing

Python implementation of pre-processing for End-to-End speech recognition

Language: Python - Size: 1.67 MB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 66 - Forks: 22

gregversteeg/gaussianize

Transforms univariate data into normally distributed data

Language: Python - Size: 121 KB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 63 - Forks: 24

wajuqi/Sentinel-1-preprocessing-using-Snappy

Sentinel-1 image pre-processing using snappy.

Language: Python - Size: 17.6 KB - Last synced at: almost 3 years ago - Pushed at: almost 4 years ago - Stars: 63 - Forks: 22

AlessioZanga/PyEEGLab 📦

Analyze and manipulate EEG data using PyEEGLab.

Language: Python - Size: 1.04 GB - Last synced at: 27 days ago - Pushed at: about 5 years ago - Stars: 62 - Forks: 23

TakeLab/podium

Podium: a framework agnostic Python NLP library for data loading and preprocessing

Language: Python - Size: 2.19 MB - Last synced at: 27 days ago - Pushed at: about 3 years ago - Stars: 60 - Forks: 2

YuxinZhaozyx/pytorch-VideoDataset

Tools for loading video dataset and transforms on video in pytorch. You can directly load video files without preprocessing.

Language: Python - Size: 7.81 KB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 58 - Forks: 16

lucasrla/wsi-preprocessing

Simple library for preprocessing histopathological whole-slide images (WSI) into tiles (a.k.a. patches) towards deep learning

Language: Python - Size: 18.6 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 55 - Forks: 14

taknev83/pywedge

Makes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking

Language: Jupyter Notebook - Size: 9.62 MB - Last synced at: 27 days ago - Pushed at: about 4 years ago - Stars: 55 - Forks: 10

MASILab/PreQual

An automated pipeline for integrated preprocessing and quality assurance of diffusion weighted MRI images

Language: Python - Size: 396 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 52 - Forks: 10

olivierhagolle/Start_maja

To process a Sentinel-2 time series with MAJA cloud detection and atmospheric correction processor

Language: Python - Size: 483 MB - Last synced at: 4 months ago - Pushed at: almost 6 years ago - Stars: 50 - Forks: 15

VincentStimper/mclahe

NumPy and Tensorflow implementation of the Multidimensional Contrast Limited Adaptive Histogram Equalization (MCLAHE) procedure

Language: Python - Size: 16.8 MB - Last synced at: 2 days ago - Pushed at: over 3 years ago - Stars: 49 - Forks: 6

paulross/cpip

CPIP - a C/C++ preprocessor implemented in Python.

Language: Python - Size: 37.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 46 - Forks: 4

nlgranger/SeqTools

A python library to manipulate and transform indexable data (lists, arrays, ...)

Language: Python - Size: 1.56 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 46 - Forks: 4

SilentFlame/Named-Entity-Recognition

Corpus and a baseline neural network system for Named Entity Recognition in Hindi-English Code-Mixed social media text.

Language: Python - Size: 29.2 MB - Last synced at: 9 months ago - Pushed at: about 5 years ago - Stars: 45 - Forks: 16

0xferit/ITU-Turkish-NLP-Pipeline-Caller 📦

A Python3 wrapper tool to help using ITU Turkish NLP Pipeline API -- UNMAINTAINED --

Language: Python - Size: 131 KB - Last synced at: about 2 months ago - Pushed at: over 7 years ago - Stars: 45 - Forks: 9

preprocessy/preprocessy

Python package for Customizable Data Preprocessing Pipelines

Language: Jupyter Notebook - Size: 993 KB - Last synced at: 20 days ago - Pushed at: 23 days ago - Stars: 44 - Forks: 14

l-ramirez-lopez/prospectr

R package: Misc. Functions for Processing and Sample Selection of Spectroscopic Data

Language: R - Size: 17.4 MB - Last synced at: 15 days ago - Pushed at: 2 months ago - Stars: 44 - Forks: 21

bids-apps/freesurfer

BIDS app wrapping recon-all from FreeSurfer

Language: Python - Size: 224 KB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 43 - Forks: 35

karakurai/visual_inspection

An application for visual inspection written in Python, running on Windows, Linux, and macOS. This software enables high-performance visual inspection even with an inexpensive web camera. No GPU machine required. It is possible to automate the inspection in a factory.

Language: Python - Size: 9.21 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 42 - Forks: 13

data-science-lab-amsterdam/skippa

SciKIt-learn Pipeline in PAndas

Language: Python - Size: 423 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 42 - Forks: 1

Aura-healthcare/ecg_qc

A library to compute ECG signal quality indicators

Language: Jupyter Notebook - Size: 50.4 MB - Last synced at: 27 days ago - Pushed at: about 3 years ago - Stars: 42 - Forks: 10

OanaIgnat/I3D_Keras

I3D implemetation in Keras + video preprocessing + visualization of results

Language: Python - Size: 83 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 41 - Forks: 10

TextDatasetCleaner/TextDatasetCleaner

🔬 Очистка датасетов от мусора (нормализация, препроцессинг)

Language: Python - Size: 72.3 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 40 - Forks: 10

ag-ds-bubble/swtloc

Python package for Stroke Width Transform - Localizing the Text (Letters & Words) in a Natural Image

Language: Python - Size: 126 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 39 - Forks: 6

ParkerICI/premessa

R package for pre-processing of mass and flow cytometry data

Language: R - Size: 247 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 39 - Forks: 23

Puneet2000/In-Depth-ML

In depth machine learning resources

Language: Jupyter Notebook - Size: 130 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 38 - Forks: 16

bids-apps/HCPPipelines

A BIDS App for minimal preprocessing using the HCP Pipelines

Language: Python - Size: 153 KB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 37 - Forks: 31

raj-sutariya/indic-num2words

Python library for converting numbers to words for all Indian Languages.

Language: Python - Size: 117 KB - Last synced at: 14 days ago - Pushed at: 7 months ago - Stars: 37 - Forks: 13

SIMEXP/load_confounds 📦

Load fMRIprep confounds in python

Language: Python - Size: 3.15 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 37 - Forks: 12

Clearailhc/ACE2005-toolkit

Focusing on ACE 2005 data preprocessing, we provide doc-level, sentence-level and BIO-style golden data preprocessing, the only thing you need is the ACE05 row data. Hope you enjoy!😎

Language: Python - Size: 46.6 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 37 - Forks: 6

rachellea/ct-volume-preprocessing

End-to-end Python CT volume preprocessing pipeline to convert raw DICOMs into clean 3D numpy arrays for ML. From paper Draelos et al. "Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes."

Language: Python - Size: 25.4 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 36 - Forks: 15

allenai/smashed

SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batching, and more. Supports datasets from Huggingface, torchdata iterables, or simple lists of dictionaries.

Language: Python - Size: 4.56 MB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 35 - Forks: 5

FareedKhan-dev/Most-powerful-NLP-library

Gemini, as capable as GPT-4, provides a free API with limited access. I tested it with the help of prompt engineering and found that it can solve almost any NLP task you want to tackle.

Language: Jupyter Notebook - Size: 107 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 35 - Forks: 9

fitushar/Brain-Tissue-Segmentation-Using-Deep-Learning-Pipeline-NeuroNet

This Repository is for the MISA Course final project which was Brain tissue segmentation. we adopt NeuroNet which is a comprehensive brain image segmentation tool based on a novel multi-output CNN architecture which has been trained and tuned using IBSR18 dataset

Language: Jupyter Notebook - Size: 5.16 MB - Last synced at: 8 months ago - Pushed at: over 5 years ago - Stars: 35 - Forks: 9

huseinzol05/Machine-Learning-Data-Science-Reuse 📦

Gathers machine learning and data science techniques for problem solving.

Language: Jupyter Notebook - Size: 38.1 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 35 - Forks: 32

fkie-cad/Logprep

log data pre processing, generation and shipping in python

Language: Python - Size: 10.3 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 34 - Forks: 10

daniellwdb/roka

🤖 Rise of Kingdoms bot to manage kingdom titles and DKP through Discord.

Language: TypeScript - Size: 35.6 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 34 - Forks: 16

maruedt/chemometrics

Python library for chemometric data analysis

Language: Python - Size: 37.9 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 34 - Forks: 6

NirLab-TAU/sleepeegpy

Language: Jupyter Notebook - Size: 166 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 33 - Forks: 12

hellosunking/Ktrim

Ktrim: an extra-fast and accurate adapter- and quality-trimmer for sequencing data

Language: C++ - Size: 336 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 32 - Forks: 7

JuliaML/MLLabelUtils.jl

Utility package for working with classification targets and label-encodings

Language: Julia - Size: 170 KB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 31 - Forks: 13

hscspring/pnlp

NLP预/后处理工具。

Language: Python - Size: 106 KB - Last synced at: 27 days ago - Pushed at: 9 months ago - Stars: 30 - Forks: 6

intuition-dev/INTUITION

Intuition v1. CLI for Pug, CRUD and docs/blogs as staticGen, and much more.

Size: 197 MB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 29 - Forks: 3

SudhakarKuma/Machine_Learning

A repository of resources for understanding the concepts of machine learning/deep learning. 

Language: Jupyter Notebook - Size: 615 MB - Last synced at: almost 3 years ago - Pushed at: over 4 years ago - Stars: 29 - Forks: 26

prat96/FLIR_to_Yolo

This script converts FLIR thermal dataset annotations to YOLO format

Language: Python - Size: 16.6 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 29 - Forks: 4

Related Topics
machine-learning 294 python 279 nlp 115 data-science 113 deep-learning 77 pandas 73 classification 67 data 56 data-analysis 53 data-visualization 52 feature-engineering 45 tensorflow 43 python3 43 numpy 43 sklearn 42 dataset 41 natural-language-processing 40 logistic-regression 38 random-forest 35 eda 35 visualization 33 regression 33 data-mining 32 scikit-learn 29 exploratory-data-analysis 29 neural-network 28 machine-learning-algorithms 28 image-processing 28 keras 28 clustering 27 pytorch 27 linear-regression 27 feature-extraction 25 nltk 25 r 25 pipeline 25 matplotlib 24 sentiment-analysis 23 jupyter-notebook 23 artificial-intelligence 22 seaborn 22 data-cleaning 22 ai 21 preprocessor 21 ml 21 neural-networks 20 computer-vision 20 eeg 18 svm 18 cnn 17 nlp-machine-learning 17 xgboost 17 java 17 svm-classifier 17 postprocessing 17 text-processing 17 statistics 16 supervised-learning 16 decision-trees 15 streamlit 15 time-series 15 opencv 15 normalization 15 prediction 14 datascience 14 text 14 knn 14 analysis 14 knn-classification 14 ensemble-learning 13 predictive-modeling 13 tokenizer 13 kaggle 13 feature-selection 13 preprocessing-data 12 neuroimaging 12 mri 12 datacleaning 12 pca 11 text-classification 11 lemmatization 11 c 11 regression-models 11 tf-idf 11 naive-bayes-classifier 11 twitter 10 matlab 10 css 10 pandas-dataframe 10 text-mining 10 word2vec 10 datamining 10 ocr 10 bioinformatics 9 fmri 9 outlier-detection 9 flask 9 weka 9 data-preprocessing 9 tokenization 9