An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-preparation"

hi-primus/optimus

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Language: Python - Size: 110 MB - Last synced at: about 18 hours ago - Pushed at: 5 months ago - Stars: 1,505 - Forks: 232

skrub-data/skrub

Machine learning with dataframes

Language: Python - Size: 12.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,360 - Forks: 121

NVIDIA/NeMo-Curator

Scalable data pre processing and curation toolkit for LLMs

Language: Jupyter Notebook - Size: 7.66 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 879 - Forks: 124

data-prep-kit/data-prep-kit

Open source project for data preparation of LLM application builders

Language: HTML - Size: 219 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 622 - Forks: 193

developmentseed/label-maker

Data Preparation for Satellite Machine Learning

Language: Python - Size: 18.8 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 465 - Forks: 111

PacktWorkshops/The-Data-Science-Workshop

A New, Interactive Approach to Learning Data Science

Language: Jupyter Notebook - Size: 169 MB - Last synced at: 21 days ago - Pushed at: over 2 years ago - Stars: 226 - Forks: 218

pablo14/data-science-live-book

An open source book to learn data science, data analysis and machine learning, suitable for all ages!

Language: TeX - Size: 58.4 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 215 - Forks: 106

hi-primus/bumblebee

🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)

Language: Vue - Size: 23 MB - Last synced at: 19 days ago - Pushed at: almost 2 years ago - Stars: 141 - Forks: 35

whwu95/MVFNet

【AAAI'2021】MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Language: Python - Size: 20.3 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 136 - Forks: 12

asavinov/prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Language: Python - Size: 1.95 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 90 - Forks: 5

sbcgua/mockup_loader

ABAP unit testing framework, prepare in Excel, reuse in abap code

Language: ABAP - Size: 992 KB - Last synced at: 19 days ago - Pushed at: 2 months ago - Stars: 68 - Forks: 16

Talend/data-prep 📦

OS code of Data-prep project

Language: Java - Size: 67.2 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 56 - Forks: 28

ruchikaverma-iitg/MoNuSAC

This repository contains my implementations of the algorithms which MoNuSAC participants could use for data preparation to train their models at ISBI 2020.

Language: Jupyter Notebook - Size: 137 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 46 - Forks: 11

soumyadip007/Data-Science-Using-Python-University-Course-Module

“Data science” is just about as broad of a term as they come. It may be easiest to describe what it is by listing its more concrete components: Data exploration & analysis. Included here: Pandas; NumPy; SciPy; a helping hand from Python's Standard Library.

Language: Jupyter Notebook - Size: 34.1 MB - Last synced at: 23 days ago - Pushed at: about 5 years ago - Stars: 45 - Forks: 46

Kukuster/SumStatsRehab

GWAS summary statistics files QC tool

Language: Python - Size: 1.87 MB - Last synced at: 16 days ago - Pushed at: 4 months ago - Stars: 38 - Forks: 6

ashish-kamboj/Market-Mix-Modeling

Market Mix Modelling for an eCommerce firm to estimate the impact of various marketing levers on sales

Language: R - Size: 5.05 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 35 - Forks: 28

ELToulemonde/dataPreparation

Data preparation for data science projects.

Language: R - Size: 5.18 MB - Last synced at: about 13 hours ago - Pushed at: almost 2 years ago - Stars: 31 - Forks: 10

neuro-ml/reskit

A library for creating and curating reproducible pipelines for scientific and industrial machine learning

Language: Jupyter Notebook - Size: 36.4 MB - Last synced at: 2 days ago - Pushed at: almost 8 years ago - Stars: 27 - Forks: 7

umich-dbgroup/foofah

Foofah: programming-by-example data transformation program synthesizer

Language: CSS - Size: 4.31 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 25 - Forks: 10

abrazinskas/machine-learning-data-pipeline

Pipeline module for parallel real-time data processing for machine learning models development and production purposes.

Size: 3.41 MB - Last synced at: 9 days ago - Pushed at: over 5 years ago - Stars: 22 - Forks: 2

dataclr/dataclr

Feature selection for tabular datasets using advanced filter and wrapper methods

Language: Python - Size: 107 KB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 17 - Forks: 1

salehjg/Shapenet2_Preparation

A python script to convert and down-sample mesh data into pointclouds using FPS algorithm.

Language: Python - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 0

hegongshan/Storage-for-AI-Paper

Accelerating AI Training and Inference from Storage Perspective (Must-read Papers on Storage for AI)

Size: 16.6 KB - Last synced at: about 16 hours ago - Pushed at: about 17 hours ago - Stars: 15 - Forks: 2

SagarGaniga/Data-Preprocessing

Data preprocessing is a data mining technique that involves transforming raw data into an understandable format.

Language: Jupyter Notebook - Size: 422 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 15 - Forks: 21

ksm26/Pretraining-LLMs

Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.

Language: Jupyter Notebook - Size: 29.3 KB - Last synced at: 29 days ago - Pushed at: 9 months ago - Stars: 13 - Forks: 5

rmsandu/segmentation-eval

Extract and evaluate radiomics for liver cancer tumors from DICOM segmentation masks. Using SimpleITK, PyRadiomics and PyDicom.

Language: Python - Size: 1.44 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 5

daya6489/DriveML

Self-Drive Machine Learning Projects

Language: R - Size: 6.17 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 13 - Forks: 4

halil/sau-ml

SAU Makine Öğrenmesi Eğitim İçerikleri

Language: Python - Size: 14.8 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 13 - Forks: 3

vzhomeexperiments/R_selflearning

Developing self learning robot

Language: R - Size: 89 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 12 - Forks: 35

Bharat-Reddy/Bank-Marketing-Analysis

The data is related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The classification goal is to predict if the client will subscribe a term deposit.

Language: Jupyter Notebook - Size: 2.34 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 12 - Forks: 9

dustin-decker/featuremill

general-purpose fast, stateless, and deterministic feature extractor written in golang for use in machine learning

Language: Go - Size: 64.5 KB - Last synced at: 23 days ago - Pushed at: about 7 years ago - Stars: 12 - Forks: 0

AiCorsair/Dataquest-Data-Science-Analysis-Projects

A repository dedicated to storing guided projects completed while learning data science concepts with Dataquest.

Language: Jupyter Notebook - Size: 74 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 11 - Forks: 3

aws-samples/sm-data-wrangler-mlops-workflows

Integrate SageMaker Data Wrangler into your MLOps workflows with Amazon SageMaker Pipelines, AWS Step Functions, and Amazon Managed Workflow for Apache Airflow (MWAA)

Language: Jupyter Notebook - Size: 2 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 2

arrahtech/osdq-desktop

The classic desktop version of osDQ

Language: Java - Size: 106 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 9 - Forks: 8

RashadGarayev/Image-ClassificationNN

Image classification svm with simple neural network.

Language: Python - Size: 3.52 MB - Last synced at: 7 days ago - Pushed at: almost 5 years ago - Stars: 9 - Forks: 1

18520339/finding-similar-images

Finding similar images from image URLs using ImageHash

Language: Python - Size: 1.72 MB - Last synced at: 11 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 2

labrijisaad/Prediction-du-cours-de-Bourse

Forecast Apple stock prices using Python, machine learning, and time series analysis. Compare performance of four models for comprehensive analysis and prediction.

Language: Jupyter Notebook - Size: 4.74 MB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 2

CSFelix/Data-Science-Mental-Maps

🐍 Mental Maps Related to Contents in Data Science 🐍

Size: 51.8 KB - Last synced at: 27 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

dataiku/dss-plugin-timeseries-preparation

This Dataiku DSS plugin provides visual recipes to perform resampling, windowing, interval extraction, extrema extraction, and decomposition on time series data.

Language: Python - Size: 665 KB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 6 - Forks: 5

ArchAngelAries/TagScribeR

A tool to streamline AI image captioning

Language: Python - Size: 190 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 6 - Forks: 0

azeezat123/Bank-statement-Analysis

Documenting the data cleaning process on a bank statement dataset using the python libraries, NumPy and Pandas.

Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

SSusantAchary/Data-Annotator-for-SpaCy

🚀SpAnnor annotator for Named Entity Recognition easy to use tool. The annotator allows users to quickly assign custom labels to one or more entities in the text. Easy to setup for Data Training for SpaCy 🔥.

Language: HTML - Size: 3.99 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 1

wefindx/metaform

A utility for defining metadata for data types and formats.

Language: Python - Size: 2.15 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 0

HROlive/From-Data-to-Insights-with-Google-Cloud-Platform

Four-course accelerated online specialization teaches course participants how to derive insights through data analysis and visualization using the Google Cloud Platform

Language: Jupyter Notebook - Size: 715 KB - Last synced at: 6 days ago - Pushed at: almost 6 years ago - Stars: 5 - Forks: 3

bharatsdev/production-ready-model

Make machine learning application production ready

Language: Python - Size: 143 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 6

kozodoi/dptools

Python package with utilities for data processing, aggregation, feature engineering and data versioning

Language: Python - Size: 108 KB - Last synced at: 13 days ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 2

kbelisar/datalark

Like the mudlark finding treasures on the foreshore, the datalark seeks treasures hidden within messy data!

Language: R - Size: 32.2 KB - Last synced at: 13 days ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

danielhaake/covid19-monitor-germany

The COVID-19 Monitor Germany is an interactive dashboard to give a better overview about the pandemic situation in Germany. It provides a multitude of plots and daily calculated figures. The data used come from official sources. On the one hand from the Robert-Koch-Institut (RKI), on the other hand from the Intensivregister.

Language: Python - Size: 39 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 2

Ashleshk/Power-BI-A-Z-Hands-On-Power-BI-Training-For-Data-Science-Udemy

Learn data visualization through Microsoft Power BI and create opportunities for you or key decision makers to discover data patterns such as customer purchase behavior, sales trends, or production bottlenecks. You'll learn all of the features in Power BI that allow you to explore, experiment with, fix, prepare, and present data easily, quickly, and beautifully.

Size: 5.82 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

konradmalik/ann-laminar-burning-velocity 📦

Models trained in my article on LBV predictions.

Language: C - Size: 3.91 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 0

nragland37/Event-Optimization-Tool

R-based Shiny application that maps availability and identifies optimal engagement times to enhance participation within an organization

Language: R - Size: 32.7 MB - Last synced at: 12 days ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

alirezaniki/DPSA

A GUI-based seismic data processing and source analysis app leveraging KIWI tools and Pyrocko package.

Language: Shell - Size: 101 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

ahmadtc1/datasetBuilder

🗂 Simple and convenient dataset generation at the press of a key

Language: Python - Size: 13 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

hrtnisri2016/celestial-bodies-database

This is one of the required projects to earn the Relational Databases certification from freeCodeCamp. For this project, I built a database of celestial bodies using PostgreSQL.

Language: Jupyter Notebook - Size: 729 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 4

Arvindhh931/Mileage-prediction

Fuel Efficiency of car in miles per gallon

Language: Jupyter Notebook - Size: 3.05 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 2

nisheethjaiswal/Data-Annotator-for-SpaCy

🚀SpAnnor annotator for Named Entity Recognition easy to use tool. The annotator allows users to quickly assign custom labels to one or more entities in the text. Easy to setup for Data Training for SpaCy 🔥.

Language: HTML - Size: 3.71 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

ved93/ml-express

A Python library for day to day data analysis and machine learning. This aims to make data building, cleaning and machine learning much much faster. A library of extension and helper modules for Python's data analysis and machine learning libraries.

Language: Python - Size: 68.4 KB - Last synced at: 20 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

ChaitanyaC22/Investment-Analysis-for-an-Asset-Management-Company

Data analysis to identify the best sectors, countries, and a suitable investment type for making investments.

Language: Jupyter Notebook - Size: 6.78 MB - Last synced at: 30 days ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 0

KwokHing/Exploratory-Data-Analysis-on-SMRT-Tweets

Demo on performing exploratory data analysis (EDA) on train service disruptions based on scrapped (user generated contents) tweets from the train operator's (SMRT) twitter account

Language: Jupyter Notebook - Size: 1.25 MB - Last synced at: 22 days ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 4

jranaraki/NCBIdataPrep

An R code to convert NCBI data files into CSV

Language: R - Size: 25.4 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 1

imarranz/data-science-workflow-management

This repository is a collection of code, documentation, and other resources that support the management and automation of a Data Science project.

Language: Makefile - Size: 42 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 2 - Forks: 0

carpentries-incubator/rna-seq-data-for-ml

RNA-Seq: Data Readiness for Machine Learning Applications

Language: R - Size: 47.5 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 2

shettyvarshaa/ML-LAB

Machine Learning Lab Programs in the curriculum

Language: Python - Size: 749 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 1

furk4neg3/Sales-Forecasting

Created AI models to forecast Wallmart's sales. Used different models, like dense, LSTM, GRU and naive model. Different window and horizon sizes are used too. Compared models visually at the end.

Language: Jupyter Notebook - Size: 448 KB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Muneeb1030/FineTune-Tiny-Llama

Fine-tuning the Tiny Llama model to mimic my professor's writing style using the Llama Factory. The project involves data collection, preprocessing, preparation, fine-tuning, and evaluation.

Language: Jupyter Notebook - Size: 390 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

OgeAno/Hotel-KPI-Analysis

An analysis of some trends in hotel KPIs

Language: Jupyter Notebook - Size: 257 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

dmandache/sleek-patch

Python 3 Package for optimally sampling big images with texture-aware patchification based on SLIC superpixels. So Sleek !

Language: Python - Size: 30.1 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

lahmacunradio/analytics

Utils for analytics

Language: Python - Size: 176 KB - Last synced at: 5 days ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

OgeAno/Online-Sales-Analysis

An analysis of a webshop's sales over a 2-year period

Language: Jupyter Notebook - Size: 405 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

to-schi/ASR-Deepspeech2-Tensorflow

An end-to-end speech recognition engine similar to DeepSpeech2

Language: Jupyter Notebook - Size: 2.19 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Subhrajit91939/Preprocessing-CLI

Data Pre-processing CLI⚡- Command Line Interface python app to automate data pre-processing

Size: 24.4 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

mzguntalan/vegetable

Vegetable contains a design/definition of a Vector Graphic that allows it to easily render it as equally an spaced point cloud/sequence. From this, vegetable offers a way to read .ttf font files, and render their glyphs into point clouds/sequences.

Language: Python - Size: 1.45 MB - Last synced at: 30 days ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

Data-Wrangling-with-JavaScript/Chapter-6

Code examples for Chapter 6 of Data Wrangling with JavaScript

Language: JavaScript - Size: 154 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

phueb/Preppy

prepare ordered language data for RNN training

Language: Python - Size: 147 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

rai-harshit/simple_image_labeler

Labeling tool for Image Classification tasks.

Language: Python - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 0

ashishpatel26/Audio-Classification-Data-Preparation

Dynamic Data-set preparation for audio, video, images

Language: Jupyter Notebook - Size: 73.3 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 2

gaoisbest/Pytorch_notes_and_projects

Pytorch notes and projects

Language: Jupyter Notebook - Size: 101 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

alkashef/cleaning-excel-data

Tidying and cleaning data in Excel sheets

Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 0

HaivuUK/lua-regression

A lualatex package for adding different polynomial regressions to graphs. Additionally calculates R Squared and confidence intervals.

Language: TeX - Size: 1.1 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

sergezaugg/xeno_canto_organizer

A python tool to prepare Xeno-Canto audio files for machine learning projects

Language: Python - Size: 1.75 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

arkapatra31/ML

Learning and Implementation of my Machine Learning Journey

Language: Jupyter Notebook - Size: 1.84 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

RezaMoammadi/Book-Data-Science

If you're eager to explore data science, data analysis, and machine learning, 'Uncovering Data Science with R' is the perfect starting point. This book offers a clear, hands-on introduction to the field, requiring no prior experience in analytics or programming.

Language: HTML - Size: 103 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

manishdevdi/Instacart-Market-Basket-Analysis

The objective of this project is to analyze the 3 million grocery orders from more than 200,000 Instacart users and predict which previously purchased item will be in user's next order. Customer segmentationty analysis are done to study customer purchase patterns and for better product marketing and cro and affiniss-selling.

Language: Jupyter Notebook - Size: 6.59 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

francesco-pastori/effects-of-data-preparation-on-algorithms

Analyzing the effect of data preparation on different algorithms, introducing different problem inside the dataset

Language: Jupyter Notebook - Size: 8.73 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

ndomah/The-Data-Engineering-Academy

Materials from The Data Engineering Academy

Size: 18.5 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Gracysapra/R-in-data-Science

This repository contains essential guides for data analysis using R, covering topics like data preparation, data reshaping, and data visualization. Each file focuses on fundamental techniques to manipulate, clean, and visualize data effectively using R programming.

Language: Jupyter Notebook - Size: 40 KB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

M-Fatoni/Improving-Employee-Retention-by-Predicting-Employee-Attrition-Using-Machine-Learning

This project aims to leverage machine learning techniques to predict employee attrition, allowing organizations to identify at-risk employees and implement strategies to improve retention rates.

Language: Jupyter Notebook - Size: 1000 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

M-Fatoni/Predict-Clicked-Ads-Customer-Classification-by-using-Machine-Learning

This project aims to classify customers who are likely to click on ads using machine learning techniques. By predicting customer behavior, businesses can optimize their ad targeting strategies, resulting in improved ad performance and increased return on investment (ROI).

Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

M-Fatoni/Predict-Customer-Personality-to-boost-marketing-campaign-by-using-Machine-Learning

This project aims to enhance marketing campaign effectiveness by predicting customer personalities using machine learning techniques. By understanding customer personality traits, businesses can tailor their marketing strategies to better meet the needs and preferences of their target audience.

Language: Jupyter Notebook - Size: 2.45 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

rubypoddar/HappinessScoreVisualizer

visualizing and analyzing global happiness scores using data visualization techniques and statistical tests.

Language: Jupyter Notebook - Size: 779 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

hitesh22rana/sourcecollector

A simple tool to consolidate multiple files into a single .txt file. Perfect for feeding your files to AI tools without any fuss.

Language: Go - Size: 10.7 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

aditijoshi613/Brazilian-E-commerce-Analytics

Analytics for a leading Brazilian E-commerce firm, Olist Store

Size: 41.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 2

victorantoniassi/jr_analytics_engineer_practical_test

Minha resolução para um teste prático de uma vaga de Analytics Engineer Júnior

Language: Python - Size: 34 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

parsa-abbasi/Data-Preparation-and-Visualization-in-Python

Data Preparation and Visualization in Python

Language: Jupyter Notebook - Size: 1.53 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

srikanth-gedela/SriMLModels

Quick reference on various aspects of machines learning that I have come acrossed and my Machine Learning portfolio.

Language: Jupyter Notebook - Size: 66.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

MiladNooraei/Quera-Superstore Fork of FarzanehSoltanzadeh/Quera-Superstore

Conducted data pre-processing, optimized data warehousing, applied statistical analysis and machine learning techniques, and created visually compelling Power BI visualizations to derive valuable insights for informed decision-making.

Language: Jupyter Notebook - Size: 22.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

ejay34/06_recovery_of_gold

На основании сырых данных с параметрами добычи и очистки золотоносной руды построить прототип модели для предсказания коэффициента восстановления золота из золотоносной руды с лучшей метрикой sMAPE.

Language: Jupyter Notebook - Size: 416 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

Arckitechttt/Data-Preprocessing-Projects

Perform Data Preprocessing including “Handling Missing Values”, “Handling Outliers”, “Handling Irrelevant Data”, “Handling Imbalanced Dataset”, “Handling Unstandardized Data”, and “Feature Selection based on Features Reduction algorithms and Features Correlation method”.

Language: Jupyter Notebook - Size: 212 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

Benazir023/BookReviewAnalysis_efficient_workflow

This is a Dataquest project that focuses on creating an efficient workflow

Size: 318 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

MouhtaramSoufiane/Projets-Machine-Learning

this repository contains two projects : the first it s applying ML algorithm (Logistic regression) for classification on Titanic dataset From scratch and with use Sickit-Learn and the second for analyze this data : Understanding data - data preprocessing

Language: Jupyter Notebook - Size: 1.28 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

Related Topics
machine-learning 78 python 78 data-preprocessing 73 data-science 67 data-analysis 66 data-visualization 60 data-cleaning 52 pandas 34 exploratory-data-analysis 30 feature-engineering 22 deep-learning 22 classification 19 numpy 18 data 18 data-wrangling 17 sql 16 matplotlib 16 data-processing 15 python3 15 logistic-regression 14 r 14 seaborn 14 eda 13 scikit-learn 12 machine-learning-algorithms 11 random-forest 10 tableau 10 linear-regression 9 clustering 9 tensorflow 9 regression 9 jupyter-notebook 9 predictive-modeling 9 data-mining 8 data-analytics 8 data-manipulation 8 statistics 8 image-processing 7 neural-network 7 feature-selection 7 dataset 7 nlp 7 excel 7 data-cleansing 7 visualization 7 neural-networks 6 data-engineering 6 opencv 6 feature-extraction 6 statistical-analysis 6 artificial-intelligence 6 data-transformation 6 data-collection 6 datasets 5 data-exploration 5 text-processing 5 docker 5 pca 5 dashboard 5 data-quality 5 supervised-learning 5 keras 5 plotly 5 data-visualisation 5 time-series-analysis 5 powerbi 4 random-forest-classifier 4 pytorch 4 large-language-models 4 data-prep 4 decision-tree-classifier 4 svm-classifier 4 mysql 4 natural-language-processing 4 analysis 4 missing-values 4 analytics 4 image-classification 4 train-test-split 4 pipeline 4 sklearn 4 model-training-and-evaluation 4 streamlit 4 preprocessing 4 sentiment-analysis 4 computer-vision 4 web-scraping 4 data-normalization 4 decission-tree 4 hypothesis-testing 4 ml 4 deep-neural-networks 4 data-modeling 4 named-entity-recognition 4 webscraping 3 sklearn-library 3 bioinformatics 3 spark 3 data-insights 3 selenium 3