Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-preparation

IBM/data-prep-kit

Open source project for data preparation of LLM application builders

Language: Python - Size: 128 MB - Last synced: about 11 hours ago - Pushed: about 12 hours ago - Stars: 9 - Forks: 10

carpentries-incubator/rna-seq-data-for-ml

RNA-Seq: Data Readiness for Machine Learning Applications

Language: R - Size: 47.5 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 2 - Forks: 3

ArchAngelAries/TagScribeR

A tool to streamline AI image captioning

Language: Python - Size: 91.8 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 5 - Forks: 0

mohawk2/data-prepare

Module to prepare CSV (etc) data for automatic processing

Language: Perl - Size: 170 KB - Last synced: 7 days ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

Putriarrum/Predict-Customer-Personality-to-Boost-Up-Marketing-Campaign-Performance

This project is my personal project about Marketing Campaign using dataset from Big Tech Company provided by Rakamain Academy. I created Clustering Model with Python (Sklearn) to get best model from the dataset that can used for arrange their next marketing strategic planning.

Language: Jupyter Notebook - Size: 2.48 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 0 - Forks: 0

developmentseed/label-maker

Data Preparation for Satellite Machine Learning

Language: Python - Size: 18.8 MB - Last synced: 9 days ago - Pushed: 8 months ago - Stars: 454 - Forks: 111

skrub-data/skrub

Prepping tables for machine learning

Language: Python - Size: 7.95 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 1,012 - Forks: 87

cevheryilmaz/Honey_Production_in_the_USA_in_Machine_Learning

Language: Jupyter Notebook - Size: 7.81 KB - Last synced: 13 days ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

DataRish/MBTI-Personality-Predictor

This project predicts MBTI personality types from users' recent 50 posts using NLP and ML techniques.

Language: Jupyter Notebook - Size: 24.3 MB - Last synced: 16 days ago - Pushed: 17 days ago - Stars: 0 - Forks: 0

Chan-dre-yi/industry-4.0-exploratory-data-analysis

An exploratory data analysis of an Industry 4.0 dataset uncovered insights indicating that Business Intelligence and IoT systems will have the greatest impact in the field over the next decade.

Language: MATLAB - Size: 1.32 MB - Last synced: 18 days ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

aditijoshi613/Brazilian-E-commerce-Analytics

Analytics for a leading Brazilian E-commerce firm, Olist Store

Size: 41.5 MB - Last synced: 22 days ago - Pushed: 22 days ago - Stars: 1 - Forks: 2

SerhatDerya/medical_examination_research

This repository contains a research about medical examinations (such as body measurements, results from various blood tests, and lifestyle choices).

Language: Jupyter Notebook - Size: 1.08 MB - Last synced: 23 days ago - Pushed: about 2 years ago - Stars: 1 - Forks: 0

kakarot11/Logistic_Regression_NeuralNetwork

Multiple models for binary classification and checking the accuracy with each model.

Language: Jupyter Notebook - Size: 3.5 MB - Last synced: 27 days ago - Pushed: 28 days ago - Stars: 0 - Forks: 0

Arckitechttt/Data-Preprocessing-Projects

Perform Data Preprocessing including “Handling Missing Values”, “Handling Outliers”, “Handling Irrelevant Data”, “Handling Imbalanced Dataset”, “Handling Unstandardized Data”, and “Feature Selection based on Features Reduction algorithms and Features Correlation method”.

Language: Jupyter Notebook - Size: 212 KB - Last synced: about 1 month ago - Pushed: 12 months ago - Stars: 1 - Forks: 0

hi-primus/optimus

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Language: Python - Size: 110 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,441 - Forks: 233

Ganeshkarwa/Diwali-Sales-Analysis-Project-

Diwali-Sales-Analysis-Project

Language: Jupyter Notebook - Size: 893 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

kozodoi/dptools

Python package with utilities for data processing, aggregation, feature engineering and data versioning

Language: Python - Size: 108 KB - Last synced: 29 days ago - Pushed: about 2 years ago - Stars: 3 - Forks: 2

Ogefest/refinator-site

Public repo for refinator.xyz webstie. My new project, no-code tool to work with messy data

Language: HTML - Size: 7.73 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

brprojects/MLS_model

In this project I predict the 2016 MLS season using historical data and Poisson regression. The project includes cleaning, preprocessing and analyzing the dataset, building and evaluating predictive models for match outcomes, forecasting team performance and simulating the league table. It uses Pandas, Numpy, MatPlotLib and StatsModel libraries.

Language: Python - Size: 1.35 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

victorantoniassi/jr_analytics_engineer_practical_test

Minha resolução para um teste prático de uma vaga de Analytics Engineer Júnior

Language: Python - Size: 34 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 0

salehjg/Shapenet2_Preparation

A python script to convert and down-sample mesh data into pointclouds using FPS algorithm.

Language: Python - Size: 7.81 KB - Last synced: about 2 months ago - Pushed: almost 3 years ago - Stars: 13 - Forks: 0

asavinov/prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Language: Python - Size: 1.95 MB - Last synced: 2 days ago - Pushed: over 2 years ago - Stars: 89 - Forks: 4

bodybuilders-team/ist-meic-cd-g03

Data Science project of group 03 - MEIC @ IST 2023/2024.

Language: Python - Size: 146 MB - Last synced: 28 days ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

mzguntalan/vegetable

Vegetable contains a design/definition of a Vector Graphic that allows it to easily render it as equally an spaced point cloud/sequence. From this, vegetable offers a way to read .ttf font files, and render their glyphs into point clouds/sequences.

Language: Python - Size: 1.45 MB - Last synced: 2 months ago - Pushed: almost 2 years ago - Stars: 2 - Forks: 1

ArthurSrz/Introduction-aux-Interactions-Homme-Donn-es Fork of microsoft/Data-Science-For-Beginners

Un cours pour apprendre à construire des interactions homme-données

Language: Jupyter Notebook - Size: 79.3 MB - Last synced: 2 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

piebro/simple-image-classification-labeling-website

A simple website to label images for classification locally.

Language: HTML - Size: 5.86 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

sbcgua/mockup_loader

ABAP unit testing framework, prepare in Excel, reuse in abap code

Language: ABAP - Size: 789 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 62 - Forks: 16

lprtk/pyTCTK

Python Text Cleaning ToolKit library (pyTCTK)

Language: Python - Size: 21.5 KB - Last synced: 3 months ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

damaniayesh/KPMG_FORAGE_JOB_SIMULATIONS

The project describes the client on customer targeting with the Data, Analytics & Modelling team. Assessed data quality and completeness in preparation for analysis. The Analysed data to target high-value customers based on demographics and attributes

Language: Jupyter Notebook - Size: 6.45 MB - Last synced: 3 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

Cyrill98/Extract-Invoice-PDF-file-to-CSV

Language: Jupyter Notebook - Size: 266 KB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

ELToulemonde/dataPreparation

Data preparation for data science projects.

Language: R - Size: 5.18 MB - Last synced: 21 days ago - Pushed: 11 months ago - Stars: 31 - Forks: 10

ka00ri/sumIT

Computes the sum or difference of two digits, given two images and an operation to perform +/-.

Language: Python - Size: 14.6 KB - Last synced: 3 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

dmandache/sleek-patch

Python 3 Package for optimally sampling big images with texture-aware patchification based on SLIC superpixels. So Sleek !

Language: Python - Size: 30.1 MB - Last synced: about 2 months ago - Pushed: 4 months ago - Stars: 2 - Forks: 0

ArtemKornev0/Data_preparation-resume_analysis

Подготовка данных (анализ резюме из HeadHunter) / Data preparation (resume analysis from HeadHunter)

Language: Jupyter Notebook - Size: 2.37 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

Cloud-SPAN/02genomics

Data preparation and organisation

Language: Python - Size: 51.2 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

oliverweissl/KnowledgeAndData-Project

Visualisation of Codon-useage for species in the NCBI Taxonomy.

Size: 55.9 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

iTechArt/convtools-ita

convtools is a python library to declaratively define conversions for processing collections, doing complex aggregations and joins.

Language: Python - Size: 332 KB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 183 - Forks: 11

OgeAno/HR--Employee-Turnover-Analysis

An analysis of employee turnover for a given 12-month period

Language: Jupyter Notebook - Size: 187 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

itzhak0estrella/Energy_Data_Analytics_GNN

Undergraduate research project that was funded by the ECE Next Program. Contributed with Professor Hao Zhu and with my grad. mentors Shaohui Liu and Young-ho Cho .

Language: Jupyter Notebook - Size: 1.27 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

AnvithaChaluvadi/Whale-Analysis_Module4Challenge

In this assignment, I'll get to use what I've learned this week to evaluate the performance among various algorithmic, hedge, and mutual fund portfolios and compare them against the S&P 500 Index.

Language: Jupyter Notebook - Size: 6.67 MB - Last synced: 4 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

mpokojovy/COVID.LOS.prep

Time-to-Event Modeling for Hospital Length of Stay Prediction for COVID-19 Patients: Data Preparation

Language: R - Size: 7.81 KB - Last synced: 5 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

FarhanaTeli/Factors-Influencing-US-Home-Prices

Using publicly available data for the national factors that impact supply and demand of homes in US, build a data science model to study the effect of these variables on home prices.

Language: Jupyter Notebook - Size: 4.08 MB - Last synced: 4 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

SanaeSaccomano/Intelligence-Artificielle

Résumé de mes projets d'Intelligence artificielle

Language: Jupyter Notebook - Size: 2.89 MB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

parsa-abbasi/Data-Preparation-and-Visualization-in-Python

Data Preparation and Visualization in Python

Language: Jupyter Notebook - Size: 1.53 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 1 - Forks: 1

PacktWorkshops/The-Data-Science-Workshop

A New, Interactive Approach to Learning Data Science

Language: Jupyter Notebook - Size: 169 MB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 179 - Forks: 195

tirthgala/Automation-of-Operations-for-Zola

This repository contains my work on VBA macros while working in the e-commerce department of an Indian fashion brand called Zola.

Language: HTML - Size: 14.1 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

tirthgala/Data-Science-For-Business

This repository contains learning in Data Science for Business course while pursuing my Master's in Quantitative Management- Business Analytics program at Fuqua School of Business

Language: R - Size: 5.92 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

hi-primus/bumblebee

🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)

Language: Vue - Size: 23 MB - Last synced: 6 months ago - Pushed: 10 months ago - Stars: 130 - Forks: 34

nisheethjaiswal/Data-Annotator-for-SpaCy

🚀SpAnnor annotator for Named Entity Recognition easy to use tool. The annotator allows users to quickly assign custom labels to one or more entities in the text. Easy to setup for Data Training for SpaCy 🔥.

Language: HTML - Size: 3.71 MB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 3 - Forks: 1

RosanaFSS/Data-Visualization-Nanodegree

Data Visualization Nanodegree

Size: 5.36 MB - Last synced: 7 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

pablo14/data-science-live-book

An open source book to learn data science, data analysis and machine learning, suitable for all ages!

Language: TeX - Size: 58.4 MB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 215 - Forks: 106

georgezoto/Tableau-Advanced

Udemy's Tableau 10 Advanced Training: Master Tableau in Data Science. Harness the power of your data. Unleash the potential of your team. Learn data visualization through Tableau and create opportunities for you or key decision makers to discover data patterns such as customer purchase behavior, sales trends, or production bottlenecks.

Size: 742 KB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 3

nischaybikramthapa/Physical-Activity-Recognition

Can we predict what a person is doing based on their movements?

Language: Jupyter Notebook - Size: 23.4 MB - Last synced: 7 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

franc136/2022_Cyclistic_Case_Study

A case study analyzing 2022 bicycle rideshare data, to identify trends in rider behavior.

Size: 3.39 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

lemarigo/PortifolioProjects-Data-Prep-and-Machine-Learning

Folder contains python scripts and reports around Data Preparation and Machine Learning implementation.

Language: Jupyter Notebook - Size: 37.8 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

whwu95/MVFNet

【AAAI'2021】MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Language: Python - Size: 20.3 MB - Last synced: 6 months ago - Pushed: about 2 years ago - Stars: 136 - Forks: 12

vzhomeexperiments/R_selflearning

Developing self learning robot

Language: R - Size: 89 MB - Last synced: 4 months ago - Pushed: about 3 years ago - Stars: 12 - Forks: 35

WaliUllahbaig/OCR-with-VisionEncoderDecoder-Model

The project focuses on building an OCR system using state-of-the-art deep learning models, specifically VisionEncoderDecoder models, which have demonstrated impressive performance in various computer vision tasks.

Language: Jupyter Notebook - Size: 386 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

WaliUllahbaig/Exploring-Hyperparameters-and-Weight-Initializations-in-Neural-Networks

This project delves into artificial neural networks, using Python and Keras, to build and analyze these networks. Neural networks are computational models inspired by the human brain, consisting of interconnected nodes (neurons) that process information.

Language: Jupyter Notebook - Size: 1.24 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

bayhaqy/Data-Preparation-Analysis-Mico

Simple Way to Data Preparation and Analysis with Miro

Language: Python - Size: 3.91 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

sdixit5/Analysis-of-Barack-Obama-s-Presidency

This Project compares the Effective Minimum Wage and Unemployment Rate statistically and Analytically at the start and end of former President Barack Obama's term.

Language: Jupyter Notebook - Size: 3.54 MB - Last synced: 8 months ago - Pushed: almost 3 years ago - Stars: 1 - Forks: 0

nilot-pal/Membrane-permeability-using-ML

Source code for "Prediction of Membrane Permeability of Molecules Using Machine Learning"

Language: Python - Size: 4.42 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

OpsDataHub/data_analytics_portfolio

Portfolio containing projects to showcase data skills

Language: HTML - Size: 2.52 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 1

SamuelBarbosaDev/Roof_Imoveis_Data_Analysis

The company hired you because they want to know what would be the 5 properties they should invest in and why, and which 5 you would not recommend investing in at all.

Language: Jupyter Notebook - Size: 4.31 MB - Last synced: 19 days ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

SamuelBarbosaDev/Walrmart_Data_Analysis

You have been hired by Walmart to survey the revenue of their stores in the USA and point out which store would be best to expand its size. It is necessary to analyze the weekly sales of each store, calculate some important information that will be asked, and at the end of it all, indicate which store should be invested in.

Language: Jupyter Notebook - Size: 2.94 MB - Last synced: 19 days ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

RashadGarayev/Image-ClassificationNN

Image classification svm with simple neural network.

Language: Python - Size: 3.52 MB - Last synced: 8 months ago - Pushed: about 4 years ago - Stars: 8 - Forks: 1

martamanevska/Big-Data-Kaggle-Dataset-Project

Finding insights for further marketing decisions using dataset: from order status, price, payment and freight performance to customer location, product and reviews. According to the description of the dataset available on Kaggle, the collection of dataset used to develop the project refers to orders made at multiple marketplaces in Brazil.

Size: 20.1 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

noernimat/data_preparation_covid19_dataset

Data preparation covid19 dataset for Machine Learning Model

Language: Jupyter Notebook - Size: 5.45 MB - Last synced: 9 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 1

dartwinshu/rakamin-digital-festival-data-science

Data Science course by Rakamin Academy

Language: Jupyter Notebook - Size: 448 KB - Last synced: 9 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

dartwinshu/revou-mini-couse-data-analytics

Data analytics course by RevoU

Size: 7.81 KB - Last synced: 9 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

ghulam-ahmad-1/Movie_Recommendation_system

Movie Recommendation System

Language: Jupyter Notebook - Size: 225 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

KwokHing/Exploratory-Data-Analysis-on-SMRT-Tweets

Demo on performing exploratory data analysis (EDA) on train service disruptions based on scrapped (user generated contents) tweets from the train operator's (SMRT) twitter account

Language: Jupyter Notebook - Size: 1.25 MB - Last synced: 9 months ago - Pushed: over 4 years ago - Stars: 3 - Forks: 4

MiladNooraei/Quera-Superstore Fork of FarzanehSoltanzadeh/Quera-Superstore

Conducted data pre-processing, optimized data warehousing, applied statistical analysis and machine learning techniques, and created visually compelling Power BI visualizations to derive valuable insights for informed decision-making.

Language: Jupyter Notebook - Size: 22.8 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 0

E7su/hypno

Data analysis with pandas, numpy, scikit-learn

Language: Python - Size: 2.85 MB - Last synced: 10 months ago - Pushed: over 7 years ago - Stars: 1 - Forks: 0

viniavskyi-ostap/recommender_systems

Implementation of different approaches to recommendation on Amazon Review dataset

Language: Jupyter Notebook - Size: 7.28 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

halil/sau-ml

SAU Makine Öğrenmesi Eğitim İçerikleri

Language: Python - Size: 14.8 MB - Last synced: 10 months ago - Pushed: about 6 years ago - Stars: 13 - Forks: 3

neuro-ml/reskit

A library for creating and curating reproducible pipelines for scientific and industrial machine learning

Language: Jupyter Notebook - Size: 36.4 MB - Last synced: 10 months ago - Pushed: almost 7 years ago - Stars: 27 - Forks: 6

sadnanMohosin/Data-Science-Machine-Learning-Literacy

The purpose of this repository to learn the underlying theory and concept of DL/ML from data preparation to implementing prepared data to the models.

Size: 2.96 MB - Last synced: 10 months ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

mymickiewicz/data-preprocessor

A data preprocessing tool for `MyMickiewicz`.

Language: TypeScript - Size: 40 KB - Last synced: 10 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

RodrigoSdeCarvalho/rsEasyML

Rust version of my machine learning framework that provides data preprocessing, feature selection, classification, regression and even more complex deep learning models, model persistence, autoencoders and anomaly detection

Language: Rust - Size: 3.91 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

CSFelix/Data-Science-Mental-Maps

🐍 Mental Maps Related to Contents in Data Science 🐍

Size: 51.8 KB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 7 - Forks: 0

doratako/Data-Quality-Assurance

Data validation and data cleansing

Language: Jupyter Notebook - Size: 54.7 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

virchan/predictive_modeling_workflow

This project explores the predictive modeling workflow using the Kaggle competition "Titanic - Machine Learning from Disaster." It emphasizes key stages like data analysis and model evaluation, aiming to identify the optimal model. Through a real-world approach, we enhance our understanding of the workflow and emphasize rigorous model evaluation.

Language: Jupyter Notebook - Size: 2.48 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

Sof-AI/fanfiction_project

A passion project focused on analyzing my own readinglists & fanworks hosted on Archive Of Our Own!

Language: Jupyter Notebook - Size: 48.4 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

LucienCastle/loan-delinquency-prediction

Predicts if a customer will delinquent using ML classification models

Language: Jupyter Notebook - Size: 6.28 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

lucoliv23/KC-Roasters-Classification-

Language: Jupyter Notebook - Size: 0 Bytes - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

lucoliv23/Celestial-Object-Detection

Language: Jupyter Notebook - Size: 1.16 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

lucoliv23/Genomic-Data-Clustering

Language: Jupyter Notebook - Size: 0 Bytes - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

ejay34/01_real_estate_market

Используя данные сервиса Яндекс.Недвижимость, определить рыночную стоимость объектов недвижимости и типичные параметры квартир

Language: Jupyter Notebook - Size: 581 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

ejay34/06_recovery_of_gold

На основании сырых данных с параметрами добычи и очистки золотоносной руды построить прототип модели для предсказания коэффициента восстановления золота из золотоносной руды с лучшей метрикой sMAPE.

Language: Jupyter Notebook - Size: 416 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 1 - Forks: 0

ejay34/05_location_for_the_well

На основании данных о геологоразведке построить модели прогноза запасов нефтяных скважин для регионов, выбрать регион для разработки с приемлемым порогом риска безубыточности и наиболее перспективными ресурсами.

Language: Jupyter Notebook - Size: 230 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

ejay34/04_churn_forecast

На основании данных о поведении клиентов построить модель с максимально большим значением F1 для задачи классификации, которая будет определять клиентов, склонных к оттоку.

Language: Jupyter Notebook - Size: 135 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

ejay34/03_recommendations_tariff_plan

На основании данных о поведении клиентов построить модель с максимально большим значением accuracy для задачи классификации, которая предложит подходящий тариф.

Language: Jupyter Notebook - Size: 11.7 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

ejay34/02_computer_games_sales

Используя исторические данные о продажах компьютерных игр, оценки пользователей и экспертов, жанры и платформы, выявить закономерности, определяющие успешность игры.

Language: Jupyter Notebook - Size: 386 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

GeeKboy2/Data_preparation_4_ML_algorithm

This project will focus on data preparation and will follow the steps : data cleaning, handling text and categorical attributes, and feature scaling.

Language: Jupyter Notebook - Size: 1.65 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

lahmacunradio/analytics

Utils for analytics

Language: Python - Size: 176 KB - Last synced: 6 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

konradmalik/ann-laminar-burning-velocity 📦

Models trained in my article on LBV predictions.

Language: C - Size: 3.91 KB - Last synced: 4 months ago - Pushed: almost 4 years ago - Stars: 4 - Forks: 0

Benazir023/BookReviewAnalysis_efficient_workflow

This is a Dataquest project that focuses on creating an efficient workflow

Size: 318 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 1 - Forks: 0

MouhtaramSoufiane/Projets-Machine-Learning

this repository contains two projects : the first it s applying ML algorithm (Logistic regression) for classification on Titanic dataset From scratch and with use Sickit-Learn and the second for analyze this data : Understanding data - data preprocessing

Language: Jupyter Notebook - Size: 1.28 MB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 1 - Forks: 0

hrtnisri2016/celestial-bodies-database

This is one of the required projects to earn the Relational Databases certification from freeCodeCamp. For this project, I built a database of celestial bodies using PostgreSQL.

Language: Jupyter Notebook - Size: 729 KB - Last synced: 12 months ago - Pushed: over 1 year ago - Stars: 3 - Forks: 4

Related Keywords
data-preparation 243 python 61 machine-learning 59 data-preprocessing 58 data-science 55 data-analysis 49 data-visualization 39 data-cleaning 35 pandas 26 classification 17 exploratory-data-analysis 16 feature-engineering 16 deep-learning 16 python3 14 logistic-regression 14 data-wrangling 12 r 11 numpy 11 data-processing 11 data 10 eda 10 matplotlib 10 random-forest 9 scikit-learn 9 tableau 8 machine-learning-algorithms 8 sql 8 image-processing 7 data-mining 7 predictive-modeling 7 linear-regression 7 data-cleansing 7 seaborn 7 dataset 6 jupyter-notebook 6 regression 6 nlp 6 datasets 5 neural-network 5 tensorflow 5 artificial-intelligence 5 feature-selection 5 data-manipulation 5 time-series-analysis 5 feature-extraction 5 data-collection 5 opencv 5 data-transformation 5 clustering 5 visualization 4 pca 4 neural-networks 4 hypothesis-testing 4 data-analytics 4 statistical-analysis 4 image-classification 4 computer-vision 4 decision-tree-classifier 4 dashboard 4 ml 4 preprocessing 4 decission-tree 4 keras 4 missing-values 4 sentiment-analysis 4 natural-language-processing 4 data-modeling 4 statistics 4 normalization 3 web-scraping 3 data-exploration 3 supervised-learning 3 data-profiling 3 data-engineering 3 pipeline 3 selenium 3 deep-neural-networks 3 unit-testing 3 cnn-classification 3 excel 3 pytorch 3 streamlit 3 bioinformatics 3 modeling 3 svm-classifier 3 data-visualisation 3 sklearn 3 plotly 3 datascience 3 outliers 3 business-intelligence 3 data-augmentation 3 exploratory-data-visualizations 3 named-entity-recognition 3 kaggle-competition 3 correlation 2 spacy-nlp 2 text-mining 2 sampling 2 data-annotation-tools 2