Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-cleaning

FAIZANTKHAN/Regression-Project-Bangalore-Property-Price-Prediction

🏠 Bangalore Property Price Prediction is a comprehensive project designed to accurately predict property prices in Bangalore. Leveraging advanced regression techniques and a dataset sourced from Kaggle, the model undergoes meticulous feature engineering, data cleaning, and parameter tuning to ensure high accuracy.

Language: Python - Size: 527 KB - Last synced: about 4 hours ago - Pushed: about 5 hours ago - Stars: 0 - Forks: 1

psrc/travel-studies

Repo for PSRC's Regional Travel Studies, 2014 onward

Language: HTML - Size: 747 MB - Last synced: about 4 hours ago - Pushed: about 5 hours ago - Stars: 6 - Forks: 1

mohawk2/data-prepare

Module to prepare CSV (etc) data for automatic processing

Language: Perl - Size: 170 KB - Last synced: about 7 hours ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

roniantoniius/Scraping-Sneakers-with-BeautifulSoup

Language: Jupyter Notebook - Size: 25.4 KB - Last synced: about 10 hours ago - Pushed: about 11 hours ago - Stars: 1 - Forks: 0

roniantoniius/Analyze-House-Price-Scrapping-Tableau

Language: Jupyter Notebook - Size: 282 KB - Last synced: about 10 hours ago - Pushed: about 11 hours ago - Stars: 1 - Forks: 0

sfirke/janitor

simple tools for data cleaning in R

Language: R - Size: 6.55 MB - Last synced: about 7 hours ago - Pushed: 2 months ago - Stars: 1,343 - Forks: 130

RandomGamingDev/grabcraft-to-schema

A Python library and its cli for converting grabcraft to schema (more specifically litematica schematic) files

Language: Python - Size: 55.5 MB - Last synced: about 18 hours ago - Pushed: about 19 hours ago - Stars: 1 - Forks: 0

AndreyKhamid/DataScience_StudyProjects

Данный репозиторий содержит проекты (преимущественно Data Science), созданные в процессе обучения на потоке Skill Factory.

Language: HTML - Size: 2.2 MB - Last synced: about 24 hours ago - Pushed: 1 day ago - Stars: 0 - Forks: 0

davejoe506/song-of-the-week

A script that automates the scoring and record-keeping for a silly little Song of the Week competition amongst colleagues.

Language: Python - Size: 12.7 KB - Last synced: 1 day ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

LibraryCarpentry/lc-open-refine

Library Carpentry: OpenRefine

Size: 18.5 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 49 - Forks: 135

ChhavikKapoor20/Vrinda-Store-Data-Analysis

This repository includes all the files used in the Vrinda Store Data Analysis using Excel with Interactive Dashboard Project.

Size: 8.34 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 0 - Forks: 0

akanz1/klib

Easy to use Python library of customized functions for cleaning and analyzing data.

Language: Python - Size: 46.7 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 478 - Forks: 51

nits302/SQL__Nashville-Housing-Data-Cleaning-Project

Size: 5.66 MB - Last synced: about 15 hours ago - Pushed: 1 day ago - Stars: 0 - Forks: 0

cleanlab/cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language: Python - Size: 11.1 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 8,710 - Forks: 670

Mohitsachdev1507/Vrinda-Store-Analysis-MSExcel-

Analyzed Data By Creating Interactive Dashboard Using MS Excel

Size: 6.59 MB - Last synced: about 9 hours ago - Pushed: 1 day ago - Stars: 0 - Forks: 0

theadithya/Power-BI_Super_Store_Analysis

Super Store Sales Analysis using Power BI

Size: 430 KB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 0 - Forks: 0

ECNU-ICALK/EduChat

An open-source educational chat model from ICALK, East China Normal University. 开源中英教育对话大模型。(通用基座模型,GPU部署,数据清理) 致敬: LLaMA, MOSS, BELLE, Ziya, vLLM

Language: Python - Size: 210 MB - Last synced: about 20 hours ago - Pushed: 5 months ago - Stars: 607 - Forks: 58

yugantgajera/Data-Analytics-Projects

Analytics is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data. It also entails applying data patterns toward effective decision-making.

Language: Jupyter Notebook - Size: 3.58 MB - Last synced: 2 days ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

Ishank-jn/Requirement-Engineering

This project is a literature survey on data cleaning techniques such as ActiveClean, BoostClean, HoloClean, Data Tamer.

Size: 414 KB - Last synced: 2 days ago - Pushed: almost 5 years ago - Stars: 2 - Forks: 0

cleanlab/cleanlab-studio

Client interface for all things Cleanlab Studio

Language: Python - Size: 2.88 MB - Last synced: about 24 hours ago - Pushed: 2 days ago - Stars: 21 - Forks: 4

owenwienczkowski/StockMarketMachineLearningModel

Regression machine learning model to analayze past stock market history to form predictions on future stock market activity. Read more in README.

Language: Python - Size: 1.57 GB - Last synced: 1 day ago - Pushed: 2 days ago - Stars: 0 - Forks: 0

AndreyKhamid/DataCleaningProject

В данном репозитории представлен учебный проект по очистке данных на основе дата-сета по недвижимости в Мск и МО.

Language: Jupyter Notebook - Size: 6.27 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 0

brunocampos01/allstate-claims-severity

Udacity Machine Learning Engineer Nanodegree capstone proposal.

Language: Python - Size: 266 MB - Last synced: 2 days ago - Pushed: over 2 years ago - Stars: 7 - Forks: 2

ninadpatil09/Heart_Disease_Detection_Analysis

The Heart Disease Detection Analysis aims to create a predictive model for identifying individuals at risk of heart disease. Using a dataset with attributes like age, sex, and health metrics, the project focuses on distinguishing patients with and without heart disease.

Language: Jupyter Notebook - Size: 699 KB - Last synced: 2 days ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

CSU-Agricultural-Water-Quality-Program/ALS-Data-Cleaning-Tool

A coding tool developed in R to take water analysis results exported from the ALS WEBTRIEVE™ data portal. Exported data are cleaned, merged, and exported into archiving (e.g., CSV) or visual (e.g., HTML) formats.

Language: HTML - Size: 33.2 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 0

johnkerl/miller

Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

Language: Go - Size: 200 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 8,578 - Forks: 202

buchananja/dpyp

A convenience tool for small-scale data pipelines in Python

Language: Python - Size: 4.07 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 2 - Forks: 0

awesome-mlops/awesome-ml-monitoring

A curated list of awesome open source tools and commercial products for monitoring data quality, monitoring model performance, and profiling data 🚀

Size: 4.88 KB - Last synced: 2 days ago - Pushed: 5 months ago - Stars: 48 - Forks: 5

zachpinto/fast-food-nutrition

GPT-4-based data augmentation for simple EDA of caloric contents of items among popular fast food chain in the US.

Language: Jupyter Notebook - Size: 4.37 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0

Aniket066/HR-Data-Analysis

Language: Jupyter Notebook - Size: 73.2 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0

zachpinto/growth-rate-imputations

Streamlit application for imputing missing values in time-series data based on implied growth rates

Language: Python - Size: 31.3 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0

saidulIslam1602/Analysis-of-Demographic-Profile-Dataset-with-Microsoft-Excel

This GitHub repository features a project focused on demographic analysis and visualization. It employs diverse data manipulation techniques and visualization tools to explore real-world demographic trends.

Size: 190 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0

MilanKalkenings/small_data_analytics_projects

practicing some basic pandas, pyplot and plotly routines

Language: Jupyter Notebook - Size: 5.81 MB - Last synced: 2 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0

minhaj-313/Vrinda-Store-Annual-Report--Data-Analyst-Project

Discover the Vrinda Store Annual Report Project, where we analyze sales data to reveal actionable insights for business growth. This Excel-based dashboard offers a complete overview of customer, regional performance, and sales channels, etc. Dive into data-driven decision-making and Dive into data-driven decision-making and strategic planning.

Size: 6.45 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 1 - Forks: 1

skrub-data/skrub

Prepping tables for machine learning

Language: Python - Size: 7.95 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 1,012 - Forks: 87

minhaj-313/Simple-Income-Statement-Dashboard-Using-PowerBI

Welcome to the Simple Income Statement Dashboard project repository! This project features an income statement dashboard developed using Power BI. The dashboard offers visualizations and insights based on Microsoft's income statement data for FY-21 and FY-22, obtained from the official website's financial statements section.

Size: 539 KB - Last synced: 2 days ago - Pushed: 3 days ago - Stars: 1 - Forks: 0

tusharpandey003/Data-Science

Data science include Data Analysis, Machine learning , EDA,PCA and Data Structure and Algorithms

Language: Jupyter Notebook - Size: 17.9 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 0 - Forks: 0

JeffWang0325/Microsoft-DAT275X-Principles-of-Machine-Learning-Python-Edition

In this data science course, you will be given clear explanations of machine learning theory combined with practical scenarios and hands-on experience building, validating, and deploying machine learning models. You will learn how to build and derive insights from these models using Python, and Azure Notebooks.

Language: Jupyter Notebook - Size: 7.4 MB - Last synced: 4 days ago - Pushed: over 2 years ago - Stars: 1 - Forks: 1

minhaj-313/Sales-Dashboard-Using-Excel---Data-Analyst-Project

Explore the power of data analysis with this Sales Dashboard project! Gain insights into sales trends, customer behavior, and profitability using Excel. Follow along to dive deep into the world of data analytics!

Size: 4.37 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 1 - Forks: 0

Nika6s/Project-1

Анализ вакансий на портале HeadHunter

Language: Jupyter Notebook - Size: 1.4 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 0 - Forks: 0

DicksonC96/Covid-Mobility-Malaysia

A daily auto-updating interactive dashboard project to visualize the impact of community's mobility to daily new COVID-19 cases by leveraging the data from MOH Malaysia, Google, Apple, Waze and TonTon. https://public.tableau.com/views/COVID-19MobilityDashboard/MobilityTrends (Tableau) https://datastudio.google.com/reporting/54616e0e-19c9-4097-bca7-22d2fa9e7541 (Data Studio)

Language: Jupyter Notebook - Size: 41.2 MB - Last synced: 4 days ago - Pushed: 5 days ago - Stars: 3 - Forks: 0

samagra44/Virat-Kohli-Century-Analysis

Virat Kohli Century Analysis

Language: Jupyter Notebook - Size: 7.57 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 2 - Forks: 0

Faridghr/FraudDetectivePy

Python-based supervised machine learning project.

Language: Jupyter Notebook - Size: 3.09 MB - Last synced: 4 days ago - Pushed: 5 days ago - Stars: 0 - Forks: 0

gabriel-braga-uc/me115-Linguagem_R

R Language course from University. Emphasis on visualization, cleaning and manipulation of data. *ongoing

Language: HTML - Size: 246 KB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 0 - Forks: 0

Vineet-Karmakar/Car_Sales_Analysis

We analyze a sample car sales dataset and create an interactive, dynamic dashboard to present key insights in appealing and easy to understand way. (In Tableau)

Size: 2.6 MB - Last synced: 4 days ago - Pushed: 5 days ago - Stars: 0 - Forks: 0

scribe-org/Scribe-Data

Wikidata and Wikipedia language data extraction

Language: Python - Size: 97.6 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 16 - Forks: 17

ascender1729/CDS-IISC-P1-DataSci-PreDoc

A data science project utilizing machine learning to predict movie release years and genres based on directors' previous works.

Language: Jupyter Notebook - Size: 11.7 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 1 - Forks: 0

Vineet-Karmakar/E-Commerce_Sales_Analysis

We analyze a sample sales dataset and create an interactive, dynamic dashboard to present key insights in appealing and easy to understand way.

Size: 3.48 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 0 - Forks: 0

tohidkhanbagani/Global-Pandemic-Insights-A-Data-Driven-Analysis-of-COVID-19.

Language: Jupyter Notebook - Size: 11.7 KB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 0 - Forks: 0

threnjen/boardgamegeek

BoardGameGeek Recommender System is a start-to-finish project, from sourcing the data to a hybrid recommender system utilizing both content-based and collaborative filtering.

Language: Jupyter Notebook - Size: 105 MB - Last synced: 5 days ago - Pushed: 6 days ago - Stars: 4 - Forks: 0

ccb-hms/ontology-mapper

Tool for mapping (uncontrolled) terms to ontology terms

Language: Python - Size: 286 KB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 11 - Forks: 2

EricOkoe/World_Layoffs_Data_Analysis

The aim of this project is to showcase various techniques employed in Data Cleaning and conducting Exploratory Data Analysis (EDA) to derive meaningful insights from the database using SQL

Size: 148 KB - Last synced: 5 days ago - Pushed: 6 days ago - Stars: 0 - Forks: 0

Ananya48/PRODIGY_DS_02

Task2- EDA with Titanic dataset

Language: Jupyter Notebook - Size: 324 KB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 1 - Forks: 0

unionai-oss/pandera

A light-weight, flexible, and expressive statistical data testing library

Language: Python - Size: 3.41 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 3,009 - Forks: 273

datacarpentry/OpenRefine-ecology-lesson

Data Cleaning with OpenRefine for Ecologists

Size: 13.6 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 24 - Forks: 114

azaz9026/EDA

Exploratory Data Analysis (EDA) refers to the method of studying and exploring record sets to apprehend their predominant traits, discover patterns, locate outliers, and identify relationships between variables. EDA is normally carried out as a preliminary step before undertaking extra formal statistical analyses or modeling.

Language: Jupyter Notebook - Size: 3.34 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 0 - Forks: 0

BioPsyk/cleansumstats

Convert GWAS sumstat files into a common format with a common reference for positions, rsids and effect alleles.

Language: Shell - Size: 7.96 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 7 - Forks: 1

Digital-Dermatology/SelfClean

🧼🔎 A holistic self-supervised data cleaning strategy to detect irrelevant samples, near duplicates and label errors.

Language: Python - Size: 18.6 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 9 - Forks: 1

CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering

LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!

Language: Python - Size: 39.9 MB - Last synced: 7 days ago - Pushed: 8 days ago - Stars: 110 - Forks: 41

rakeshbangla41/8_Week_SQL_Challenge

Solutions for #8WeekSQLChallenge using MySQL

Size: 302 KB - Last synced: 7 days ago - Pushed: 8 days ago - Stars: 1 - Forks: 0

kiranshahi/Impact-of-AID-on-Poverty-Alleviation

Data visualisation on Impact of AID on Poverty Alleviation using D3.js and Tableau.

Language: JavaScript - Size: 7.22 MB - Last synced: 8 days ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

yaph/world-aid-transparency

World aid transparency data scripts for creating a visualization with D3

Language: Python - Size: 1.83 MB - Last synced: 8 days ago - Pushed: over 11 years ago - Stars: 3 - Forks: 5

yaph/rail-suicides

Rail suicides data munging with Python pandas

Language: Python - Size: 168 KB - Last synced: 8 days ago - Pushed: over 10 years ago - Stars: 0 - Forks: 1

yaph/james-bond-actors

Script to grab Freebase data about James Bond actors and generate gexf data file.

Language: Python - Size: 134 KB - Last synced: 8 days ago - Pushed: almost 11 years ago - Stars: 7 - Forks: 6

yaph/gh-commit-locations

Scripts used for analyzing GitHub commit locations to create a map visualization

Language: Python - Size: 179 KB - Last synced: 8 days ago - Pushed: almost 12 years ago - Stars: 3 - Forks: 4

yaph/evolution-internet-users

Process data for Evolution of Internet Users dataviz

Language: Python - Size: 320 KB - Last synced: 8 days ago - Pushed: about 10 years ago - Stars: 1 - Forks: 2

OlaPietka/Data-Cleaning-project

Final project for data cleaning course at University of Illinois master degree

Language: Jupyter Notebook - Size: 48.7 MB - Last synced: 8 days ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

Iqrar99/data-analytics-portfolio

Portfolio of data science and data analyst projects completed by me for academic, self learning, and hobby purposes.

Language: Jupyter Notebook - Size: 11.8 MB - Last synced: 8 days ago - Pushed: over 2 years ago - Stars: 84 - Forks: 22

KulikDM/pythresh

Outlier Detection Thresholding

Language: Jupyter Notebook - Size: 14.5 MB - Last synced: 6 days ago - Pushed: 2 months ago - Stars: 116 - Forks: 5

StevenMMortimer/crmfunc

An R Package for Handling CRM Data

Language: R - Size: 9.77 KB - Last synced: 8 days ago - Pushed: almost 7 years ago - Stars: 1 - Forks: 0

Renumics/sliceguard

A library for detecting problematic data segments in structured and unstructured data with few lines of code.

Language: Python - Size: 4.28 MB - Last synced: 2 days ago - Pushed: 4 months ago - Stars: 57 - Forks: 1

sail-sg/sailcraft

Data Toolkit for Sailor Language Models

Language: Python - Size: 195 KB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 8 - Forks: 0

AlexiaChen/DataBlackHole

Data Erasure Library

Language: C++ - Size: 2.81 MB - Last synced: 9 days ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0

mainguyen2911/Customers-Churn-Data-Prediction-Model

Predicting whether a customer will exit the bank's service using multiple Machine Learning models.

Language: Jupyter Notebook - Size: 592 KB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 0 - Forks: 0

charlesdedampierre/BunkaTopics

🗺️ Data Cleaning and Textual Data Visualization 🗺️

Language: Python - Size: 227 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 92 - Forks: 7

TheNJineer/NJRealtor-Scrapper

Full scale portfolio project which scrapes the NJ Realtor website of its monthly median sales data pdfs. The pdf's contents will be extracted to be cleaned and transformed to then be stores in a SQL data base for future use in a machine learning project.

Language: Python - Size: 3.37 MB - Last synced: 9 days ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

Manish7272/Retail_Sales_Prediction-main

Language: Jupyter Notebook - Size: 49.8 MB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 1 - Forks: 0

yutanagano/tidytcells

Standardise TR/MH data

Language: Python - Size: 40.3 MB - Last synced: 10 days ago - Pushed: 24 days ago - Stars: 4 - Forks: 2

ishwarighule/HousingDataRevive---Data-Cleaning-Project

This is data analysis project, we will be working with a comprehensive dataset of housing properties. Objective is to perform a thorough analysis of the data using SQL to gain insights and prepare the dataset for further analysis.

Size: 5.64 MB - Last synced: 10 days ago - Pushed: 5 months ago - Stars: 1 - Forks: 0

nirdesh17/movie-recommender-system

A movie recommendation system, is an AI/ML-based approach to filtering or predicting the users’ film preferences based on their past choices and behavior. It’s an advanced filtration mechanism that predicts the possible movie choices of the concerned user and their preferences towards a domain-specific item, aka movie.

Language: Jupyter Notebook - Size: 8.77 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 3 - Forks: 0

Rawlingsofficial/Data-Science-on-Qwasar-platform

This repository serves as a comprehensive showcase of my skills and expertise in data science, encompassing various projects and exercises completed throughout the bootcamp.

Language: Jupyter Notebook - Size: 1.92 MB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 1 - Forks: 0

emmaarenas/data-quality-analysis

collection of Jupyter Notebooks in both English and Spanish, dedicated to performing data quality analysis using the R programming language

Language: HTML - Size: 968 KB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 0 - Forks: 0

vishnu-t-r/Data-Analytics-Portfolio-Projects

This repository contain data analyst portfolio projects developed using various data analytics tools including SQL, Python, Tableau, Looker etc.

Language: Jupyter Notebook - Size: 2.14 MB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 2 - Forks: 0

zdravoj/water-pump-classifier

Predicts the operational status of water pumps in Tanzania.

Language: Jupyter Notebook - Size: 11.7 MB - Last synced: 12 days ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

umairqadir97/Recommendation_Engine

A Recommendation Engine (Python, Machine Learning, Spark, Pandas, Clickhouse )

Size: 0 Bytes - Last synced: 12 days ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

umairqadir97/Entity_Name_Standardizatin

Machine Learning, Data Science, Names Standardization, Python, Spark, Pandas

Size: 0 Bytes - Last synced: 12 days ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

SudhanshuBlaze/Bengaluru-House-Prices-prediction

Using Python sci-kit learn library for training a model and predicting Bengaluru House prices.

Language: Jupyter Notebook - Size: 290 KB - Last synced: 12 days ago - Pushed: over 3 years ago - Stars: 2 - Forks: 1

columbustech/cdrive

A scalable, collaborative, cloud-native and customizable solution to data cleaning and integration

Language: Python - Size: 3.2 MB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 0 - Forks: 1

opendataval/opendataval

OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)

Language: Python - Size: 22.5 MB - Last synced: 3 days ago - Pushed: 3 months ago - Stars: 71 - Forks: 5

Daviedavie100/Daviedavie100

I'm an aspiring data analyst. I'm currently into Data Analysis and Python Programming :sparkling_heart:

Size: 76.2 KB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 1 - Forks: 0

datacarpentry/openrefine-socialsci

OpenRefine for Social Science Data

Size: 5.39 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 22 - Forks: 48

shubhamsawant0601/Data_Science

Study and notes of Data Science lifecycle.

Language: Jupyter Notebook - Size: 20 MB - Last synced: 13 days ago - Pushed: 14 days ago - Stars: 3 - Forks: 1

Ilu27/Regression--Retail_Sales_Prediction_ML

Language: Jupyter Notebook - Size: 4.8 MB - Last synced: 13 days ago - Pushed: 14 days ago - Stars: 0 - Forks: 0

anothersoham/Understanding-Correlation-Between-Music-Mental-Health

A Python Data Analysis & Visualisation Project

Language: Jupyter Notebook - Size: 19.5 MB - Last synced: 15 days ago - Pushed: 15 days ago - Stars: 0 - Forks: 0

MohaMedFRy/Investigate-a-TMDB

In this project, I made a program using Python to explore The Movie Database (TMDB), I wrote code to import the data then cleaning it, and answer interesting questions about it by computing descriptive statistics and Visualize it to facilitate understanding it and using NumPy, Matplotlib, and pandas.

Language: Jupyter Notebook - Size: 5.92 MB - Last synced: 15 days ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

christopherBelter/qvr_processing

A function to process QVR data using R

Language: R - Size: 35.2 KB - Last synced: 15 days ago - Pushed: 15 days ago - Stars: 1 - Forks: 0

duoan/recsys

A end-to-end open source recommender platform, include data collection, feature engineering and ABTest, recommend algorithm.

Size: 5.86 KB - Last synced: 15 days ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

AtharvaKate2001/Room-Occupancy-Prediction

Conduct a comprehensive Exploratory Data Analysis (EDA) to uncover patterns in environmental variables and develop a predictive machine learning model for accurate room occupancy predictions, aligning strategic insights with the optimization of operational efficiency and resource management.

Language: Jupyter Notebook - Size: 2.07 MB - Last synced: 15 days ago - Pushed: 15 days ago - Stars: 0 - Forks: 0

ashen007/Australia_rain_prediction

build, train, and evaluate different classification algorithms and compare prebuild models of the scikit library with custom written algorithms from scratch.

Language: Jupyter Notebook - Size: 62.4 MB - Last synced: 16 days ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

Dhrumi-Kansara-1/data-analysis-excel-clinic-dataset

Using Excel, data cleaning, manipulation, and analysis were performed to examine the clinic data and identify factors that contribute to long wait times in the clinic.

Size: 11 MB - Last synced: 16 days ago - Pushed: 11 months ago - Stars: 0 - Forks: 0

Related Keywords
data-cleaning 2,069 data-visualization 748 data-analysis 647 python 590 data-science 522 machine-learning 355 pandas 332 exploratory-data-analysis 194 jupyter-notebook 182 sql 159 numpy 156 r 153 data-wrangling 149 matplotlib 144 data 133 seaborn 118 excel 118 data-preprocessing 117 eda 117 data-analytics 116 python3 113 feature-engineering 112 powerbi 101 tableau 91 data-mining 84 data-processing 75 visualization 74 data-manipulation 71 data-exploration 66 statistics 61 data-transformation 59 dashboard 51 web-scraping 50 machine-learning-algorithms 50 data-modeling 46 classification 45 data-engineering 44 deep-learning 43 statistical-analysis 43 linear-regression 42 data-collection 42 dataset 42 predictive-modeling 41 logistic-regression 38 data-visualisation 38 outlier-detection 38 random-forest 38 feature-selection 36 sklearn 36 analysis 35 pandas-dataframe 35 data-analysis-python 35 data-preparation 35 nlp 35 pivot-tables 34 matplotlib-pyplot 33 csv 32 scikit-learn 32 feature-extraction 30 regression-models 29 mysql 29 sentiment-analysis 28 etl 28 supervised-learning 28 preprocessing 27 data-cleansing 27 plotly 25 kaggle 25 natural-language-processing 24 regression 24 postgresql 24 sql-server 24 database 23 clustering 23 business-analytics 22 webscraping 22 json 22 javascript 21 power-query 20 hypothesis-testing 20 pandas-python 19 decision-trees 19 data-extraction 19 data-quality 18 tidyverse 18 artificial-intelligence 18 streamlit 17 covid-19 17 business-intelligence 17 api 17 microsoft-excel 17 cross-validation 16 xgboost 16 analytics 16 data-validation 16 regression-analysis 16 ggplot2 15 hyperparameter-tuning 14 rstudio 14 naive-bayes-classifier 14