An open API service providing repository metadata for many open source software ecosystems.

Topic: "exploratory-data-analysis"

ydataai/ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

Language: Python - Size: 839 MB - Last synced at: 4 days ago - Pushed at: 17 days ago - Stars: 13,224 - Forks: 1,753

cleanlab/cleanlab

Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language: Python - Size: 11.5 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 10,935 - Forks: 855

great-expectations/great_expectations

Always know what to expect from your data.

Language: Python - Size: 228 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 10,884 - Forks: 1,640

evidence-dev/evidence

Business intelligence as code: build fast, interactive data visualizations in SQL and markdown

Language: JavaScript - Size: 282 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 5,636 - Forks: 301

lux-org/lux

Automatically visualize your pandas dataframe via a single print! 📊 💡

Language: Python - Size: 51.4 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 5,308 - Forks: 375

fbdesignpro/sweetviz

Visualize and compare datasets, target values and associations, with one line of code.

Language: Python - Size: 15.3 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 3,055 - Forks: 288

JasonKessler/scattertext

Beautiful visualizations of how language differs among document types.

Language: Python - Size: 39.4 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 2,312 - Forks: 289

sfu-db/dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

Language: Python - Size: 214 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2,203 - Forks: 219

Renumics/spotlight

Interactively explore unstructured datasets from your dataframe.

Language: TypeScript - Size: 47.3 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,202 - Forks: 84

hurshd0/must-read-papers-for-ml

Collection of must read papers for Data Science, or Machine Learning / Deep Learning Engineer

Size: 19.5 KB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 1,136 - Forks: 163

cleanlab/cleanvision

Automatically find issues in image datasets and practice data-centric computer vision.

Language: Python - Size: 2.12 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 1,116 - Forks: 74

dataprofessor/code

Compilation of R and Python programming codes on the Data Professor YouTube channel.

Language: Jupyter Notebook - Size: 30.9 MB - Last synced at: 5 months ago - Pushed at: 12 months ago - Stars: 972 - Forks: 1,440

latitude-dev/latitude

Developer-first embedded analytics

Language: TypeScript - Size: 4.9 MB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 929 - Forks: 51

dataprofessor/streamlit_freecodecamp

Build 12 Data Apps in Python with Streamlit

Language: Jupyter Notebook - Size: 607 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 639 - Forks: 565

tommyod/KDEpy

Kernel Density Estimation in Python

Language: Jupyter Notebook - Size: 11.5 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 627 - Forks: 98

achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project

Complete-Life-Cycle-of-a-Data-Science-Project

Size: 156 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 609 - Forks: 250

jadianes/data-science-your-way

Ways of doing Data Science Engineering and Machine Learning in R and Python

Language: Jupyter Notebook - Size: 9.12 MB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 608 - Forks: 255

InfuseAI/piperider

Code review for data in dbt

Language: Python - Size: 32.6 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 492 - Forks: 24

aeturrell/skimpy

skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.

Language: Python - Size: 4.99 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 486 - Forks: 25

ropensci/visdat

Preliminary Exploratory Visualisation of Data

Language: R - Size: 20.9 MB - Last synced at: 4 days ago - Pushed at: 7 days ago - Stars: 459 - Forks: 46

mstaniak/autoEDA-resources

A list of software and papers related to automatic and fast Exploratory Data Analysis

Language: HTML - Size: 21.4 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 430 - Forks: 78

Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

Language: C++ - Size: 149 MB - Last synced at: 13 days ago - Pushed at: 14 days ago - Stars: 425 - Forks: 82

rasbt/musicmood

A machine learning approach to classify songs by mood.

Language: OpenEdge ABL - Size: 53.3 MB - Last synced at: 7 months ago - Pushed at: about 9 years ago - Stars: 419 - Forks: 107

data-describe/data-describe

data⎰describe: Pythonic EDA Accelerator for Data Science

Language: Python - Size: 126 MB - Last synced at: 20 days ago - Pushed at: over 2 years ago - Stars: 302 - Forks: 18

rasgointelligence/feature-engineering-tutorials

Data Science Feature Engineering and Selection Tutorials

Language: Jupyter Notebook - Size: 2.76 MB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 286 - Forks: 101

amanovishnu/ineuron-full-stack-data-science-assignments

this repository features assignments and projects from the iNeuron full stack data science course, providing valuable resources for learners to enhance their skills and apply their knowledge.

Language: Jupyter Notebook - Size: 157 MB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 284 - Forks: 208

yangboz/LotteryPrediction

:full_moon_with_face: Lottery prediction besides of following "law of proability","Probability: Independent Events", there are still "Saying "a Tail is due", or "just one more go, my luck is due to change" is called The Gambler's Fallacy" existed.

Language: Jupyter Notebook - Size: 19 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 281 - Forks: 126

neerjad/DataVisualization

Tutorials on visualizing data using python packages like bokeh, plotly, seaborn and igraph

Language: Jupyter Notebook - Size: 2.79 MB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 266 - Forks: 61

Jean-njoroge/Breast-cancer-risk-prediction

Classification of Breast Cancer diagnosis Using Support Vector Machines

Language: Jupyter Notebook - Size: 1.74 MB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 257 - Forks: 136

alastairrushworth/inspectdf

🛠️ 📊 Tools for Exploring and Comparing Data Frames

Language: R - Size: 24.9 MB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 251 - Forks: 23

mebauer/data-analysis-using-python

Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data.

Language: Jupyter Notebook - Size: 572 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 247 - Forks: 39

ank0409/Ditching-Excel-for-Python

Functionalities in Excel translated to Python

Language: Jupyter Notebook - Size: 14.6 KB - Last synced at: 7 months ago - Pushed at: over 4 years ago - Stars: 231 - Forks: 90

harunurrashid97/100-Days-Of-ML-Code

A day to day plan for this challenge. Covers both theoritical and practical aspects

Language: Jupyter Notebook - Size: 11.8 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 228 - Forks: 111

dvgodoy/handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes

Language: Jupyter Notebook - Size: 1.68 MB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 196 - Forks: 27

ajaymache/data-analysis-using-python

Exploratory data analysis 📊using python 🐍of used car 🚘 database taken from ⓚ𝖆𝖌𝖌𝖑𝖊

Language: Jupyter Notebook - Size: 49.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 193 - Forks: 89

mirador/mirador

Tool for visual exploration of complex data.

Language: Java - Size: 29.2 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 191 - Forks: 24

HPInc/AI-Blueprints

📁 This repository hosts a growing collection of AI blueprint projects that run end-to-end using Jupyter notebooks, MLflow deployments, and Streamlit web apps.🛠️ All projects are built using HP AI Studio with ❤️ If you find this useful, please don’t forget to star the repository ⭐ and support our work 🚀

Language: Jupyter Notebook - Size: 616 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 184 - Forks: 15

drshahizan/Python_EDA

This topic explains about the implementation of exploratory data analysis (EDA). A total of 21 EDA case studies have been implemented using the Malaysian dataset.

Language: Jupyter Notebook - Size: 502 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 174 - Forks: 90

trr266/ExPanDaR

R Package for Interactive Panel Data Exploration

Language: R - Size: 52.6 MB - Last synced at: 11 days ago - Pushed at: 7 months ago - Stars: 160 - Forks: 46

kanaverse/kana

Single cell analysis in the browser

Language: JavaScript - Size: 119 MB - Last synced at: 8 days ago - Pushed at: 10 days ago - Stars: 152 - Forks: 18

mchav/dataframe

A fast, safe, and intuitive DataFrame library.

Language: Haskell - Size: 53.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 146 - Forks: 23

business-science/correlationfunnel

Speed Up Exploratory Data Analysis (EDA)

Language: R - Size: 25.2 MB - Last synced at: 12 days ago - Pushed at: almost 2 years ago - Stars: 138 - Forks: 29

ahmedbesbes/How-to-score-0.8134-in-Titanic-Kaggle-Challenge

Solution of the Titanic Kaggle competition

Language: Jupyter Notebook - Size: 745 KB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 129 - Forks: 96

jadianes/spark-r-notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: 6 months ago - Pushed at: about 8 years ago - Stars: 121 - Forks: 71

ujjwalkarn/xda

R package for exploratory data analysis

Language: R - Size: 722 KB - Last synced at: 6 months ago - Pushed at: almost 8 years ago - Stars: 120 - Forks: 50

lozuwa/impy

Impy is a Python3 library with features that help you in your computer vision tasks.

Language: Python - Size: 91.4 MB - Last synced at: 7 months ago - Pushed at: over 6 years ago - Stars: 116 - Forks: 32

pachterlab/voyager

From geospatial to spatial -omics

Language: R - Size: 4.33 GB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 98 - Forks: 14

Kushal997-das/THE-SPARKS-FOUNDATION

📌 This repo. Contains Basic - Advance level Data science / Machine learning / business analysis Projects. 👨‍💻

Language: Jupyter Notebook - Size: 7.25 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 98 - Forks: 64

dgwozdz/HN_SO_analysis

Is there a relationship between popularity of a given technology on Stack Overflow (SO) and Hacker News (HN)? And a few words about causality

Language: Python - Size: 21.1 MB - Last synced at: 12 months ago - Pushed at: over 7 years ago - Stars: 97 - Forks: 11

jadianes/data-journalism

Data journalism and easy to replicate notebooks using Python, R, and Web visualisations

Language: HTML - Size: 9.32 MB - Last synced at: 29 days ago - Pushed at: over 7 years ago - Stars: 90 - Forks: 17

SmooSenseAI/smoosense

Interactively browse multimodal tabular data

Language: TypeScript - Size: 9.99 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 89 - Forks: 10

Data-Centric-AI-Community/awesome-python-for-data-science

A curated list of awesome resources such as books, tutorials, courses, open-source libraries, exercises, and other materials that support Pythonistas in the making, and Pythonistas migrating into Data Science! 📊

Language: Jupyter Notebook - Size: 51.8 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 19

StephanieStallworth/Exploratory_Data_Analysis_Visualization_Python

Data analysis and visualization with PyData ecosystem: Pandas, Matplotlib Numpy, and Seaborn

Language: Jupyter Notebook - Size: 23.5 MB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 89 - Forks: 96

tusharnankani/whatsapp-chat-data-analysis

An Exhaustive WhatsApp Chat Data Analysis.

Language: Jupyter Notebook - Size: 24.6 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 79 - Forks: 25

duttashi/learnr

Exploratory, Inferential and Predictive data analysis. Feel free to show your :heart: by giving a star :star:

Language: R - Size: 54.7 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 78 - Forks: 53

mandliya/ml

A 60 days+ streak of daily learning of ML/DL/Maths concepts through projects

Language: Jupyter Notebook - Size: 101 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 77 - Forks: 17

nbarrowman/vtree

An R package for calculating and drawing variable trees

Language: R - Size: 29.9 MB - Last synced at: 14 days ago - Pushed at: about 2 months ago - Stars: 76 - Forks: 7

PetrKorab/Arabica

Python package for text mining of time-series data

Language: Python - Size: 107 MB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 76 - Forks: 16

kianweelee/Edator

A python package that performs exploratory data analysis for users. Additionally, it generates 3 types of output files (cleaned CSV, plots and a text report).

Language: Python - Size: 348 KB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 76 - Forks: 9

pyaf/DenseNet-MURA-PyTorch

Implementation of DenseNet model on Standford's MURA dataset using PyTorch

Language: Python - Size: 211 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 75 - Forks: 33

devsgnr/breadroll

breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.

Language: TypeScript - Size: 15.2 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 74 - Forks: 0

prathameshtari/Predicting-Football-Match-Outcome-using-Machine-Learning

Football Match prediction using machine learning algorithms in jupyter notebook

Language: Jupyter Notebook - Size: 1.11 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 74 - Forks: 61

ben519/mltools

Exploratory and diagnostic machine learning tools for R

Language: R - Size: 172 KB - Last synced at: 14 days ago - Pushed at: about 4 years ago - Stars: 73 - Forks: 26

PacktWorkshops/The-Data-Analysis-Workshop

A New Interactive Approach to Learning Data Analysis

Language: Jupyter Notebook - Size: 63.3 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 72 - Forks: 56

kevinadhiguna/dqlab-career-track

A collection of scripts written to complete DQLab Data Analyst Career Track 📊

Language: Python - Size: 5.5 MB - Last synced at: 7 months ago - Pushed at: about 3 years ago - Stars: 72 - Forks: 49

hsbc/tslumen

A library for Time Series EDA (exploratory data analysis)

Language: Python - Size: 85.9 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 71 - Forks: 9

zmjones/edarf

exploratory data analysis using random forests

Language: R - Size: 6.92 MB - Last synced at: 7 days ago - Pushed at: over 7 years ago - Stars: 68 - Forks: 11

Renumics/sliceguard

A library for detecting problematic data segments in structured and unstructured data with few lines of code.

Language: Python - Size: 4.28 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 64 - Forks: 3

ucd-dnp/leila

Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co

Language: Jupyter Notebook - Size: 29.7 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 61 - Forks: 22

mukulsinghal001/lead-scoring-model-python

Lead Scoring is such a powerful metric when it comes to quantifying the lead & it is nowadays used by every CRM. In this repository, we are going to take a look at the UpGrad lead scoring case study and see how can we solve this problem through several supervised machine learning models.

Language: Jupyter Notebook - Size: 7.07 MB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 59 - Forks: 25

datamole-ai/edvart

An open-source Python library for Data Scientists & Data Analysts designed to simplify the exploratory data analysis process. Using Edvart, you can explore data sets and generate reports with minimal coding.

Language: Python - Size: 25.3 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 55 - Forks: 7

Spratiher9/Sparkora

Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟

Language: HTML - Size: 1.23 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 53 - Forks: 7

ahmed-mohamed-sn/olliePy

OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.

Language: Python - Size: 190 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 51 - Forks: 3

exploripy/exploripy

Pre-Modelling Analysis of the data, by doing various exploratory data analysis and Statistical Test.

Language: Python - Size: 3.21 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 51 - Forks: 22

TysonStanley/furniture

The furniture R package contains table1 for publication-ready simple and stratified descriptive statistics, tableC for publication-ready correlation matrixes, and other tables #rstats

Language: R - Size: 4.2 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 50 - Forks: 7

gitsuraaj/Data-Science-Series

For all those who're struggling to find a good hands-on resource (with case studies) to master their Data Science skills, Here's all what you need!

Language: Jupyter Notebook - Size: 24.5 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 50 - Forks: 17

wpinvestigative/kushner_eb5_census

Jared Kushner and his partners used a program meant for job-starved areas to build a luxury skyscraper

Language: HTML - Size: 31.4 MB - Last synced at: over 2 years ago - Pushed at: over 8 years ago - Stars: 49 - Forks: 2

great-northern-diver/loon

A Toolkit for Interactive Statistical Data Visualization

Language: Tcl - Size: 43.4 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 47 - Forks: 16

hemansnation/Python-For-Data-Professionals

This course is designed to get a good grip on python programming, logic building, solving algorithm-based questions, data structures, understanding of data analytics, working with pandas, professional practices, and API building.

Language: Jupyter Notebook - Size: 35.6 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 46 - Forks: 12

sauravmishra1710/Heart-Failure-Condition-And-Survival-Analysis

Perform a survival analysis based on the time-to-event (death event) for the subjects. Compare machine learning models to assess the likelihood of a death by heart failure condition. This can be used to help hospitals in assessing the severity of patients with cardiovascular diseases and heart failure condition.

Language: Jupyter Notebook - Size: 27.8 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 45 - Forks: 20

daya6489/SmartEDA

a R package for data exploratory analysis

Language: HTML - Size: 17.6 MB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 45 - Forks: 14

AvinashSingh786/Fraud-Analysis

Insurance fraud claims analysis project

Language: Python - Size: 17 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 45 - Forks: 28

Elysian01/Data-Purifier

A Python library for Automated Exploratory Data Analysis, Automated Data Cleaning, and Automated Data Preprocessing For Machine Learning and Natural Language Processing Applications in Python.

Language: Jupyter Notebook - Size: 7.51 MB - Last synced at: 29 days ago - Pushed at: over 3 years ago - Stars: 45 - Forks: 6

mast-group/sequence-mining 📦

Probabilistic Sequence Mining

Language: Java - Size: 5.5 MB - Last synced at: 7 months ago - Pushed at: over 7 years ago - Stars: 45 - Forks: 8

anishsingh20/Human-Resource-Analytics-and-Employee-Churn-Prediction

A Data science and Analytics project with the main aim of doing some Descriptive and Exploratory Data Analysis and then applying predictive modelling for predicting why and which are the best and most experienced employees leaving prematurely?

Language: R - Size: 4.14 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 42 - Forks: 31

s1dewalker/Airbnb-listings-NYC

Exploratory Data Analysis | SQL, Tableau, Python

Language: Jupyter Notebook - Size: 25.2 MB - Last synced at: 4 months ago - Pushed at: 12 months ago - Stars: 40 - Forks: 17

wikistat/Exploration

Science des Données Saison 2: Exploration statistique multidimensionnelle, ACP, AFC, AFD, Classification non supervisée

Language: Jupyter Notebook - Size: 83.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 40 - Forks: 67

exactpro/nostradamus

🧠 An open-source machine learning application for analyzing software defect reports extracted from bug tracking systems.

Language: TypeScript - Size: 198 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 40 - Forks: 12

sanithps98/Automobile-Dataset-Analysis

This project analyzes and visualizes the Used Car Prices from the Automobile dataset in order to predict the most probable car price

Language: Jupyter Notebook - Size: 805 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 40 - Forks: 34

KaziAmitHasan/data-inspector

Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.

Language: Jupyter Notebook - Size: 540 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 39 - Forks: 2

gabrielpreda/Kaggle

Kaggle Kernels (Python, R, Jupyter Notebooks)

Language: Jupyter Notebook - Size: 290 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 38 - Forks: 18

evoluteur/kaggle-look-alike

Kaggle Data Explorer UI look-alike built in React.

Language: JavaScript - Size: 1.96 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 36 - Forks: 3

awslabs/amazon-accessible-rl-sdk

A2RL is a Python library for offline reinforcement learning

Language: Python - Size: 744 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 36 - Forks: 8

zhihanyue/qgridnext

Advancing QGrid, an interactive grid for exploring DataFrames in JupyterLab/Notebook

Language: Python - Size: 54.3 MB - Last synced at: 21 days ago - Pushed at: 12 months ago - Stars: 35 - Forks: 2

aedin/PCAworkshop

An introduction to matrix factorization and PCA and SVD.

Language: TeX - Size: 9.26 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 35 - Forks: 12

urmi-21/MetaOmGraph

MetaOmGraph: a workbench for interactive exploratory data analysis of large expression datasets

Language: Java - Size: 175 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 32 - Forks: 14

erdogant/hnet

Association ruled based networks using graphical Hypergeometric Networks.

Language: Python - Size: 75.3 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 31 - Forks: 3

aatmunbaxi/orgroamtools

Helper library for data analysis of org-roam collections

Language: Python - Size: 3.39 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 29 - Forks: 1

roshankoirala/pySpark_tutorial

Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple DataFrames, visualization, Machine Learning

Language: Jupyter Notebook - Size: 202 KB - Last synced at: 7 months ago - Pushed at: about 5 years ago - Stars: 29 - Forks: 26

DiegoUsaiUK/Market_Basket_Analysis

Market Basket Analysis with Recommendation Algorithms & Shiny App Implementation of a Product Recommendation System for an Online Retailer

Language: R - Size: 1.03 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 29 - Forks: 20

Related Topics
python 2,199 machine-learning 1,785 data-visualization 1,760 data-science 1,544 data-analysis 1,253 pandas 1,169 seaborn 768 matplotlib 728 eda 708 jupyter-notebook 652 numpy 592 feature-engineering 482 visualization 435 data-cleaning 404 python3 398 logistic-regression 366 r 352 random-forest 316 exploratory-data-visualizations 302 linear-regression 302 sql 296 machine-learning-algorithms 280 classification 276 data 272 scikit-learn 248 statistics 219 data-preprocessing 210 predictive-modeling 203 regression 200 statistical-analysis 194 tableau 172 matplotlib-pyplot 166 plotly 162 deep-learning 152 sklearn 149 data-analytics 145 supervised-learning 143 xgboost 137 powerbi 136 streamlit 131 data-wrangling 130 random-forest-classifier 129 kaggle 127 analysis 126 regression-models 125 clustering 122 decision-trees 117 dataanalysis 114 datacleaning 113 feature-selection 111 hypothesis-testing 109 dashboard 106 hyperparameter-tuning 105 data-mining 102 analytics 101 datavisualization 98 natural-language-processing 96 excel 96 time-series-analysis 90 datascience 90 sentiment-analysis 89 nlp 88 mysql 83 decision-tree-classifier 83 kaggle-dataset 80 dataset 80 kmeans-clustering 79 data-analysis-python 76 time-series 75 business-analytics 72 pandas-dataframe 70 unsupervised-learning 68 univariate-analysis 67 flask 67 prediction 66 classification-algorithm 65 regression-analysis 65 tensorflow 62 outlier-detection 60 seaborn-plots 59 cross-validation 59 business-intelligence 58 k-means-clustering 57 preprocessing 56 model-evaluation 55 neural-network 55 bivariate-analysis 55 kaggle-competition 55 ggplot2 53 predictive-analytics 53 xgboost-classifier 53 datapreprocessing 53 supervised-machine-learning 52 data-visualisation 52 data-cleaning-and-preprocessing 51 support-vector-machines 50 confusion-matrix 50 webscraping 50 artificial-intelligence 50 feature-extraction 50