An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: dask-dataframes

kaladabrio2020/HELPS_

Language: Jupyter Notebook - Size: 1.57 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

dask-contrib/dask-deltatable

A Delta Lake reader for Dask

Language: Python - Size: 260 KB - Last synced at: 17 days ago - Pushed at: 7 months ago - Stars: 49 - Forks: 15

MBrugnaroto/Trips-Data-Pipeline

This repository presents a simple streaming data pipeline to get statistics per vehicle from a data source.

Language: Python - Size: 42.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

HamedAlemo/dask-tutorial

A tutorial to learn Dask DataArray and Dask DataFrames with examples from geospatial data catalogs.

Language: Jupyter Notebook - Size: 18.5 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Ramy-Badr-Ahmed/Higgs-Dataset-Training

Training Higgs Dataset with Keras - https://doi.org/10.5281/zenodo.13133945

Language: Python - Size: 3.46 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

sbl-sdsc/df-parallel

Comparison of Dataframe libraries for parallel processing of large tabular files on CPU and GPU.

Language: Jupyter Notebook - Size: 3.33 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 6 - Forks: 3

Navya0203/BigData-Recommender-System-along-with-Sentiment-Analysis-Using-Dask-and-Pyspark

This repository develops an advanced recommendation system to enhance the e-commerce shopping experience by automating product suggestions and analyzing user preferences through machine learning techniques and big data technologies.

Language: Jupyter Notebook - Size: 588 KB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 1

pranjalpruthi/bhedi

βHΞDI (Biomarker-based Heuristic Engine for Dengue Identification) is a computational tool designed for the identification of Dengue virus serotypes in wastewater next-generation sequencing data.

Language: Go - Size: 6.76 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

somjit101/NYC-Taxi-Demand-Prediction

This is a Time Series Forecasting and Regression solution to project the no. of pick-ups at and around a given region at a given time in the city of New York, USA.

Language: Jupyter Notebook - Size: 12.5 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

momentoscope/hextof-processor

Code for preprocessing data from the HEXTOF instrument at FLASH, DESY in Hamburg (DE)

Language: Jupyter Notebook - Size: 12.7 MB - Last synced at: 11 months ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 4

D3struf/Distributed-Collaborative-Filtering-Book-Recommendation-System

It uses Dask as a Distributed Framework and Inspired by the work of https://github.com/entbappy/ML-Based-Book-Recommender-System

Language: Jupyter Notebook - Size: 370 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

fares-ds/pandas

Full data analysis and data visualization projects notebooks using Pandas, Numpy, matplotlib and seaborn

Language: Jupyter Notebook - Size: 8.07 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Circadiaware/circalizer

Flexible stacked visualization of circadian data from multiple sources and devices

Language: Jupyter Notebook - Size: 5.49 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

jackmadden246/Toothbrush

This is a Xander project involving a Python script containing toothbrush sales data , which is scheduled on a cloud environment

Language: Python - Size: 7.82 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

stivenramireza/pocs

POCs in order to explore new technologies.

Language: Python - Size: 675 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

IncubatorShokuhou/dask-tutorial Fork of dask/dask-tutorial

Dask tutorial;Dask汉化教程

Language: Jupyter Notebook - Size: 87.2 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

sarmad9987/Machine-learning-with-Dask

The following project shows and compares machine learning between Pandas DataFrames and Dask Dataframes.

Language: Jupyter Notebook - Size: 107 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

rishigoutam/nycdemo

Code for a talk on wrangling large datasets in pandas

Language: Jupyter Notebook - Size: 115 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 3

kuntala-c/Crimes-Analysis-in-Chicago-using-Dask

Data Analysis on an extensive dataset of crimes in Chicago (2005-2016) using Dask

Language: Jupyter Notebook - Size: 2.26 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 3

nafisa-samia/Analysis-of-Crimes-in-Chicago-using-Dask Fork of kuntala-c/Crimes-Analysis-in-Chicago-using-Dask

Data Analysis on an extensive dataset of crimes in Chicago (2005 - 2016) using Dask

Language: Jupyter Notebook - Size: 1.02 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Related Keywords
dask-dataframes 20 dask 12 python 7 data-visualization 5 dask-distributed 3 data-science 3 dask-array 3 pandas 3 pandas-dataframe 2 scikit-learn 2 s3-bucket 2 numpy 2 cuda-toolkit 2 matplotlib 2 dask-ml 2 machine-learning 2 data-analysis 2 hvplot 1 holoviews 1 data 1 pandas-python 1 seaborn 1 circadian-data 1 circadian-rhythm 1 datashader 1 sleep-cycles 1 sleep-research 1 streamlit 1 distributed-computing 1 distributed-collaborative-filtering 1 collaborative-filtering 1 ultrafast-spectroscopy 1 solid-state-physics 1 photoemission 1 pes 1 mpes 1 materials-science 1 free-electron-laser 1 fel 1 distributed-processing 1 condensed-matter-physics 1 arpes 1 sqlite3 1 jupyter-notebook 1 delayed 1 chinese-translation 1 splash 1 selenium 1 scrapy 1 requests 1 rabbitmq 1 mongodb 1 gitflow 1 fastapi 1 dagster 1 cassandra 1 blockchain 1 beautifulsoup 1 apache-airflow 1 python3 1 php 1 mariadb-database 1 ec2-instance 1 cronjob 1 cloudwatch 1 aws-lambda 1 visualizations 1 sleep-tracker 1 xgboost-regression 1 parallel-processing 1 gpu-computing 1 dataframes 1 dask-cudf 1 uci-machine-learning 1 uci-dataset 1 matplotlib-python 1 keras-tensorflow 1 keras 1 higgs-boson 1 cupy 1 binary-classification 1 geospatial-data 1 geospatial-analysis 1 geospatial 1 sql 1 postgres 1 kafka 1 docker 1 airflow 1 parquet 1 delta-lake 1 plotly 1 parallel-programming 1 matplotlib-pyplot 1 dash 1 time-series-forecasting 1 time-series-analysis 1 random-forest-regression 1 performance-analysis 1 new-york-taxi-demand-prediction 1