An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: dask-distributed

pyiron/pylammpsmpi

Parallel Lammps Python interface - control a mpi4py parallel LAMMPS instance from a serial python process or a Jupyter notebook

Language: Python - Size: 778 KB - Last synced at: about 5 hours ago - Pushed at: about 6 hours ago - Stars: 32 - Forks: 5

AmishiDesai04/Distributed-Machine-Learning

A lightweight, scalable system that demonstrates model and data parallelism in machine learning using Dask, PyTorch, and Flask. Features distributed CNN inference and linear regression training across multiple networked devices.

Language: HTML - Size: 295 KB - Last synced at: 2 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

DataCanvasIO/HyperGBM

A full pipeline AutoML tool for tabular data

Language: Python - Size: 11 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 347 - Forks: 47

shauryashaurya/learn-data-munging

Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.

Language: Jupyter Notebook - Size: 627 MB - Last synced at: 4 days ago - Pushed at: 10 days ago - Stars: 48 - Forks: 21

ubunye-ai-ecosystems/tfilterspy

Python library for implementing state-of-the-art Bayesian filtering techniques like Kalman Filters and Particle Filters.

Language: Python - Size: 5.75 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

elcorto/psweep

Loop like a pro, make parameter studies fun.

Language: Python - Size: 6.86 MB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 17 - Forks: 2

kaladabrio2020/HELPS_

Language: Jupyter Notebook - Size: 1.57 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

TimeEval/TimeEval

Evaluation Tool for Anomaly Detection Algorithms on Time Series

Language: Jupyter Notebook - Size: 24.7 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 121 - Forks: 18

sulis-hpc/sulis-hpc.github.io

User documentation website for the Sulis tier 2 HPC service. Built using Jekyll.

Language: SCSS - Size: 2.04 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 6

OleksandrZadvornyi/dask-weather-analysis

Distributed processing and analysis of daily weather station summaries using Dask library

Language: Jupyter Notebook - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

shiv3679/ClimEval

EvalMetrics: Precision in Prediction

Language: Python - Size: 300 KB - Last synced at: 21 days ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Daniel-Elston/real-time-reddit-scalable-processing

Scaling NLP processing pipelines with Dask and PySpark, utilising Apache Kafka real-time data streaming, for optimal LLM training

Language: Python - Size: 3.07 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

KayDVC/semmed-neo4j

A project using the National Library of Medicine's Semantic Medline Database to create a graphical-relational database.

Language: Jupyter Notebook - Size: 4.11 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

jameslamb/lightgbm-dask-testing

Test LightGBM's Dask integration on different cluster types

Language: Jupyter Notebook - Size: 104 KB - Last synced at: about 20 hours ago - Pushed at: 4 months ago - Stars: 12 - Forks: 5

rolani/dask-ecs-lib

dask-ecs-lib is a Python library that effortlessly spins up a Dask cluster on AWS ECS using Fargate, allowing you to seamlessly execute and parallelize your functions.

Language: Python - Size: 13.7 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

HamedAlemo/dask-tutorial

A tutorial to learn Dask DataArray and Dask DataFrames with examples from geospatial data catalogs.

Language: Jupyter Notebook - Size: 18.5 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

VorGeo/earthengine-dask

Scale up concurrent requests to Earth Engine interactive endpoints with Dask

Language: HTML - Size: 883 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 0

ScalableCytometryImageProcessing/SCIP

Scalable Cytometry Image Processing (SCIP) is an open-source tool that implements an image processing pipeline on top of Dask, a distributed computing framework written in Python. SCIP performs projection, illumination correction, image segmentation and masking, and feature extraction.

Language: Python - Size: 20.5 MB - Last synced at: 7 days ago - Pushed at: 11 months ago - Stars: 7 - Forks: 0

pleiszenburg/scherbelberg

HPC cluster deployment and management for the Hetzner Cloud

Language: Python - Size: 743 KB - Last synced at: 9 days ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 1

gjoseph92/sneks

Launch a Dask cluster from a Poetry environment

Language: Python - Size: 661 KB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 0

datapartnership/pipelines

:alembic: Pipelines

Language: Python - Size: 74.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ivanbgd/dask_demo_reins

A Dask library for Big Data processing in Python demo

Language: Python - Size: 11.7 KB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

JBris/pycaret-fugue-dask-test

Testing PyCaret, Fugue, and Dask

Language: Python - Size: 107 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

modin-project/unidist

Unified Distributed Execution

Language: Python - Size: 624 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 41 - Forks: 8

LimnoTech/Xarray-DataAccessor

Efficiently read climate/meteorology data into Xarray using Dask for parallelization. Transform the data for your modelling needs.

Language: Python - Size: 754 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 1

JSybrandt/agatha

AGATHA: Automatic Graph-mining And Transformer based Hypothesis generation Approach

Language: Python - Size: 6.67 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 52 - Forks: 9

epiviz/epivizFileServer

Python library to query and transform genomic data from indexed files

Language: Python - Size: 16.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 4

aws-solutions-library-samples/distributed-compute-on-aws-with-cross-regional-dask

Perform I/O intensive workloads on high-volume data sparsely located across multiple AWS regions through the use of Dask.

Language: TypeScript - Size: 1.35 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 2

jkanche/asynchronous-api-dask-terraform

Asynchronous API using Dask and AWS Fargate

Language: HCL - Size: 378 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

gyasi87/NYC_Taxi_Data

NY City Taxi Analysis using Dask

Language: Jupyter Notebook - Size: 754 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

maawoo/stac-access-performance

Testing access performance of Sentinel-1 RTC metadata catalogs

Language: HTML - Size: 14.4 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

arjunsawhney1/scalable-ML Fork of rajeevdixit19/Scaleable-Ml

In this repo, I build a LogisticRegression prediction model with Dask and PySpark and initialize an AWS EMR cluster to run the entire pipeline.

Size: 131 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

comp-dev-cms-ita/dask-remote-jobqueue

A custom dask remote jobqueue for HTCondor.

Language: Python - Size: 330 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 1

mabaszadeh/distributed-tsp

Distributed solution for Traveling Salesman Problem using Dask.distributed + OR-TOOLS

Language: Python - Size: 1.75 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

eth-cscs/ipcluster_magic

Magic commands to support running MPI python code as well as multi-node Dask workloads on Jupyter notebooks.

Language: Python - Size: 364 KB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 1

sandal-tan/dtop

Top - For Dask!

Language: Python - Size: 42 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

antarcticrainforest/esm_analysis

Python 3 tools for distributed analysis and visualisation of big climate data on HPC systems.

Language: Python - Size: 8.75 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

fabidick22/add-worker-DaskCluster

Script para configuración e installacion de requermientos de un worker de Dask Distributed

Language: Shell - Size: 25.4 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

octoenergy/dask-remote 📦

Procurement: Dask Cluster as a Process.

Language: Python - Size: 389 KB - Last synced at: 12 days ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

leosmerling-hopeit/fraud-poc

Fraud detection ML pipeline and serving POC using Dask and hopeit.engine. Project created with nbdev: https://www.fast.ai/2019/12/02/nbdev/

Language: Jupyter Notebook - Size: 8.06 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 4

gandalf1819/NYCOpenData-Profiling-Analysis

Open Data Profiling, Quality and Analysis on NYC OpenData dataset with semantic profiling using fuzzy ratio, Levenshtein distance and regex

Language: Jupyter Notebook - Size: 17.9 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 4

matbinder/ase-dask-example

Language: Jupyter Notebook - Size: 34.2 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

gdmarmerola/big-data-ml-training

Code for "Training models when data doesn't fit in memory" post

Language: Jupyter Notebook - Size: 2.94 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 9 - Forks: 4

IncubatorShokuhou/dask-tutorial Fork of dask/dask-tutorial

Dask tutorial;Dask汉化教程

Language: Jupyter Notebook - Size: 87.2 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

vlfom/nyc-taxi-data

Code for fetching, sampling, and analysis of NYC taxi data from TLC and Uber for 2009-2018

Language: Jupyter Notebook - Size: 2.64 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

alexberndt/machine-learning-sandbox

Collection of machine learning algorithms ...

Language: Jupyter Notebook - Size: 352 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

lebedov/dask-ml-on-azure-ml

Using Dask-ML on Azure ML

Language: Python - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

Estivador/python-dask

Language: Dockerfile - Size: 1000 Bytes - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

JulianWgs/dask-log-server

Preserve all necessary runtime data of a Dask client in order to "replay" and analyze the performance and behavior of the client after the fact

Language: Python - Size: 289 KB - Last synced at: 8 days ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0