An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with dask-distributed

A curated list of projects in awesome lists tagged with dask-distributed .

https://github.com/shauryashaurya/learn-data-munging

Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.

arrow dask dask-distributed data-engineering datafusion jupyter numpy pandas polars pyspark ray spark

Last synced: 16 Apr 2025

https://github.com/pyiron/pylammpsmpi

Parallel Lammps Python interface - control a mpi4py parallel LAMMPS instance from a serial python process or a Jupyter notebook - based on executorlib

dask-distributed lammps lammps-python-interface mpi4py openmpi

Last synced: 13 Feb 2026

https://github.com/aws-solutions-library-samples/distributed-compute-on-aws-with-cross-regional-dask

Perform I/O intensive workloads on high-volume data sparsely located across multiple AWS regions through the use of Dask.

dask dask-distributed dask-worker-pools

Last synced: 14 Oct 2025

https://github.com/jameslamb/lightgbm-dask-testing

Test LightGBM's Dask integration on different cluster types

aws dask dask-distributed docker lightgbm machine-learning

Last synced: 06 Sep 2025

https://github.com/gjoseph92/sneks

Launch a Dask cluster from a Poetry environment

coiled dask dask-distributed poetry-python

Last synced: 20 Mar 2025

https://github.com/eth-cscs/ipcluster_magic

Magic commands to support running MPI python code as well as multi-node Dask workloads on Jupyter notebooks.

dask-distributed ipyparallel jupyter-notebook mpi4py

Last synced: 04 Apr 2025

https://github.com/comp-dev-cms-ita/dask-remote-jobqueue

A custom dask remote jobqueue for HTCondor.

dask dask-distributed dask-jobqueue htcondor

Last synced: 28 Feb 2026

https://github.com/maawoo/stac-access-performance

Testing access performance of Sentinel-1 RTC metadata catalogs

analysis-ready-data dask-distributed earth-observation metadata sentinel-1 xarray

Last synced: 17 Jan 2026

https://github.com/jbris/pycaret-fugue-dask-test

Testing PyCaret, Fugue, and Dask

dask dask-distributed fugue pycaret pycaret-library

Last synced: 13 May 2026

https://github.com/lebedov/dask-ml-on-azure-ml

Using Dask-ML on Azure ML

azure-ml dask-distributed dask-ml

Last synced: 06 May 2026

https://github.com/jkanche/asynchronous-api-dask-terraform

Asynchronous API using Dask and AWS Fargate

aws dask-distributed fargate-containers fastapi

Last synced: 17 May 2026

https://github.com/amishidesai04/distributed-machine-learning

A lightweight, scalable system that demonstrates model and data parallelism in machine learning using Dask, PyTorch, and Flask. Features distributed CNN inference and linear regression training across multiple networked devices.

dask-distributed distributed-computing distributed-machine-learning flask machine-learning pytorch

Last synced: 30 Apr 2026

https://github.com/hamedalemo/dask-tutorial

A tutorial to learn Dask DataArray and Dask DataFrames with examples from geospatial data catalogs.

dask dask-dataframes dask-distributed geospatial geospatial-analysis geospatial-data

Last synced: 06 Jun 2026

https://github.com/daniel-elston/real-time-reddit-scalable-processing

Scaling NLP processing pipelines with Dask and PySpark, utilising Apache Kafka real-time data streaming, for optimal LLM training

apache-kafka dask-distributed embeddings llm llm-training nlp pyspark scalability

Last synced: 12 May 2026

https://github.com/kaydvc/semmed-neo4j

A project using the National Library of Medicine's Semantic Medline Database to create a graphical-relational database.

aws dask dask-distributed graphical-data neo4j relational-databases semmeddb

Last synced: 17 Mar 2025