Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with dask

A curated list of projects in awesome lists tagged with dask .

https://github.com/dask/dask

Parallel computing with task scheduling

dask numpy pandas pydata python scikit-learn scipy

Last synced: 29 Sep 2024

https://github.com/pydata/xarray

N-D labeled arrays and datasets in Python

dask netcdf numpy pandas python xarray

Last synced: 29 Sep 2024

https://github.com/mars-project/mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

dask dataframe joblib lightgbm machine-learning numpy pandas python pytorch ray scikit-learn statsmodels tensor tensorflow xgboost

Last synced: 29 Sep 2024

https://github.com/jmcarpenter2/swifter

A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

dask modin pandas pandas-dataframe parallel-computing parallelization

Last synced: 30 Sep 2024

https://github.com/fugue-project/fugue

A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

dask data-practitioners distributed distributed-computing distributed-systems machine-learning pandas spark sql

Last synced: 28 Sep 2024

https://github.com/dask/distributed

A distributed task scheduler for Dask

dask distributed-computing hacktoberfest pydata python

Last synced: 31 Jul 2024

https://github.com/pytroll/satpy

Python package for earth-observing satellite data processing

closember dask hacktoberfest python satellite weather xarray

Last synced: 01 Aug 2024

https://github.com/Nixtla/mlforecast

Scalable machine 🤖 learning for time series forecasting.

dask forecast forecasting lightgbm machine-learning python time-series xgboost

Last synced: 01 Aug 2024

https://github.com/ranaroussi/pystore

Fast data store for Pandas time-series data

dask database dataframe datastore pandas parquet timeseries

Last synced: 01 Aug 2024

https://github.com/capitalone/datacompy

Pandas and Spark DataFrame comparison for humans and more!

compare dask data data-science dataframes fugue numpy pandas polars pyspark python spark

Last synced: 28 Sep 2024

https://github.com/dask-contrib/dask-sql

Distributed SQL Engine in Python using Dask

dask distributed ml python sql sql-engines sql-server

Last synced: 28 Sep 2024

https://github.com/pytroll/pyresample

Geospatial image resampling in Python

closember dask hacktoberfest kd-tree numpy python resampling xarray

Last synced: 03 Aug 2024

https://github.com/Ouranosinc/xclim

Library of derived climate variables, ie climate indicators, based on xarray.

anuclim climate-analysis climate-science dask icclim netcdf4 python xarray xclim

Last synced: 01 Aug 2024

https://github.com/gjoseph92/stackstac

Turn a STAC catalog into a dask-based xarray

cog dask geospatial rasterio stac xarray

Last synced: 03 Oct 2024

https://github.com/dask/dask-jobqueue

Deploy Dask on job schedulers like PBS, SLURM, and SGE

dask distributed hpc pbs-cluster python sge-cluster slurm-cluster

Last synced: 02 Aug 2024

https://github.com/pangeo-data/climpred

:earth_americas: Verification of weather and climate forecasts :earth_africa:

climate climate-analysis dask forecasting pangeo prediction python s2d s2s xarray

Last synced: 01 Aug 2024

https://github.com/allencellmodeling/aicsimageio

Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python

bio-formats dask image-metadata imageio microscopy python scientific-computing scientific-formats xarray

Last synced: 01 Oct 2024

https://github.com/AllenCellModeling/aicsimageio

Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python

bio-formats dask image-metadata imageio microscopy python scientific-computing scientific-formats xarray

Last synced: 03 Aug 2024

https://github.com/jgrss/geowombat

GeoWombat: Utilities for geospatial data

dask geography python raster rasterio remote-sensing satellite xarray

Last synced: 31 Jul 2024

https://github.com/google/xarray-beam

Distributed Xarray with Apache Beam

beam dask xarray zarr

Last synced: 03 Aug 2024

https://github.com/xarray-contrib/flox

Fast & furious GroupBy operations for dask.array

dask map-reduce xarray

Last synced: 02 Aug 2024

https://github.com/dymaxionlabs/dask-rasterio

Read and write rasters in parallel using Rasterio and Dask

dask gdal python rasterio

Last synced: 03 Aug 2024

https://github.com/xarray-contrib/xeofs

Comprehensive EOF analysis in Python with xarray: A versatile, multidimensional, and scalable tool for advanced climate data analysis

climate-science dask dimensionality-reduction eof-analysis pattern-recognition pca xarray

Last synced: 08 Aug 2024

https://github.com/ncar/ncar-python-tutorial

Numerical & Scientific Computing with Python Tutorial

cartopy dask jupyter matplotlib numpy python scipy tutorial xarray

Last synced: 01 Oct 2024

https://github.com/saturncloud/dask-pytorch-ddp

dask-pytorch-ddp is a Python package that makes it easy to train PyTorch models on dask clusters using distributed data parallel.

computer-vision dask deep-learning distributed-computing machine-learning nlp pytorch

Last synced: 03 Aug 2024

https://github.com/ml-tooling/lazycluster

🎛 Distributed machine learning made simple.

cluster dask distributed-computing hyperopt machine-learning python ssh

Last synced: 06 Aug 2024

https://github.com/sinhrks/daskperiment

Reproducibility for Humans: A lightweight tool to perform reproducible machine learning experiment.

dask machine-learning reproducibility

Last synced: 03 Aug 2024

https://github.com/itamarst/dask-memusage

A low-impact profiler to figure out how much memory each task in Dask is using

dask memory profiler profiling python

Last synced: 03 Aug 2024

https://github.com/nci/scores

scores: verification scores and metrics, supporting the earth system modelling community

climate dask forecast-evaluation forecast-verification forecasting model-validation oceanography pandas python verification weather xarray

Last synced: 08 Aug 2024

https://github.com/mansenfranzen/pywrangler

Advanced data wrangling for python

dask dataframe datawrangling pyspark python

Last synced: 02 Oct 2024

https://github.com/casangi/cngi_prototype

Prototype Development of CNGI

astronomy dask numba scipy xarray zarr

Last synced: 30 Sep 2024

https://github.com/developmentseed/label-maker-dask

Library for running label-maker as a dask job

dask machine-learning microsoft osm

Last synced: 03 Aug 2024

https://github.com/osoceanacoustics/echodataflow

Orchestrated sonar data processing workflow

dask docker elasticsearch kafka kibana logstash prefect python

Last synced: 28 Sep 2024

https://github.com/ornl/flowcept

Runtime data integration system that empowers any data processing system to capture and query workflow provenance using data observability.

big-data dask data-integration lineage machine-learning mlflow model-management parallel-processing provenance reproducibility responsible-ai scientific-workflows tensorboard trustworthy-ai workflows

Last synced: 29 Sep 2024

https://github.com/rhasanm/airflower

Airflower adds intelligent decision-making to Apache Airflow by capturing real-time metadata through listeners and sending it to an external brain via gRPC. The brain, equipped with analytical and rule engines, processes the data and sends decisions back to Airflow to optimize task execution dynamically.

airflow dask data-engineering etl grpc machine-learning pandas python rabbitmq tensorflow

Last synced: 29 Sep 2024