{"id":16285408,"url":"https://github.com/constantinpape/cluster_tools","last_synced_at":"2025-09-08T01:32:22.543Z","repository":{"id":29074193,"uuid":"118263492","full_name":"constantinpape/cluster_tools","owner":"constantinpape","description":"Distributed segmentation for bio-image-analysis","archived":false,"fork":false,"pushed_at":"2025-05-13T06:21:51.000Z","size":1793,"stargazers_count":38,"open_issues_count":16,"forks_count":14,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-05-13T06:27:46.178Z","etag":null,"topics":["3d-segmentation","bio-image-analysis","cluster-computing","connectomics","lifted-multicut","microscopy-images","multicut","mutex-watershed","segmentation","watershed"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/constantinpape.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-01-20T17:09:39.000Z","updated_at":"2025-05-13T06:20:41.000Z","dependencies_parsed_at":"2022-08-07T14:01:11.670Z","dependency_job_id":"9ed1fb16-9abc-472e-86b5-7631e884b99e","html_url":"https://github.com/constantinpape/cluster_tools","commit_stats":{"total_commits":894,"total_committers":8,"mean_commits":111.75,"dds":0.3221476510067114,"last_synced_commit":"366f2a1a57d5c01a7d51548c74e1d53465a25924"},"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/constantinpape/cluster_tools","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fcluster_tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fcluster_tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fcluster_tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fcluster_tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/constantinpape","download_url":"https://codeload.github.com/constantinpape/cluster_tools/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fcluster_tools/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274121922,"owners_count":25225801,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-07T02:00:09.463Z","response_time":67,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-segmentation","bio-image-analysis","cluster-computing","connectomics","lifted-multicut","microscopy-images","multicut","mutex-watershed","segmentation","watershed"],"created_at":"2024-10-10T19:23:26.923Z","updated_at":"2025-09-08T01:32:22.116Z","avatar_url":"https://github.com/constantinpape.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Anaconda-Server Badge](https://anaconda.org/conda-forge/cluster_tools/badges/version.svg)](https://anaconda.org/conda-forge/cluster_tools)\n\n# Cluster Tools\n\nWorkflows for distributed Bio Image Analysis and Segmentation.\nSupports Slurm, LSF and local execution, easy to extend to more scheduling systems.\n\n\n## Workflows\n\n- [Hierarchical Multicut](http:/openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w1/Pape_Solving_Large_Multicut_ICCV_2017_paper.pdf) / [Hierarchical lifted Multicut](https://arxiv.org/abs/1905.10535)\n  - Distance Transform Watersheds\n  - Region Adjacency Graph\n  - Edge Feature Extraction from Boundary-or-Affinity Maps\n  - Agglomeration via (lifted) Multicut\n- [Sparse lifted Multicut from biological priors](https://arxiv.org/abs/1905.10535)\n- [Mutex Watershed](https://link.springer.com/chapter/10.1007/978-3-030-01225-0_34)\n- Connected Components\n- Downscaling and Pyramids\n  - [Paintera Format](https://github.com/saalfeldlab/paintera)\n  - [BigDataViewer Format](https://imagej.net/BigDataViewer)\n  - [Bigcat Format](https://github.com/saalfeldlab/bigcat)\n- [Ilastik Prediction](https://www.ilastik.org/)\n- Skeletonization\n- Distributed Neural Network Prediction (originally implemented [here](https://github.com/constantinpape/simpleference))\n- Validation with Rand Index and Variation of Information\n\n\n## Installation\n\nYou can install the package via conda:\n```\nconda install -c conda-forge cluster_tools\n```\n\nTo set-up a develoment environment with all necessary dependencies, you can use the `environment.yml` file:\n```\nconda env create -f environment.yml\n```\nand then install the package in development mode via\n```\npip install -e . --no-deps\n```\n\n## Citation\n\nIf you use this software in a publication, please cite\n```\nPape, Constantin, et al. \"Solving large multicut problems for connectomics via domain decomposition.\" Proceedings of the IEEE International Conference on Computer Vision. 2017.\n```\n\nFor the lifted multicut workflows, please cite\n```\nPape, Constantin, et al. \"Leveraging Domain Knowledge to improve EM image segmentation with Lifted Multicuts.\" arXiv preprint. 2019.\n```\nYou can find code for the experiments in `publications/lifted_domain_knowledge`.\n\nIf you are using another algorithom not part of these two publications, please also cite the appropriate publication ([see the links here](https://github.com/constantinpape/cluster_tools#workflows)).\n\n\n## Getting Started\n\nThis repository uses [luigi](https://github.com/spotify/luigi) for workflow management.\nWe support different cluster schedulers, so far \n- [`slurm`](https://slurm.schedmd.com/documentation.html)\n- [`lsf`](https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_welcome/lsf_kc_ss.html)\n- `local` (local execution based on `ProcessPool`)\n\nThe scheduler can be selected by the keyword `target`.\nInter-process communication is achieved through files which are stored in a temporary folder and\nmost workflows use [n5](https://github.com/saalfeldlab/n5) storage. You can use [z5](https://github.com/constantinpape/z5) to convert files to it with python.\n\nSimplified, running a workflow from this repository looks like this:\n```py\nimport json\nimport luigi\nfrom cluster_tools import SimpleWorkflow  # this is just a mock class, not actually part of this repository\n\n# folder for temporary scripts and files\ntmp_folder = 'tmp_wf'\n\n# directory for configurations for workflow sub-tasks stored as json\nconfig_dir = 'configs'\n\n# get the default configurations for all sub-tasks\ndefault_configs = SimpleWorkflow.get_config()\n\n# global configuration for shebang to proper python interpreter with all dependencies,\n# group name and block-shape\nglobal_config = default_configs['global']\nshebang = '#! /path/to/bin/python'\nglobal_config.update({'shebang': shebang, 'groupname': 'mygroup'})\nwith open('configs/global.config', 'w') as f:\n  json.dump(global_config, f)\n  \n# run the example workflow with `max_jobs` number of jobs\nmax_jobs = 100\ntask = SimpleWorkflow(tmp_folder=tmp_folder, config_dir=config_dir,\n                      target='slurm', max_jobs=max_jobs,\n                      input_path='/path/to/input.n5', input_key='data',\n                      output_path='/path/to/output.n5', output_key='data')\nluigi.build([task])\n ```\nFor a list of the available segmentation worklfows, have a look at [this](https://github.com/constantinpape/cluster_tools/blob/master/cluster_tools/workflows.py).\nUnfortunately, there is no proper documentation yet. For more details, have a look at the\n[examples](https://github.com/constantinpape/cluster_tools/blob/master/example), in particular\n[this example](https://github.com/constantinpape/cluster_tools/blob/master/example/multicut.py).\nYou can donwload the example data (also used for the tests) [here](https://drive.google.com/file/d/1E_Wpw9u8E4foYKk7wvx5RPSWvg_NCN7U/view?usp=sharing).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconstantinpape%2Fcluster_tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fconstantinpape%2Fcluster_tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconstantinpape%2Fcluster_tools/lists"}