https://github.com/janelia-cellmap/dacapo
A framework for easy application of established machine learning techniques on large, multi-dimensional images.
https://github.com/janelia-cellmap/dacapo
deep-learning machine-learning segmentation
Last synced: 3 months ago
JSON representation
A framework for easy application of established machine learning techniques on large, multi-dimensional images.
- Host: GitHub
- URL: https://github.com/janelia-cellmap/dacapo
- Owner: janelia-cellmap
- License: bsd-3-clause
- Created: 2023-08-24T20:18:00.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2025-09-15T19:14:05.000Z (9 months ago)
- Last Synced: 2026-01-07T22:45:11.747Z (6 months ago)
- Topics: deep-learning, machine-learning, segmentation
- Language: Python
- Homepage: https://janelia-cellmap.github.io/dacapo/
- Size: 54.7 MB
- Stars: 60
- Watchers: 4
- Forks: 10
- Open Issues: 30
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
Awesome Lists containing this project
README

# DaCapo  

[](https://dacapo.readthedocs.io/en/stable/?badge=stable)



[](https://github.com/janelia-cellmap/dacapo/actions/workflows/tests.yaml)
[](https://github.com/janelia-cellmap/dacapo/actions/workflows/black.yaml)
[](https://github.com/janelia-cellmap/dacapo/actions/workflows/mypy.yaml)
[](https://janelia-cellmap.github.io/dacapo/)
[](https://codecov.io/gh/janelia-cellmap/dacapo)
A framework for easy application of established machine learning techniques on large, multi-dimensional images.
`dacapo` allows you to configure machine learning jobs as combinations of
[DataSplits](https://janelia-cellmap.github.io/dacapo/autoapi/dacapo/experiments/datasplits/index.html),
[Architectures](https://janelia-cellmap.github.io/dacapo/autoapi/dacapo/experiments/architectures/index.html),
[Tasks](https://janelia-cellmap.github.io/dacapo/autoapi/dacapo/experiments/tasks/index.html),
[Trainers](https://janelia-cellmap.github.io/dacapo/autoapi/dacapo/experiments/trainers/index.html),
on arbitrarily large volumes of
multi-dimensional images. `dacapo` is not tied to a particular learning
framework, but currently only supports [`torch`](https://pytorch.org/) with
plans to support [`tensorflow`](https://www.tensorflow.org/).

## Installation and Setup
Currently, python>=3.10 is supported. We recommend creating a new conda environment for dacapo with python 3.10.
```
conda create -n dacapo python=3.10
conda activate dacapo
```
Then install DaCapo using pip with the following command:
```
pip install dacapo-ml
```
This will install the minimum required dependencies.
You may additionally utilize a MongoDB server for storing outputs. To install and run MongoDB locally, refer to the MongoDB documentation [here](https://www.mongodb.com/docs/manual/installation/).
The use of MongoDB, as well as specifying the compute context (on cluster or not) should be specified in the ```dacapo.yaml``` in the main directory.
## Functionality Overview
Tasks we support and approaches for those tasks:
- Instance Segmentation
- Affinities
- Local Shape Descriptors
- Semantic segmentation
- Signed distances
- One-hot encoding of different types of objects
## Example Tutorial
A minimal example tutorial can be found in the examples directory and opened in colab here:
## Helpful Resources & Tools
- Chunked data, zarr, and n5
- OME-Zarr: a cloud-optimized bioimaging file format with international community support (doi: [10.1101/2023.02.17.528834](https://pubmed.ncbi.nlm.nih.gov/36865282/))
- Videos about N5 and Fiji can be found in [this playlist](https://www.youtube.com/playlist?list=PLmZHHIZ9Gz-IJA7HtW8quZcuLViz9Em6e). For other questions, join the discussion on the [Image.sc forum](https://forum.image.sc/tag/n5).
- Read about chunked storage plugins in Fiji in this blog: [N5 plugins for Fiji](https://openorganelle.janelia.org/news/2023-02-06-n5-plugins-for-fiji)
- Script for converting tiff to zarr can be found [here](https://github.com/yuriyzubov/tiff-to-zarr)
- Segmentations
- A description of local shape descriptors used for affinities task. Read the blog [here](https://localshapedescriptors.github.io/). Example image from the blog showing the difference between segmentations:
- 
- CellMap Models
- [GitHub Repo](https://github.com/janelia-cellmap/cellmap-models) of published models
- For example, the COSEM trained pytorch networks are located [here](https://github.com/janelia-cellmap/cellmap-models/tree/main/src/cellmap_models/pytorch/cosem).
- [OpenOrganelle.org](https://openorganelle.janelia.org)
- 
- Example of [unprocessed distance predictions](https://tinyurl.com/3kw2tuab)
- Example of [refined segmentations](https://tinyurl.com/k59pba98) that have undergone post-processing (e.g., thresholding, masking, smoothing)
- Example of [groundtruth data](https://tinyurl.com/pu8mespz)
- Visualization
- [Neuroglancer GitHub Repo](https://github.com/google/neuroglancer)
# Citing this repo
If you use our code, please cite us and spread the news!
```
@article{Patton_DaCapo_a_modular_2024,
author = {Patton, William and Rhoades, Jeff L. and Zouinkhi, Marwan and Ackerman, David G. and Malin-Mayor, Caroline and Adjavon, Diane and Heinrich, Larissa and Bennett, Davis and Zubov, Yurii and Project Team, CellMap and Weigel, Aubrey V. and Funke, Jan},
doi = {10.48550/arXiv.2408.02834},
journal = {arXiv-cs.CV},
title = {{DaCapo: a modular deep learning framework for scalable 3D image segmentation}},
year = {2024}
}
```