https://github.com/imagingdatacommons/idc-sm-annotations-conversion

conversion dicom pathomics

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/imagingdatacommons/idc-sm-annotations-conversion
Owner: ImagingDataCommons
License: mit
Created: 2023-06-13T14:33:19.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-03-01T22:22:32.000Z (11 months ago)
Last Synced: 2025-03-01T23:22:55.433Z (11 months ago)
Topics: conversion, dicom, pathomics
Language: Python
Homepage:
Size: 463 KB
Stars: 0
Watchers: 6
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# IDC Annotation Conversion

Python project for converting various pathology annotations into DICOM
format for ingestion into the Imaging Data Commons.

The code in this repository is currently under development.

### Installation

This repository is structured to be directly installable as a Python
distribution named `idc-annotation-conversion` via pip. You should be able to
run this command from the root of the cloned repository to install the packages
along with all its dependencies (defined in `pyproject.toml`) in your current
Python environment:

```bash
pip install .
```
Alternatively, you can install the package directly from remote with:

```bash
pip install https://github.com/ImagingDataCommons/idc-sm-annotations-conversion.git
```

### Cloud Authentication

You need to authenticate to the relevant Google cloud buckets to run the code
in this package. Specifically, access to the following resources is required:

- Project `idc-etl-processing`
- Bucket `public-datasets-idc`, the public bucket containing DICOM-format whole
slide images.
- Bucket `idc-annotation-conversion-outputs`, or any other bucket specified
as the output bucket, if any.

Depending on the conversion process that you are running, you may also need
access to:

- Bucket `tcia-nuclei-seg`, which contains the original (CSV format)
segmentations for the `pan_cancer_nuclei_seg` conversion process.
- Project `idc-external-031` and bucket `rms_annotation_test_oct_2023`, which contains the
original (XML format) annotations for the `rms` conversion process.

If you are using an IDC cloud VM, this should be handled
automatically for you. Otherwise, you should run:

```
gcloud auth application-default login --billing-project idc-etl-processing
```

and then once you are finished:

```
gcloud auth application-default revoke
```

### Use

Each conversion process is implemented as a submodule of the `idc_annotation_conversion`
module, which is installed when you installed this package. Each submodule has an
an entrypoint (a `__main__.py` file), meaning that to run the process once this
package is installed you run:

```bash
python -m idc_annotation_conversion.
```

So for example to run the `pan_cancer_nuclei_seg` conversion process:

```bash
python -m idc_annotation_conversion.pan_cancer_nuclei_seg
```

In each case, the default parameters should be sufficient to run a conversion processon
on the entire collection but there a number of optional arguments to control the process.
You can see the options by running `--help` when calling the submodule. E.g.:

```bash
python -m idc_annotation_conversion.pan_cancer_nuclei_seg --help
```

### Modules

The following modules are currently available:

- `pan_cancer_nuclei_seg`: Conversion of Pan Cancer Nuclei segmentations from
XML to ANN and SEGs for various TCGA collections.
- `rms`: Conversion of annotations related to the "RMS-Mutation-Prediction"
collection. Specifically conversion of hand annotated regions to SR, and
ML generated segmentations to SEG.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/imagingdatacommons/idc-sm-annotations-conversion

Awesome Lists containing this project

README