https://github.com/cyrilzakka/phantom
Effective DICOM anonymization in Python
https://github.com/cyrilzakka/phantom
anonymization dicom medical-image-processing medical-imaging pydicom
Last synced: about 1 month ago
JSON representation
Effective DICOM anonymization in Python
- Host: GitHub
- URL: https://github.com/cyrilzakka/phantom
- Owner: cyrilzakka
- Created: 2023-01-19T18:57:17.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-01-20T08:22:14.000Z (over 2 years ago)
- Last Synced: 2025-04-13T05:42:58.859Z (about 1 month ago)
- Topics: anonymization, dicom, medical-image-processing, medical-imaging, pydicom
- Language: Python
- Homepage:
- Size: 2.28 MB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Citation: CITATION.cff
Awesome Lists containing this project
README
![]()
# Phantom: DICOM Anonymization
Phantom is a simple python module intended to simplify medical DICOM anonymization for medical machine-learning applications. Please keep in mind that we do NOT guarantee IRB-validated outputs and are not liable for any breaches of patient privacy.## Installation
```bash
git clone https://github.com/cyrilzakka/Phantom.git
cd Phantom
```
Phantom relies on a few dependencies to run:
```bash
pip install -r requirements.txt
```## Usage
### Metadata (Alpha)
Phantom maintains a very strict paradigm for anonymization to ensure a maximum degree of privacy. Identifiers specified in a `base.yaml` are kept within the metadata of the DICOM file, while everything else, inluding private metadata is purged. Pixel data is always preserved.To anonymize DICOM metadata, simply run:
```bash
python meta/metadata.py -s /path/to/dicoms -c /path/to/config.yaml
```
where `config.yaml` is a simple YAML specifying the attributes to keep in the DICOM.
```yaml
--- base.yaml ---
base:
keep: ['PatientsSex'] # keys to keep in the DICOM metadata list
jitter: {'PatientBirthDate': 30} # keys to jitter specified as a dict dict
generate: ['PatientID'] # keys to replace with a randomized ID list
```As DICOMs become more complex, we resort to modularity to keep anonymization manageable. Phantom configs can be composed together to create more complex anonymization pipelines. As an example, here we create a new `echo.yaml` configuration that inherits from `base.yaml` above. while appending modality-specific attributes like `HeartRate` or `NumberOfFrames`. Please keep in mind that in the case of inheritance any keys specified within either configuration files will be kept.
```yaml
--- echo.yaml ---
echo:
defaults: 'base'
keep: ['HeartRate', 'NumberOfFrames']
jitter: {'AcquisitionDateTime': 30, 'StudyDate': 30}
generate: []
```### Burned Annotations
DICOM modalities occasionally contain burned annotations, or patient data embedded within the pixels of the images. While this is often difficult to detect and remove without some sort of machine-learning approach, medical imaging modalities with a temporal dimension (e.g. echocardiograms) offer a simple solution via static pixel masking. This can be quickly achieved using the `TemporalMaskingPipeline`:
```bash
python pixel/temporal_mask.py -s /path/to/dicoms -d /path/to/destination
```
![]()
![]()
## Disclaimer
This project is still in the alpha phase of development and is likely to experience some breaking changes as a result. If you run into any errors, please make sure to update the package first before opening an issue.## Issues
If you have an issues, feature requests, or simply want to contribute, please do not hesitate to submit a pull request.