https://github.com/mirzaevinom/data_science_bowl_2018

My 5th place (out of 816 teams) solution to The 2018 Data Science Bowl organized by Booz Allen Hamilton
https://github.com/mirzaevinom/data_science_bowl_2018

data-science-bowl-2018 deep-learning instance-segmentation kaggle-competition object-detection

Last synced: 7 months ago
JSON representation

My 5th place (out of 816 teams) solution to The 2018 Data Science Bowl organized by Booz Allen Hamilton

Host: GitHub
URL: https://github.com/mirzaevinom/data_science_bowl_2018
Owner: mirzaevinom
Created: 2018-06-27T20:15:44.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-03-29T20:49:47.000Z (over 6 years ago)
Last Synced: 2024-10-31T03:35:31.770Z (12 months ago)
Topics: data-science-bowl-2018, deep-learning, instance-segmentation, kaggle-competition, object-detection
Language: Python
Homepage:
Size: 1.16 MB
Stars: 157
Watchers: 8
Forks: 63
Open Issues: 3
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

# The 2018 Data Science Bowl: "Spot Nuclei. Speed Cures."

This repository contains scripts of my solution to [The 2018 Data Science Bowl](https://www.kaggle.com/c/data-science-bowl-2018). Goal of the competition was to create an algorithm to automate nucleus detection from biomedical images.

Weights are now available [here](https://www.dropbox.com/s/1kql7tsug876xfn/kaggle_bowl.h5?dl=0).

## Model overview

For this competition, I modified [Matterport's](https://github.com/matterport/Mask_RCNN) implementation of [Mask-RCNN](https://arxiv.org/abs/1703.06870) deep neural network for object instance segmentation. I adapted the existing model configurations to detect small nuclei in images with varying size and modality. To ensure that the model doesn't overfit, I used an [external dataset](https://www.kaggle.com/voglinio/external-h-e-data-with-mask-annotations) and relied heavily on image augmentation. Moreover, generated mosaics from train images based on [this notebook](https://www.kaggle.com/bonlime/train-test-image-mosaic). To improve generalizability of the model, I split (using stratification) the `stage1_train` dataset into train and validation sets based on 5 image modalities provided by [Allen Goodman](https://www.kaggle.com/c/data-science-bowl-2018/discussion/48130). After training the model using Resnet101 as a backbone encoder and Adam as an optimizer, I improved prediction accuracy by test time augmentation and post-processing the masks.

## Training Method(s)

### Pre-processing
- I noticed some issues with the provided masks. Therefore, used the annotations and mask provided by [Konstantin Lopuhin](https://github.com/lopuhin/kaggle-dsbowl-2018-dataset-fixes) in [data quality issues](https://www.kaggle.com/c/data-science-bowl-2018/discussion/47572) thread.
- Removed the alpha channel from the images.
- Filled holes in the masks
- Split (using stratification) the `stage1_train` dataset into 90% train and 10% validation sets based on 5 image modalities provided by [Allen Goodman](https://www.kaggle.com/c/data-science-bowl-2018/discussion/48130).
- Used an [external dataset](https://www.kaggle.com/voglinio/external-h-e-data-with-mask-annotations) provided in the forum. Divided the images and the masks into 4 pieces due their large sizes. External dataset [download links](https://nucleisegmentationbenchmark.weebly.com/dataset.html).
- Generated mosaics from train images based on [Emil's](https://www.kaggle.com/bonlime/train-test-image-mosaic) notebook.

### Model and Training
* Modified [Matterport's](https://github.com/matterport/Mask_RCNN) implementation of [Mask-RCNN](https://arxiv.org/abs/1703.06870) deep neural network for object instance segmentation.
* Tuned hyperparameters to detect small nuclei from the images. (I found [this tutorial](https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46) very useful for understanding the model hyperparameters)
+ Original Matterport implementation was validating only on one image so fixed this [validation issue](https://github.com/matterport/Mask_RCNN/issues/89).
+ Reduced RPN (region proposal network) anchor sizes since the nuclei are mostly small.
+ Increased number of anchors to be used since the nuclei are small and can be found anywhere on an image.
+ Increased maximum number of predicted objects since an image can contain 300 or more nuclei.
+ Increased `POST_NMS_ROIS_TRAINING` to get more region proposals during training.
+ Added extra parameter `DETECTION_MASK_THRESHOLD` to model configuration. Default was hardcoded in the model as 0.5 but setting it to 0.35 helped in detection of small nuclei boundaries.
+ Resized images and masks to 512x512
* Relied heavily on image augmentation due to small training set:
- Random horizontal or vertical flips
- Random 90 or -90 degrees rotation
- [Random rotations](https://www.kaggle.com/c/data-science-bowl-2018/discussion/49692) in the range of (-15, 15) degrees
- [Random cropping](https://www.kaggle.com/c/data-science-bowl-2018/discussion/49692) of bigger images and masks to 256x256x3.
- [Random scaling](https://www.kaggle.com/c/data-science-bowl-2018/discussion/49692) of image and mask scaling in the range (0.5, 2.0)

* Used Resnet101 architecture as a backbone encoder but initialized the first 50 layers of the model with pre-trained Resnet50 weights from [ImageNet competition](https://github.com/fchollet/deep-learning-models/releases/).
* Trained the model with [Adam](https://arxiv.org/abs/1412.6980) optimizer for 75 epochs:
- 25 epochs with learning rate 1e-4
- 25 epochs with learning rate 1e-5
- 25 epochs with learning rate 1e-6
* Did not retrain the model with stage1 test data during stage 2 as I was afraid of overfitting on black and white images.

### Post-processing

- Combined predictions on actual image and horizontally flipped image: took unions of masks with maximum overlap and removed false positive masks with small overlap.
- Due to configured RPN anchor sizes, the model predicts small nuclei very well. However, it struggles at predicting large nuclei. Therefore, if a model predicts no masks for an image then I scale down the image and predict once again.
- Removed overlaps between predicted nuclei based on their objectness score. In other words, removed intersections from the masks with lower scores.
- If this intersection removal results in multiple objects in that mask, then removing all the small pieces.
- Closing small holes inside the masks using morphological operations (dilation followed by erosion).

## Interesting findings

- Mask-RCNN model overfits easily without image augmentation.
- Removing false positive mask predictions improves the overall score significantly.
- Since images are on different scales, predicting masks on scaled images helps with the model generalizability.
- Dilating and then eroding individual masks helped me achieve slightly better result.
- Matterport's original implementation was only [validating on only one image](https://github.com/matterport/Mask_RCNN/issues/89). Fixing this issue made the training process reproducible.
- I found that the model reaches a local minima faster when trained using Adam optimizer compared to default SGD optimizer.

## Unsuccessful approaches tried
- Trained the model with Dice Coefficient Loss instead of default binary cross-entropy loss for the masks heads. Although got comparable results couldn't beat prediction accuracy of binary cross-entropy on my validation set.
- Trained with random Gaussian and Poisson (or shot) noise for image augmentation. It actually hurt overall model performance.
- Tried ensembling actual image predictions with horizontal and vertical flip predictions. Used non-maximum suppression for removing overlaps. Did not improve prediction accuracy on the validation set.
- Trained end-to-end without initializing with pre-trained ImageNet weights. Mostly got to Mean IoU score of 0.35 on stage1 test set.
- Trained on preprocessed images with adaptive histogram equalization (CLAHE). The model performed way worse.

## Example model predictions

For the following figures red lines represent ground truth boundaries and blue lines represent prediction boundaries.

* Model predictions for some stage 1 test image samples:

* Model predictions for some stage 2 test image samples:

# Appendix

## A1. Model Execution Time

The following execution times are measured on Nvidia P100 GPUs provided by [Ohio Supercomputer Center](https://www.osc.edu/)

+ Each training epoch takes about 12 minutes.
+ It takes about 15 hours to train the model from scratch.
+ It takes 1 to 4 seconds to predict all the nuclei on a single image.
+ Plotting boundaries of each nuclei and saving plots adds an extra 1 second on average

## A2. Dependencies

The codes are written in Python (3.6.3) and tested on Red Hat Enterprise Linux Server (7.4). The scripts depend on the following python libraries available on `PyPi`:

* `tensorflow (1.3.0), keras (2.1.3), numpy (1.13.3), scipy (0.19.1)` for computations
* `cv2 (3.4.0), skimage (0.13.0), matplotlib (2.1.0)` for image processing and plotting
* `tqdm (4.11.2)` for progress bar

## A3. How To Generate the Solution

1. Download/extract/place the training (external dataset [download links](https://nucleisegmentationbenchmark.weebly.com/dataset.html) ) datasets in the following folder structure:

2. Run `python augment_preprocess.py` to pre-process external data and create mosaics from the dataset. (You can skip this step if you only want to train on provided train set)

3. Run `python train.py` to train the model. Model weights are saved at `../data/logs/kaggle_bowl/mask_rcnn.h5`.

4. Run `python predict.py` to evaluate model performance on validation set and predict nuclei boundaries on test set.

### Acknowledgements

* This material is based upon work supported by the National Science Foundation under Agreement No. 0931642 ([The Ohio Supercomputer Center](https://www.osc.edu/))
* I would like to also thank [Mathematical Biosciences Institue](http://mbi.osu.edu) (MBI) at Ohio State University, for partially supporting this research. MBI receives its funding through the National Science Foundation grant DMS 1440386

[1]: https://github.com/matterport/Mask_RCNN
[2]: https://www.osc.edu/
[3]: https://arxiv.org/abs/1703.06870
[4]: https://www.kaggle.com/voglinio/external-h-e-data-with-mask-annotations
[5]: https://www.kaggle.com/c/data-science-bowl-2018/discussion/48130
[6]: https://github.com/matterport/Mask_RCNN/issues/89
[7]: https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46
[8]: https://www.kaggle.com/bonlime/train-test-image-mosaic
[10]: https://nucleisegmentationbenchmark.weebly.com/dataset.html
[11]: http://mbi.osu.edu
[12]: http://www.lerner.ccf.org/thor/scott/lab/
[13]: https://promise12.grand-challenge.org
[14]: https://www.kaggle.com/c/data-science-bowl-2018/discussion/47572
[15]: https://github.com/lopuhin/kaggle-dsbowl-2018-dataset-fixes
[16]: https://github.com/fchollet/deep-learning-models/releases/
[17]: https://arxiv.org/abs/1412.6980
[18]: https://www.kaggle.com/c/data-science-bowl-2018/discussion/49692
[19]: https://nucleisegmentationbenchmark.weebly.com/dataset.html
[20]: https://github.com/matterport/Mask_RCNN/issues/89

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mirzaevinom/data_science_bowl_2018

Awesome Lists containing this project

README