An open API service indexing awesome lists of open source software.

https://github.com/gsarti/cancer-detection

Team Capybara final project "Histopathologic Cancer Detection" for the Statistical Machine Learning course @ University of Trieste
https://github.com/gsarti/cancer-detection

cancer cancer-detection capsule-network capsule-networks convolutional-neural-networks data-science dssc healthcare image-segmentation random-forest university-of-trieste university-project unsupervised-clustering

Last synced: about 2 months ago
JSON representation

Team Capybara final project "Histopathologic Cancer Detection" for the Statistical Machine Learning course @ University of Trieste

Awesome Lists containing this project

README

        

![header](img/header.png)

## Description

"Histopathologic Cancer Detection" is developed by Team Capybara ([gsarti](https://github.com/gsarti), [stinco](https://github.com/stinco), [andrealorenzon](https://github.com/andrealorenzon)) as final project for the Statistical Machine Learning course held by Prof. Luca Bortolussi at University of Trieste.

The project is based on the Kaggle competition ["Histopathologic Cancer Detection"](https://www.kaggle.com/c/histopathologic-cancer-detection), in which participants create an algorithm to identify metastatic cancer in small image patches taken from larger digital pathology scans.

## Data

The data are a slightly modified version of the **PatchCamelyon (PCam)** [benchmark dataset](https://github.com/basveeling/pcam) in which duplicates generated by probabilistic sampling were removed.

> **PCam** packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task, akin to CIFAR-10 and MNIST. Models can easily be trained on a single GPU in a couple hours, and achieve competitive scores in the Camelyon16 tasks of tumor detection and whole-slide image diagnosis. Furthermore, the balance between task-difficulty and tractability makes it a prime suspect for fundamental machine learning research on topics as active learning, model uncertainty, and explainability.

The data are provided under the [CC0 License](https://choosealicense.com/licenses/cc0-1.0/). They can be found in the `data` folder.

## Approaches

We compared three approaches for this classification task:

* Unsupervised segmentation of cellules nuclei followed by a random forest on full-image cell statistics.

* DenseNet-169 convolutional neural network with pretrained weights and adaptive learning rate, based on the top Kaggle kernel for the challenge.

* Capsule networks for tumor detection

More information on the project and resources can be found in our [project presentation](https://github.com/gsarti/cancer-detection/blob/master/HCD_Presentation.pdf).