https://github.com/gsarti/cancer-detection
Team Capybara final project "Histopathologic Cancer Detection" for the Statistical Machine Learning course @ University of Trieste
https://github.com/gsarti/cancer-detection
cancer cancer-detection capsule-network capsule-networks convolutional-neural-networks data-science dssc healthcare image-segmentation random-forest university-of-trieste university-project unsupervised-clustering
Last synced: about 2 months ago
JSON representation
Team Capybara final project "Histopathologic Cancer Detection" for the Statistical Machine Learning course @ University of Trieste
- Host: GitHub
- URL: https://github.com/gsarti/cancer-detection
- Owner: gsarti
- License: mit
- Created: 2019-05-21T14:27:37.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-06-28T16:19:29.000Z (almost 6 years ago)
- Last Synced: 2025-05-08T03:17:34.840Z (about 2 months ago)
- Topics: cancer, cancer-detection, capsule-network, capsule-networks, convolutional-neural-networks, data-science, dssc, healthcare, image-segmentation, random-forest, university-of-trieste, university-project, unsupervised-clustering
- Language: Jupyter Notebook
- Homepage:
- Size: 76.1 MB
- Stars: 9
- Watchers: 2
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README

## Description
"Histopathologic Cancer Detection" is developed by Team Capybara ([gsarti](https://github.com/gsarti), [stinco](https://github.com/stinco), [andrealorenzon](https://github.com/andrealorenzon)) as final project for the Statistical Machine Learning course held by Prof. Luca Bortolussi at University of Trieste.
The project is based on the Kaggle competition ["Histopathologic Cancer Detection"](https://www.kaggle.com/c/histopathologic-cancer-detection), in which participants create an algorithm to identify metastatic cancer in small image patches taken from larger digital pathology scans.
## Data
The data are a slightly modified version of the **PatchCamelyon (PCam)** [benchmark dataset](https://github.com/basveeling/pcam) in which duplicates generated by probabilistic sampling were removed.
> **PCam** packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task, akin to CIFAR-10 and MNIST. Models can easily be trained on a single GPU in a couple hours, and achieve competitive scores in the Camelyon16 tasks of tumor detection and whole-slide image diagnosis. Furthermore, the balance between task-difficulty and tractability makes it a prime suspect for fundamental machine learning research on topics as active learning, model uncertainty, and explainability.
The data are provided under the [CC0 License](https://choosealicense.com/licenses/cc0-1.0/). They can be found in the `data` folder.
## Approaches
We compared three approaches for this classification task:
* Unsupervised segmentation of cellules nuclei followed by a random forest on full-image cell statistics.
* DenseNet-169 convolutional neural network with pretrained weights and adaptive learning rate, based on the top Kaggle kernel for the challenge.
* Capsule networks for tumor detection
More information on the project and resources can be found in our [project presentation](https://github.com/gsarti/cancer-detection/blob/master/HCD_Presentation.pdf).