An open API service indexing awesome lists of open source software.

https://github.com/soroushj/mhsma-dataset

MHSMA: The Modified Human Sperm Morphology Analysis Dataset
https://github.com/soroushj/mhsma-dataset

dataset deep-learning machine-learning medical-imaging

Last synced: 11 months ago
JSON representation

MHSMA: The Modified Human Sperm Morphology Analysis Dataset

Awesome Lists containing this project

README

          

# MHSMA: The Modified Human Sperm Morphology Analysis Dataset

The MHSMA dataset is a collection of human sperm images from 235 patients with male factor infertility.
Each image is labeled by experts for normal or abnormal sperm acrosome, head, vacuole, and tail.

The training, validation, and test sets contain 1000, 240, and 300 images, respectively.

Images are available in two different crop sizes: 128x128- and 64x64-pixel.
The following figure shows two versions of the same instance.

| 128x128-pixel | 64x64-pixel |
| :----------------------------------------------: | :--------------------------------------------: |
| ![MHSMA-128 sample](sample/mhsma-128-sample.png) | ![MHSMA-64 sample](sample/mhsma-64-sample.png) |

In MHSMA, each instance is a grayscale image capturing a single sperm.
The head of the sperm is roughly located at the center of the image.
Also, the sperm tail is not entirely visible in the images.

Labels can be either `0` (normal, positive) or `1` (abnormal, negative).

The dataset is available in `.npy` format.
You can load the `.npy` files using [numpy.load](https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html).
The details of the files are described in the table below.

| File | Shape | Type | Description |
| ---------------------- | ------------------ | ------- | ------------------------------------- |
| `x_128_train.npy` | `(1000, 128, 128)` | `uint8` | Training set, 128x128-pixel version |
| `x_128_valid.npy` | `(240, 128, 128)` | `uint8` | Validation set, 128x128-pixel version |
| `x_128_test.npy` | `(300, 128, 128)` | `uint8` | Test set, 128x128-pixel version |
| `x_64_train.npy` | `(1000, 64, 64)` | `uint8` | Training set, 64x64-pixel version |
| `x_64_valid.npy` | `(240, 64, 64)` | `uint8` | Validation set, 64x64-pixel version |
| `x_64_test.npy` | `(300, 64, 64)` | `uint8` | Test set, 64x64-pixel version |
| `y_acrosome_train.npy` | `(1000,)` | `uint8` | Training set labels for acrosome |
| `y_acrosome_valid.npy` | `(240,)` | `uint8` | Validation set labels for acrosome |
| `y_acrosome_test.npy` | `(300,)` | `uint8` | Test set labels for acrosome |
| `y_head_train.npy` | `(1000,)` | `uint8` | Training set labels for head |
| `y_head_valid.npy` | `(240,)` | `uint8` | Validation set labels for head |
| `y_head_test.npy` | `(300,)` | `uint8` | Test set labels for head |
| `y_vacuole_train.npy` | `(1000,)` | `uint8` | Training set labels for vacuole |
| `y_vacuole_valid.npy` | `(240,)` | `uint8` | Validation set labels for vacuole |
| `y_vacuole_test.npy` | `(300,)` | `uint8` | Test set labels for vacuole |
| `y_tail_train.npy` | `(1000,)` | `uint8` | Training set labels for tail |
| `y_tail_valid.npy` | `(240,)` | `uint8` | Validation set labels for tail |
| `y_tail_test.npy` | `(300,)` | `uint8` | Test set labels for tail |

The following table shows the number of positive and negative examples in the dataset.



Set
Label
# Positive
# Negative
% Positive




Whole dataset
Acrosome
1,086
454
70.52


Head
1,122
418
72.86


Vacuole
1,301
239
84.48


Tail
1,471
69
95.52


Training set
Acrosome
699
301
69.90


Head
727
273
72.70


Vacuole
830
170
83.00


Tail
954
46
95.40


Validation set
Acrosome
174
66
72.50


Head
176
64
73.33


Vacuole
209
31
87.08


Tail
233
7
97.08


Test set
Acrosome
213
87
71.00


Head
219
81
73.00


Vacuole
262
38
87.33


Tail
284
16
94.67

## Results

If you would like to add a new result, you can [open a pull request](https://github.com/soroushj/mhsma-dataset/pulls).



Method
Label
Accuracy
Precision
Recall
F0.5 score
G-mean
ROC AUC
MCC




A novel deep learning method for automatic assessment of human sperm images (Apr 2019)
Acrosome
76.67
85.93
80.28
84.74
83.06
83.89
+0.4618


Head
77.00
83.48
85.39
83.86
84.43
77.80
+0.4053


Vacuole
91.33
94.36
95.80
94.65
95.08
88.08
+0.5910


Effect of Deep Transfer and Multi-task Learning on Sperm Abnormality Detection (Nov 2020)
Acrosome (DTL)
79.00
80.24
93.42
82.57
86.58
79.65
+0.4447


Acrosome (DMTL)
80.66
82.42
92.48
84.26
87.31
78.19
+0.4984


Head (DTL)
84.00
87.01
91.78
87.92
89.36
81.56
+0.5775


Head (DMTL)
82.00
82.60
95.43
84.89
88.78
78.40
+0.5021


Vacuole (DTL)
94.00
95.18
98.09
95.75
96.62
94.73
+0.7082


Vacuole (DMTL)
92.33
94.75
96.56
95.11
95.65
93.64
+0.6348

## Citation

If you use this dataset in your research, please kindly cite [our work](https://doi.org/10.1016/j.compbiomed.2019.04.030) as:

```bibtex
@article{javadi2019novel,
title={A novel deep learning method for automatic assessment of human sperm images},
author={Javadi, Soroush and Mirroshandel, Seyed Abolghasem},
journal={Computers in Biology and Medicine},
volume={109},
pages={182--194},
year={2019},
doi={10.1016/j.compbiomed.2019.04.030}
}
```

## License

This dataset is made available under the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.

## Credits

MHSMA is based on the Human Sperm Morphology Analysis Dataset (HSMA-DS) [(Ghasemian et al., 2015)](https://doi.org/10.1016/j.cmpb.2015.08.013).