Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/shenggan/bccd_dataset

BCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.
https://github.com/shenggan/bccd_dataset

cell-detection dataset detection medical-imaging

Last synced: 5 days ago
JSON representation

BCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.

Awesome Lists containing this project

README

        

# BCCD Dataset

BCCD Dataset is a small-scale dataset for blood cells detection.

Thanks the original data and annotations from [cosmicad](https://github.com/cosmicad/dataset) and [akshaylamba](https://github.com/akshaylamba/all_CELL_data). The original dataset is re-organized into VOC format. BCCD Dataset is under *[MIT licence](./LICENSE)*.

You can [download](https://github.com/Shenggan/BCCD_Dataset/releases) the `.rec` format for mxnet directly. The `.rec` file can be load by [mxnet.image.ImageDetIter](http://mxnet.incubator.apache.org/api/python/image/image.html?highlight=imagedetiter#mxnet.image.ImageDetIter).

### Data preparation
Data preparation is important to use machine learning. In this project, the Faster R-CNN algorithm from [keras-frcnn](https://github.com/kbardool/keras-frcnn) for Object Detection is used.
From this [dataset](https://github.com/Shenggan/BCCD_Dataset), [nicolaschen1](https://github.com/nicolaschen1) developed two Python scripts to make preparation data (CSV file and images) for recognition of abnormalities in blood cells on medical images.

- export.py: it creates the file "test.csv" with all data needed: filename, class_name, x1,y1,x2,y2.
- plot.py: it plots the boxes for each image and save it in a new directory.

#### Overview of dataset

* You can see a example of the labeled cell image.

We have three kind of labels :

* RBC (Red Blood Cell)
* WBC (White Blood Cell)
* Platelets (血小板)

![example](./example.jpg)

* The structure of the `BCCD_dataset`

```
├── BCCD
│ ├── Annotations
│ │ └── BloodImage_00XYZ.xml (364 items)
│ ├── ImageSets # Contain four Main/*.txt which split the dataset
│ └── JPEGImages
│ └── BloodImage_00XYZ.jpg (364 items)
├── dataset
│ └── mxnet # Some preprocess scripts for mxnet
├── scripts
│ ├── split.py # A script to generate four .txt in ImageSets
│ └── visualize.py # A script to generate labeled img like example.jpg
├── example.jpg # A example labeled img generated by visualize.py
├── LICENSE
└── README.md
```

* The `JPEGImages`:

* **Image Type** : *jpeg(JPEG)*
* **Width** x **Height** : *640 x 480*

* The `Annotations` : The VOC format `.xml` for Object Detection, automatically generate by the label tools. Below is an example of `.xml` file.

```xml

JPEGImages
BloodImage_00000.jpg
/home/pi/detection_dataset/JPEGImages/BloodImage_00000.jpg

Unknown


640
480
3

0

WBC
Unspecified
0
0

260
177
491
376


...

...


```