Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/shenggan/bccd_dataset
BCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.
https://github.com/shenggan/bccd_dataset
cell-detection dataset detection medical-imaging
Last synced: 5 days ago
JSON representation
BCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.
- Host: GitHub
- URL: https://github.com/shenggan/bccd_dataset
- Owner: Shenggan
- License: mit
- Created: 2017-12-07T11:54:25.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2021-09-16T17:47:24.000Z (over 3 years ago)
- Last Synced: 2025-01-23T02:06:58.954Z (12 days ago)
- Topics: cell-detection, dataset, detection, medical-imaging
- Language: Python
- Homepage:
- Size: 7.22 MB
- Stars: 386
- Watchers: 11
- Forks: 211
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# BCCD Dataset
BCCD Dataset is a small-scale dataset for blood cells detection.
Thanks the original data and annotations from [cosmicad](https://github.com/cosmicad/dataset) and [akshaylamba](https://github.com/akshaylamba/all_CELL_data). The original dataset is re-organized into VOC format. BCCD Dataset is under *[MIT licence](./LICENSE)*.
You can [download](https://github.com/Shenggan/BCCD_Dataset/releases) the `.rec` format for mxnet directly. The `.rec` file can be load by [mxnet.image.ImageDetIter](http://mxnet.incubator.apache.org/api/python/image/image.html?highlight=imagedetiter#mxnet.image.ImageDetIter).
### Data preparation
Data preparation is important to use machine learning. In this project, the Faster R-CNN algorithm from [keras-frcnn](https://github.com/kbardool/keras-frcnn) for Object Detection is used.
From this [dataset](https://github.com/Shenggan/BCCD_Dataset), [nicolaschen1](https://github.com/nicolaschen1) developed two Python scripts to make preparation data (CSV file and images) for recognition of abnormalities in blood cells on medical images.- export.py: it creates the file "test.csv" with all data needed: filename, class_name, x1,y1,x2,y2.
- plot.py: it plots the boxes for each image and save it in a new directory.#### Overview of dataset
* You can see a example of the labeled cell image.
We have three kind of labels :
* RBC (Red Blood Cell)
* WBC (White Blood Cell)
* Platelets (血小板)![example](./example.jpg)
* The structure of the `BCCD_dataset`
```
├── BCCD
│ ├── Annotations
│ │ └── BloodImage_00XYZ.xml (364 items)
│ ├── ImageSets # Contain four Main/*.txt which split the dataset
│ └── JPEGImages
│ └── BloodImage_00XYZ.jpg (364 items)
├── dataset
│ └── mxnet # Some preprocess scripts for mxnet
├── scripts
│ ├── split.py # A script to generate four .txt in ImageSets
│ └── visualize.py # A script to generate labeled img like example.jpg
├── example.jpg # A example labeled img generated by visualize.py
├── LICENSE
└── README.md
```* The `JPEGImages`:
* **Image Type** : *jpeg(JPEG)*
* **Width** x **Height** : *640 x 480** The `Annotations` : The VOC format `.xml` for Object Detection, automatically generate by the label tools. Below is an example of `.xml` file.
```xml
JPEGImages
BloodImage_00000.jpg
/home/pi/detection_dataset/JPEGImages/BloodImage_00000.jpg
Unknown
640
480
3
0
WBC
Unspecified
0
0
260
177
491
376
...
...
```