An open API service indexing awesome lists of open source software.

https://github.com/devbruce/yolov1-tf2

YOLOv1 implementation with TensorFlow2
https://github.com/devbruce/yolov1-tf2

object-detection tensorboard tensorflow2 tf2 yolo yolov1

Last synced: 5 months ago
JSON representation

YOLOv1 implementation with TensorFlow2

Awesome Lists containing this project

README

          

# YOLOv1 with Tensorflow 2

![tf-v2.5.0](https://img.shields.io/badge/TensorFlow-v2.5.0-orange)

For ease of implementation, i have not implemented exactly the same as paper.
The things presented below are implemented differently from the paper.

- Backbone network(Used **Xception** instead of network mentioned in the paper.)

- Learning rate schedule(Used `tf.keras.optimizers.schedules.ExponentialDecay`)

- Data augmentations

- Hyper parameters

- And so on ...



## Preview

### Tensorboard
















### Inference Visualization

















## Build Environment with Docker

### Build Docker Image

```bash
docker build -t ${ImageName}:${ImageTag} .
```

### Create a Container

- Example

```bash
docker run -d -it --gpus all --shm-size=${ShmSize} ${ImageName}:${ImageTag} /bin/bash
```



## Training Dataset: Pascal VOC Dataset ([Link](http://host.robots.ox.ac.uk/pascal/VOC/))

> Pascal VOC Dataset with [TFDS](https://www.tensorflow.org/datasets/overview)

### Number of Images

| | Train | Validation | Test |
|-----------------|-------|------------|------------------------|
| Pascal VOC 2007 | 2501 | 2510 | 4952 (Used Validation) |
| Pascal VOC 2012 | 5717 | 5823 | 10991 (No labels) |

- Training Set: VOC2007 trainval + VOC2012 trainval (Total: 16551)
- Validation Set: VOC2007 test (Total: 4952)



## Pretrained PB File

Trained with default values of this repo. (Total Epoch: 105)

- **Download pb file: \<[Google Drive Link](https://drive.google.com/file/d/1s-3HXGmlRUhuZ5uiZzxQDGrfOA5Y-5_a/view?usp=sharing)\>**

pb file is uploaded as `tar.gz`. So, you have to decompress this file like below.

```bash
tar -zxvf yolo_voc_448x448.tar.gz
```

If you want to inference with this pb file, infer to [inference_tutorial.ipynb](./inference_tutorial.ipynb)


### Performance

#### Evaluation with VOC2007 test

| class name | AP |
|-------------|------------|
| dog | 0.7464 |
| pottedplant | 0.2250 |
| car | 0.5021 |
| person | 0.4482 |
| tvmonitor | 0.5213 |
| diningtable | 0.4564 |
| bicycle | 0.5927 |
| chair | 0.2041 |
| motorbike | 0.5595 |
| sofa | 0.4801 |
| bus | 0.6215 |
| boat | 0.3274 |
| horse | 0.7049 |
| aeroplane | 0.5872 |
| sheep | 0.4223 |
| bottle | 0.1312 |
| train | 0.7917 |
| cat | 0.8044 |
| bird | 0.4824 |
| cow | 0.4548 |
| **mAP** | **0.5032** |


#### Inference Speed

- GPU(GeForce RTX 3090): About 8 FPS
- CPU(AMD Ryzen 5 5600X 6-Core Processor): About 4 FPS



## Training Script

> Path: [./voc_scripts/train_voc.py](./voc_scripts/train_voc.py)

```bash
python train_voc.py
```

**Options**

Default option values are [./configs/configs.py](./configs/configs.py).
If the options are given, the default config values are overridden.

- `--epochs`: Number of training epochs
- `--init_lr`: Initial learning rate
- `--lr_decay_rate`: Learning rate decay rate
- `--lr_decay_steps`: Learning rate decay steps
- `--batch_size`: Training batch size
- `--val_step`: Validation interval during training
- `--tb_img_max_outputs `: Number of visualized prediction images in tensorboard
- `--train_ds_sample_ratio`: Training dataset sampling ratio
- `--val_ds_sample_ratio`: Validation dataset sampling ratio


## Evaluation Script

> Path: [./voc_scripts/eval_voc.py](./voc_scripts/eval_voc.py)

Evaluation pretrained model with VOC2007 Test Dataset

```bash
python eval_voc.py
```

**Options**

- `--batch_size`: Evaluation batch size (Default: batch_size of [./configs/configs.py](./configs/configs.py).)
- `--pb_dir`: Save pb directory path (Default: `./ckpts/voc_ckpts/yolo_voc_448x448`)



## Citation

**You Only Look Once: Unified, Real-Time Object Detection** \<[arxiv link](https://arxiv.org/abs/1506.02640)\>

```
@misc{redmon2016look,
title={You Only Look Once: Unified, Real-Time Object Detection},
author={Joseph Redmon and Santosh Divvala and Ross Girshick and Ali Farhadi},
year={2016},
eprint={1506.02640},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```