https://github.com/ternaus/midv-500-models

Model for document segmentation trained on the midv-500-models dataset.
https://github.com/ternaus/midv-500-models

computer-vision deep-learning document-scanner image-segmentation python pytorch semantic-segmentation

Last synced: 10 months ago
JSON representation

Model for document segmentation trained on the midv-500-models dataset.

Host: GitHub
URL: https://github.com/ternaus/midv-500-models
Owner: ternaus
License: mit
Created: 2020-05-19T18:00:02.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2020-11-09T02:24:25.000Z (over 5 years ago)
Last Synced: 2024-12-28T14:26:46.577Z (over 1 year ago)
Topics: computer-vision, deep-learning, document-scanner, image-segmentation, python, pytorch, semantic-segmentation
Language: Python
Homepage:
Size: 42.9 MB
Stars: 73
Watchers: 4
Forks: 11
Open Issues: 0
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

# midv-500-models
[![DOI](https://zenodo.org/badge/265323358.svg)](https://zenodo.org/badge/latestdoi/265323358)

The repository contains a model for binary semantic segmentation of the documents.

![](https://habrastorage.org/webt/gy/-t/xn/gy-txnzezlnurcwwlv7q5vs77x4.jpeg)

* **Left**: input.
* **Center**: prediction.
* **Right**: overlay of the image and predicted mask.

## Installation

`pip install -U midv500models`

### Example inference

Jupyter notebook with an example: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1lNv88MJOKgc-50XeYcHlJODpvT2JF9ru?usp=sharing)

## Dataset
Model is trained on [MIDV-500: A Dataset for Identity Documents Analysis and Recognition on Mobile Devices in Video Stream](https://arxiv.org/abs/1807.05786).

### Preparation

Download the dataset from the ftp server with
```bash
wget -r ftp://smartengines.com/midv-500/
```

Unpack the dataset
```bash
cd smartengines.com/midv-500/dataset/
unzip \*.zip
```

The resulting folder structure will be

```bash
smartengines.com
midv-500
dataset
01_alb_id
ground_truth
CA
CA01_01.tif
...
images
CA
CA01_01.json
...
...
...
...
...
```

To preprocess the data use the script
```python
python midv500models/preprocess_data.py -i \
-o
```

where `input_folder` corresponds to the file with the unpacked dataset and output folder will look as:

```bash
images
CA01_01.jpg
...
masks
CA01_01.png
```

target binary masks will have values \[0, 255\], where 0 is background and 255 is the document.

## Training

```bash
python midv500models/train.py -c midv500models/configs/2020-05-19.yaml \
-i
```

## Inference

```bash
python midv500models/inference.py -c midv500models/configs/2020-05-19.yaml \
-i \
-o
-w
```

## Weights
Unet with Resnet34 backbone: [Config](midv500models/configs/2020-05-19.yaml) [Weights](Unet_Resnet34.pth)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ternaus/midv-500-models

Awesome Lists containing this project

README