https://github.com/vndee/multi-mnist
MNIST dataset with multiple digits. This dataset can be use for learning number (more than 1 digit) regconizer model.
https://github.com/vndee/multi-mnist
data-augmentation deep-learning mnist seq2seq
Last synced: 3 months ago
JSON representation
MNIST dataset with multiple digits. This dataset can be use for learning number (more than 1 digit) regconizer model.
- Host: GitHub
- URL: https://github.com/vndee/multi-mnist
- Owner: vndee
- License: mit
- Created: 2019-04-26T05:28:57.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-04-27T05:49:51.000Z (over 6 years ago)
- Last Synced: 2025-05-12T22:53:40.560Z (5 months ago)
- Topics: data-augmentation, deep-learning, mnist, seq2seq
- Language: Python
- Homepage:
- Size: 11.4 MB
- Stars: 4
- Watchers: 1
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## multi-mnist
MNIST data set with multiple digits. We have generated a
dataset for multiple digits recognition task from MNIST (http://yann.lecun.com/exdb/mnist/index.html).
![]()
![]()
![]()
You can download generated data in release tab.
See in `examples` folder:```
examples
- train
+ labels.csv
+ 1/
+ 2/
...
+ 8/
+ 9/
- test
+ labels.csv
+ 1/
+ 2/
...
+ 8/
+ 9/
```
Each folder `1`, `2`, `3`, `4`, ..., contains generated
images with exactly number of digits as the name of folder. `labels.csv`
list name of image and ground truth number respectively.
```bash
labels.csv
1.png,1
2.png,4
3.png,45
4.png,785,
5.png,1479,
...```
#### Create your own dataset
Clone this repository:
```bash
git clone https://github.com/vndee/multi-mnist
```
Requirements:- python 3
- numpy
- idx2numpy
- tqdm
- opencv-pythonInstall requirements:
```bash
pip3 install -r requirements.txt
```Change some parameter in `main.py`:
- `output_dir`: Path to your expected output directory.
- `number_of_samples_per_class`: Number of samples for each number of digit.Run `python3 main.py` and take a look at `output_dir`.