Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eora-ai/torchok

Production-oriented Computer Vision models training pipeline for common tasks: classification, segmentation, detection and representation🥤
https://github.com/eora-ai/torchok

computer-vision deep-learning image-classification image-retrieval image-segmentation representation-learning

Last synced: 3 months ago
JSON representation

Production-oriented Computer Vision models training pipeline for common tasks: classification, segmentation, detection and representation🥤

Awesome Lists containing this project

README

        

TorchOk

**The toolkit for fast Deep Learning experiments in Computer Vision**

## A day-to-day Computer Vision Engineer backpack

[![Build Status](https://github.com/eora-ai/torchok/actions/workflows/flake8_checks.yaml/badge.svg?branch=main)](https://github.com/eora-ai/torchok/actions/workflows/flake8_checks.yaml)

TorchOk is based on [PyTorch](https://github.com/pytorch/pytorch) and utilizes [PyTorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning) for training pipeline routines.

The toolkit consists of:
- Neural Network models which are proved to be the best not only on [PapersWithCode](https://paperswithcode.com/) but in practice. All models are under plug&play interface that easily connects backbones, necks and heads for reuse across tasks
- Out-of-the-box support of common Computer Vision tasks: classification, segmentation, image representation and detection coming soon
- Commonly used datasets, image augmentations and transformations (from [Albumentations](https://albumentations.ai/))
- Fast implementations of retrieval metrics (with the help of [FAISS](https://github.com/facebookresearch/faiss) and [ranx](https://github.com/AmenRa/ranx)) and lots of other metrics from [torchmetrics](https://torchmetrics.readthedocs.io/)
- Export models to ONNX and the ability to test the exported model without changing the datasets
- All components can be customized by inheriting the unified interfaces: Lightning's training loop, tasks, models, datasets, augmentations and transformations, metrics, loss functions, optimizers and LR schedulers
- Training, validation and testing configurations are represented by YAML config files and managed by [Hydra](https://hydra.cc/)
- Only straightforward training techniques are implemented. No whistles and bells

## Installation
### pip
Installation via pip can be done in two steps:
1. Install PyTorch that meets your hardware requirements via [official instructions](https://pytorch.org/get-started/locally/)
2. Install TorchOk by running `pip install --upgrade torchok`
### Conda
To remove the previous installation of TorchOk environment, run:
```bash
conda remove --name torchok --all
```
To install TorchOk locally, run:
```bash
conda env create -f environment.yml
```
This will create a new conda environment **torchok** with all dependencies.
### Docker
Another way to install TorchOk is through Docker. The built image supports SSH access, Jupyter Lab and Tensorboard ports exposing. If you don't need any of this, just omit the corresponding arguments. Build the image and run the container:
```bash
docker build -t torchok --build-arg SSH_PUBLIC_KEY="" .
docker run -d --name _torchok --gpus=all -v :/workdir -p :22 -p :8888 -p :6006 torchok
```

## Getting started
The folder `examples/configs` contains YAML config files with some predefined training and inference configurations.
### Train
For a training example, we can use the default configuration `examples/configs/classification_cifar10.yml`, where the CIFAR-10 dataset and the classification task are specified. The CIFAR-10 dataset will be automatically downloaded into your `~/.cache/torchok/data/cifar10` folder (341 MB).

**To train on all available GPU devices (default config):**
```bash
python -m torchok -cp ../examples/configs -cn classification_cifar10
```
**To train on all available CPU cores:**
```bash
python -m torchok -cp ../examples/configs -cn classification_cifar10 trainer.accelerator='cpu'
```
During the training you can access the training and validation logs by starting a local TensorBoard:
```bash
tensorboard --logdir ~/.cache/torchok/logs/cifar10
```
### Find learning rate
To automatically find the initial learning rate, we use Pytorch Lightning tuner which algorithm based on [Cyclical Learning Rates for Training Neural Networks](https://arxiv.org/abs/1506.01186) the article.
```bash
python -m torchok -cp ../examples/configs -cn classification_cifar10 +mode=find_lr
```

### Export to ONNX
TODO
### Run ONNX model
For the ONNX model run, we can use the `examples/configs/onnx_infer.yaml`.
But first we need to define the field `path_to_onnx`.

**To test ONNX model:**
```bash
python test.py -cp examples/configs -cn onnx_infer +mode=test
```

**To predict ONNX model:**
```bash
python test.py -cp examples/configs -cn onnx_infer +mode=predict
```

## Run tests
```bash
python -m unittest discover -s tests/ -p "test_*.py"
```
## To be added soon (TODO)
Tasks
=====
* MOBY (unsupervised training)
* InstanceSegmentationTask

Detection models
================
* YOLOR neck + head
* DETR neck + head

Datasets
========
* ImageNet
* Cityscapes

Losses
======
* Pytorch Metric Learning losses
* NT-ext (for unsupervised training)