https://github.com/graykode/vision-tutorial

Computer Vision Tutorial for Deep Learning Researchers
https://github.com/graykode/vision-tutorial

Last synced: 11 months ago
JSON representation

Computer Vision Tutorial for Deep Learning Researchers

Host: GitHub
URL: https://github.com/graykode/vision-tutorial
Owner: graykode
Created: 2019-02-21T03:18:43.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-08-14T03:31:04.000Z (almost 7 years ago)
Last Synced: 2025-04-30T20:03:36.888Z (about 1 year ago)
Language: Python
Size: 966 KB
Stars: 33
Watchers: 4
Forks: 9
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          ## vision-tutorial



`vision-tutorial` is a tutorial for who is studying `Computer Vision Basic Architectures` using **Pytorch** and **Keras**. Most of the models about Vision were implemented with less than **100 lines** of code(except comments or blank lines). The list of these papers is a list that Professor [Sung Kim](https://github.com/hunkim) recommended.

- Data was used as overfitting to show simple model learning. [One image about Cat or Dog](https://github.com/graykode/vision-tutorial/tree/master/data)

- The accuracy of the model is not important in this project because it is affected by data. I recommend that you **focus on the structure of the model, the number of parameters, the learning process and paper detailed implementation. **

  

## SOTA Basic Vision Models - Introduction

- How to handle image in Pytorch and Keras

  - Image Resizing, Cropping

- Introduction CNN(Convolutional Neural Networks) in Pytorch and Keras

  - How does number of channels, filter size (=kernel), grid, and padding affect Convolution?

  - Paper : [Object Recognition with Gradient-Based Learning](http://yann.lecun.com/exdb/publis/pdf/lecun-99.pdf)

- AlexNet(2012.09)

  - Paper : [ImageNet Classification with Deep Convolutional Neural Networks](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)

  - Model

    ![](2.AlexNet/model.jpg)

- ZFNet(2013.11)

  - Paper : [Visualizing and Understanding Convolutional Networks](https://arxiv.org/abs/1311.2901)

- VGG16(2014.09)

  - Paper : [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)

- Inception.v1(a.k.a GoogLeNet)(2014.09)

  - Paper : [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)

- Inception.v2, v3(2015.12)

  - Paper : [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/abs/1512.00567)

- ResNet(2015.12)

  - Paper : [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)

  - Model

    ![](7.ResNet/model.jpeg)

- Inception.v4(2016.02)

  - Paper : [Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning](https://arxiv.org/abs/1602.07261)

- DenseNet(2016.08)

  - Paper : [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)

  - Model

    ![](9.DenseNet/model.jpg)

- Xception(2016.10)

  - Paper : [Xception: Deep Learning with Depthwise Separable Convolutions](https://arxiv.org/abs/1610.02357)

- MobileNet(2017.04)

  - Paper : [MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications](https://arxiv.org/abs/1704.04861)

- SENet(2017.09)

  - Paper : [Squeeze-and-Excitation Networks](https://arxiv.org/abs/1709.01507)

## To be Continue Implementation in Other Repository

#### v Semantic Segmentation

- FCN(2014.11) : [Fully Convolutional Networks for Semantic Segmentation](https://arxiv.org/abs/1411.4038)

- U-Net(2015.05) : [U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/abs/1505.04597)](https://arxiv.org/abs/1606.00915)

- SegNet(2015.11) : [SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation](https://arxiv.org/abs/1511.00561)

- DeepLab(2016.06) : [DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs](https://arxiv.org/abs/1606.00915)

- ENet(2016.07) : [ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation](https://arxiv.org/abs/1606.02147)

- PSPNet(2016.12) : [Pyramid Scene Parsing Network](https://arxiv.org/abs/1612.01105)

- ICNet(2017.04) : [ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545)

#### v Generative adversarial networks

- GAN(2014.06) : [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661)

- DCGAN(2015.11) : [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434)

- Pix2Pix(2016.11) : [Image-to-Image Translation with Conditional Adversarial Networks](https://arxiv.org/abs/1611.07004)

- WGAN(2017.01) : [Wasserstein GAN](https://arxiv.org/abs/1701.07875)

- CycleGAN(2017.05) : [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks](https://arxiv.org/abs/1703.10593)

#### v Object Detection

- RCNN(2013.11) : [Rich feature hierarchies for accurate object detection and semantic segmentation](https://arxiv.org/abs/1311.2524)

- Fast-RCNN(2015.04) : [Fast R-CNN](https://arxiv.org/abs/1504.08083)

- Faster-RCNN(2015.06) : [Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](https://arxiv.org/abs/1506.01497)

- YOLO(2015.06) : [You Only Look Once: Unified, Real-Time Object Detection](https://arxiv.org/abs/1506.02640)

- SSD(2015.12) : [SSD: Single Shot MultiBox Detector](https://arxiv.org/abs/1512.02325)

- YOLO9000(2016.12) : [YOLO9000: Better, Faster, Stronger](https://arxiv.org/abs/1612.08242)

- Mask R-CNN(2017.05) : [Mask R-CNN](https://arxiv.org/abs/1703.06870)

- RetinaNet(2017.08):  [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002)

## Author

- Tae Hwan Jung(Jeff Jung) @graykode

- Author Email : [nlkey2022@gmail.com](mailto:nlkey2022@gmail.com)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/graykode/vision-tutorial

Awesome Lists containing this project

README