https://github.com/mk2112/mobileyolov3
YOLOv3 on a MobileNetV3_Small architecture; trained, explained, pruned and quantized for text detection.
https://github.com/mk2112/mobileyolov3
icdar2015 mobilenetv2 pruning pytorch quantization yolov3
Last synced: about 2 months ago
JSON representation
YOLOv3 on a MobileNetV3_Small architecture; trained, explained, pruned and quantized for text detection.
- Host: GitHub
- URL: https://github.com/mk2112/mobileyolov3
- Owner: MK2112
- License: mit
- Created: 2024-09-24T16:12:11.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-02-18T19:41:15.000Z (3 months ago)
- Last Synced: 2025-02-18T20:36:33.220Z (3 months ago)
- Topics: icdar2015, mobilenetv2, pruning, pytorch, quantization, yolov3
- Language: Jupyter Notebook
- Homepage:
- Size: 26.3 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# mobileYOLOv3
YOLOv3 via a MobileNetV3 backbone for text detection; pruned, quantized, optimized, and explained for deployment on mobile devices. Primarily intended as a single source for learning about YOLO(v3) in an applied manner.
## Roadmap
- [x] Pretrained MobileNetV2 backbone
- [x] Introduce the YOLOv3 paradigm
- [x] Basic Pruning, Quantization integration
- [x] Training pipeline (for ICDAR 2015)
- [x] Switch backbone to MobileNetV3
- [x] Mixed Precision Training
- [x] Pruning and quantization
- [x] Add textbook-style explanations for YOLOv3
- [ ] Optimize, expand to applicability to other datasets## References
- Yolov3: An incremental improvement [\[Farhadi, A. & Redmon, J., 2018\]](https://arxiv.org/abs/1804.02767)
- ICDAR 2015 Dataset [Kaggle.com](https://www.kaggle.com/datasets/bestofbests9/icdar2015)
- Mobile data science and intelligent apps: concepts, AI-based modeling and research directions [\[Sarker, et al. 2021\]](https://link.springer.com/content/pdf/10.1007/s11036-020-01650-z.pdf)
- Faster R-CNN: Towards real-time object detection with region proposal networks [\[Ren, et al. 2016\]](https://arxiv.org/abs/1506.01497)
- Histograms of oriented gradients for human detection [\[Dalal, N., & Triggs, B. 2005\]](http://vision.stanford.edu/teaching/cs231b_spring1213/papers/CVPR05_DalalTriggs.pdf)
- Distance-IoU loss: Faster and better learning for bounding box regression [\[Zheng, et al. 2020\]](https://arxiv.org/abs/1911.08287)
- Focal loss for dense object detection [\[Ross, et al. 2017\]](https://arxiv.org/abs/1708.02002)
- Does label smoothing mitigate label noise? [\[Lukasik, et al. 2020\]](https://arxiv.org/abs/2003.02819)
- Searching for activation functions [\[Ramachandran, et al. 2017\]](https://arxiv.org/abs/1710.05941)
- Searching for mobilenetv3 [\[Howard, et al. 2019\]](https://arxiv.org/abs/1905.02244)
- ECA-Net: Efficient channel attention for deep convolutional neural networks [\[Wang, et al. 2020\]](https://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_ECA-Net_Efficient_Channel_Attention_for_Deep_Convolutional_Neural_Networks_CVPR_2020_paper.pdf)
- Xception: Deep learning with depthwise separable convolutions [\[Chollet, F. 2017\]](https://arxiv.org/abs/1610.02357)
- Dropout: a simple way to prevent neural networks from overfitting [\[Srivastava, et al. 2014\]](https://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
- MixUp: Beyond empirical risk minimization [\[Zhang, H. 2017\]](https://openreview.net/forum?id=r1Ddp1-Rb)
- Super-convergence: Very fast training of neural networks using large learning rates [\[Smith, L. N., & Topin, N. 2019\]](https://arxiv.org/abs/1708.07120)
- Lookahead optimizer: k steps forward, 1 step back [\[Zhang, et al. 2019\]](https://arxiv.org/abs/1907.08610)
- Methods for pruning deep neural networks [\[Vadera, S., & Ameen, S. 2022\]](https://arxiv.org/abs/2011.00241)
- Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding [\[Han, et al. 2015\]](https://arxiv.org/abs/1510.00149)