{"id":13443883,"url":"https://github.com/nightsnack/YOLObile","last_synced_at":"2025-03-20T17:32:09.163Z","repository":{"id":37545440,"uuid":"285068545","full_name":"nightsnack/YOLObile","owner":"nightsnack","description":"This is the implementation of YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design","archived":false,"fork":false,"pushed_at":"2024-07-25T11:00:09.000Z","size":7860,"stargazers_count":361,"open_issues_count":12,"forks_count":96,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-10-28T07:41:21.237Z","etag":null,"topics":["deep-learning","object-detection","yolov4"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nightsnack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-08-04T18:44:10.000Z","updated_at":"2024-10-03T14:47:40.000Z","dependencies_parsed_at":"2024-10-28T05:53:53.839Z","dependency_job_id":"87dfe4a2-5d08-433b-a7b3-dc141dc3aca9","html_url":"https://github.com/nightsnack/YOLObile","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nightsnack%2FYOLObile","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nightsnack%2FYOLObile/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nightsnack%2FYOLObile/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nightsnack%2FYOLObile/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nightsnack","download_url":"https://codeload.github.com/nightsnack/YOLObile/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244660626,"owners_count":20489367,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","object-detection","yolov4"],"created_at":"2024-07-31T03:02:12.844Z","updated_at":"2025-03-20T17:32:04.143Z","avatar_url":"https://github.com/nightsnack.png","language":"Python","funding_links":[],"categories":["Python","Lighter and Deployment Frameworks"],"sub_categories":[],"readme":"\n\n# YOLObile \nThis is the implementation of [YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design](https://arxiv.org/abs/2009.05697) using [ultralytics/yolov3](https://github.com/ultralytics/yolov3). Thanks to the original author.\n\narXiv: [https://arxiv.org/abs/2009.05697](https://arxiv.org/abs/2009.05697)\nIn Proceeding in AAAI 2021\n\n**For those who may be interested in the compiler code (How to deploy it onto Android?):** The compiler source code is associated with our collaborator at William \u0026 Mary, and has joint IP related stuff. **We have no plans to open source this part now.** Sorry for the inconvenience.\n\n**For IOS developer:** We only use Android platform to build and test the compiler because of the advantages of highly open source. We also believe the same techniques can be applied on Apple IOS platform, but we haven't tested it yet.\n\n![Image of YOLObile](figure/yolo_demo.jpg)\n\n\n\n## Introduction\nThe rapid development and wide utilization of object detection techniques have aroused attention on both accuracy and speed of object detectors. However, the current state-of-the-art object detection works are either accuracy-oriented using a large model but leading to high latency\nor speed-oriented using a lightweight model but sacrificing accuracy. In this work, we propose YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design. A novel block-punched pruning scheme is proposed for any kernel size. To improve computational efficiency on mobile devices, a GPU-CPU collaborative  scheme is adopted along with advanced compiler-assisted optimizations. Experimental results indicate that our pruning scheme achieves 14x compression rate of YOLOv4 with 49.0 mAP. \nUnder our YOLObile framework, we achieve 17 FPS inference speed using GPU on Samsung Galaxy S20. \nBy incorporating our proposed GPU-CPU collaborative scheme, the inference speed is increased to 19.1 FPS, and outperforms the original YOLOv4 by 5x speedup.\n\n## Environments\nPython 3.7 or later with all `pip install -U -r requirements.txt` packages including `torch == 1.4`. Docker images come with all dependencies preinstalled. Docker requirements are: \n- Nvidia Driver \u003e= 440.44\n- Docker Engine - CE \u003e= 19.03\n\n### Download Coco Dataset: (18 GB)\n```shell script\ncd ../ \u0026\u0026 sh YOLObile/data/get_coco2014.sh\n```\nThe default path for coco data folder is outside the project root folder.\n\n``` shell script\n/Project\n/Project/YOLObile (Project root)\n/Project/coco (coco data)\n```\n\n### Download Model Checkpoints:\nGoogle Drive: [Google Drive Download](https://drive.google.com/drive/folders/10FRZo9WC1vZA1w6xxysgjptQmbqk3Sz3?usp=sharing)\n\nBaidu Netdisk: [Baidu Netdisk Download](https://pan.baidu.com/s/1FMTOQF6ebH6OJWEAq9F0KQ) code: r3nk\n\nAfter downloads, please put the weight file under ./weights folder\n\n\n## Docker build instructions\n\n### 1. Install Docker and Nvidia-Docker\n\nDocker images come with all dependencies preinstalled, however Docker itself requires installation, and relies of nvidia driver installations in order to interact properly with local GPU resources. The requirements are: \n- Nvidia Driver \u003e= 440.44 https://www.nvidia.com/Download/index.aspx\n- Nvidia-Docker https://github.com/NVIDIA/nvidia-docker\n- Docker Engine - CE \u003e= 19.03 https://docs.docker.com/install/\n\n### 2. Build the project\n```bash\n# Build and Push\nt=YOLObile \u0026\u0026 sudo docker build -t $t .\n```\n\n### 3. Run Container\n```bash\n# Pull and Run with local directory access\nt=YOLObile \u0026\u0026 sudo docker run -it --gpus all --ipc=host -v \"$(pwd)\"your/cocodata/path:/usr/src/coco $t bash\n```\n\n### 4. Run Commands\nOnce the container is launched and you are inside it, you will have a terminal window in which you can run all regular bash commands, such as:\n- `ls .`\n- `ls ../coco`\n- `python train.py`\n- `python test.py`\n- `python detect.py`\n\n## Configurations:\n\n**Train Options and Model Config:** \n```\n./cfg/csdarknet53s-panet-spp.cfg (model configuration)\n./cfg/darknet_admm.yaml (pruning configuration)\n./cfg/darknet_retrain.yaml (retrain configuration)\n```\n\n**Weights:** \n```\n./weights/yolov4dense.pt (dense model)\n./weights/best8x-514.pt (pruned model)\n```\n\n**Prune Config**\n```\n./prune_config/config_csdarknet53pan_v*.yaml\n```\n\n## Training\n\nThe training process includes two steps:\n\n**Pruning:** `python train.py --img-size 320 --batch-size 64 --device 0,1,2,3 --epoch 25 --admm-file darknet_admm --cfg cfg/csdarknet53s-panet-spp.cfg --weights weights/yolov4dense.pt --data data/coco2014.data` \n\nThe pruning process does **NOT** support resume.\n\n**Masked Retrain:** `python train.py --img-size 320 --batch-size 64 --device 0,1,2,3 --epoch 280 --admm-file darknet_retrain --cfg cfg/csdarknet53s-panet-spp.cfg --weights weights/yolov4dense.pt --data data/coco2014.data --multi-scale`.\n\nThe masked retrain process support resume.\n\nYou can run the total process via `sh ./runprune.sh`\n\n\n\n## Check model Weight Parameters \u0026 Flops:\n```shell script\npython check_compression.py\n```\n## Test model MAP:\n\n```shell script\npython test.py --img-size 320 --batch-size 64 --device 0 --cfg cfg/csdarknet53s-panet-spp.cfg --weights weights/best8x-514.pt --data data/coco2014.data\n\n```\n```\n               Class    Images   Targets         P         R   mAP@0.5        F1: 100%|| 79/79 [00:\n                 all     5e+03  3.51e+04     0.501     0.544     0.508     0.512\n              person     5e+03  1.05e+04     0.643     0.697     0.698     0.669\n             bicycle     5e+03       313     0.464     0.409     0.388     0.435\n                 car     5e+03  1.64e+03     0.492     0.547     0.503     0.518\n          motorcycle     5e+03       388     0.602     0.635     0.623     0.618\n            airplane     5e+03       131     0.676     0.786     0.804     0.727\n                 bus     5e+03       259      0.67     0.788     0.792     0.724\n               train     5e+03       212     0.731     0.797     0.805     0.763\n               truck     5e+03       352     0.414     0.526     0.475     0.463\n          toothbrush     5e+03        77      0.35     0.301     0.269     0.323\nSpeed: 3.6/1.4/5.0 ms inference/NMS/total per 320x320 image at batch-size 64\n\nCOCO mAP with pycocotools...\nloading annotations into memory...\nDone (t=3.87s)\ncreating index...\nindex created!\nLoading and preparing results...\nDONE (t=3.74s)\ncreating index...\nindex created!\nRunning per image evaluation...\nEvaluate annotation type *bbox*\nDONE (t=83.06s).\nAccumulating evaluation results...\nDONE (t=9.39s).\n Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.334\n Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.514\n Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.350\n Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.117\n Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.374\n Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.519\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.295\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.466\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.504\n Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.240\n Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.583\n Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.727\n```\n\n## FPS vs mAP on COCO dataset\n![Image of YOLObile](figure/yolobilemap.png)\n\n\n## Already Known Issues\n- The accuracy printed in retraining process is not accurate. Please run the test.py individually to check the accuracy. I raised this issue in the old versions of Ultralytics/YOLOv3 repository, and I am not sure if they had already solved yet. \n\n\n- When you use multi-card training（4 cards or more ), the training process may stop after a few hours without any errors printed.\nI suggest using docker instead if you use 4 cards or more. The docker build instructions can be found above.\n\n- Pytorch 1.5+ might have multi card issues\n\n## Acknowledgements\n[https://github.com/ultralytics/yolov3](https://github.com/ultralytics/yolov3)\n\n[https://github.com/AlexeyAB/darknet](https://github.com/AlexeyAB/darknet)\n\n## Contact Me\nGithub: [https://github.com/nightsnack](https://github.com/nightsnack) \n\nEmail : nightsnackc@gmail.com\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnightsnack%2FYOLObile","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnightsnack%2FYOLObile","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnightsnack%2FYOLObile/lists"}