{"id":13419274,"url":"https://github.com/doonny/PipeCNN","last_synced_at":"2025-03-15T05:30:43.048Z","repository":{"id":39310771,"uuid":"73077215","full_name":"doonny/PipeCNN","owner":"doonny","description":"An OpenCL-based FPGA Accelerator for Convolutional Neural Networks","archived":false,"fork":false,"pushed_at":"2022-02-14T23:59:36.000Z","size":3876,"stargazers_count":1252,"open_issues_count":42,"forks_count":369,"subscribers_count":72,"default_branch":"master","last_synced_at":"2024-11-08T09:48:00.458Z","etag":null,"topics":["altera-opencl-sdk","deep-learning","deep-neural-networks","fpga","fpga-accelerator","hardware","hls","opencl"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/doonny.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-11-07T12:41:32.000Z","updated_at":"2024-11-07T12:57:37.000Z","dependencies_parsed_at":"2022-09-12T12:31:16.004Z","dependency_job_id":null,"html_url":"https://github.com/doonny/PipeCNN","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doonny%2FPipeCNN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doonny%2FPipeCNN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doonny%2FPipeCNN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doonny%2FPipeCNN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/doonny","download_url":"https://codeload.github.com/doonny/PipeCNN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243690113,"owners_count":20331726,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["altera-opencl-sdk","deep-learning","deep-neural-networks","fpga","fpga-accelerator","hardware","hls","opencl"],"created_at":"2024-07-30T22:01:13.692Z","updated_at":"2025-03-15T05:30:42.313Z","avatar_url":"https://github.com/doonny.png","language":"C","readme":"# PipeCNN\n\n## About \n**PipeCNN** is an OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks (CNNs).\nThere is a growing trend among the FPGA community to utilize High Level Synthesis (HLS) tools to design\nand implement customized circuits on FPGAs. Compared with RTL-based design methodology, the HLS tools provide faster hardware development\ncycle by automatically synthesizing an algorithm in high-level languages (e.g. C/C++) to RTL/hardware. [OpenCL™](https://www.khronos.org/opencl/) is an open, emergying cross-platform parallel programming language that can be used in both GPU and FPGA developments. The main goal of this project is to provide a generic, yet efficient OpenCL-based design of CNN accelerator on FPGAs. PipeCNN utilizes ***Pipe**lined **CNN*** functional kernels to achieved improved throughput in inference computation. Our design is scalable both in performance and hardware resource, and thus can be deployed on a variety of FPGA platforms. PipeCNN supports both Intel OpenCL SDK and Xilinx Vitis based FPGA design flow.\n\n## How to Use\n\nFirst, download the pre-trained CNN models, input test vectors and golden reference files from PipeCNN's own ModelZoo (instructions are located in the \"data\" folder inside each project folder). Place the data in the correct folder. Then, compile the project by using the Makefile provided. After finishing the compilation, simply type the following command to run PipeCNN:\n```\n./run.exe conv.aocx\n```\nThe ModelZoo now provides pre-quantized model for the following networks:\n* VGG-16\n* ResNet-50\n\nFor more detailed instructions, please check out the [User Instructions](https://github.com/doonny/PipeCNN/tree/master/documents).\n\n## Supported Tools\nCurrently, we are using [Intel's OpenCL SDK](https://www.intel.com/content/www/us/en/software/programmable/sdk-for-opencl/overview.html) and [Xilinx Vitis](https://china.xilinx.com/products/design-tools/vitis/vitis-platform.html) tool kit to compile of the OpenCL/HLS code and implementate of the generated RTL on FPGAs. \n\n* Intel OpenCL SDK Pro v20.1\n* Xilinx Vitis 2020.1\n\n## Tested Boards\nThe following boards have been tested working:\n* Terasic's DE5a-net-ddr4 (Arria-10 GX1150 FPGA)\n* Intel's Arria-10 Dev Kit (Arria-10 GX1150 FPGA)\n* Xilinx's U50 Acceleration Card (VU35P FPGA)\n* Xilinx's ZCU102 Dev Board (ZU9EG FPGA)\n* Xilinx's ZC706 Dev Board (Zynq-7045 FPGA)\n\nPipeCNN may also run on other FPGA boards, which includes Terasic's DE10-standard/DE10-nano, Intel's PAC cards, Xilinx Ultra96-v2 boards. However, due to limited time and resourse, we have not verified that yet. Please let us know if you would like to share your results on other FPGA boards.\n\n## Demos\nNow you can run classification on the ImageNet dataset by using PipeCNN, and measure the top-1/5 accuracy for different CNN models.\n\nTo run this demo, first, set **USE_OPENCV = 1** in the Makefile. Secondly, download the ImageNet validation dataset, extract and place all the pictures in the \"/data\" folder. Rename the variable \"picture_file_path_head\" in the host file to indicate the correct image data set path. Finally, recompile the host program and run PipeCNN.\n\nThe following piture shows that the demo runs on our own computer with the DE5-net board.\n\n![DE5-net-Demo](documents/Demo-DE5-net.gif)\n\n## Performances\nIt's been four years since the release of the this project. Deep Learning Architecture (DLA) is constantly evolving, and lots of new techniques have been invented to improve the efficiency of DLA. The performance of PipeCNN is no longer comparable to the state-of-the-art designs. Therefore, the current goal of this project is to provide a complete design that can be used to learn DLA and try out new ideas. \n\nThis following table lists the performance and cost information on some of the boards we used as a reference. For each FPGA device, one needs to perform design space exploration (with hardware parameters VEC_SIZE, LANE_NUM and CONV_GP_SIZE_X) to find the optimal design that maximizes the throughput or minimizes the excution time. Suggested hardware parameters for the above boards are summarized [here](https://github.com/doonny/PipeCNN/tree/master/documents). Since we are constantly optimzing the design and updating the codes, the performance data in the following table might be out-dated, and please use the latest version to get the exect data. We welcome other vendors/researches to provide the latest performance and cost information on other FPGA platforms/boards.\n\n| Boards     | Excution Time* | Batch Size | DSP Consumed |  Frequency|\n| :--------: |--------------:| ----------:| ------------:|----------:|\n| DE5a-net-ddr4    |          -- |         -- |           --|     --|\n\n*Note: ResNet-50 was used as the benchmark. Image size is 227x227x3.\n\n## Citation\nPlease kindly cite our work of PipeCNN if it helped your research:\n```\nDong Wang, Ke Xu and Diankun Jiang, “PipeCNN: An OpenCL-Based Open-Source FPGA Accelerator for Convolution Neural Networks”, FPT 2017.\n```\nArchitectural and algorithm level optimizations can be conducted to further improve the performance of PipeCNN. We list a few latest research achievements that are based on PipeCNN for reference:\n* Improving the throughput by introducing a new opencl-friendly sparse-convolution algorithm\n```\nDong Wang, Ke Xu, Qun Jia and  Soheil Ghiasi, “ABM-SpConv: A Novel Approach to FPGA-Based Acceleration of Convolutional Neural Network Inference”, DAC 2019.\n```\n\n## Contributors\n\nThe following people have also contributed to this project:\n\n[Diankun Jiang](https://github.com/dkjiang2018), [Ke Xu](https://github.com/xuke225), Qun Jia, Jianjing An, Xiaoyun Wang, Shihang Fu, Zhihong Bai, Dezheng Zhang.\n\n## Research Opportunity\nOur Research Lab is also looking for outstanding students who are interested in designing hardware accelerator for deep-learning algorithms on FPGAs. Please send me a e-mail if you are interested.\n\n## Related Works\nThere are other FPGA accelerators that also adopt HLS-based design scheme. Some brilliant works are listed as follow. Note that PipeCNN is the first, and only one that is Open-Source （￣︶￣）↗\n* U. Aydonat, S. O'Connell, D. Capalija, A. C. Ling, and G. R. Chiu. \"An OpenCL™ Deep Learning Accelerator on Arria 10,\" *in Proc. FPGA 2017*.\n* N. Suda, V. Chandra, G. Dasika, A. Mohanty, Y. F. Ma, S. Vrudhula, J. S. Seo, and Y. Cao, \"Throughput-Optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks,\" *in Proc. FPGA 2016*.\n* C. Zhang, P. Li, G. Sun, Y. Guan, B. J. Xiao, and J. Cong, \"Optimizing FPGA-based accelerator design for deep convolutional neural networks,\" *in Proc. FPGA 2015*.\n","funding_links":[],"categories":["TODO scan for Android support in followings","FPGA Tools","C","Applications","Tools"],"sub_categories":["Interfaces","Mesh networks"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdoonny%2FPipeCNN","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdoonny%2FPipeCNN","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdoonny%2FPipeCNN/lists"}