{"id":28470480,"url":"https://github.com/icarogabryel/cnn-accelerator","last_synced_at":"2025-08-07T23:08:33.880Z","repository":{"id":284431415,"uuid":"953275796","full_name":"icarogabryel/cnn-accelerator","owner":"icarogabryel","description":"CNN accelerator using radix-4 Booth's algorithm described in VHDL . It multiplies a 32-bit integer with a 7-bit constant from a 3x3 kernel and accumulates the results.","archived":false,"fork":false,"pushed_at":"2025-07-04T20:22:53.000Z","size":255,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-04T20:34:30.779Z","etag":null,"topics":["accelerator","booths-algorithm","cnn","compressor","computer-architecture","computer-organization","hardware","hardware-acceleration","hardware-designs","hdl","ia","integrated-circuits","multiplier"],"latest_commit_sha":null,"homepage":"","language":"VHDL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/icarogabryel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-23T01:01:41.000Z","updated_at":"2025-07-04T20:22:56.000Z","dependencies_parsed_at":"2025-05-14T20:29:38.843Z","dependency_job_id":"417fd586-2240-44c9-9fea-c4dc5facaa96","html_url":"https://github.com/icarogabryel/cnn-accelerator","commit_stats":null,"previous_names":["icarogabryel/cnn-accelerator"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/icarogabryel/cnn-accelerator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/icarogabryel%2Fcnn-accelerator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/icarogabryel%2Fcnn-accelerator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/icarogabryel%2Fcnn-accelerator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/icarogabryel%2Fcnn-accelerator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/icarogabryel","download_url":"https://codeload.github.com/icarogabryel/cnn-accelerator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/icarogabryel%2Fcnn-accelerator/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269338065,"owners_count":24400180,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-07T02:00:09.698Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accelerator","booths-algorithm","cnn","compressor","computer-architecture","computer-organization","hardware","hardware-acceleration","hardware-designs","hdl","ia","integrated-circuits","multiplier"],"created_at":"2025-06-07T09:30:40.363Z","updated_at":"2025-08-07T23:08:33.866Z","avatar_url":"https://github.com/icarogabryel.png","language":"VHDL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CNN Accelerator\n\nIn this repository, you will find the hardware design for a Convolutional Neural Network (CNN) accelerator described in VHDL. It multiplies a 32-bit integer with a 7-bit constant from a 3x3 kernel and accumulates the results.\n\n## The multiplier\n\nThe multiplier uses a parallel approach of the radix-4 Booth's algorithm. In this case, the 7-bit constant is used as the multiplier to generate only 4 partial products. Firstly, the architecture should use a wallace tree to sum the partial products. However, with only two levels, it was decided to use a four way adder that uses a 4:2 compressor internally to reduce the size of the circuit.\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"doc/multiplier.jpg\" alt=\"Figure 1\" width=\"65%\"/\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003eFigure 1\u003c/p\u003e\n\nAs seen in figure 1, the multiplier has 4 components responsible for generating the partial products and a 4-way adder to sum them. Inside the partial product generator, the Booth's algorithm is implemented as follows figure 2. The multiplexer is used to select the value of the partial product based on the value of block made by part of the multiplier number.\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"doc/partial_gen.jpg\" alt=\"Figure 2\" width=\"65%\"/\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003eFigure 2\u003c/p\u003e\n\nThe partials products are generated by the following the radix-4 table:\n\n| Operation  | Block |\n|------------|-------|\n| mtpcd * 0  | 000   |\n| mtpcd * 1  | 001   |\n| mtpcd * 1  | 010   |\n| mtpcd * 2  | 011   |\n| mtpcd * -2 | 100   |\n| mtpcd * -1 | 101   |\n| mtpcd * -1 | 110   |\n| mtpcd * 0  | 111   |\n\nAlso, the multiplexer only has 33 bits because the extension of the signal is only necessary when summing the partial products. When extending the signal, is also made the shift to the left to align the bits in Booth's algorithm.\n\nThe 4-way adder is implemented with a 4:2 compressor chain. The chain was implemented as follows:\n\n```vhdl\ncompressor_gen : for i in 0 to 38 generate\n    p_carry(i) \u003c= carry_bus(i) or c_out_bus(i);\n\n    compressor_inst : compressor\n        port map(\n            b0    =\u003e a(i),\n            b1    =\u003e b(i),\n            b2    =\u003e c(i),\n            b3    =\u003e d(i),\n            c_in  =\u003e p_carry(i),\n            c_out =\u003e c_out_bus(i + 1),\n            carry =\u003e carry_bus(i + 1),\n            sum   =\u003e sum(i)\n        );\n\nend generate compressor_gen;\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficarogabryel%2Fcnn-accelerator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ficarogabryel%2Fcnn-accelerator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficarogabryel%2Fcnn-accelerator/lists"}