{"id":13641512,"url":"https://github.com/rockcarry/ffcnn","last_synced_at":"2026-02-15T22:14:15.187Z","repository":{"id":41743681,"uuid":"391063800","full_name":"rockcarry/ffcnn","owner":"rockcarry","description":"ffcnn is a cnn neural network inference framework, written in 600 lines C language.","archived":false,"fork":false,"pushed_at":"2022-07-06T03:12:29.000Z","size":1907,"stargazers_count":75,"open_issues_count":0,"forks_count":28,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-08-03T01:23:30.258Z","etag":null,"topics":["cnn","darknet","ncnn","yolo","yolo-fastest","yolov3"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rockcarry.png","metadata":{"files":{"readme":"readme.txt","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-07-30T12:52:39.000Z","updated_at":"2024-05-31T12:03:45.000Z","dependencies_parsed_at":"2022-08-19T00:00:40.975Z","dependency_job_id":null,"html_url":"https://github.com/rockcarry/ffcnn","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rockcarry%2Fffcnn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rockcarry%2Fffcnn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rockcarry%2Fffcnn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rockcarry%2Fffcnn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rockcarry","download_url":"https://codeload.github.com/rockcarry/ffcnn/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223827363,"owners_count":17209780,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cnn","darknet","ncnn","yolo","yolo-fastest","yolov3"],"created_at":"2024-08-02T01:01:21.400Z","updated_at":"2026-02-15T22:14:15.160Z","avatar_url":"https://github.com/rockcarry.png","language":"C","funding_links":[],"categories":["Other Versions of YOLO","Data \u0026 Science"],"sub_categories":["Machine Learning Framework"],"readme":"﻿+----------------------------+\r\n ffcnn 卷积神经网络前向推理库\r\n+----------------------------+\r\n\r\nffcnn 是一个 c 语言编写的卷积神经网络前向推理库\r\n只用了 500 多行代码就实现了完整的 yolov3、yolo-fastest 网络的前向推理\r\n不依赖于任何第三方库，在标准 c 环境下就可以编译通过，在 VC、msys2+gcc、ubuntu+gcc\r\n等多个平台上都可以正确的编译运行\r\n\r\n这个代码相对于 darknet、ncnn 来说，没做特殊指令集优化，但代码更加简洁易懂，可以\r\n作为大家学习卷积神经网络的一个参考\r\n\r\n\r\ndarknet 与 yolov3 的一些总结\r\n----------------------------\r\nyolov3 的网络结构里面，只有卷积层、dropout 层、shortcut 层、route 层、maxpool 层、\r\nupsample 层和 yolo 层这几种类型。因此要实现起来还是比较容易的\r\n\r\n卷积层：\r\n1. 要搞明白卷积的含义和计算方法\r\n2. 卷积运算的 pad、stride 的含义\r\n3. 每个卷积核还有一个 bias 参数，计算完每个点后需要加上这个 bias\r\n   没有归一化的情况（batch_normalize），其计算方法：\r\n   x += bias;\r\n   x  = activate(x, type);\r\n4. 要搞明白什么是分组卷积\r\n5. 卷积运算每个输出的点，都要经过激活函数\r\n6. 卷积层如果有归一化操作（batch_normalize），其计算方法：\r\n   x  = (x - rolling_mean) / sqrt(rolling_variance + 0.00001f)\r\n   x *= scale;\r\n   x += bias;\r\n   x  = activate(x, type);\r\n   其中 rolling_mean、rolling_variance、scale、bias 在 darknet 的 weights 文件中可以读取到\r\n\r\ndropout 层：\r\n前向推理时，这一层可以当做不存在，输入数据不做任何处理，直接传给下一层即可\r\n\r\nshortcut 层：\r\n把指定层的数据和当前层的数据相加，然后结果输出到下一层\r\n\r\nroute 层：\r\n把指定的层（最多可以有 4 个）做拼接，宽高不变，channel 个数增加，然后结果输出到下一层\r\n\r\nmaxpool 层：\r\nmax 池化层，将 filter 覆盖的数据取最大值作为结果\r\n\r\nupsample 层：\r\n上采样层，可以理解为把图像放大，stride 指定了放大倍数，一般用最近邻法就可以了\r\n\r\nyolo 层：\r\n这一层主要是根据输入的 feature map 计算出 bbox\r\n以 yolo-fastest 为例，总共有两个 yolo 层，其输入分别是 10x10x255 和 20x20x255\r\n其中 255 表示有 255 个通道，其每个数据的含义如下：\r\n255 = 3 * (4 + 1 + 80)\r\n3 表示这个 grid 里面有 3 个 bbox 结果数据\r\n每个 bbox 结果数据里面，4 个 x, y, w, h 坐标数据，1 个 object score 评分，然后是 80 个分类的评分\r\n每个 bbox 里面在 80 个分类中找出评分最高的，作为这个 bbox 的分类，评分如果小于阈值（ignore_thresh）则丢弃\r\n将符合要求的全部 bbox 放入一个列表保存，然后再做一个 nms 操作，就得到最终结果了\r\n\r\n每个 bbox 的评分和 (x, y, w, h) 计算方法：\r\n设 tx, ty, tw, th, bs 分别对应channel 0, 1, 2, 3, 4 的值（后面还有 80 个分类的评分）\r\n\r\n评分的计算方法：score = sigmoid(bs); （80 个分类评分计算方法是一样的）\r\n坐标计算方法：\r\nfloat bbox_cx = (j + sigmod(tx)) * grid_width;  （grid_width 就是网络输入层即 0 层的宽度除以格子数目，即每个格子的像素宽度）\r\nfloat bbox_cy = (i + sigmod(ty)) * grid_height; （方法与 bbox_cx 一致）\r\nfloat bbox_w  = (float)exp(tw) * anchor_box_w;  （如果有缩放系数还要乘以这个系数）\r\nfloat bbox_h  = (float)exp(th) * anchor_box_h;  （方法与 bbox_w  一致）\r\n\r\nbbox_cx、bbox_cy 是中心点坐标，bbox_w、bbox_h 是宽高，转换一下得到：\r\nx1 = bbox_cx - bbox_w * 0.5f;\r\ny1 = bbox_cy - bbox_h * 0.5f;\r\nx2 = bbox_cx + bbox_w * 0.5f;\r\ny2 = bbox_cy + bbox_h * 0.5f;\r\n\r\n\r\ndarknet 的 weights 文件\r\n-----------------------\r\n\r\n文件最前面有一个文件头：\r\n#pragma pack(1)\r\ntypedef struct {\r\n    int32_t  ver_major, ver_minor, ver_revision;\r\n    uint64_t net_seen;\r\n} WEIGHTS_FILE_HEADER;\r\n#pragma pack()\r\n\r\n然后就是全部的权重数据，yolov3、yolo-fastest 的模块基本上就只有卷积层的权重，其它层是没有权重数据的。\r\n图像和卷积核（filter）的数据都是 NCHW 格式，卷积层的权重数据存放顺序为：\r\n\r\nn 个 bias\r\nif (batchnorm) {\r\n    n 个 scale\r\n    n 个 rolling_mean\r\n    n 个 rolling_variance\r\n}\r\nn * c * h * w 个权重数据\r\n\r\n\r\nffcnn 的特点\r\n------------\r\n1. 极为简洁易懂的 c 语言代码实现\r\n2. 核心算法仅仅 600 行\r\n3. 不依赖于任何第三方库\r\n4. 可以很方便的移植到各种平台\r\n5. 推理时会自动释放不需要的 layer 减小内存占用\r\n6. 现阶段是 make it work first 后面有时间再优化性能\r\n7. 直接使用 darknet 的 .cfg 和 .weights 文件（不需要再转换）\r\n\r\n\r\nffcnn vs ncnn 性能评测\r\n----------------------\r\n\r\n测试环境：\r\n1. Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz 1.90GHZ, 8GB RAM\r\n2. win7 64bit 操作系统 + msys2 + mingw32 + gcc version 10.3.0\r\n3. ffcnn + yolo-fastest 代码：https://github.com/rockcarry/ffcnn\r\n4. ncnn  + yolo-fastest 代码：https://github.com/rockcarry/ffyolodet\r\n5. 测试图片 test.bmp 100 次推理\r\n\r\n测试结果：\r\n+----------+--------------+-------------------+------------------+\r\n| 测试项目 | ffcnn-v1.2.0 | ncnn with avx off | ncnn with avx on |\r\n+----------+--------------+-------------------+------------------+\r\n| 耗    时 | 14555ms      | 14649 ms          | 8424 ms          |\r\n+----------+--------------+-------------------+------------------+\r\n| 内存占用 | 5MB          | 41MB              | 41MB             |\r\n+----------+--------------+-------------------+------------------+\r\n| 程序体积 | 68KB         | 1.2MB             | 1.2MB            |\r\n+----------+--------------+-------------------+------------------+\r\n\r\n可以看到 ffcnn 已经逼近 ncnn（不开启 avx 指令优化）的性能\r\n\r\n\r\nrockcarry@163.com\r\n20:22 2021/8/7\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frockcarry%2Fffcnn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frockcarry%2Fffcnn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frockcarry%2Fffcnn/lists"}