{"id":18928207,"url":"https://github.com/bbuf/image-processing-algorithm-speed","last_synced_at":"2025-04-09T22:18:48.485Z","repository":{"id":50327642,"uuid":"194971363","full_name":"BBuf/Image-processing-algorithm-Speed","owner":"BBuf","description":"opencv","archived":false,"fork":false,"pushed_at":"2020-10-28T14:49:10.000Z","size":5697,"stargazers_count":244,"open_issues_count":3,"forks_count":84,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-04-09T22:18:43.423Z","etag":null,"topics":["avx","gaussian-filter","opencv","rgb2gray","sobel","sse"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BBuf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-07-03T03:15:27.000Z","updated_at":"2025-03-04T09:38:27.000Z","dependencies_parsed_at":"2022-09-24T12:46:38.004Z","dependency_job_id":null,"html_url":"https://github.com/BBuf/Image-processing-algorithm-Speed","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BBuf%2FImage-processing-algorithm-Speed","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BBuf%2FImage-processing-algorithm-Speed/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BBuf%2FImage-processing-algorithm-Speed/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BBuf%2FImage-processing-algorithm-Speed/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BBuf","download_url":"https://codeload.github.com/BBuf/Image-processing-algorithm-Speed/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248119285,"owners_count":21050755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["avx","gaussian-filter","opencv","rgb2gray","sobel","sse"],"created_at":"2024-11-08T11:24:09.763Z","updated_at":"2025-04-09T22:18:48.455Z","avatar_url":"https://github.com/BBuf.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Introduction\n\n## speed_histogram_algorithm_framework \n\n- 局部直方图加速框架，内部使用了一些近似计算及指令集加速(SSE)，可以快速处理中值滤波、最大值滤波、最小值滤波、表面模糊等算法。\n\n## resources\n- SSE优化相关的资源。\n\n#### PC的CPU为I5-3230，64位。\n\n#### OpenCV版本为3.4.0\n\n\n\n- sse_implementation_of_common_functions_in_image_processing.cpp 多个图像处理中常用函数的SSE实现。\n- speed_rgb2gray_sse.cpp 使用sse加速RGB和灰度图转换算法，相比于原始实现有接近5倍加速。算法原理：https://mp.weixin.qq.com/s/SagVQ5gfXWWA7NATv-zvBQ  速度测试结果如下：\n\n\u003e测试CPU型号：Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz\n\n| 分辨率    | 优化                                     | 循环次数 | 速度 |\n| --------- | ---------------------------------------- | -------- | ---- |\n| 4032x3024 | 原始实现                                 | 1000      |  12.139ms    |\n| 4032x3024 | 第一版优化（float-\u003eINT）                 | 1000      |   7.629ms   |\n| 4032x3024 | OpenCV 自带函数                          | 1000      |   4.287ms   |\n| 4032x3024 | 第二版优化（手动4路并行）                | 1000      |   10.528ms   |\n| 4032x3024 | 第三版优化（OpenMP4线程）                | 1000      |   7.632ms   |\n| 4032x3024 | 第四版优化（SSE优化，一次处理12个像素）  | 1000      |   5.579ms   |\n| 4032x3024 | 第五版优化（SSE优化，一次处理15个像素）  | 1000      |  5.843ms    |\n| 4032x3024 | 第六版优化（AVX2优化，一次处理10个像素） | 1000      |   3.576ms   |\n| 4032x3024 | 第七版优化（AVX2优化+std::async）        | 1000      |   2.626ms   |\n\n\n\n- speed_vibrance_algorithm.cpp 使用SSE加速自然饱和度算法，加速9倍，算法原理请看： https://mp.weixin.qq.com/s/26UVvqMNLgnquXY21Xu3OQ 。速度测试结果如下：\n\n|分辨率|优化|循环次数|速度|\n|----|----|----|----|\n|4032x3024|原始实现|100|115.36ms|\n|4032x3024|第一版优化|100|62.43ms|\n|4032x3024|第二版优化(4线程)|100|28.89ms|\n|4032x3024|第三版优化(SSE)|100|12.69ms|\n\n\n\n- speed_sobel_edgedetection_sse.cpp 使用SSE加速Sobel边缘检测算法，加速幅度巨大，算法原理请看：https://mp.weixin.qq.com/s/5lCfO_jmSfP7DbsgM7qbpg 。速度测试结果如下：\n\n|分辨率|算法优化|循环次数|速度|\n|-|-|-|-|\n|4032x3024|普通实现|1000|126.54 ms|\n|4032x3024|Float-\u003eINT+查表法|1000|81.62 ms|\n|4032x3024|SSE优化版本1|1000|34.95 ms|\n|4032x3024|SSE优化版本2|1000|28.87 ms|\n|4032x3024|AVX2优化版本1|1000|15.42 ms  |\n|4032x3024|AVX2优化+std::async|1000| 5.69 ms |\n\n- speed_skin_detection_sse.cpp 使用SSE加速肤色检测算法，加速幅度较大，算法原理请看：https://mp.weixin.qq.com/s/UFzY1s6ohTM-dnNg0P4kkw 。速度测试结果如下：\n\n|分辨率|算法优化|循环次数|速度|\n|-|-|-|-|\n|4272x2848|普通实现|1000|41.40ms|\n|4272x2848|OpenMP 4线程|1000|36.54ms|\n|4272x2848|SSE第一版|1000|6.77ms|\n|4272x2848|SSE第二版(std::async)|1000|4.73ms|\n\n- speed_rgb2yuv_sse.cpp SSE极致优化RGB和YUV图像空间互转，算法原理请看：https://mp.weixin.qq.com/s/ryGocz-0YpqZ1CjYXJbd7Q 。速度测试结果如下：\n\n|分辨率|算法优化|循环次数|速度|\n|-|-|-|-|\n|4032x3024|普通实现|1000|150.58ms|\n|4032x3024|去掉浮点数，除法用位运算代替|1000|76.70ms|\n|4032x3024|OpenMP 4线程|1000|50.48ms|\n|4032x3024|普通SSE向量化|1000|48.92ms|\n|4032x3024|_mm_madd_epi16二次优化|1000|33.04ms|\n|4032x3024|SSE+4线程|1000|23.70ms|\n\n\n\n- speed_median_filter_3x3_sse.cpp 极致优化3*3中值滤波，算法原理请看：https://blog.csdn.net/just_sort/article/details/98617050 。速度测试效果如下：\n\n|分辨率|算法优化|循环次数|速度|\n|-|-|-|-|\n|4032x3024|普通实现|10| 8293.79 ms |\n|4032x3024|逻辑优化，更好的流水|10|  83.75 ms |\n|4032x3024|SSE优化|10| 11.93 ms |\n|4032x3024|AVX优化|10| 9.32 ms |\n\n----------------------------------------------------------------------------------\n\n- speed_gaussian_filter_sse.cpp 使用sse加速高斯滤波算法。算法原理：https://blog.csdn.net/just_sort/article/details/95212099 。速度测试效果如下：\n\n| 优化方式| 图像分辨率 | 速度 |\n| ------------------- | ---------- | ---- |\n| C语言普通实现+单线程 | 4032*3024  | 290.43ms |\n| SSE优化+单线程      | 4032*3024  | 265.96ms |\n\n- speed_integral_graph_sse.cpp 使用SSE加速积分图运算，但是在PC上并没有速度提升，算法原理请看：https://www.cnblogs.com/Imageshop/p/6897233.html 。速度测试结果如下：\n\n|优化方式|图像分辨率 |速度|\n|---------|----------|-------|\n|C语言实现+单线程|4032*3024|66.66ms|\n|C语言实现+4线程|4032*3024|65.34ms|\n|SSE优化+单线程|4032*3024|66.10ms|\n|SSE优化+4线程|4032*3024|66.20ms|\n\n\n- speed_common_functions.cpp 对图像处理的一些常用函数的快速实现，个别使用了SSE优化。\n- speed_max_filter_sse.cpp 使用speed_histogram_algorithm_framework框架实现最大值滤波，半径越大越明显。原理请看：https://blog.csdn.net/just_sort/article/details/97280807 。运行的时候记得把工程属性中的sdl检查关掉，不然会报一个变量未初始化的错误。速度测试效果如下:\n\n|优化方式|图像分辨率 |半径|速度|\n|---------|----------|-------|-------|\n|C语言实现+单线程|4272*2848|7|9445.90ms|\n|SSE优化+单线程|4272*2848|7|2234.55ms|\n|C语言实现+单线程|4272*2848|9|14468.76ms|\n|SSE优化+单线程|4272*2848|9|2221.68ms|\n|C语言实现+单线程|4272*2848|11|23069.10ms|\n|SSE优化+单线程|4272*2848|11|2180.95ms|\n\n- speed_box_filter_sse.cpp 使用speed_histogram_algorithm框架实现O(1)最大值滤波，使用了SSE优化，算法原理请看：https://blog.csdn.net/just_sort/article/details/98075712 。运行方法和speed_max_filter_sse.cpp相同，速度测试结果如下：\n\n|优化方式|图像分辨率 |半径|速度|\n|---------|----------|-------|-------|\n|C语言实现+单线程|4272*2848|11|163.16ms|\n|SSE优化+单线程|4272*2848|11|123.83ms|\n|C语言实现+单线程|4272*2848|21|167.81ms|\n|SSE优化+单线程|4272*2848|21|126.98ms|\n|C语言实现+单线程|4272*2848|31|168.62ms|\n|SSE优化+单线程|4272*2848|31|126.17ms|\n\n- speed_multi_scale_detail_boosting_see.cpp 在speed_box_filter_sse.cpp提供的盒子滤波sse优化的基础上，进一步使用指令集实现了对论文《DARK IMAGE ENHANCEMENT BASED ON PAIRWISE TARGET CONTRAST AND MULTI-SCALE DETAIL BOOSTING》的算法优化。算法原理请看：https://blog.csdn.net/just_sort/article/details/98485746  。在CoreI7-3770速度测试结果如下：\n\n|优化方式|图像分辨率 |半径|速度|\n|---------|----------|-------|-------|\n|C语言实现+单线程|4272*2848|7|206.00ms|\n|SSE优化+单线程|4272*2848|7|57.12ms|\n\n- speed_bicubic_zoom_sse.cpp SSE优化三次立方插值算法，算法原理请看：https://blog.csdn.net/just_sort/article/details/100119653 。速度测试结果如下：\n\n|优化方式|图像分辨率 |插值后大小|速度|\n|---------|----------|-------|-------|\n|C语言原始算法实现|4272*2848|长宽均为原始1.5倍|1856.29ms|\n|C语言实现+查表优化+边界优化|4272*2848|长宽均为原始1.5倍|839.10ms|\n|SSE优化+边界优化|4272*2848|长宽均为原始1.5倍|315.70ms|\n|OpenCV3.1.0自带的函数|4272*2848|长宽均为原始1.5倍|118.77ms|\n\n\n\n\n# 维护了一个微信公众号，分享论文，算法，比赛，生活，欢迎加入。\n\n- 图片要是没加载出来直接搜GiantPandaCV 就好。\n\n![](image/weixin.jpg)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbbuf%2Fimage-processing-algorithm-speed","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbbuf%2Fimage-processing-algorithm-speed","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbbuf%2Fimage-processing-algorithm-speed/lists"}