An open API service indexing awesome lists of open source software.

https://github.com/antosiowsky/gaussian-filter-assembly


https://github.com/antosiowsky/gaussian-filter-assembly

Last synced: 9 months ago
JSON representation

Awesome Lists containing this project

README

          

Gaussian Filter


Project Overview


Topic: Image filtering using a Gaussian filter


Objective: The goal of this project was to implement an image processing application using a Gaussian filter. The application ensures fast and efficient image filtering by utilizing assembly optimizations and multi-core processing.



Algorithm Description


The filtering process involves applying a 3x3 Gaussian filter matrix to each pixel in an image. The steps include:



  • Retrieving neighboring pixels within a 3x3 region.

  • Multiplying pixels by the corresponding weights in the Gaussian filter matrix.

  • Summing the results and normalizing with a normalization coefficient.

  • Writing the filtered pixel to the output image.


Optimization is achieved through the use of SIMD (Single Instruction Multiple Data) instructions and multi-threading, allowing simultaneous processing of multiple pixels.



Input Parameters




  • Input Image: BMP format image file to be processed.


  • Number of Threads: Specifies the number of threads used for processing (1, 2, 4, 8, 16, 32, 64).


  • Input Data Type: Various image types (e.g., uniform, gradient, random) for testing the algorithm.


  • Computation Library: Specifies the computational method (pure assembly vs. C++ implementation).


Assembly Code Snippet



; Loading neighboring pixels
pinsrb xmm1, byte ptr[RCX + R11 - 3], 0
pinsrb xmm3, byte ptr[RCX + R11], 1
pinsrb xmm1, byte ptr[RCX + R11 + 3], 2
pinsrb xmm3, byte ptr[RCX - 3], 3
pinsrb xmm3, byte ptr[RCX + 3], 5



; Multiplying pixels by filter weights
pmullw xmm3, xmm4
pxor xmm2, xmm2
psadbw xmm1, xmm2
paddsw xmm1, xmm3

This code is optimized for SIMD operations, reducing memory overhead and increasing processing speed.



User Interface


The application provides a graphical user interface (GUI) where users can:



  • Select a BMP image file for processing.

  • Specify the number of filtering iterations.

  • Choose a processing library (C++ or Assembly).

  • Adjust the number of threads using a slider.

  • Apply the Gaussian filter and save the output image.


Menu

Performance Measurements


Testing was performed on three different image sizes: small (640x426), medium (1280x853), and large (1920x1280).


Performance comparisons were made between ASM and C++ implementations using various threading configurations (1, 2, 4, 8, 16, 32, 64 threads).


For each configuration, execution time was measured over 5 runs, with the first run excluded as a warm-up.



Sample Performance Data (Small Image, Assembly)




Threads
Run 1
Run 2
Run 3
Run 4
Run 5
Avg Time (ms)
Standard Deviation


1571312171614.52.38


21869787.51.29


424586561.41



Conclusion


The project demonstrates a significant performance boost using SIMD assembly optimization and multi-threading. The assembly implementation outperforms the C++ version, particularly with higher thread counts.



License


This project is licensed under the MIT License.