Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/congard/gaussian-elimination-optimizations
This repository contains several low-level optimizations of the Gaussian Elimination algorithm. Made as a part of Code Optimizations on Different Architectures course at the AGH University of Science and Technology.
https://github.com/congard/gaussian-elimination-optimizations
agh agh-university agh-ust agh-wi avx c-language gauss-elimination optimizations sse
Last synced: 24 days ago
JSON representation
This repository contains several low-level optimizations of the Gaussian Elimination algorithm. Made as a part of Code Optimizations on Different Architectures course at the AGH University of Science and Technology.
- Host: GitHub
- URL: https://github.com/congard/gaussian-elimination-optimizations
- Owner: congard
- License: mit
- Created: 2024-06-30T13:51:27.000Z (4 months ago)
- Default Branch: master
- Last Pushed: 2024-06-30T13:56:40.000Z (4 months ago)
- Last Synced: 2024-09-30T01:03:37.134Z (about 1 month ago)
- Topics: agh, agh-university, agh-ust, agh-wi, avx, c-language, gauss-elimination, optimizations, sse
- Language: C
- Homepage:
- Size: 688 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GaussianElimination - Optimizations
This repository contains several low-level optimizations of the Gaussian Elimination algorithm.
Made as a part of Code Optimizations on Different Architectures course
at the AGH University of Science and Technology.## Optimizations
Optimizations are located in the `src` directory and named `ge#.c`, where `#` - optimization number.
### ge1
Basic Gaussian Elimination implementation.
### ge2
Counters are moved to registers.
### ge3
Multiplier, i.e.
```c
multiplier = A[i][k] / A[k][k]
```is moved to register and extracted from the most inner loop.
### ge4
The third loop is unrolled by 8.
### ge5
Matrix representation was changed to 1D array.
### ge6
SSE instructions are used.
### ge7
AVX-256 instructions are used.
### ge8
AVX-512 instructions are used.
## Dependencies & Requirements
The project requires **PAPI** to be installed (tested with PAPI 7.0.1.0).
PAPI provides a programmer interface to monitor the performance of running programs.On Fedora, PAPI can be installed by executing the following command:
```bash
sudo dnf install papi-devel
```> [!IMPORTANT]
> In order to run the prepared benchmarks, your CPU must at least support the following counters:
>
> | Counter | Description |
> |----------------|----------------------------------------------|
> | `PAPI_FP_OPS` | Floating point operations |
> | `PAPI_TOT_INS` | Instructions completed |
> | `PAPI_TOT_CYC` | Total cycles |
> | `PAPI_BR_TKN` | Conditional branch instructions taken |
> | `PAPI_BR_MSP` | Conditional branch instructions mispredicted |
>
> Supported counters can be checked by executing
> ```bash
> papi_avail | grep Yes
> ```
>
> Moreover, in order to run `ge6`, `ge7` and `ge8` optimizations,
> your CPU must support SSE, AVX-256 and AVX-512 accordingly.
> It can be checked by executing `lscpu` command.## Bibliografy
1. Materiały do laboratorium 2 – dr hab. inż. Maciej Woźniak
2. Materiały do laboratorium 3 – dr hab. inż. Maciej Woźniak
3. Materiały do laboratorium 4 – dr hab. inż. Maciej Woźniak
4. How To Optimize Gemm – flame: [https://github.com/flame/how-to-optimize-gemm](https://github.com/flame/how-to-optimize-gemm)