Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gante/opencl-fpga-fir-filter
OpenCL code for big FIR Filters, on FPGAs (also works for GPUs, with the commented code)
https://github.com/gante/opencl-fpga-fir-filter
Last synced: about 1 month ago
JSON representation
OpenCL code for big FIR Filters, on FPGAs (also works for GPUs, with the commented code)
- Host: GitHub
- URL: https://github.com/gante/opencl-fpga-fir-filter
- Owner: gante
- License: mit
- Created: 2018-01-29T21:45:25.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-02-13T15:32:44.000Z (almost 7 years ago)
- Last Synced: 2024-12-03T15:49:51.859Z (about 2 months ago)
- Language: C++
- Size: 11.7 KB
- Stars: 3
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# OpenCL-FPGA-FIR-Filter
OpenCL code for big FIR Filters, on FPGAs (also works for GPUs, with the commented code).
Minimal performance loss when there are more taps than it fits in a single FPGA pipeline.Resulted in: http://ieeexplore.ieee.org/document/7828456/
Problem type: Compute Bound (several operations per load/store)
[each filter (complex) tap requires 4 ADD e 4 MUL = 8 FP operations!
e.g. for a filter with 64 taps, each output requires 64*8 = 512 = 2^9 FP operations]
////////////////////////////////////////////////////////////////////////////////
Simulations: 2^22 floating point elements, 64 tapsBW [GB/s] = 16 [MB] / Exec. Time [ms]
TP [GFLOPS] = ((2^22 * 2^9 FP ops) / 10^9) / Exec. Time [s] = 2147,4836 / Exec. Time [ms]
(The exec. time doesn't include the time it requires to move the memory from the host to the device)
/////@NVIDIA GTX860M: t = 7.122 ms; avg bandwidth = 13.48 GB/s; avg GFLOPS = 301.53
@DE5-Net (Stratix V 5SGXA7): t = 17.22 ms; avg bandwidth = 0.09 GB/s; avg GFLOPS = 124.71