https://github.com/kokkos/kokkos-fft
A shared-memory FFT for the Kokkos ecosystem
https://github.com/kokkos/kokkos-fft
fft fft-library kokkos
Last synced: 3 months ago
JSON representation
A shared-memory FFT for the Kokkos ecosystem
- Host: GitHub
- URL: https://github.com/kokkos/kokkos-fft
- Owner: kokkos
- License: other
- Created: 2023-11-27T07:27:13.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-25T07:52:36.000Z (12 months ago)
- Last Synced: 2024-06-11T19:36:49.669Z (12 months ago)
- Topics: fft, fft-library, kokkos
- Language: C++
- Homepage: https://kokkosfft.readthedocs.io/
- Size: 4.36 MB
- Stars: 16
- Watchers: 7
- Forks: 1
- Open Issues: 20
-
Metadata Files:
- Readme: README.md
- License: LICENSES/Apache-2.0.txt
- Authors: AUTHORS
Awesome Lists containing this project
README
# kokkos-fft
[](https://github.com/kokkos/kokkos-fft/actions)
[](https://github.com/kokkos/kokkos-fft/actions/workflows/build_nightly.yaml)
[](https://kokkosfft.readthedocs.io/en/latest/?badge=latest)> [!WARNING]
> EXPERIMENTAL FFT interfaces for Kokkos C++ Performance Portability Programming EcoSystemkokkos-fft implements local interfaces between [Kokkos](https://github.com/kokkos/kokkos) and de facto standard FFT libraries, including [fftw](http://www.fftw.org), [cufft](https://developer.nvidia.com/cufft), [hipfft](https://github.com/ROCm/hipFFT) ([rocfft](https://github.com/ROCm/rocFFT)), and [oneMKL](https://spec.oneapi.io/versions/latest/elements/oneMKL/source/index.html). "Local" means not using MPI, or running within a single MPI process without knowing about MPI. We are inclined to implement the [numpy.fft](https://numpy.org/doc/stable/reference/routines.fft.html)-like interfaces adapted for [Kokkos](https://github.com/kokkos/kokkos).
A key concept is that **"As easy as numpy, as fast as vendor libraries"**. Accordingly, our API follows the API by [numpy.fft](https://numpy.org/doc/stable/reference/routines.fft.html) with minor differences. A fft library dedicated to Kokkos Device backend (e.g. [cufft](https://developer.nvidia.com/cufft) for CUDA backend) is automatically used. If something is wrong with runtime values (say `View` extents), it will raise runtime errors (C++ `std::runtime_error`). See [documentations](https://kokkosfft.readthedocs.io/) for more information.Here is an example for 1D real to complex transform with `rfft` in kokkos-fft.
```C++
#include
#include
#include
#include
using execution_space = Kokkos::DefaultExecutionSpace;
template using View1D = Kokkos::View;
constexpr int n = 4;View1D x("x", n);
View1D > x_hat("x_hat", n/2+1);Kokkos::Random_XorShift64_Pool<> random_pool(12345);
Kokkos::fill_random(x, random_pool, 1);
Kokkos::fence();KokkosFFT::rfft(execution_space(), x, x_hat);
```This is equivalent to the following python code.
```python3
import numpy as np
x = np.random.rand(4)
x_hat = np.fft.rfft(x)
```There are two major differences: [`execution_space`](https://kokkos.org/kokkos-core-wiki/API/core/execution_spaces.html) argument and output value (`x_hat`) is an argument of API (not returned value from API). As imagined, kokkos-fft only accepts [Kokkos Views](https://kokkos.org/kokkos-core-wiki/API/core/View.html) as input data. The accessibilities of Views from `execution_space` are statically checked (compilation errors if not accessible).
Depending on a View dimension, it automatically uses the batched plans as follows
```C++
#include
#include
#include
#include
using execution_space = Kokkos::DefaultExecutionSpace;
template using View2D = Kokkos::View;
constexpr int n0 = 4, n1 = 8;View2D x("x", n0, n1);
View2D > x_hat("x_hat", n0, n1/2+1);Kokkos::Random_XorShift64_Pool<> random_pool(12345);
Kokkos::fill_random(x, random_pool, 1);
Kokkos::fence();int axis = -1;
KokkosFFT::rfft(execution_space(), x, x_hat, KokkosFFT::Normalization::backward, axis); // FFT along -1 axis and batched along 0th axis
```This is equivalent to
```python3
import numpy as np
x = np.random.rand(4, 8)
x_hat = np.fft.rfft(x, axis=-1)
```In this example, the 1D batched `rfft` over 2D View along `axis -1` is executed. Some basic examples are found in [examples](examples).
## Disclaimer
**kokkos-fft is under development and subject to change without warning. The authors do not guarantee that this code runs correctly in all the environments.**## Using kokkos-fft
For the moment, there are two ways to use kokkos-fft: including as a subdirectory in CMake project or installing as a library. First of all, you need to clone this repo.
```bash
git clone --recursive https://github.com/kokkos/kokkos-fft.git
```### Prerequisites
To use kokkos-fft, we need the followings:
* `CMake 3.22+`
* `Kokkos 4.4+`
* `gcc 8.3.0+` (CPUs)
* `IntelLLVM 2023.0.0+` (CPUs, Intel GPUs)
* `nvcc 11.0.0+` (NVIDIA GPUs)
* `rocm 5.3.0+` (AMD GPUs)### CMake
Since kokkos-fft is a header-only library, it is enough to simply add as a subdirectory. It is assumed that kokkos and kokkos-fft are placed under `/tpls`.Here is an example to use kokkos-fft in the following CMake project.
```
---/
|
└──/
|--tpls
| |--kokkos/
| └──kokkos-fft/
|--CMakeLists.txt
└──hello.cpp
```The `CMakeLists.txt` would be
```CMake
cmake_minimum_required(VERSION 3.23)
project(kokkos-fft-as-subdirectory LANGUAGES CXX)add_subdirectory(tpls/kokkos)
add_subdirectory(tpls/kokkos-fft)add_executable(hello-kokkos-fft hello.cpp)
target_link_libraries(hello-kokkos-fft PUBLIC Kokkos::kokkos KokkosFFT::fft)
```For compilation, we basically rely on the CMake options for Kokkos. For example, the compile options for A100 GPU is as follows.
```
cmake -B build \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_CUDA=ON \
-DKokkos_ARCH_AMPERE80=ON
cmake --build build -j 8
```
This way, all the functionalities are executed on A100 GPUs. For installation, details are provided in the [documentation](https://kokkosfft.readthedocs.io/en/latest/intro/building.html#install-kokkosfft-as-a-library).## LICENCE
[](https://spdx.org/licenses/LLVM-exception.html)
[](https://opensource.org/licenses/MIT)kokkos-fft is distributed under either the MIT license, or at your option, the Apache-2.0 licence with LLVM exception.