Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mikaelslevinsky/fasttransforms

:bullettrain_front: Fast orthogonal polynomial transforms :surfer:
https://github.com/mikaelslevinsky/fasttransforms

chebyshev harmonics jacobi laguerre legendre proriol spherical spin ultraspherical zernike

Last synced: 7 days ago
JSON representation

:bullettrain_front: Fast orthogonal polynomial transforms :surfer:

Awesome Lists containing this project

README

        

# FastTransforms

[![Build Status](https://github.com/MikaelSlevinsky/FastTransforms/workflows/CI/badge.svg)](https://github.com/MikaelSlevinsky/FastTransforms/actions?query=workflow%3ACI) [![Build status](https://ci.appveyor.com/api/projects/status/er98t0q3bsx4a5l9/branch/master?svg=true)](https://ci.appveyor.com/project/MikaelSlevinsky/fasttransforms/branch/master) [![codecov](https://codecov.io/gh/MikaelSlevinsky/FastTransforms/branch/master/graph/badge.svg)](https://codecov.io/gh/MikaelSlevinsky/FastTransforms) [![](https://img.shields.io/badge/docs-master-blue.svg)](https://MikaelSlevinsky.github.io/FastTransforms)

[FastTransforms](https://github.com/MikaelSlevinsky/FastTransforms) provides computational kernels and driver routines for orthogonal polynomial transforms. The univariate algorithms have a runtime complexity of O(n log n), while the multivariate algorithms are 2-normwise backward stable with a runtime complexity of O(nd+1), where n is the polynomial degree and d is the spatial dimension of the problem.

The transforms are separated into computational kernels that offer SSE, AVX, and AVX-512 vectorization on applicable Intel processors, and driver routines that are easily parallelized by OpenMP.

If you feel you need help getting started, please do not hesitate to e-mail me. If you use this library for your research, please cite the references.

## Installation Notes

The library makes use of OpenBLAS, FFTW3, and MPFR, which are easily installed via package managers such as Homebrew, apt-get, or vcpkg. When `FastTransforms` is compiled with OpenMP, the environment variable that controls multithreading is `OMP_NUM_THREADS`.

### macOS

Sample installation:
```
brew install libomp fftw mpfr
make CC=clang FT_USE_APPLEBLAS=1
```
On macOS, the OpenBLAS dependency is optional in light of the vecLib framework. It is also important to have `export SDKROOT="/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk"` defined in your environment. In case the library is compiled with vecLib, then the environment variable `VECLIB_MAXIMUM_THREADS` partially controls the multithreading.

### Linux

To access functions from the library, you must ensure that you append the current library path to the default. Sample installation:
```
apt-get install libomp-dev libblas-dev libopenblas-base libfftw3-dev libmpfr-dev
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:.
make CC=gcc
```
See the [GitHub build](https://github.com/MikaelSlevinsky/FastTransforms/blob/master/.github/workflows/ci.yml) for further details.

### Windows

We use GCC 7.2.0 distributed through MinGW-w64 on Visual Studio 2017. Sample installation:
```
vcpkg install openblas:x64-windows fftw3[core,threads]:x64-windows mpir:x64-windows mpfr:x64-windows
mingw32-make CC=gcc FT_BLAS=openblas FT_FFTW_WITH_COMBINED_THREADS=1
```
See the [AppVeyor build](https://github.com/MikaelSlevinsky/FastTransforms/blob/master/.appveyor.yml) for further details.

## Performance Benchmark

The tables below shows the current timings and accuracies for the transforms with pseudorandom coefficients drawn from U(-1,1) and converted back. The timings are reported in seconds and the error is measured in the relative 2- and ∞-norms.

The library is compiled by Apple LLVM version 10.0.1 as above with the compiler optimization flags `-O3 -march=native`, and executed on an iMac Pro (Early 2018) with a 2.3 GHz Intel Xeon W-2191B processor with 18 threads on 18 logical cores.

### Spherical harmonic series to bivariate Fourier series (`sph2fourier`)

| Degree | 63 | 127 | 255 | 511 | 1023 | 2047 | 4095 | 8191 |
| ----------------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: |
| **Forward** | 0.000101 | 0.000446 | 0.001952 | 0.006277 | 0.026590 | 0.141962 | 0.775724 | 4.870621 |
| **Backward** | 0.000101 | 0.000448 | 0.001918 | 0.006357 | 0.027123 | 0.145149 | 0.815881 | 5.040577 |
| ϵ2 | 5.42e-16 | 7.79e-16 | 9.23e-16 | 1.27e-15 | 1.80e-15 | 2.52e-15 | 3.54e-15 | 4.98e-15 |
| ϵ | 1.33e-15 | 2.55e-15 | 4.55e-15 | 5.22e-15 | 9.33e-15 | 1.11e-14 | 1.81e-14 | 3.80e-14 |

### Proriol series to bivariate Chebyshev series (`tri2cheb`) with (α, β, γ) = (0, 0, 0)

| Degree | 63 | 127 | 255 | 511 | 1023 | 2047 | 4095 | 8191 |
| ----------------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: |
| **Forward** | 0.000063 | 0.000300 | 0.001048 | 0.003886 | 0.019123 | 0.113299 | 0.699465 | 4.555086 |
| **Backward** | 0.000065 | 0.000289 | 0.000987 | 0.003837 | 0.019529 | 0.114718 | 0.699051 | 4.697958 |
| ϵ2 | 2.30e-15 | 3.94e-15 | 8.14e-15 | 1.68e-14 | 3.55e-14 | 7.30e-14 | 1.44e-13 | 2.97e-13 |
| ϵ | 1.09e-14 | 1.84e-14 | 5.49e-14 | 1.49e-13 | 9.06e-13 | 1.95e-12 | 3.73e-12 | 1.05e-11 |

The error growth rate is significantly reduced if (α, β, γ) = (-0.5, -0.5, -0.5).

### Generalized Zernike series to Chebyshev×Fourier series (`disk2cxf`) with (α, β) = (0, 0)

| Degree | 63 | 127 | 255 | 511 | 1023 | 2047 | 4095 | 8191 |
| ----------------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: |
| **Forward** | 0.000209 | 0.001077 | 0.003618 | 0.013243 | 0.064321 | 0.369272 | 2.182001 | 16.15872 |
| **Backward** | 0.000208 | 0.001027 | 0.003487 | 0.012903 | 0.064544 | 0.372493 | 2.191747 | 16.43340 |
| ϵ2 | 7.60e-16 | 9.35e-16 | 1.31e-15 | 1.84e-15 | 2.56e-15 | 3.59e-15 | 5.05e-15 | 7.15e-15 |
| ϵ | 1.78e-15 | 2.66e-15 | 5.33e-15 | 7.11e-15 | 1.27e-14 | 1.79e-14 | 2.91e-14 | 1.66e-13 |

### Rotation of a spherical harmonic series (`sph_rotation`) with ZYZ Euler angles (α, β, γ) = (0.1, 0.2, 0.3)

| Degree | 63 | 127 | 255 | 511 | 1023 | 2047 | 4095 | 8191 |
| ----------------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: | -------: |
| **Time** | 0.000141 | 0.000453 | 0.001598 | 0.006742 | 0.031256 | 0.170632 | 1.082894 | 10.35029 |
| ϵ2 | 7.75e-15 | 8.29e-15 | 1.08e-14 | 1.77e-14 | 2.93e-14 | 5.12e-14 | 1.09e-13 | 1.87e-13 |
| ϵ | 1.47e-14 | 2.55e-14 | 8.01e-14 | 2.59e-13 | 7.36e-13 | 2.45e-12 | 6.74e-12 | 1.55e-11 |

## References

[1] J. Molina and R. M. Slevinsky. A rapid and well-conditioned algorithm for the Helmholtz–Hodge decomposition of vector fields on the sphere, arXiv:1809.04555, 2018.

[2] S. Olver, R. M. Slevinsky, and A. Townsend. Fast algorithms using orthogonal polynomials, *Acta Numerica*, **29**:573—699, 2020.

[3] R. M. Slevinsky. Fast and backward stable transforms between spherical harmonic expansions and bivariate Fourier series, *Appl. Comput. Harmon. Anal.*, **47**:585—606, 2019.

[4] R. M. Slevinsky, Conquering the pre-computation in two-dimensional harmonic polynomial transforms, arXiv:1711.07866, 2017.