https://github.com/nekon69/fastnoiselitecuda

A wrapper around C++ FastNoiseLite library for CUDA
https://github.com/nekon69/fastnoiselitecuda

cellular-noise computer-graphics cpp cuda fastnoiselite gamedev generative-art gpgpu gpu header-only noise opensimplex2-noise pcg perlin-noise procedural-generation simplex-noise terrain-generation texture-generation worley-noise

Last synced: 6 days ago
JSON representation

A wrapper around C++ FastNoiseLite library for CUDA

Host: GitHub
URL: https://github.com/nekon69/fastnoiselitecuda
Owner: NeKon69
License: mit
Created: 2025-09-11T18:12:35.000Z (27 days ago)
Default Branch: main
Last Pushed: 2025-09-26T15:36:54.000Z (12 days ago)
Last Synced: 2025-09-26T17:37:56.781Z (12 days ago)
Topics: cellular-noise, computer-graphics, cpp, cuda, fastnoiselite, gamedev, generative-art, gpgpu, gpu, header-only, noise, opensimplex2-noise, pcg, perlin-noise, procedural-generation, simplex-noise, terrain-generation, texture-generation, worley-noise
Language: C++
Homepage:
Size: 82 KB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Notice: NOTICE.md

Awesome Lists containing this project

README

          # FastNoiseLiteCUDA

This is a CUDA-compatible wrapper for Auburn's popular [FastNoiseLite](https://github.com/Auburn/FastNoiseLite) library. It allows for the high-performance generation of various noise types directly on the GPU within CUDA kernels.

This port is designed to be a drop-in replacement for the original single-header file, enabling its use in both host (`__host__`) and device (`__device__`) code.

---

## *Please note: A Critical Note on Host-Side Usage*

The library's behavior changes depending on whether you are compiling a `.cu` file with NVCC or a standard `.cpp` file with a host compiler (like GCC/Clang/MSVC). This is crucial for understanding where you can safely call the `GetNoise` functions.

The reason for this is the `NOISE_CONSTANT` macro, which places large lookup tables into `__constant__` GPU memory when processed by NVCC, but compiles them as regular CPU memory otherwise.

*   **In `.cu` files (compiled with NVCC):**

    Any function that uses the lookup tables (like `GetNoise`) is prepared by NVCC for device execution. If you call such a function from host code within a `.cu` file, the CPU will try to access the `__constant__` memory on the GPU, which is an illegal operation and **will not work**.

    *   **Rule:** Inside `.cu` files, only configure `FastNoiseLite` on the host. Noise generation must happen inside a `__global__` kernel.

*   **In `.cpp` files (or other non-NVCC compiled sources):**

    When you include `FastNoiseLiteCUDA.h` in a standard `.cpp` file, `NOISE_CONSTANT` is empty. The lookup tables are just standard global arrays in CPU memory.

    *   **Rule:** Inside `.cpp` files, you can safely use the **entire** `FastNoiseLite` object on the host, including calling `GetNoise`. This allows for CPU-based noise generation for testing or other logic.

**Recommendation:**

For consistency and to avoid mistakes, the best practice is to configure your `FastNoiseLite` instance on the CPU and then pass it by value to your kernels for noise generation, as shown in *Example 2*. Use host-side `GetNoise` calls from `.cpp` files only when you have a specific need for them.

---

## Key Modifications for CUDA Compatibility

To make the library compatible with CUDA, several key changes were made to the original source code:

*   **CUDA Function Specifiers**: All class methods and helper functions are now decorated with `__device__ __host__` (via the `NOISE_DH` macro). This allows them to be called from both CPU (host) and GPU (device) code seamlessly.

*   **Lookup Table Refactoring**: The original `Lookup` struct, which was a nested static member of the class, caused compilation issues with NVCC. The compiler struggles with the definition of static device-side members. To resolve this:

    *   The large lookup arrays (for gradients and random vectors) have been moved into a global `detail` namespace.

    *   These arrays are declared in `__constant__` memory using the `NOISE_CONSTANT` macro. Constant memory is a cached, read-only memory space on the GPU, making it highly efficient for data that is accessed uniformly by all threads in a warp.

    *   A new `FastNoise` namespace now contains the `Lookup` struct, which safely references these constant memory arrays. This restructuring resolves compilation errors while improving performance on the GPU.

## Usage

Include the `FastNoiseLiteCUDA.h` header in your `.cu` file. You can then instantiate and use the `FastNoiseLite` object directly inside your CUDA kernels.

### Example 1: Creating Noise Object Inside Kernel

Here is a simple example of a kernel that generates 2D OpenSimplex2 noise for a grid by creating the noise object on the device.

```cuda

#include "FastNoiseLiteCUDA.h"

__global__ void generate_noise_kernel(float* output, int width, int height)

{

    int x = blockIdx.x * blockDim.x + threadIdx.x;

    int y = blockIdx.y * blockDim.y + threadIdx.y;

    if (x >= width || y >= height)

    {

        return;

    }

    // Create a FastNoiseLite instance on the stack for each thread

    FastNoiseLite noise(1337); // Seed

    noise.SetNoiseType(FastNoiseLite::NoiseType_OpenSimplex2);

    noise.SetFrequency(0.05f);

    // Calculate noise value

    float noiseValue = noise.GetNoise((float)x, (float)y);

    // Write the result to the output array

    output[y * width + x] = noiseValue;

}

```

### Example 2: Configuring on Host, Passing to Kernel

A more common pattern is to configure the noise generator on the host and pass it by value to the kernel.

```cuda

#include "FastNoiseLiteCUDA.h"

#include 

// Kernel accepts a configured FastNoiseLite object

__global__ void generate_noise_from_host_config(float* output, int width, int height, FastNoiseLite noise)

{

    int x = blockIdx.x * blockDim.x + threadIdx.x;

    int y = blockIdx.y * blockDim.y + threadIdx.y;

    if (x >= width || y >= height)

    {

        return;

    }

    // Use the pre-configured noise object passed from the host

    float noiseValue = noise.GetNoise((float)x, (float)y);

    output[y * width + x] = noiseValue;

}

int main()

{

    int width = 1024;

    int height = 1024;

    size_t bufferSize = width * height * sizeof(float);

    float* d_output;

    cudaMalloc(&d_output, bufferSize);

    // 1. Configure FastNoiseLite on the host

    FastNoiseLite host_noise_generator;

    host_noise_generator.SetNoiseType(FastNoiseLite::NoiseType_Perlin);

    host_noise_generator.SetFrequency(0.02f);

    host_noise_generator.SetFractalType(FastNoiseLite::FractalType_FBm);

    host_noise_generator.SetFractalOctaves(5);

    dim3 threadsPerBlock(16, 16);

    dim3 numBlocks((width + threadsPerBlock.x - 1) / threadsPerBlock.x, (height + threadsPerBlock.y - 1) / threadsPerBlock.y);

    // 2. Pass the configured object by value to the kernel

    generate_noise_from_host_config<<>>(d_output, width, height, host_noise_generator);

    // ... copy data back to host and process ...

    cudaFree(d_output);

    return 0;

}

```

## Original Library

This project is a wrapper and is entirely based on the fantastic work by Jordan Peck (Auburn). All noise generation algorithms and logic belong to the original author.

For more in-depth documentation on the noise algorithms, features, and settings, please refer to the [official repository](https://github.com/Auburn/FastNoiseLite).

## License

This wrapper is distributed under the MIT License, consistent with the original FastNoiseLite library. See the [LICENSE](LICENSE) file for more detail.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nekon69/fastnoiselitecuda

Awesome Lists containing this project

README