https://github.com/bobmcdear/neural-network-cuda

Neural network from scratch in CUDA/C++
https://github.com/bobmcdear/neural-network-cuda

cplusplus cuda deep-learning machine-learning neural-network

Last synced: 4 months ago
JSON representation

Neural network from scratch in CUDA/C++

Host: GitHub
URL: https://github.com/bobmcdear/neural-network-cuda
Owner: BobMcDear
License: gpl-3.0
Created: 2021-05-04T22:13:20.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2025-01-17T00:42:54.000Z (11 months ago)
Last Synced: 2025-03-31T10:02:11.139Z (8 months ago)
Topics: cplusplus, cuda, deep-learning, machine-learning, neural-network
Language: Cuda
Homepage:
Size: 23.5 MB
Stars: 78
Watchers: 1
Forks: 16
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

# Neural network in CUDA/C++

• [Description](#description)

• [Usage](#usage)

## Description
This is an implementation of a neural net, completely from scratch, in CUDA/C++. A technical report with a more comprehensive overview of this project can be found [here](https://github.com/BobMcDear/neural-network-cuda/blob/main/Neural%20Network%20in%20CUDA.pdf).

## Usage
The code is by no means efficient and is meant as an introduction to CUDA only. Here is an overview of the various classes and functions:

Everything is implemented in both pure C++ (under ```CPU/```) and CUDA/C++ (under ```GPU/```). The syntax remains virtually identical, and there are only two points to bear in mind when switching between C++ and CUDA/C++:

1. C++ and CUDA/C++ modules end with the suffixes ```CPU``` and ```GPU``` respectively.
2. Don't forget to allocate and destroy CUDA arrays via ```cudaMallocManaged``` and ```cudaFree```

* ```linear.h/Linear_SUFFIX```:
* Initialization:

Required arguments:
* ```_bs``` (```int```): Batch size.
* ```_n_in``` (```int```): Number of input features.
* ```_n_out``` (```int```): Number of output features.

Optional arguments:
* ```_lr``` (```float```): Learning rate.

* ```forward```: Runs a linear forward pass.

Required arguments:
* ```_inp``` (```float*```): Pointer to the input data.
* ```_out``` (```float*```): Pointer for storing the output data.

* ```update```: Updates the weights and biases.

* ```backward```: Performs a backward pass, storing the gradients in ```_inp```.

* ```relu.h/ReLU_SUFFIX```:
* Initialization:

Required argument:
* ```_sz_out``` (```int```): The number of input/output elements.

* ```forward```, ```backward```: Like ```Linear_SUFFIX``` but for ReLU.

* ```mse.h/MSE_SUFFIX```:
* Initialization: Like ReLU.

* ```forward```: Dummy method for compatibility with the other modules and performing backpropagation; does not actually calculate the loss.

Required arguments:
* ```_inp``` (```float*```): Pointer to the predictions.
* ```_out``` (```float*```): Pointer to the target values.

* ```_forward```: Calculates the MSE. This method is solely for calculating the loss and cannot be used during backpropagation.

Required arguments: Like ```MSE_SUFFIX``` but ```_out``` must have an extra element for storing the loss.

* ```backward```: Performs a backward pass, storing the gradients in ```_inp```.

* ```sequential.h/Sequential_SUFFIX```:

* Initialization:

Required arguments:
* ```layers``` (```std::vector```): Layers to be chained together.

* ```forward```: Cascades the modules in ```layers```.

Required arguments:
* ```inp``` (```float*```): Pointer to the input data.
* ```out``` (```float*```): Dummy argument, only for compatibility with the other forward methods and doesn't get used. The output is accesible via the last layer's ```out``` attribute.

* ```update```: Updates every module in ```layers```.

* ```train_SUFFIX```: Trains a network with gradient descent.

Required arguments:
* ```seq``` (```Sequential_SUFFIX```): Sequential module to train.
* ```inp``` (```float*```): Pointer to the input data.
* ```targ``` (```float*```): Pointer to the target data.
* ```bs``` (```int```): Batch size.
* ```n_in``` (```int```): Number of input features.
* ```n_epochs``` (```int```): Number of epochs.

For end-to-end training with speed benchmarks, please run ```main.cpp``` or ```main.cu``` for the CPU and GPU respectively.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bobmcdear/neural-network-cuda

Awesome Lists containing this project

README