Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bobmcdear/neural-network-cuda
Neural network from scratch in CUDA/C++
https://github.com/bobmcdear/neural-network-cuda
cplusplus cuda deep-learning machine-learning neural-network
Last synced: about 6 hours ago
JSON representation
Neural network from scratch in CUDA/C++
- Host: GitHub
- URL: https://github.com/bobmcdear/neural-network-cuda
- Owner: BobMcDear
- License: gpl-3.0
- Created: 2021-05-04T22:13:20.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-01-15T16:44:04.000Z (15 days ago)
- Last Synced: 2025-01-15T19:03:44.210Z (15 days ago)
- Topics: cplusplus, cuda, deep-learning, machine-learning, neural-network
- Language: Cuda
- Homepage:
- Size: 23.5 MB
- Stars: 71
- Watchers: 3
- Forks: 15
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Neural network in CUDA/C++
• [Description](#description)
• [Usage](#usage)## Description
This is an implementation of a neural net, completely from scratch, in CUDA/C++. A technical report with a more comprehensive overview of this project can be found [here](https://github.com/BobMcDear/neural-network-cuda/blob/main/Neural%20Network%20in%20CUDA.pdf).## Usage
The code is by no means efficient and is meant as an introduction to CUDA only. Here is an overview of the various classes and functions:Everything is implemented in both pure C++ (under ```CPU/```) and CUDA/C++ (under ```GPU/```). The syntax remains virtually identical, and there are only two points to bear in mind when switching between C++ and CUDA/C++:
1. C++ and CUDA/C++ modules end with the suffixes ```CPU``` and ```GPU``` respectively.
2. Don't forget to allocate and destroy CUDA arrays via ```cudaMallocManaged``` and ```cudaFree```* ```linear.h/Linear_SUFFIX```:
* Initialization:Required arguments:
* ```_bs``` (```int```): Batch size.
* ```_n_in``` (```int```): Number of input features.
* ```_n_out``` (```int```): Number of output features.Optional arguments:
* ```_lr``` (```float```): Learning rate.* ```forward```: Runs a linear forward pass.
Required arguments:
* ```_inp``` (```float*```): Pointer to the input data.
* ```_out``` (```float*```): Pointer for storing the output data.* ```update```: Updates the weights and biases.
* ```backward```: Performs a backward pass, storing the gradients in ```_inp```.
* ```relu.h/ReLU_SUFFIX```:
* Initialization:Required argument:
* ```_sz_out``` (```int```): The number of input/output elements.* ```forward```, ```backward```: Like ```Linear_SUFFIX``` but for ReLU.
* ```mse.h/MSE_SUFFIX```:
* Initialization: Like ReLU.* ```forward```: Dummy method for compatibility with the other modules and performing backpropagation; does not actually calculate the loss.
Required arguments:
* ```_inp``` (```float*```): Pointer to the predictions.
* ```_out``` (```float*```): Pointer to the target values.* ```_forward```: Calculates the MSE. This method is solely for calculating the loss and cannot be used during backpropagation.
Required arguments: Like ```MSE_SUFFIX``` but ```_out``` must have an extra element for storing the loss.
* ```backward```: Performs a backward pass, storing the gradients in ```_inp```.
* ```sequential.h/Sequential_SUFFIX```:
* Initialization:
Required arguments:
* ```layers``` (```std::vector```): Layers to be chained together.* ```forward```: Cascades the modules in ```layers```.
Required arguments:
* ```inp``` (```float*```): Pointer to the input data.
* ```out``` (```float*```): Dummy argument, only for compatibility with the other forward methods and doesn't get used. The output is accesible via the last layer's ```out``` attribute.* ```update```: Updates every module in ```layers```.
* ```train_SUFFIX```: Trains a network with gradient descent.
Required arguments:
* ```seq``` (```Sequential_SUFFIX```): Sequential module to train.
* ```inp``` (```float*```): Pointer to the input data.
* ```targ``` (```float*```): Pointer to the target data.
* ```bs``` (```int```): Batch size.
* ```n_in``` (```int```): Number of input features.
* ```n_epochs``` (```int```): Number of epochs.For end-to-end training with speed benchmarks, please run ```main.cpp``` or ```main.cu``` for the CPU and GPU respectively.