{"id":21813594,"url":"https://github.com/bobmcdear/neural-network-cuda","last_synced_at":"2025-07-28T14:38:50.233Z","repository":{"id":40254867,"uuid":"364399099","full_name":"BobMcDear/neural-network-cuda","owner":"BobMcDear","description":"Neural network from scratch in CUDA/C++","archived":false,"fork":false,"pushed_at":"2025-01-17T00:42:54.000Z","size":24619,"stargazers_count":78,"open_issues_count":0,"forks_count":16,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-31T10:02:11.139Z","etag":null,"topics":["cplusplus","cuda","deep-learning","machine-learning","neural-network"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BobMcDear.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-04T22:13:20.000Z","updated_at":"2025-03-21T08:49:17.000Z","dependencies_parsed_at":"2025-03-24T09:15:33.774Z","dependency_job_id":null,"html_url":"https://github.com/BobMcDear/neural-network-cuda","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BobMcDear%2Fneural-network-cuda","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BobMcDear%2Fneural-network-cuda/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BobMcDear%2Fneural-network-cuda/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BobMcDear%2Fneural-network-cuda/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BobMcDear","download_url":"https://codeload.github.com/BobMcDear/neural-network-cuda/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247640465,"owners_count":20971557,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cplusplus","cuda","deep-learning","machine-learning","neural-network"],"created_at":"2024-11-27T14:30:19.986Z","updated_at":"2025-04-07T11:11:19.941Z","avatar_url":"https://github.com/BobMcDear.png","language":"Cuda","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Neural network in CUDA/C++\n\n• \u003cstrong\u003e[Description](#description)\u003c/strong\u003e\u003cbr\u003e\n• \u003cstrong\u003e[Usage](#usage)\u003c/strong\u003e\u003cbr\u003e\n\n## Description\nThis is an implementation of a neural net, completely from scratch, in CUDA/C++. A technical report with a more comprehensive overview of this project can be found [here](https://github.com/BobMcDear/neural-network-cuda/blob/main/Neural%20Network%20in%20CUDA.pdf).\n\n## Usage\nThe code is by no means efficient and is meant as an introduction to CUDA only. Here is an overview of the various classes and functions:\n\nEverything is implemented in both pure C++ (under ```CPU/```) and CUDA/C++ (under ```GPU/```). The syntax remains virtually identical, and there are only two points to bear in mind when switching between C++ and CUDA/C++:\n\n1. C++ and CUDA/C++ modules end with the suffixes ```CPU``` and ```GPU``` respectively.\n2. Don't forget to allocate and destroy CUDA arrays via ```cudaMallocManaged``` and ```cudaFree```\n\n* ```linear.h/Linear_SUFFIX```:\n  * Initialization:\n\n    Required arguments:\n     * ```_bs``` (```int```): Batch size.\n     * ```_n_in``` (```int```): Number of input features.\n     * ```_n_out``` (```int```): Number of output features.\n\n    Optional arguments:\n     * ```_lr``` (```float```): Learning rate.\n\n  * ```forward```: Runs a linear forward pass.\n\n    Required arguments:\n     * ```_inp``` (```float*```): Pointer to the input data.\n     * ```_out``` (```float*```): Pointer for storing the output data.\n\n  * ```update```: Updates the weights and biases.\n\n  * ```backward```: Performs a backward pass, storing the gradients in ```_inp```.\n\n* ```relu.h/ReLU_SUFFIX```:\n  * Initialization:\n\n    Required argument:\n     * ```_sz_out``` (```int```): The number of input/output elements.\n\n  * ```forward```, ```backward```: Like ```Linear_SUFFIX``` but for ReLU.\n\n* ```mse.h/MSE_SUFFIX```:\n  * Initialization: Like ReLU.\n\n  * ```forward```: Dummy method for compatibility with the other modules and performing backpropagation; does not actually calculate the loss.\n\n    Required arguments:\n     * ```_inp``` (```float*```): Pointer to the predictions.\n     * ```_out``` (```float*```): Pointer to the target values.\n\n  * ```_forward```: Calculates the MSE. This method is solely for calculating the loss and cannot be used during backpropagation.\n\n    Required arguments: Like ```MSE_SUFFIX``` but ```_out``` must have an extra element for storing the loss.\n\n  * ```backward```: Performs a backward pass, storing the gradients in ```_inp```.\n\n* ```sequential.h/Sequential_SUFFIX```:\n\n  * Initialization:\n\n    Required arguments:\n     * ```layers``` (```std::vector\u003cModule*\u003e```): Layers to be chained together.\n\n  * ```forward```: Cascades the modules in ```layers```.\n\n    Required arguments:\n     * ```inp``` (```float*```): Pointer to the input data.\n     * ```out``` (```float*```): Dummy argument, only for compatibility with the other forward methods and doesn't get used. The output is accesible via the last layer's ```out``` attribute.\n\n  * ```update```: Updates every module in ```layers```.\n\n* ```train_SUFFIX```: Trains a network with gradient descent.\n\n  Required arguments:\n   * ```seq``` (```Sequential_SUFFIX```): Sequential module to train.\n   * ```inp``` (```float*```): Pointer to the input data.\n   * ```targ``` (```float*```): Pointer to the target data.\n   * ```bs``` (```int```): Batch size.\n   * ```n_in``` (```int```): Number of input features.\n   * ```n_epochs``` (```int```):  Number of epochs.\n\nFor end-to-end training with speed benchmarks, please run ```main.cpp``` or ```main.cu``` for the CPU and GPU respectively.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbobmcdear%2Fneural-network-cuda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbobmcdear%2Fneural-network-cuda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbobmcdear%2Fneural-network-cuda/lists"}