{"id":32659782,"url":"https://github.com/vicen-te/tiny-nn","last_synced_at":"2026-04-19T02:08:05.945Z","repository":{"id":321305331,"uuid":"1071108455","full_name":"Vicen-te/tiny-nn","owner":"Vicen-te","description":"A tiny neural network framework for fully-connected layers with CPU and CUDA support","archived":false,"fork":false,"pushed_at":"2025-10-28T23:39:09.000Z","size":4017,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-29T00:34:16.406Z","etag":null,"topics":["backpropagation","cplusplus-20","cpu","cuda","cuda-12-8","kernel","multi-threaded","neural-network","nn"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Vicen-te.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-06T21:59:05.000Z","updated_at":"2025-10-28T23:39:14.000Z","dependencies_parsed_at":"2025-10-29T00:34:18.640Z","dependency_job_id":"251c06b0-be2d-44a9-93db-944f3a643f39","html_url":"https://github.com/Vicen-te/tiny-nn","commit_stats":null,"previous_names":["vicen-te/tiny-nn"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Vicen-te/tiny-nn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vicen-te%2Ftiny-nn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vicen-te%2Ftiny-nn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vicen-te%2Ftiny-nn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vicen-te%2Ftiny-nn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Vicen-te","download_url":"https://codeload.github.com/Vicen-te/tiny-nn/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vicen-te%2Ftiny-nn/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":282007156,"owners_count":26598240,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-31T02:00:07.401Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["backpropagation","cplusplus-20","cpu","cuda","cuda-12-8","kernel","multi-threaded","neural-network","nn"],"created_at":"2025-10-31T15:01:23.802Z","updated_at":"2025-10-31T15:02:43.000Z","avatar_url":"https://github.com/Vicen-te.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Tiny-NN — Fully Connected Neural Networks in C++20 + CUDA 12.8\n\nTiny-NN is a high-performance implementation of fully connected neural networks supporting both CPU and GPU execution. It's designed for easy experimentation and benchmarking, featuring:\n- CPU execution (parallelized)\n- CUDA execution with memory reuse (weights and biases uploaded only once per layer)\n- Training with backpropagation and SGD\n- Model serialization using [`json.hpp`](https://github.com/nlohmann/json) (MIT licensed) included in the repository\n- Simple MNIST dataset integration and ASCII preview\n\n## Requirements\n- C++20 compatible compiler\n- CUDA 12.8 (for GPU support)\n- CMake \u003e= 3.24\n- Python 3.12 (optional, for dataset download and preview)\n\n\u003e Although development was done on Windows 10/11 using Visual Studio 2022, the project can be built on any OS with a compatible C++20 compiler and CUDA installation.\n\n## Setup\n\n1. Clone or copy the repository to your machine.\n```bash\ngit clone https://github.com/Vicen-te/tiny-nn.git\ncd tiny-nn\n```\n\n2. Download the MNIST dataset:\nUsing Python script (recommended):\n```bash\npython scripts/download_mnist.py\n```\n- This will download and save the MNIST dataset in `data/minst/`.\n- Alternatively, you can download the dataset manually from [Kaggle](https://www.kaggle.com/datasets/hojjatk/mnist-dataset)\n\n3. Optional: generate a small model using Python (arguments: `input layer`, `hidden layer`, `output layer`):\n```bash\npython data/generate_model.py 128 64 10\n```\n\t\n4. Optional: preview MNIST digits:\n\n- Python: `python scripts/preview.py`\n- C++: `ascii_preview()` function in MNISTLoader\n\n\t\n## Build \n### Visual Studio:\n\n- Open Visual Studio -\u003e File -\u003e Open -\u003e Folder... and select the project folder.\n- Visual Studio will detect CMake. For GPU usage, choose x64 configuration.\n- Build -\u003e Build All.\n\n### PowerShell / Developer Command Prompt (recommended):\n#### Option 1: Specify all options manually\n```powershell\nmkdir build\ncd build\ncmake .. -G \"Visual Studio 17 2022\" -A x64 -DCUDA_TOOLKIT_ROOT_DIR=\"C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.8\"\ncmake --build . --config Release\n```\n\n- `-G \"Visual Studio 17 2022\"` selects Visual Studio 2022\n- `-A x64` selects 64-bit architecture (recommended for CUDA)\n- `-DCUDA_TOOLKIT_ROOT_DIR` is optional, CMake can auto-detect CUDA\n\u003e Note: The -A x64 option is recommended if you want to use CUDA on Windows. On Linux or macOS, this is not necessary.\n\n#### Option 2: Let CMake detect everything automatically (recommended)\n```powershell\ncmake -B build -S .\ncmake --build build --config Release\n```\n\n- CMake will detect Visual Studio and CUDA if installed in standard locations\n- `-S` is the source folder, `-B` is the build folder\n\u003eBoth methods produce the same result. Use Option 2 for simplicity and fewer manual settings.\n\n\t\n## Run\n\nFrom the `build/Release` folder:\n```kotlin\ntinny-nn.exe \u003cmode\u003e\n```\n\nModes:\n\n- train or t → Train model\n- inference or i → Run inference on a sample\n- benchmark or b → Compare CPU vs CUDA performance\n\n### Expected output\n\n#### Training (train / t)\n\n- Training progress printed to console\n- Training duration in seconds\n- Saved model JSON to ./data/models/fc_digit_classification.json\n- ASCII MNIST preview of a single sample image\n\n#### Inference (inference / i)\n\n- Output values of selected sample\n- Maximum value and its index\n- ASCII preview of the sample\n\n#### Benchmark (benchmark / b)\n\n- CPU vs GPU inference correctness check\n- Average inference timings per method\n- CSV results saved to ./data/results/bench.csv\n\n\u003eCurrently, benchmark only measures inference, not training. Measuring training performance would require additional implementation.\n\n## Notes \u0026 Improvements\n\n- Currently, weights `W` and biases `b` are uploaded to the GPU **once per layer**. The input vector is uploaded for each inference.  \n- cuBLAS GEMM is already used for matrix multiplications, replacing the simple custom FC kernel.  \n- Intermediate GPU buffers (`dX`/`dY`) are allocated per layer and batch and are **not fully reused**, though CUDA streams enable asynchronous execution.  \n- For higher performance (future improvements):  \n  - Reusing intermediate GPU buffers across layers and batches via CUDA streams.\n  - Implementing more efficient batching and overlapping of data transfers with computation.  \n- Profiling can be done with Nsight Systems / Nsight Compute.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvicen-te%2Ftiny-nn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvicen-te%2Ftiny-nn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvicen-te%2Ftiny-nn/lists"}