{"id":21831061,"url":"https://github.com/wiktor2718/matrix_flow","last_synced_at":"2025-03-21T13:21:09.927Z","repository":{"id":264588112,"uuid":"864573740","full_name":"Wiktor2718/matrix_flow","owner":"Wiktor2718","description":"Matrix Flow is a simple machine learning library written in Rust and CUDA. It was created as a portfolio project to deepen my understanding of machine learning, GPU programming, and Rust. It provides an API for matrix manipulation and includes specially optimized neural networks.","archived":false,"fork":false,"pushed_at":"2024-11-05T23:54:18.000Z","size":47,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-01-26T09:15:04.101Z","etag":null,"topics":["adam-optimizer","benchmarking","cuda","deep-learning","gpu-computing","machine-learning","matrix-operations","neural-networks","portfolio-project","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Wiktor2718.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-28T15:26:17.000Z","updated_at":"2024-11-06T00:42:32.000Z","dependencies_parsed_at":"2024-11-25T08:20:15.532Z","dependency_job_id":"04539ba6-35b2-4e70-a1c7-d2162371b19c","html_url":"https://github.com/Wiktor2718/matrix_flow","commit_stats":null,"previous_names":["wiktor2718/matrix_flow"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wiktor2718%2Fmatrix_flow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wiktor2718%2Fmatrix_flow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wiktor2718%2Fmatrix_flow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wiktor2718%2Fmatrix_flow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Wiktor2718","download_url":"https://codeload.github.com/Wiktor2718/matrix_flow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244803364,"owners_count":20512897,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adam-optimizer","benchmarking","cuda","deep-learning","gpu-computing","machine-learning","matrix-operations","neural-networks","portfolio-project","rust"],"created_at":"2024-11-27T19:08:16.560Z","updated_at":"2025-03-21T13:21:09.905Z","avatar_url":"https://github.com/Wiktor2718.png","language":"Rust","readme":"# Matrix Flow\nMatrix Flow is a simple machine learning library written in Rust and CUDA.\nIt was created as a portfolio project to deepen my understanding of machine\nlearning, GPU programming, and Rust.\nIt provides an API for matrix manipulation and includes specially optimized neural networks.\n\n## Features\n- **GPU-accelerated computation** using CUDA\n- **Multi-layer perceptron (MLP)** with customizable layers and activation functions\n- **Adam optimizer** for efficient gradient-based learning\n- Supports **batch training** for improved efficiency\n- **NVTX benchmarking** for performance profiling\n\n## Prerequisites\n### Install Rust\n- **Rust**: Install Rust from the [official website](https://www.rust-lang.org/).\n\n### Install CUDA Toolkit\n- **CUDA Toolkit**: Make sure you have CUDA installed on your system. The default library paths\n    are set up for Linux, but you may need to adjust these for your specific environment.\n\n### Install Sample Data Sets\n- **CSV Data**: The example provided expects labeled datasets in CSV format for both training and testing.\n\n## Running The Sample MLP\n1. Make sure that the GPU architecture in the `build.rs` is correct\n    ```rs\n    let cuda_arch = \"sm_86\"; // Adjust the architecture as needed\n    ```\n1. If you have a non-standard path to CUDA libraries, modify this line  \n    ```rs\n    let cuda_lib_path = \"/usr/local/cuda/lib64\"; // Adjust this path as necessary\n    ```\n1. Run `cargo run`\n\n## Memory Optimizations\nThis library avoids additional memory allocations by reusing the memory of the\nowned operand. For instance:\n- `A + \u0026B` reuses the memory of `A`\n- `\u0026A + B` reuses the memory of `B`\n- `\u0026A + \u0026B` allocates new memory.\n- `A + B` reuses the memory of `A` and drops `B`\n\n## Example Usage (Networks)\n```rs\nuse std::{fs::File, iter::zip, path::Path, error::Error};\nuse matrix_flow::prelude::*;\n\nfn read_labeled_data\u003cP: AsRef\u003cPath\u003e\u003e(path: P, output_size: usize, batch_size: usize, max_value: ValueType) -\u003e Result\u003c(Vec\u003cMatrix\u003e, Vec\u003cMatrix\u003e), Box\u003cdyn Error\u003e\u003e {\n    let file = File::open(path)?;\n    let mut rdr = csv::Reader::from_reader(file);\n\n    let mut res_items = Vec::new();\n    let mut res_labels = Vec::new();\n\n    let mut item_buffer = Vec::new();\n    let mut label_buffer = Vec::new();\n\n    for (idx, result) in rdr.deserialize().enumerate() {\n        let (label_index, item): (usize, Vec\u003cf32\u003e) = result?;\n\n        // Vectorize label\n        let mut label = vec![0.; output_size];\n        label[label_index] = 1.;\n\n        // Store in buffers\n        item_buffer.extend(\u0026item);\n        label_buffer.extend(\u0026label);\n        \n        // Accumulate until full\n        if (idx + 1) % batch_size != 0 {\n            continue;\n        }\n\n        // Move to GPU\n        let mut item_batch = Matrix::from(\u0026item_buffer, batch_size, item.len());\n        let label_batch = Matrix::from(\u0026label_buffer, batch_size, output_size);\n\n        // Normalize items on the GPU\n        item_batch = item_batch / max_value;\n\n        // Clear buffers\n        item_buffer.clear();\n        label_buffer.clear();\n\n        // Push batches to results\n        res_items.push(item_batch);\n        res_labels.push(label_batch);\n    }\n\n    Ok((res_labels, res_items))\n}\n\nfn main() {\n    // Parameters\n    const EPOCHS: u32 = 100;\n    const BATCH_SIZE: usize = 128;\n\n    let layers = [\n        Layer::new(28*28, 100, ActivationType::Tanh),\n        Layer::new(100, 100, ActivationType::Tanh),\n        Layer::new(100, 10, ActivationType::Linear),\n    ];\n\n    range_push(\"Data Loading\");\n    let (output_data, input_data) = read_labeled_data(\n        \"data_sets/mnist_train.csv\",\n        10,\n        BATCH_SIZE,\n        255.0\n    ).expect(\"Unable to read data\");\n\n    range_pop();\n\n    range_push(\"Adam Initialization\");\n    let optim = Optimizer::adam(layers, 0.9, 0.999, 1e-8);\n    range_pop();\n\n    range_push(\"Network Initialization\");\n    let network = MLP::new(BATCH_SIZE, 0.001, optim, layers);\n    range_pop();\n\n    range_push(\"Training\");\n    for e in 0..EPOCHS {\n        let mut error = 0.;\n        for (x, y) in zip(\u0026input_data, \u0026output_data) {\n            range_push(\"Forward Pass\");\n            let output = network.forward(x);\n            range_pop();\n\n            range_push(\"Error Calculation\");\n            error += mse(y, \u0026output);\n            range_pop();\n            \n            range_push(\"Gradient Calculation\");\n            let gradient = mse_prime(y, \u0026output);\n            range_pop();\n\n            range_push(\"Backward Pass\");\n            let _ = network.backward(\u0026gradient);\n            range_pop();\n        }\n        println!(\"{e}: {}\", error / input_data.len() as f32);\n    }\n    range_pop();\n\n    // Free up VRAM space\n    drop(output_data);\n    drop(input_data);\n\n    let (output_data, input_data) = read_labeled_data(\n        \"data_sets/mnist_test.csv\",\n        10,\n        128,\n        255.\n    ).expect(\"Unable to read data\");\n\n    // Test run\n    let mut error = 0.0;\n    for (x, y) in zip(\u0026input_data, \u0026output_data) {\n        let output = network.forward(x);\n        error += mse(y, \u0026output);\n    }\n    println!(\"Test loss: {}\", error / input_data.len() as f32);\n}\n```\n\n## Important Functions\n- `MLP::new(batch_size, learning_rate, optimizer, layers)`\n    Creates a MLP struct and allocates its data as a **contiguous block of memory on the GPU (device)**\n- `Optimizer::adam(layers, beta1, beta2, epsilon)` creates an `Optimizer` `enum` with the `Adam` variant\n    and allocates its data as a **contiguous block of memory on the GPU (device)**\n- `forward(batch)` performs a forward pass on a `batch`\n- `backward(gradient)` **updates the parameters** with `Optimizer` and **returns its gradient**\n- `range_push(label)` and `range_pop()` are used for **NVTX debugging**\n\n## License\nThis project is licensed under the **MIT License**","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwiktor2718%2Fmatrix_flow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwiktor2718%2Fmatrix_flow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwiktor2718%2Fmatrix_flow/lists"}