Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/charles-r-earp/autograph

A machine learning library for Rust.
https://github.com/charles-r-earp/autograph

cuda machine-learning neural-networks rust

Last synced: about 2 months ago
JSON representation

A machine learning library for Rust.

Awesome Lists containing this project

README

        

[![LicenseBadge]][License]
[![DocsBadge]][Docs]
[![build](https://github.com/charles-r-earp/autograph/actions/workflows/ci.yml/badge.svg)](https://github.com/charles-r-earp/autograph/actions/workflows/ci.yml)

[License]: https://github.com/charles-r-earp/autograph/blob/main/LICENSE-APACHE
[LicenseBadge]: https://img.shields.io/badge/license-MIT/Apache_2.0-blue.svg
[Docs]: https://docs.rs/autograph
[DocsBadge]: https://docs.rs/autograph/badge.svg

# autograph

A machine learning library for Rust.

GPGPU kernels implemented with [krnl](https://github.com/charles-r-earp/krnl).

- Host and device execution.
- Tensors emulate [ndarray](https://github.com/rust-ndarray/ndarray)
- Host tensors can be borrowed as arrays.
- Tensors, models, and optimizers can be serialized with [serde](https://github.com/serde-rs/serde).
- Portable between platforms.
- Save and resume training progress.
- Fully extensible, in Rust.

## Neural Networks

```rust
#[derive(Layer, Forward)]
#[autograph(forward(Variable4, Output=Variable2))]
struct LeNet5 {
conv1: Conv2,
relu1: Relu,
pool1: MaxPool2,
conv2: Conv2,
relu2: Relu,
pool2: MaxPool2,
flatten: Flatten,
dense1: Dense,
relu3: Relu,
dense2: Dense,
relu4: Relu,
dense3: Dense,
}

impl LeNet5 {
fn new(device: Device, scalar_type: ScalarType) -> Result {
let conv1 = Conv2::builder()
.device(device.clone())
.scalar_type(scalar_type)
.inputs(1)
.outputs(6)
.filter([5, 5])
.build()?;
let relu1 = Relu;
let pool1 = MaxPool2::builder().filter([2, 2]).build();
let conv2 = Conv2::builder()
.device(device.clone())
.scalar_type(scalar_type)
.inputs(6)
.outputs(16)
.filter([5, 5])
.build()?;
let relu2 = Relu;
let pool2 = MaxPool2::builder().filter([2, 2]).build();
let flatten = Flatten;
let dense1 = Dense::builder()
.device(device.clone())
.scalar_type(scalar_type)
.inputs(16 * 4 * 4)
.outputs(128)
.build()?;
let relu3 = Relu;
let dense2 = Dense::builder()
.device(device.clone())
.scalar_type(scalar_type)
.inputs(128)
.outputs(84)
.build()?;
let relu4 = Relu;
let dense3 = Dense::builder()
.device(device.clone())
.scalar_type(scalar_type)
.inputs(84)
.outputs(10)
.bias(true)
.build()?;
Ok(Self {
conv1,
relu1,
pool1,
conv2,
relu2,
pool2,
flatten,
dense1,
relu3,
dense2,
relu4,
dense3,
})
}
}

let mut model = LeNet5::new(device.clone(), ScalarType::F32)?;
model.init_parameter_grads()?;
let y = model.forward(x)?;
let loss = y.cross_entropy_loss(t)?;
loss.backward()?;
model.update(learning_rate, &optimizer)?;
```

See the [Neural Network MNIST](examples/neural-network-mnist) example.

# Benchmarks

_NVIDIA GeForce GTX 1060 with Max-Q Design_

## LeNet5(training, batch_size = 100)

| | `autograph` | `tch` | `candle` |
|:------------------|:--------------------------|:---------------------------------|:-------------------------------- |
| **`bf16_host`** | `498.54 ms` (✅ **1.00x**) | `75.26 ms` (🚀 **6.62x faster**) | `N/A` |
| **`f32_host`** | `8.25 ms` (✅ **1.00x**) | `3.14 ms` (🚀 **2.63x faster**) | `34.17 ms` (❌ *4.14x slower*) |
| **`bf16_device`** | `1.76 ms` (✅ **1.00x**) | `17.63 ms` (❌ *10.02x slower*) | `N/A` |
| **`f32_device`** | `1.73 ms` (✅ **1.00x**) | `1.19 ms` (✅ **1.45x faster**) | `9.76 ms` (❌ *5.64x slower*) |

## LeNet5(inference, batch_size = 1,000)

| | `autograph` | `tch` | `candle` |
|:------------------|:-------------------------|:---------------------------------|:-------------------------------- |
| **`bf16_host`** | `1.81 s` (✅ **1.00x**) | `193.60 ms` (🚀 **9.37x faster**) | `N/A` |
| **`f32_host`** | `15.56 ms` (✅ **1.00x**) | `9.46 ms` (✅ **1.64x faster**) | `94.23 ms` (❌ *6.06x slower*) |
| **`bf16_device`** | `4.65 ms` (✅ **1.00x**) | `48.63 ms` (❌ *10.46x slower*) | `N/A` |
| **`f32_device`** | `4.65 ms` (✅ **1.00x**) | `1.84 ms` (🚀 **2.52x faster**) | `10.81 ms` (❌ *2.33x slower*) |

See the [Neural Network](benches/neural-network-benches) benchmark.

# License

Dual-licensed to be compatible with the Rust project.

Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 or the MIT license http://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.

# Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.