Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/denizyuret/autograd.jl
Julia port of the Python autograd package.
https://github.com/denizyuret/autograd.jl
autograd automatic-differentiation data-science deep-learning knet machine-learning neural-networks
Last synced: 29 days ago
JSON representation
Julia port of the Python autograd package.
- Host: GitHub
- URL: https://github.com/denizyuret/autograd.jl
- Owner: denizyuret
- License: other
- Created: 2016-08-08T20:24:26.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2022-02-12T19:22:35.000Z (over 2 years ago)
- Last Synced: 2024-09-10T22:28:14.536Z (2 months ago)
- Topics: autograd, automatic-differentiation, data-science, deep-learning, knet, machine-learning, neural-networks
- Language: Julia
- Size: 763 KB
- Stars: 169
- Watchers: 20
- Forks: 26
- Open Issues: 21
-
Metadata Files:
- Readme: README.md
- Changelog: ChangeLog
- License: LICENSE.md
Awesome Lists containing this project
README
# AutoGrad
[![Build Status](https://travis-ci.org/denizyuret/AutoGrad.jl.svg?branch=master)](https://travis-ci.org/denizyuret/AutoGrad.jl)
[![coveralls](https://coveralls.io/repos/github/denizyuret/AutoGrad.jl/badge.svg?branch=master)](https://coveralls.io/github/denizyuret/AutoGrad.jl?branch=master)
[![codecov](https://codecov.io/gh/denizyuret/AutoGrad.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/denizyuret/AutoGrad.jl)AutoGrad.jl is an automatic differentiation package for Julia. It started as a port of the
popular Python [autograd](https://github.com/HIPS/autograd) package and forms the foundation
of the [Knet](https://github.com/denizyuret/Knet.jl) Julia deep learning framework.
AutoGrad can differentiate regular Julia code that includes loops, conditionals, helper
functions, closures etc. by keeping track of the primitive operations and using this
execution trace to compute gradients. It uses reverse mode differentiation
(a.k.a. backpropagation) so it can efficiently handle functions with large array inputs and
scalar outputs. It can compute gradients of gradients to handle higher order derivatives.## Installation
You can install AutoGrad in Julia using:
```julia
julia> using Pkg; Pkg.add("AutoGrad")
```In order to use it in your code start with:
```julia
using AutoGrad
```## Interface
```julia
x = Param([1,2,3]) # user declares parameters
x => P([1,2,3]) # they are wrapped in a struct
value(x) => [1,2,3] # we can get the original value
sum(abs2,x) => 14 # they act like regular values outside of differentiation
y = @diff sum(abs2,x) # if you want the gradients
y => T(14) # you get another struct
value(y) => 14 # which represents the same value
grad(y,x) => [2,4,6] # but also contains gradients for all Params
```## Old Interface
Pre v1.1 AutoGrad only supported the following `grad` interface. This is still supported.
```julia
x = [1,2,3]
f(x) = sum(abs2,x)
g = grad(f)
f(x) => 14
g(x) => [2,4,6]
```## Example
Here is a linear regression example using [callable objects](https://docs.julialang.org/en/stable/manual/methods/#Function-like-objects-1):
```julia
struct Linear; w; b; end # user defines a model
(f::Linear)(x) = (f.w * x .+ f.b)# Initialize a model as a callable object with parameters:
f = Linear(Param(randn(10,100)), Param(randn(10)))# SGD training loop:
for (x,y) in data
loss = @diff sum(abs2,f(x)-y)
for w in params(f)
g = grad(loss,w)
axpy!(-0.01, g, w)
end
end
```See the [examples directory](https://github.com/denizyuret/AutoGrad.jl/blob/master/examples)
for more examples.## Extending AutoGrad
AutoGrad can only handle a function if the primitives it uses have known gradients. You can
add your own primitives with gradients using the `@primitive` and `@zerograd` macros in
[macros.jl](https://github.com/denizyuret/AutoGrad.jl/blob/master/src/macros.jl) Here is an
example:```julia
@primitive log(x),dy,y (dy .* (1 ./ x))
```The `@primitive` macro marks the `log(::Any)` method as a new primitive and the next
expression defines a gradient function wrt the first argument. The gradient expressions can
refer to the parameter(s) `x`, the return variable `y` and its gradient `dy` (optionally
indicated after the argument list) in the method declaration. For functions with multiple
inputs multiple gradient expressions may be given. Non-existent or zero gradients can be
specified by omitting a gradient expression or using `nothing` in place of one. By default
the broadcasting version `log.(x)` is also defined as a primitive, use the `@primitive1`
macro if you don't want this.Note that Julia supports multiple-dispatch, i.e. a function may have multiple methods each
supporting different argument types. For example `log(::Float32)` and `log(::BigFloat)` are
two different log methods. In AutoGrad.jl each method can be defined independently as a
primitive and can have its own specific gradient. Generally AutoGrad defines gradients
without using argument types to keep the rules generic.## Debugging and Profiling
To view the contents of the computational graph after differentiating a function you can use
the following:```julia
julia> AutoGrad.gcnode(::AutoGrad.Node)=nothing # without this some values may be lost
julia> w = Param(rand(2,3)); b = Param(rand(2,1)); x = rand(3,4); y = rand(2,4);
julia> J = @diff sum(abs2, w*x .+ b - y)
T(14.695603907991153)
julia> [J] # displaying J in an Array causes pretty printing
1. P(Array{Float64,2}(2,3)) ∇=Array{Float64,2}(2,3)
2. Array{Float64,2}(2,4) = *(Array{Float64,2}(2,3), Array{Float64,2}(3,4))) ∇=Array{Float64,2}(2,4)
3. P(Array{Float64,2}(2,1)) ∇=Array{Float64,2}(2,1)
4. Array{Float64,2}(2,4) = broadcast(+, Array{Float64,2}(2,4), Array{Float64,2}(2,1))) ∇=Array{Float64,2}(2,4)
5. Array{Float64,2}(2,4) = -(Array{Float64,2}(2,4), Array{Float64,2}(2,4))) ∇=Array{Float64,2}(2,4)
6. 14.695603907991153 = sum(abs2, Array{Float64,2}(2,4))) ∇=1.0
julia> z = collect(J.list) # collect creates a Node array with reverse order
julia> dump(z[5], maxdepth=1) # allowing you to look at individual Nodes and Values
AutoGrad.Node
Value: AutoGrad.Result{Array{Float64,2}}
parents: Array{AutoGrad.Node}((2,))
children: Array{AutoGrad.Node}((1,))
outgrad: Array{Float64}((2, 4)) [3.82753 2.19124 3.26769 3.0075; 2.81565 2.3903 1.84373 1.60228]
cdr: AutoGrad.Node
julia> dump(z[5].Value, maxdepth=2)
AutoGrad.Result{Array{Float64,2}}
value: Array{Float64}((2, 4)) [1.16724 1.07224 0.935047 0.895262; 0.687182 0.589704 0.517114 0.495718]
func: * (function of type typeof(*))
args: Tuple{Param{Array{Float64,2}},Array{Float64,2}}
1: Param{Array{Float64,2}}
2: Array{Float64}((3, 4)) [0.515282 0.257471 0.140791 0.127632; 0.705288 0.783289 0.361965 0.311965; 0.780549 0.691645 0.853317 0.843374]
kwargs: Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}
data: NamedTuple{(),Tuple{}} NamedTuple()
itr: Tuple{} ()
```To profile AutoGrad using TimerOutputs.jl, set the environment variable
`ENV["AUTOGRAD_TIMER"]="true"` and rebuild AutoGrad with `Pkg.build("AutoGrad")`, before
evaluating `using AutoGrad`. The environment variable `AUTOGRAD_TIMER` is only checked at
compile time, not at run time for performance reasons. This will collect detailed timing
information but slows the code down, when you are done don't forget to
`delete!(ENV,"AUTOGRAD_TIMER")` and rebuild AutoGrad. In the example below, the symbol `sum`
indicates the time spent on the forward pass of the `sum` function and `sum[2]` indicates
the time spent on the backward pass for the second argument. `record` and `sum_outgrads` are
functions internal to AutoGrad.```julia
julia> ENV["AUTOGRAD_TIMER"]="true"
julia> using Pkg; Pkg.build("AutoGrad")
julia> using AutoGrad, TimerOutputs
julia> reset_timer!(AutoGrad.to)
julia> w = Param(rand(2,3)); b = Param(rand(2,1)); x = rand(3,4); y = rand(2,4);
julia> J = @diff sum(abs2, w*x .+ b - y)
julia> AutoGrad.to
───────────────────────────────────────────────────────────────────────
Time Allocations
────────────────────── ───────────────────────
Tot / % measured: 4.62s / 30.4% 546MiB / 25.0%Section ncalls time %tot avg alloc %tot avg
───────────────────────────────────────────────────────────────────────
+.[2] 1 328ms 23.3% 328ms 46.4MiB 34.1% 46.4MiB
sum[2] 1 288ms 20.5% 288ms 40.0MiB 29.4% 40.0MiB
* 1 38.8ms 2.76% 38.8ms 595KiB 0.43% 595KiB
* 1 269ms 19.2% 269ms 955KiB 0.68% 955KiB
+. 1 139ms 9.92% 139ms 20.4MiB 15.0% 20.4MiB
*[1] 1 117ms 8.33% 117ms 9.41MiB 6.90% 9.41MiB
record 4 88.7ms 6.31% 22.2ms 3.49MiB 2.56% 894KiB
-[1] 1 65.9ms 4.69% 65.9ms 10.0MiB 7.32% 10.0MiB
- 1 55.8ms 3.97% 55.8ms 929KiB 0.67% 929KiB
sum 1 50.0ms 3.56% 50.0ms 4.68MiB 3.44% 4.68MiB
+.[1] 1 1.78ms 0.13% 1.78ms 37.7KiB 0.03% 37.7KiB
sum_outgrads 5 1.41ms 0.10% 282μs 28.2KiB 0.02% 5.64KiB
───────────────────────────────────────────────────────────────────────
```## Code structure
[core.jl](https://github.com/denizyuret/AutoGrad.jl/blob/master/src/core.jl) implements the
main functionality and acts as the main documentation source.
[macros.jl](https://github.com/denizyuret/AutoGrad.jl/blob/master/src/macros.jl) has some
support functions to define and test new primitives.
[getindex.jl](https://github.com/denizyuret/AutoGrad.jl/blob/master/src/getindex.jl),
[iterate.jl](https://github.com/denizyuret/AutoGrad.jl/blob/master/src/iterate.jl) and
[cat.jl](https://github.com/denizyuret/AutoGrad.jl/blob/master/src/cat.jl) set up support
for common data structures including Arrays, Tuples, and Dictionaries. The numerical
gradients are defined in files such as `base.jl` and `math.jl`.## Current status and future work
The gradient coverage and unit testing are spotty, I am still adding more gradients and
tests to cover the Julia base. Documentation needs to be improved. Overwriting functions
(e.g. `setindex!`) are not supported. Efficiency could be improved by reducing runtime
compilation, memoization, and support for static computation.## Acknowledgments and references
AutoGrad.jl was written by [Deniz Yuret](http://www.denizyuret.com). Parts of the code were
initially ported from the Python [autograd](https://github.com/HIPS/autograd) package. I'd
like to thank autograd author Dougal Maclaurin for his support. See [(Baydin et
al. 2015)](https://arxiv.org/abs/1502.05767) for a general review of automatic
differentiation, [autograd
tutorial](https://github.com/HIPS/autograd/blob/master/docs/tutorial.md) for some Python
examples, and Dougal's PhD thesis for design principles.
[JuliaDiff](http://www.juliadiff.org/) and [FluxML](https://github.com/FluxML) have
alternative differentiation tools for Julia. I would like to thank the current
contributors:* Carlo Lucibello
* Ekin Akyürek
* Emre Yolcu
* Jarrett Revels
* Mike Innes
* Ozan Arkan Can
* Rene DonnerThe suggested citation for AutoGrad is:
```
@inproceedings{knet2016mlsys,
author={Yuret, Deniz},
title={Knet: beginning deep learning with 100 lines of Julia},
year={2016},
booktitle={Machine Learning Systems Workshop at NIPS 2016}
}
```