{"id":13741976,"url":"https://github.com/christopher-hesse/tenet","last_synced_at":"2025-05-08T22:32:53.225Z","repository":{"id":127980074,"uuid":"364163193","full_name":"christopher-hesse/tenet","owner":"christopher-hesse","description":"Automatic differentiation prototype in Zig","archived":false,"fork":false,"pushed_at":"2021-05-06T04:33:52.000Z","size":60,"stargazers_count":15,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-11-15T12:37:14.829Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Zig","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/christopher-hesse.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-05-04T06:39:42.000Z","updated_at":"2024-01-23T04:56:37.000Z","dependencies_parsed_at":"2024-01-25T05:49:49.600Z","dependency_job_id":null,"html_url":"https://github.com/christopher-hesse/tenet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopher-hesse%2Ftenet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopher-hesse%2Ftenet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopher-hesse%2Ftenet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopher-hesse%2Ftenet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/christopher-hesse","download_url":"https://codeload.github.com/christopher-hesse/tenet/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253158633,"owners_count":21863344,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T04:01:04.766Z","updated_at":"2025-05-08T22:32:52.910Z","avatar_url":"https://github.com/christopher-hesse.png","language":"Zig","funding_links":[],"categories":["Libraries"],"sub_categories":[],"readme":"# tenet\r\n\r\nA [torch](https://github.com/pytorch/pytorch)-inspired automatic differentiation prototype for [Zig](https://ziglang.org/).\r\n\r\nImagine the [numpy](https://numpy.org/) NDArray, only you can also compute backward in time using inverted functions.  Well, not quite, but you *can* calculate derivatives with respect to the inputs of your computation.\r\n\r\n## Usage\r\n\r\nThe main struct is `Tensor`, an N-dimensional array of numbers, usually floating point numbers.  Here's a short example showing how to do a `+` operation along with a backward pass:\r\n\r\n```zig\r\nconst tenet = @import(\"tenet.zig\");\r\nconst alc = std.testing.allocator;\r\nvar a = try tenet.Tensor.allocWithValue(f32, alc, \u0026[_]u64{2, 3, 4}, 1.0, tenet.tensor.REQUIRES_GRAD);\r\ndefer a.release();\r\nvar b = try tenet.Tensor.allocWithValue(f32, alc, \u0026[_]u64{2, 3, 4}, 2.0, tenet.tensor.REQUIRES_GRAD);\r\ndefer b.release();\r\nvar out = try tenet.tensor.plusAlloc(alc, a, b);\r\ndefer out.release();\r\nvar grad_out = try tenet.Tensor.allocWithValue(f32, alc, \u0026[_]u64{2, 3, 4}, 4.0, 0);\r\ndefer grad_out.release();\r\ntry tenet.tensor.backwardAlloc(alc, out, grad_out);\r\nstd.testing.expect(tenet.array.equal(a.grad.?, grad_out.data));\r\nstd.testing.expect(tenet.array.equal(b.grad.?, grad_out.data));\r\n```\r\n\r\nFor a full example, look at the [MNIST example](src/main.zig).\r\n\r\n## Automatic Differentiation\r\n\r\nIf you have a function `z = f(x, y)` and you want to know how to change `x` and `y` to minimize `z`, how do you do find that out?  One way would be to increase and decrease `x` and `y` individually to see how much `z` changes, then move them in whichever direction is better.  That method is called [\"finite differences\"](https://en.wikipedia.org/wiki/Finite_difference#Relation_with_derivatives).\r\n\r\nFor a couple of input variables, this is fine, but it's not very efficient with a large number of input variables.  Instead of doing that, you can find the derivatives by constructing a sort of backward version of the computation graph of your function.  If the function `f` looked like this:\r\n\r\n```py\r\ndef square(x):\r\n    return x ** 2\r\n\r\ndef cube(x):\r\n    return x ** 3\r\n\r\ndef multiply(x, y):\r\n    return x * y\r\n\r\ndef f(x, y):\r\n    a = square(x)\r\n    b = cube(y)\r\n    c = multiply(a, b)\r\n    return c\r\n```\r\n\r\nYou might have a backward function like this:\r\n\r\n```py\r\ndef backward_multiply(x, y, grad_out):\r\n    grad_in_x = y * grad_out\r\n    grad_in_y = x * grad_out\r\n    return grad_in_x, grad_in_y\r\n\r\ndef backward_square(x, grad_out):\r\n    grad_in = 2 * x * grad_out\r\n    return grad_in\r\n\r\ndef backward_cube(x, grad_out):\r\n    grad_in = 3 * x ** 2 * grad_out\r\n    return grad_in\r\n\r\ndef backward_f(x, y, grad_z):\r\n    # we actually need the intermediate values to call the backward functions\r\n    # so re-calculate them here (normally we would just store them when running f() the first time)\r\n    a = square(x)\r\n    b = cube(y)\r\n    _c = multiply(a, b)\r\n\r\n    grad_a, grad_b = backward_multiply(a, b, grad_z)\r\n    grad_y = backward_cube(y, grad_b)\r\n    grad_x = backward_square(x, grad_a)\r\n    return grad_x, grad_y\r\n```\r\n\r\nWhere the `backward_` functions are the derivatives of the original functions, using the chain rule to combine them together.  Each `backward_` function takes the original inputs to the normal function, plus an extra `grad_out` parameter, then returns `grad_in_\u003cname\u003e` for each of the original inputs.  You end up with the same information about how the output changes as you would get from changing each input variable individually, only with fewer calculations:\r\n\r\n```py\r\n# run the function normally\r\nx = 1.0\r\ny = 2.0\r\nz = f(x, y)\r\nprint(f\"f(x,y): {z}\")\r\n\r\n# run the backward function\r\ngrad_z = 1.0  # the initial grad value is set to 1\r\ngrad_x, grad_y = backward_f(x, y, grad_z)\r\nprint(f\"backward_f(x, y, grad_z): grad_x = {grad_x}, grad_y = {grad_y}\")\r\n\r\n# check the backward function using finite differences\r\n# by making small changes to each input to find how the output changes\r\ndef finite_differences(x, y, f, epsilon=1e-6):\r\n    grad_x = (f(x + epsilon, y) - f(x - epsilon, y)) / (2 * epsilon)\r\n    grad_y = (f(x, y + epsilon) - f(x, y - epsilon)) / (2 * epsilon)\r\n    return grad_x, grad_y\r\n\r\ngrad_x_fd, grad_y_fd = finite_differences(x, y, f)\r\nprint(f\"finite differences approximation: grad_x = {grad_x_fd}, grad_y = {grad_y_fd}\")\r\n```\r\n\r\nSee [scripts/grad_example.py](scripts/grad_example.py) for the full script.  In the case where the inputs and outputs are matrices instead of scalars, `grad_out` will have the shape of the output, and each `grad_in_\u003cname\u003e` will have the shape of the corresponding input.\r\n\r\nIn automatic differentiation, you create `backward_f` automatically based on the operations done by `f`.  Like in torch, no explicit graph is defined when using this prototype.  Arrays in `tenet` track the series of operations used to create them, so when you do the backward pass, each `backward_` function is run for you, automatically.\r\n\r\n## Interesting Features\r\n\r\nThere's only one sort of interesting feature about this prototype.  Zig does not support operator overloading, but it would still be nice to write out equations.  Writing out the operations by hand is a bit of a pain:\r\n\r\n```zig\r\n// (x * y + z) ^ 2.0\r\nvar a = try multiplyAlloc(alc, x, y);\r\ndefer a.release();\r\nvar b = try addAlloc(alc, a, z);\r\ndefer b.release()\r\nvar two = try Tensor.allocWithValue(f32, alc, \u0026[_]u64{}, 2, tensor.NO_FLAGS);\r\ndefer two.release();\r\nvar c = try powerAlloc(alc, b, two);\r\ndefer c.release();\r\n```\r\n\r\nThe `expr` function does all the same stuff, but uses a string at compile time:\r\n\r\n```zig\r\nvar c = try expr(alc, \"(x .* y + z) .^ 2.0\", .{.x=x, .y=y, .z=z});\r\ndefer c.release();\r\n```\r\n\r\nActually it only parses the expression at compile time, it doesn't fully unroll all the operations. I suspect the only thing keeping it from fully unrolling is some Zig compiler bug.\r\n\r\nBecause operator overloading is not used, the `expr` syntax has much fewer limitations.  For this prototype, it uses [MATLAB style operators](https://www.mathworks.com/help/matlab/matlab_prog/matlab-operators-and-special-characters.html).\r\n\r\n## Downsides\r\n\r\n* Defining an explicit graph may be a better approach than this and is used in the [kann](https://github.com/attractivechaos/kann) library\r\n* Deallocating memory immediately is kind of annoying when you don't use `expr`.  If you use `defer`, it won't be deallocated until the end of the block\r\n* Performance is mediocre, there has been no tuning for performance beyond an option to use [Intel's MKL library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html#gs.zou9ms).  The option is `-Duse-mkl` when using `zig build`.\r\n* CPU only for now\r\n* Only tested on windows\r\n* Probably contains serious bugs\r\n* This is mostly a proof-of-concept, and will likely not be maintained as a generally useful library.\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchristopher-hesse%2Ftenet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchristopher-hesse%2Ftenet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchristopher-hesse%2Ftenet/lists"}