Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/botkop/numsca
numsca is numpy for scala
https://github.com/botkop/numsca
deep-learning nd4j numpy scala
Last synced: about 2 months ago
JSON representation
numsca is numpy for scala
- Host: GitHub
- URL: https://github.com/botkop/numsca
- Owner: botkop
- License: bsd-2-clause
- Created: 2017-12-06T10:18:37.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2024-07-14T14:20:53.000Z (6 months ago)
- Last Synced: 2024-11-16T01:41:58.483Z (about 2 months ago)
- Topics: deep-learning, nd4j, numpy, scala
- Language: Jupyter Notebook
- Size: 735 KB
- Stars: 185
- Watchers: 18
- Forks: 18
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-scala - numsca - activity/y/botkop/numsca) (Table of Contents / Science and Data Analysis)
README
"What I cannot create, I do not understand." - Richard Feynman.
Numsca: Numpy for Scala
=========================[![Maven Central](https://maven-badges.herokuapp.com/maven-central/be.botkop/numsca_2.13/badge.svg)](https://maven-badges.herokuapp.com/maven-central/be.botkop/numsca_2.13)
[![Build Status](https://travis-ci.org/botkop/numsca.svg?branch=master)](https://travis-ci.org/botkop/numsca)Numsca is Numpy for Scala.
I invite you to have a look at [this notebook](https://nbviewer.jupyter.org/github/botkop/numsca/blob/master/notebooks/dl-from-scratch.ipynb),
which explains in simple terms how you can implement a neural net framework with Numsca.(If nbviewer barfs, then you can try [this notebook](notebooks/dl-from-scratch.ipynb))
Here's the famous [neural network in 11 lines of Python](http://iamtrask.github.io/2015/07/12/basic-python-network/), translated to Numsca:
```scala
import botkop.{numsca => ns}
val x = ns.array(0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1).reshape(4, 3)
val y = ns.array(0, 1, 1, 0).T
val w0 = 2 * ns.rand(3, 4) - 1
val w1 = 2 * ns.rand(4, 1) - 1
for (j <- 0 until 60000) {
val l1 = 1 / (1 + ns.exp(-ns.dot(x, w0)))
val l2 = 1 / (1 + ns.exp(-ns.dot(l1, w1)))
val l2Delta = (y - l2) * (l2 * (1 - l2))
val l1Delta = l2Delta.dot(w1.T) * (l1 * (1 - l1))
w1 += l1.T.dot(l2Delta)
w0 += x.T.dot(l1Delta)
}
```Another example: a Scala translation of Andrej Karpathy's
['Minimal character-level language model with a Vanilla Recurrent Neural Network'](src/main/scala/botkop/numsca/samples/MinCharRnn.scala).
(Compare with Andrej Karpathy's original [post](https://gist.github.com/karpathy/d4dee566867f8291f086).)Also have a look at [Scorch](https://github.com/botkop/scorch), a neural net framework in the spirit of [PyTorch](http://pytorch.org/), which uses Numsca.
## Why?
I love Scala. I teach myself deep learning. Everything in deep learning is written in Python.
This library helps me to quickly translate Python and Numpy code to my favorite language.I hope you find it useful.
Pull requests welcome.
## Disclaimer
This is far from an exhaustive copy of Numpy's functionality. I'm adding functionality as I go.
That being said, I think many of the most interesting aspects of Numpy like slicing, broadcasting and indexing
have been successfully implemented.## Under the hood
Numsca piggybacks on [Nd4j](https://nd4j.org/). Thanks, people!## Dependency
Add this to build.sbt:For Scala 2.13:
```scala
libraryDependencies += "be.botkop" %% "numsca" % "0.1.7"
```For Scala 2.11 and 2.12:
```scala
libraryDependencies += "be.botkop" %% "numsca" % "0.1.5"
```## Importing Numsca
```scala
import botkop.{numsca => ns}
import ns.Tensor
```## Creating a Tensor
```scala
scala> Tensor(3, 2, 1, 0)
[3.00, 2.00, 1.00, 0.00]scala> ns.zeros(3, 3)
[[0.00, 0.00, 0.00],
[0.00, 0.00, 0.00],
[0.00, 0.00, 0.00]]scala> ns.ones(3, 2)
[[1.00, 1.00],
[1.00, 1.00],
[1.00, 1.00]]
scala> val ta: Tensor = ns.arange(10)
[0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00]scala> val tb: Tensor = ns.reshape(ns.arange(9), 3, 3)
[[0.00, 1.00, 2.00],
[3.00, 4.00, 5.00],
[6.00, 7.00, 8.00]]
scala> val tc: Tensor = ns.reshape(ns.arange(2 * 3 * 4), 2, 3, 4)
[[[0.00, 1.00, 2.00, 3.00],
[4.00, 5.00, 6.00, 7.00],
[8.00, 9.00, 10.00, 11.00]],
[[12.00, 13.00, 14.00, 15.00],
[16.00, 17.00, 18.00, 19.00],
[20.00, 21.00, 22.00, 23.00]]]
```## Access
Single element
```scala
scala> ta(0)
res10: botkop.numsca.Tensor = 0.00scala> tc(0, 1, 2)
res14: botkop.numsca.Tensor = 6.00
```
Get the value of a single element Tensor:
```scala
scala> ta(0).squeeze()
res11: Double = 0.0
```
Slice
```scala
scala> tc(0)
res7: botkop.numsca.Tensor =
[[0.00, 1.00, 2.00, 3.00],
[4.00, 5.00, 6.00, 7.00],
[8.00, 9.00, 10.00, 11.00]]
scala> tc(0, 1)
res8: botkop.numsca.Tensor = [4.00, 5.00, 6.00, 7.00]
```## Update
In place
```scala
scala> val t = ta.copy()
t: botkop.numsca.Tensor = [0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00]scala> t(3) := -5
scala> t
res16: botkop.numsca.Tensor = [0.00, 1.00, 2.00, -5.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00]scala> t(0) += 7
scala> t
res18: botkop.numsca.Tensor = [7.00, 1.00, 2.00, -5.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00]
```Array wise
```scala
scala> val a2 = 2 * ta
val a2 = 2 * ta
a2: botkop.numsca.Tensor = [0.00, 2.00, 4.00, 6.00, 8.00, 10.00, 12.00, 14.00, 16.00, 18.00]
```## Slicing
Note:
- negative indexing is supported
- Python notation ```t[:3]``` must be written as ```t(0 :> 3)``` or ```t(:>(3))```Not supported (yet):
- step size
- ellipsis### Single dimension
#### Slice over a single dimension```scala
scala> val a0 = ta.copy().reshape(10, 1)
a0: botkop.numsca.Tensor = [0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00]scala> val a1 = a0(1 :>)
a1: botkop.numsca.Tensor = [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00]scala> val a2 = a0(0 :> -1)
a2: botkop.numsca.Tensor = [0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00]scala> val a3 = a1 - a2
a3: botkop.numsca.Tensor = [1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00]scala> ta(:>, 5 :>)
res19: botkop.numsca.Tensor = [5.00, 6.00, 7.00, 8.00, 9.00]scala> ta(:>, -3 :>)
res4: botkop.numsca.Tensor = [7.00, 8.00, 9.00]
```#### Update single dimension slice
```scala
scala> val t = ta.copy()
t: botkop.numsca.Tensor = [0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00]
```
Assign another tensor
```scala
scala> t(2 :> 5) := -ns.ones(3)
scala> t
res6: botkop.numsca.Tensor = [0.00, 1.00, -1.00, -1.00, -1.00, 5.00, 6.00, 7.00, 8.00, 9.00]
```
Assign a value
```scala
scala> t(2 :> 5) := 33
scala> t
res8: botkop.numsca.Tensor = [0.00, 1.00, 33.00, 33.00, 33.00, 5.00, 6.00, 7.00, 8.00, 9.00]
```
Update in place
```scala
scala> t(2 :> 5) -= 1
scala> t
res10: botkop.numsca.Tensor = [0.00, 1.00, 32.00, 32.00, 32.00, 5.00, 6.00, 7.00, 8.00, 9.00]```
### Multidimensional slices
```scala
scala> tb
res11: botkop.numsca.Tensor =
[[0.00, 1.00, 2.00],
[3.00, 4.00, 5.00],
[6.00, 7.00, 8.00]]
scala> tb(2:>, :>)
res15: botkop.numsca.Tensor = [6.00, 7.00, 8.00]
```
Mixed range/integer indexing. Note that integers are implicitly translated to ranges,
and this differs from Python.
```scala
scala> tb(1, 0 :> -1)
res1: botkop.numsca.Tensor = [3.00, 4.00]
```## Fancy indexing
### Boolean indexing
```scala
scala> val c = ta < 5 && ta > 1
c: botkop.numsca.Tensor = [0.00, 0.00, 1.00, 1.00, 1.00, 0.00, 0.00, 0.00, 0.00, 0.00]
```
This returns a TensorSelection:
```scala
scala> val d = ta(c)
d: botkop.numsca.TensorSelection = TensorSelection([0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00],[[I@153ea1aa,None)
```
Which is implicitly converted to a Tensor when needed:
```scala
scala> val d: Tensor = ta(c)
d: botkop.numsca.Tensor = [2.00, 3.00, 4.00]
```
Or you can force it to become a Tensor:
```scala
scala> ta(c).asTensor
res10: botkop.numsca.Tensor = [2.00, 3.00, 4.00]
```
Updating:
```scala
scala> val t = ta.copy()
scala> t(ta < 5 && ta > 1) := -7
res6: botkop.numsca.Tensor = [0.00, 1.00, -7.00, -7.00, -7.00, 5.00, 6.00, 7.00, 8.00, 9.00]
```
Selection over multiple dimensions:
```scala
scala> val c: Tensor = tc(tc % 5 == 0)
c: botkop.numsca.Tensor = [0.00, 5.00, 10.00, 15.00, 20.00]
```
Updating over multiple dimensions:
```scala
scala> val t1 = tc.copy()
t1: botkop.numsca.Tensor =
[[[0.00, 1.00, 2.00, 3.00],
[4.00, 5.00, 6.00, 7.00],
[8.00, 9.00, 10.00, 11.00]],[[12.00, 13.00, 14.00, 15.00],
[16.00, 17.00, 18.00, 19.00],
[20.00, 21.00, 22.00, 23.00]]]
scala> t1(t1 > 5 && t1 < 15) *= 2
res21: botkop.numsca.Tensor =
[[[0.00, 1.00, 2.00, 3.00],
[4.00, 5.00, 12.00, 14.00],
[16.00, 18.00, 20.00, 22.00]],[[24.00, 26.00, 28.00, 15.00],
[16.00, 17.00, 18.00, 19.00],
[20.00, 21.00, 22.00, 23.00]]]
```
### List of location indexing
```scala
scala> val primes = Tensor(2, 3, 5, 7, 11, 13, 17, 19, 23)scala> val idx = Tensor(3, 4, 1, 2, 2)
scala> primes(idx).asTensor
res23: botkop.numsca.Tensor = [7.00, 11.00, 3.00, 5.00, 5.00]```
Reshape according to index:
```scala
scala> tb
res25: botkop.numsca.Tensor =
[[0.00, 1.00, 2.00],
[3.00, 4.00, 5.00],
[6.00, 7.00, 8.00]]scala> primes(tb).asTensor
res24: botkop.numsca.Tensor =
[[2.00, 3.00, 5.00],
[7.00, 11.00, 13.00],
[17.00, 19.00, 23.00]]
```
Use as a look-up table:
```scala
scala> val numSamples = 4
val numClasses = 3
val x = ns.arange(numSamples * numClasses).reshape(numSamples, numClasses)
val y = Tensor(0, 1, 2, 1)
val z: Tensor = x(ns.arange(numSamples), y)
res26: botkop.numsca.Tensor = [0.00, 4.00, 8.00, 10.00]
```
Update along a single dimension:
```scala
scala> val primes = Tensor(2, 3, 5, 7, 11, 13, 17, 19, 23)
primes: botkop.numsca.Tensor = [2.00, 3.00, 5.00, 7.00, 11.00, 13.00, 17.00, 19.00, 23.00]scala> val idx = Tensor(3, 4, 1, 2, 2)
idx: botkop.numsca.Tensor = [3.00, 4.00, 1.00, 2.00, 2.00]scala> primes(idx) := 0
scala> primes
res1: botkop.numsca.Tensor = [2.00, 0.00, 0.00, 0.00, 0.00, 13.00, 17.00, 19.00, 23.00]
```
Multiple dimensions
```scalascala> val a = ns.arange(6).reshape(3, 2) + 1
a: botkop.numsca.Tensor =
[[1.00, 2.00],
[3.00, 4.00],
[5.00, 6.00]]scala> val s1 = Tensor(0, 1, 2)
s1: botkop.numsca.Tensor = [0.00, 1.00, 2.00]scala> val s2 = Tensor(0, 1, 0)
s2: botkop.numsca.Tensor = [0.00, 1.00, 0.00]scala> val r1: Tensor = a(s1, s2)
r1: botkop.numsca.Tensor = [1.00, 4.00, 5.00]
```
An index will be broadcast if needed:
```scala
scala> val y = ns.arange(35).reshape(5, 7)
y: botkop.numsca.Tensor =
[[0.00, 1.00, 2.00, 3.00, 4.00, 5.00, 6.00],
[7.00, 8.00, 9.00, 10.00, 11.00, 12.00, 13.00],
[14.00, 15.00, 16.00, 17.00, 18.00, 19.00, 20.00],
[21.00, 22.00, 23.00, 24.00, 25.00, 26.00, 27.00],
[28.00, 29.00, 30.00, 31.00, 32.00, 33.00, 34.00]]scala> val r5: Tensor = y(Tensor(0, 2, 4), Tensor(1))
r5: botkop.numsca.Tensor = [1.00, 15.00, 29.00]
```Update along multiple dimensions:
```scala
scala> val a = ns.arange(6).reshape(3, 2) + 1
a: botkop.numsca.Tensor =
[[1.00, 2.00],
[3.00, 4.00],
[5.00, 6.00]]scala> val s1 = Tensor(1, 1, 2)
s1: botkop.numsca.Tensor = [1.00, 1.00, 2.00]scala> val s2 = Tensor(0, 1, 0)
s2: botkop.numsca.Tensor = [0.00, 1.00, 0.00]scala> a(s1, s2) := 0
res1: botkop.numsca.Tensor =
[[1.00, 2.00],
[0.00, 0.00],
[0.00, 6.00]]
```
## Broadcasting```scala
scala> val x = ns.arange(4)
x: botkop.numsca.Tensor = [0.00, 1.00, 2.00, 3.00]scala> val xx = x.reshape(4, 1)
xx: botkop.numsca.Tensor = [0.00, 1.00, 2.00, 3.00]scala> val y = ns.ones(5)
y: botkop.numsca.Tensor = [1.00, 1.00, 1.00, 1.00, 1.00]scala> val z = ns.ones(3, 4)
val z = ns.ones(3, 4)
[[1.00, 1.00, 1.00, 1.00],
[1.00, 1.00, 1.00, 1.00],
[1.00, 1.00, 1.00, 1.00]]scala> (xx + y)
[[1.00, 1.00, 1.00, 1.00, 1.00],
[2.00, 2.00, 2.00, 2.00, 2.00],
[3.00, 3.00, 3.00, 3.00, 3.00],
[4.00, 4.00, 4.00, 4.00, 4.00]]scala> x + z
[[1.00, 2.00, 3.00, 4.00],
[1.00, 2.00, 3.00, 4.00],
[1.00, 2.00, 3.00, 4.00]]
```
Outer sum:
```scala
scala> val a = Tensor(0.0, 10.0, 20.0, 30.0).reshape(4, 1)
a: botkop.numsca.Tensor = [0.00, 10.00, 20.00, 30.00]scala> val b = Tensor(1.0, 2.0, 3.0)
b: botkop.numsca.Tensor = [1.00, 2.00, 3.00]scala> a + b
res6: botkop.numsca.Tensor =
[[1.00, 2.00, 3.00],
[11.00, 12.00, 13.00],
[21.00, 22.00, 23.00],
[31.00, 32.00, 33.00]]```
Vector Quantization from [EricsBroadcastingDoc](http://scipy.github.io/old-wiki/pages/EricsBroadcastingDoc):
```scala
scala> val observation = Tensor(111.0, 188.0)scala> val codes = Tensor( 102.0, 203.0, 132.0, 193.0, 45.0, 155.0, 57.0, 173.0).reshape(4, 2)
codes: botkop.numsca.Tensor =
[[102.00, 203.00],
[132.00, 193.00],
[45.00, 155.00],
[57.00, 173.00]]scala> val diff = codes - observation
diff: botkop.numsca.Tensor =
[[-9.00, 15.00],
[21.00, 5.00],
[-66.00, -33.00],
[-54.00, -15.00]]scala> val dist = ns.sqrt(ns.sum(ns.square(diff), axis = -1))
dist: botkop.numsca.Tensor = [17.49, 21.59, 73.79, 56.04]scala> val nearest = ns.argmin(dist).squeeze()
nearest: Double = 0.0```