https://github.com/tony-aw/broadcast
R Package Broadcast: Broadcasted Array Operations Like ‘NumPy’
https://github.com/tony-aw/broadcast
cran cran-r data-manipulation fastverse high-performance multidimensional-arrays numpy r r-package rstats rstats-package scientific-computing
Last synced: 3 months ago
JSON representation
R Package Broadcast: Broadcasted Array Operations Like ‘NumPy’
- Host: GitHub
- URL: https://github.com/tony-aw/broadcast
- Owner: tony-aw
- License: mpl-2.0
- Created: 2024-12-31T09:17:18.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-02-18T19:28:28.000Z (3 months ago)
- Last Synced: 2026-02-18T23:41:05.348Z (3 months ago)
- Topics: cran, cran-r, data-manipulation, fastverse, high-performance, multidimensional-arrays, numpy, r, r-package, rstats, rstats-package, scientific-computing
- Language: R
- Homepage: https://tony-aw.github.io/broadcast/
- Size: 287 MB
- Stars: 27
- Watchers: 3
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: NEWS.md
- License: LICENSE
Awesome Lists containing this project
README
‘R’-package ‘broadcast’: Broadcasted Array Operations Like ‘NumPy’
[](https://github.com/tony-aw/broadcast/actions)
[](https://www.repostatus.org/#active)
[](https://orcid.org/0000-0001-9498-8379)
[](https://cran.r-project.org/package=broadcast)
[](https://cran.r-project.org/package=broadcast)
## 🗺️Overview
‘broadcast’ is an efficient ‘C’/‘C++’ - based ‘R’ package that, as the
name suggests, performs “array broadcasting” (similar to broadcasting in
the ‘Numpy’ module for ‘Python’).
In the context of operations involving 2 (or more) arrays,
“broadcasting” refers to efficiently recycling array dimensions, without
making copies.
This is considerably **faster** and **more memory-efficient** than R’s
regular dimensions replication mechanism.
Key Features (click on the 🔍 to show or hide):
Consider computing the addition of these arrays `x` and `y`:
``` r
(x <- array(1:15, c(3, 5)))
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 4 7 10 13
#> [2,] 2 5 8 11 14
#> [3,] 3 6 9 12 15
(y <- array(1:5 * 100, c(1, 5)))
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 100 200 300 400 500
```
This cannot be done efficiently in base ‘R’; it **can** be done fast and
memory-efficiently with the ‘broadcast’ package:
Base ‘R’
‘broadcast’ package
``` r
x + y
Error in x + y : non-conformable arrays
# You *could* do the following....
x + y[rep(1L, 3L),]
# ... but if x or y is very large:
Error: cannot allocate vector of size
```
``` r
broadcaster(x) <- TRUE
broadcaster(y) <- TRUE
x + y
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 101 204 307 410 513
#> [2,] 102 205 308 411 514
#> [3,] 103 206 309 412 515
#> broadcaster
```
‘broadcast’ supports a wide range of infix operators, including
arithmetic-, relational-, Boolean- string- and bit-wise operators.
Using broadcasting, `bind_array()` from the ‘broadcast’ package can bind
arrays together in ways that cannot efficiently be done with `rbind()`,
`cbind()`, or `abind::abind()`. Consider, for example, column-binding
these arrays `x` and `y`:
``` r
(x <- array(1:12, c(3, 4)))
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
(y <- array(1:4 * 100, c(1, 4)))
#> [,1] [,2] [,3] [,4]
#> [1,] 100 200 300 400
```
This cannot be done efficiently in base ‘R’; it **can** be done fast and
memory-efficiently with the ‘broadcast’ package:
Base ‘R’
‘broadcast’ package
``` r
cbind(x, y)
Error in cbind(x, y) :
number of rows of matrices must match
# You *could* do the following....
cbind(x, y[rep(1L, 3L),])
# ... but if x or y is very large:
Error: cannot allocate vector of size
```
``` r
bind_array(list(x, y), along = 2L)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1 4 7 10 100 200 300 400
#> [2,] 2 5 8 11 100 200 300 400
#> [3,] 3 6 9 12 100 200 300 400
```
`bind_array()` is also considerably faster and more memory efficient
than `abind()`. See the
[benchmarks](https://tony-aw.github.io/broadcast/about/f_benchmarks_other.html).
Broadcasted General Functions 🔍
The idea of broadcasted infix operations and broadcasted array binding
has been generalized to also include `bcapply()` (a broadcasted
apply-like function), `bc_ifelse()` (broadcasted version of `ifelse()`),
`bc_strrep()` (broadcasted version of `strrep()`).
Broadcast provides casting functions, that cast subset-groups of an
array to a new dimension, cast nested lists to dimensional lists, and
vice-versa.
These functions are useful for facilitating complex broadcasted
operations, though they also have much merit beside broadcasting.
For example, you cannot broadcast through hierarchies of a list, but you
**can** broadcast along dimensions. So suppose you have the following
list:
``` r
x <- list(
student1 = list(
homework1 = sample(0:100, 5),
homework2 = sample(0:100, 5),
homework3 = sample(0:100, 5)
),
student2 = list(
homework1 = sample(0:100, 5),
homework2 = sample(0:100, 5),
homework3 = sample(0:100, 5)
),
student3 = list(
homework1 = sample(0:100, 5),
homework2 = sample(0:100, 5),
homework3 = sample(0:100, 5)
)
)
```
Since all values in the list are numbers, you might want to turn this
into a numeric array, to make mathematical computations and analyses on
it easier.
This can be done with the ‘broadcast’ package with the following steps.
First, turn the nested list into a shallow (i.e. non-nested),
dimensional list using `cast_hier2dim()`:
``` r
x2 <- cast_hier2dim(x, in2out = FALSE, direction.names = 1L)
print(x2)
#> homework1 homework2 homework3
#> student1 integer,5 integer,5 integer,5
#> student2 integer,5 integer,5 integer,5
#> student3 integer,5 integer,5 integer,5
```
Second, turn the shallow (i.e. non-nested), dimensional list into an
atomic array using `cast_shallow2atomic()`:
``` r
x3 <- cast_shallow2atomic(x2, 1L)
print(x3)
#> , , homework1
#>
#> student1 student2 student3
#> [1,] 67 6 73
#> [2,] 38 72 41
#> [3,] 0 78 37
#> [4,] 33 84 19
#> [5,] 86 36 27
#>
#> , , homework2
#>
#> student1 student2 student3
#> [1,] 42 88 19
#> [2,] 13 36 43
#> [3,] 81 33 86
#> [4,] 58 100 69
#> [5,] 50 43 39
#>
#> , , homework3
#>
#> student1 student2 student3
#> [1,] 96 78 43
#> [2,] 84 32 24
#> [3,] 20 83 69
#> [4,] 53 34 38
#> [5,] 73 69 50
```
A few Linear Algebra Functions for
Statistics 🔍
‘broadcast’ comes with a few linear algebra functions for statistics.
For example, the `sd_lc()` function to compute the standard deviation of
a linear combination of variables - regardless of what the distribution
of the variables is.
The **Quick-Start Guide** can be found
[here](https://tony-aw.github.io/broadcast/vignettes/b_quickstart.html).
## 🤷🏽Why use ‘broadcast’
**Efficiency**
Broadcasting as implemented in the ‘broadcasting’ package is about as
fast as - and sometimes even faster than - NumPy.
The implementations in the ‘broadcast’ package are also much faster and
much more memory efficient than using base ‘R’ solutions like
`sweep()`.
Efficient programs use less energy and resources, and is thus better for
the environment.
Benchmarks can be found in the “About” section on the website.
**Convenience**
Have you ever been bothered by any of the following while programming in
‘R’:
- Receiving the “non-conformable arrays” error message in a simple array
operation, when it intuitively should work?
- Receiving the “cannot allocate vector of size…” error message because
‘R’ unnecessarily allocated too much memory in array operations?
- `abind::abind()` being too slow, or ruining the structure of recursive
arrays?
- The `sweep()` and `outer()` functions being too slow or too limiting?
- that there is no array analogy to `data.table::dcast()`?
- difficulties in handling deeply nested lists?
- that certain ‘Numpy’ operations have no equivalent operation in ‘R’?
If you answered “YES” to any of the above, ‘broadcast’ may be the ‘R’ -
package for you.
**Minimal Dependencies**
Besides linking to ‘Rcpp’, ‘broadcast’ does not depend on, vendor, link
to, include, or otherwise use any external libraries; ‘broadcast’ was
essentially made from scratch and can be installed out-of-the-box.
Not using external libraries brings a number of advantages:
- Avoid dependency hell.
- Avoid wasting time, memory and computing resources for translating
between language structures.
- Ensure consistent behaviour with the rest of ‘R’.
- Access to CRAN’s quality control (no comparable quality control
organization exists for most other programming languages).
See the [Other
Packages](https://tony-aw.github.io/broadcast/about/d_other_pkgs.html)
page for more details on the above points regarding the advantages of
minimizing dependencies.
**Tested**
The ‘broadcast’ package is frequently checked using a large suite of
unit tests via the [tinytest](https://github.com/markvanderloo/tinytest)
package. These tests have a
[coverage](https://tony-aw.github.io/broadcast/about/g_unit_test_covr.html)
of over 95%. So the chance of a function from this package breaking
completely is relatively low.
‘broadcast’ is still relatively new package, however, so (small) bugs
are still very much possible. I encourage users who find bugs to report
them promptly to the
[issues](https://github.com/tony-aw/broadcast/issues) tab on the GitHub
page, and I will fix them as soon as time permits.
## 🔧Installation
``` r
install.packages("broadcast", type = "source")
```
## 📊Status
‘broadcast’ is now available on CRAN! 🎉
If you have any suggestions or feedback on the package, its
documentation, or even the benchmarks, I encourage you to let me know
(either as an [Issue](https://github.com/tony-aw/broadcast/issues) or a
[Discussion](https://github.com/tony-aw/broadcast/discussions)).
I’m eager to read your input!
## 📖Documentation
The documentation in the ‘broadcast’ website is divided into 3 main
parts:
- [Guides and
Vignettes](https://tony-aw.github.io/broadcast/vignettes/a_readme.html):
contains the topic-oriented guides in the form of a few Vignettes.
- [Reference
Manual](https://tony-aw.github.io/broadcast/man/aaa00_broadcast_help.html):
contains the function-oriented reference manual.
- [About](https://tony-aw.github.io/broadcast/about/a_acknowledgements.html):
Contains the Acknowledgements, Change logs and License file. Here
you’ll also find some information regarding the relationship between
‘broadcast’ and other packages/modules. Benchmarks can also be found
here.