Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dkogan/numpysane

more-reasonable core functionality for numpy
https://github.com/dkogan/numpysane

broadcasting linear-algebra numpy python-wrapper-api

Last synced: about 2 months ago
JSON representation

more-reasonable core functionality for numpy

Awesome Lists containing this project

README

        

* TALK
I just gave a talk about this at [[https://www.socallinuxexpo.org/scale/18x][SCaLE 18x]]. Here are the [[https://www.youtube.com/watch?v=YOOapXNtUWw][video of the talk]] and
the [[https://github.com/dkogan/talk-numpysane-gnuplotlib/raw/master/numpysane-gnuplotlib.pdf]["slides"]].

* NAME
numpysane_pywrap: Python-wrap C code with broadcasting awareness

* SYNOPSIS

Let's implement a broadcastable and type-checked inner product that is

- Written in C (i.e. it is fast)
- Callable from python using numpy arrays (i.e. it is convenient)

We write a bit of python to generate the wrapping code. "genpywrap.py":

#+BEGIN_EXAMPLE
import numpy as np
import numpysane as nps
import numpysane_pywrap as npsp

m = npsp.module( name = "innerlib",
docstring = "An inner product module in C")
m.function( "inner",
"Inner product pywrapped with npsp",

args_input = ('a', 'b'),
prototype_input = (('n',), ('n',)),
prototype_output = (),

Ccode_slice_eval = \
{np.float64:
r"""
double* out = (double*)data_slice__output;
const int N = dims_slice__a[0];

*out = 0.0;

for(int i=0; i inner_pywrap.c
#+END_EXAMPLE

We build this into a python module:

#+BEGIN_EXAMPLE
COMPILE=(`python3 -c "
import sysconfig
conf = sysconfig.get_config_vars()
print('{} {} {} -I{}'.format(*[conf[x] for x in ('CC',
'CFLAGS',
'CCSHARED',
'INCLUDEPY')]))"`)
LINK=(`python3 -c "
import sysconfig
conf = sysconfig.get_config_vars()
print('{} {} {}'.format(*[conf[x] for x in ('BLDSHARED',
'BLDLIBRARY',
'LDFLAGS')]))"`)
EXT_SUFFIX=`python3 -c "
import sysconfig
print(sysconfig.get_config_vars('EXT_SUFFIX')[0])"`

${COMPILE[@]} -c -o inner_pywrap.o inner_pywrap.c
${LINK[@]} -o innerlib$EXT_SUFFIX inner_pywrap.o
#+END_EXAMPLE

Here we used the build commands directly. This could be done with
setuptools/distutils instead; it's a normal extension module. And now we can
compute broadcasted inner products from a python script "tst.py":

#+BEGIN_EXAMPLE
import numpy as np
import innerlib
print(innerlib.inner( np.arange(4, dtype=float),
np.arange(8, dtype=float).reshape( 2,4)))
#+END_EXAMPLE

Running it to compute inner([0,1,2,3],[0,1,2,3]) and inner([0,1,2,3],[4,5,6,7]):

#+BEGIN_EXAMPLE
$ python3 tst.py
[14. 38.]
#+END_EXAMPLE

* DESCRIPTION
This module provides routines to python-wrap existing C code by generating C
sources that define the wrapper python extension module.

To create the wrappers we

1. Instantiate a new numpysane_pywrap.module class
2. Call module.function() for each wrapper function we want to add to this
module
3. Call module.write() to write the C sources defining this module to standard
output

The sources can then be built and executed normally, as any other python
extension module. The resulting functions are called as one would expect:

#+BEGIN_EXAMPLE
output = f_one_output (input0, input1, ...)
(output0, output1, ...) = f_multiple_outputs(input0, input1, ...)
#+END_EXAMPLE

depending on whether we declared a single output, or multiple outputs (see
below). It is also possible to pre-allocate the output array(s), and call the
functions like this (see below):

#+BEGIN_EXAMPLE
output = np.zeros(...)
f_one_output (input0, input1, ..., out = output)

output0 = np.zeros(...)
output1 = np.zeros(...)
f_multiple_outputs(input0, input1, ..., out = (output0, output1))
#+END_EXAMPLE

Each wrapped function is broadcasting-aware. The normal numpy broadcasting rules
(as described in 'broadcast_define' and on the numpy website:
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) apply. In
summary:

- Dimensions are aligned at the end of the shape list, and must match the
prototype

- Extra dimensions left over at the front must be consistent for all the
input arguments, meaning:

- All dimensions of length != 1 must match
- Dimensions of length 1 match corresponding dimensions of any length in
other arrays
- Missing leading dimensions are implicitly set to length 1

- The output(s) have a shape where
- The trailing dimensions match the prototype
- The leading dimensions come from the extra dimensions in the inputs

When we create a wrapper function, we only define how to compute a single
broadcasted slice. If the generated function is called with higher-dimensional
inputs, this slice code will be called multiple times. This broadcast loop is
produced by the numpysane_pywrap generator automatically. The generated code
also

- parses the python arguments
- generates python return values
- validates the inputs (and any pre-allocated outputs) to make sure the given
shapes and types all match the declared shapes and types. For instance,
computing an inner product of a 5-vector and a 3-vector is illegal
- creates the output arrays as necessary

This code-generator module does NOT produce any code to implicitly make copies
of the input. If the inputs fail validation (unknown types given, contiguity
checks failed, etc) then an exception is raised. Copying the input is
potentially slow, so we require the user to do that, if necessary.

** Explicated example

In the synopsis we declared the wrapper module like this:

#+BEGIN_EXAMPLE
m = npsp.module( name = "innerlib",
docstring = "An inner product module in C")
#+END_EXAMPLE

This produces a module named "innerlib". Note that the python importer will look
for this module in a file called "innerlib$EXT_SUFFIX" where EXT_SUFFIX comes
from the python configuration. This is normal behavior for python extension
modules.

A module can contain many wrapper functions. Each one is added by calling
'm.function()'. We did this:

#+BEGIN_EXAMPLE
m.function( "inner",
"Inner product pywrapped with numpysane_pywrap",

args_input = ('a', 'b'),
prototype_input = (('n',), ('n',)),
prototype_output = (),

Ccode_slice_eval = \
{np.float64:
r"""
double* out = (double*)data_slice__output;
const int N = dims_slice__a[0];

*out = 0.0;

for(int i=0; i>> print(innerlib.inner( np.arange(4, dtype=float),
np.arange(8, dtype=float).reshape( 2,4)),
scale_string = "1.0")
[14. 38.]

>>> print(innerlib.inner( np.arange(4, dtype=float),
np.arange(8, dtype=float).reshape( 2,4),
scale = 2.0,
scale_string = "10.0"))
[280. 760.]
#+END_EXAMPLE

** Precomputing a cookie outside the slice computation
Sometimes it is useful to generate some resource once, before any of the
broadcasted slices were evaluated. The slice evaluation code could then make use
of this resource. Example: allocating memory, opening files. This is supported
using a 'cookie'. We define a structure that contains data that will be
available to all the generated functions. This structure is initialized at the
beginning, used by the slice computation functions, and then cleaned up at the
end. This is most easily described with an example. The scaled inner product
demonstrated immediately above has an inefficiency: we compute
'atof(scale_string)' once for every slice, even though the string does not
change. We should compute the atof() ONCE, and use the resulting value each
time. And we can:

#+BEGIN_EXAMPLE
m.function( "inner",
"Inner product pywrapped with numpysane_pywrap",

args_input = ('a', 'b'),
prototype_input = (('n',), ('n',)),
prototype_output = (),
extra_args = (("double", "scale", "1", "d"),
("const char*", "scale_string", "NULL", "s")),
Ccode_cookie_struct = r"""
double scale; /* from BOTH scale arguments: "scale", "scale_string" */
""",
Ccode_validate = r"""
if(scale_string == NULL)
{
PyErr_Format(PyExc_RuntimeError,
"The 'scale_string' argument is required" );
return false;
}
cookie->scale = *scale * (scale_string ? atof(scale_string) : 1.0);
return true; """,
Ccode_slice_eval = \
{np.float64:
r"""
double* out = (double*)data_slice__output;
const int N = dims_slice__a[0];

*out = 0.0;

for(int i=0; iscale;

return true;""" },

// Cleanup, such as free() or close() goes here
Ccode_cookie_cleanup = ''
)
#+END_EXAMPLE

We defined a cookie structure that contains one element: 'double scale'. We
compute the scale factor (from BOTH of the extra arguments) before any of the
slices are evaluated: in the validation function. Then we apply the
already-computed scale with each slice. Both the validation and slice
computation functions have the whole cookie structure available in '*cookie'. It
is expected that the validation function will write something to the cookie, and
the slice functions will read it, but this is not enforced: this structure is
not const, and both functions can do whatever they like.

If the cookie initialization did something that must be cleaned up (like a
malloc() for instance), the cleanup code can be specified in the
'Ccode_cookie_cleanup' argument to function(). Note: this cleanup code is ALWAYS
executed, even if there were errors that raise an exception, EVEN if we haven't
initialized the cookie yet. When the cookie object is first initialized, it is
filled with 0, so the cleanup code can detect whether the cookie has been
initialized or not:

#+BEGIN_EXAMPLE
m.function( ...
Ccode_cookie_struct = r"""
...
bool initialized;
""",
Ccode_validate = r"""
...
cookie->initialized = true;
return true;
""",
Ccode_cookie_cleanup = r"""
if(cookie->initialized) cleanup();
""" )
#+END_EXAMPLE

** Examples
For some sample usage, see the wrapper-generator used in the test suite:
https://github.com/dkogan/numpysane/blob/master/test/genpywrap.py

** Planned functionality
Currently, each broadcasted slice is computed sequentially. But since the slices
are inherently independent, this is a natural place to add parallelism. And
implemention this with something like OpenMP should be straightforward. I'll get
around to doing this eventually, but in the meantime, patches are welcome.

* COMPATIBILITY

Python 2 and Python 3 should both be supported. Please report a bug if either
one doesn't work.

* REPOSITORY

https://github.com/dkogan/numpysane

* AUTHOR

Dima Kogan

* LICENSE AND COPYRIGHT

Copyright 2016-2020 Dima Kogan.

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU Lesser General Public License (any version) as published by
the Free Software Foundation

See https://www.gnu.org/licenses/lgpl.html