Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/dkogan/numpysane

more-reasonable core functionality for numpy
https://github.com/dkogan/numpysane
broadcasting linear-algebra numpy python-wrapper-api
Last synced: 3 months ago
JSON representation
more-reasonable core functionality for numpy
Host: GitHub
URL: https://github.com/dkogan/numpysane
Owner: dkogan
License: other
Created: 2016-05-11T08:00:48.000Z (almost 9 years ago)
Default Branch: master
Last Pushed: 2023-12-23T23:50:11.000Z (about 1 year ago)
Last Synced: 2024-04-25T12:21:57.062Z (10 months ago)
Topics: broadcasting, linear-algebra, numpy, python-wrapper-api
Language: Python
Homepage:
Size: 629 KB
Stars: 28
Watchers: 5
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README-pywrap.org
- Changelog: Changes
- License: LICENSE
Awesome Lists containing this project

README

        * TALK

I just gave a talk about this at [[https://www.socallinuxexpo.org/scale/18x][SCaLE 18x]]. Here are the [[https://www.youtube.com/watch?v=YOOapXNtUWw][video of the talk]] and

the [[https://github.com/dkogan/talk-numpysane-gnuplotlib/raw/master/numpysane-gnuplotlib.pdf]["slides"]].

* NAME

numpysane_pywrap: Python-wrap C code with broadcasting awareness

* SYNOPSIS

Let's implement a broadcastable and type-checked inner product that is

- Written in C (i.e. it is fast)

- Callable from python using numpy arrays (i.e. it is convenient)

We write a bit of python to generate the wrapping code. "genpywrap.py":

#+BEGIN_EXAMPLE

import numpy     as np

import numpysane as nps

import numpysane_pywrap as npsp

m = npsp.module( name      = "innerlib",

                 docstring = "An inner product module in C")

m.function( "inner",

            "Inner product pywrapped with npsp",

            args_input       = ('a', 'b'),

            prototype_input  = (('n',), ('n',)),

            prototype_output = (),

            Ccode_slice_eval = \

                {np.float64:

                 r"""

                   double* out = (double*)data_slice__output;

                   const int N = dims_slice__a[0];

                   *out = 0.0;

                   for(int i=0; i inner_pywrap.c

#+END_EXAMPLE

We build this into a python module:

#+BEGIN_EXAMPLE

COMPILE=(`python3 -c "

import sysconfig

conf = sysconfig.get_config_vars()

print('{} {} {} -I{}'.format(*[conf[x] for x in ('CC',

                                                 'CFLAGS',

                                                 'CCSHARED',

                                                 'INCLUDEPY')]))"`)

LINK=(`python3 -c "

import sysconfig

conf = sysconfig.get_config_vars()

print('{} {} {}'.format(*[conf[x] for x in ('BLDSHARED',

                                            'BLDLIBRARY',

                                            'LDFLAGS')]))"`)

EXT_SUFFIX=`python3 -c "

import sysconfig

print(sysconfig.get_config_vars('EXT_SUFFIX')[0])"`

${COMPILE[@]} -c -o inner_pywrap.o inner_pywrap.c

${LINK[@]} -o innerlib$EXT_SUFFIX inner_pywrap.o

#+END_EXAMPLE

Here we used the build commands directly. This could be done with

setuptools/distutils instead; it's a normal extension module. And now we can

compute broadcasted inner products from a python script "tst.py":

#+BEGIN_EXAMPLE

import numpy as np

import innerlib

print(innerlib.inner( np.arange(4, dtype=float),

                      np.arange(8, dtype=float).reshape( 2,4)))

#+END_EXAMPLE

Running it to compute inner([0,1,2,3],[0,1,2,3]) and inner([0,1,2,3],[4,5,6,7]):

#+BEGIN_EXAMPLE

$ python3 tst.py

[14. 38.]

#+END_EXAMPLE

* DESCRIPTION

This module provides routines to python-wrap existing C code by generating C

sources that define the wrapper python extension module.

To create the wrappers we

1. Instantiate a new numpysane_pywrap.module class

2. Call module.function() for each wrapper function we want to add to this

   module

3. Call module.write() to write the C sources defining this module to standard

   output

The sources can then be built and executed normally, as any other python

extension module. The resulting functions are called as one would expect:

#+BEGIN_EXAMPLE

output                  = f_one_output      (input0, input1, ...)

(output0, output1, ...) = f_multiple_outputs(input0, input1, ...)

#+END_EXAMPLE

depending on whether we declared a single output, or multiple outputs (see

below). It is also possible to pre-allocate the output array(s), and call the

functions like this (see below):

#+BEGIN_EXAMPLE

output = np.zeros(...)

f_one_output      (input0, input1, ..., out = output)

output0 = np.zeros(...)

output1 = np.zeros(...)

f_multiple_outputs(input0, input1, ..., out = (output0, output1))

#+END_EXAMPLE

Each wrapped function is broadcasting-aware. The normal numpy broadcasting rules

(as described in 'broadcast_define' and on the numpy website:

http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) apply. In

summary:

- Dimensions are aligned at the end of the shape list, and must match the

  prototype

- Extra dimensions left over at the front must be consistent for all the

  input arguments, meaning:

  - All dimensions of length != 1 must match

  - Dimensions of length 1 match corresponding dimensions of any length in

    other arrays

  - Missing leading dimensions are implicitly set to length 1

- The output(s) have a shape where

  - The trailing dimensions match the prototype

  - The leading dimensions come from the extra dimensions in the inputs

When we create a wrapper function, we only define how to compute a single

broadcasted slice. If the generated function is called with higher-dimensional

inputs, this slice code will be called multiple times. This broadcast loop is

produced by the numpysane_pywrap generator automatically. The generated code

also

- parses the python arguments

- generates python return values

- validates the inputs (and any pre-allocated outputs) to make sure the given

  shapes and types all match the declared shapes and types. For instance,

  computing an inner product of a 5-vector and a 3-vector is illegal

- creates the output arrays as necessary

This code-generator module does NOT produce any code to implicitly make copies

of the input. If the inputs fail validation (unknown types given, contiguity

checks failed, etc) then an exception is raised. Copying the input is

potentially slow, so we require the user to do that, if necessary.

** Explicated example

In the synopsis we declared the wrapper module like this:

#+BEGIN_EXAMPLE

m = npsp.module( name      = "innerlib",

                 docstring = "An inner product module in C")

#+END_EXAMPLE

This produces a module named "innerlib". Note that the python importer will look

for this module in a file called "innerlib$EXT_SUFFIX" where EXT_SUFFIX comes

from the python configuration. This is normal behavior for python extension

modules.

A module can contain many wrapper functions. Each one is added by calling

'm.function()'. We did this:

#+BEGIN_EXAMPLE

m.function( "inner",

            "Inner product pywrapped with numpysane_pywrap",

            args_input       = ('a', 'b'),

            prototype_input  = (('n',), ('n',)),

            prototype_output = (),

            Ccode_slice_eval = \

                {np.float64:

                 r"""

                   double* out = (double*)data_slice__output;

                   const int N = dims_slice__a[0];

                   *out = 0.0;

                   for(int i=0; i>> print(innerlib.inner( np.arange(4, dtype=float),

                          np.arange(8, dtype=float).reshape( 2,4)),

                          scale_string = "1.0")

[14. 38.]

>>> print(innerlib.inner( np.arange(4, dtype=float),

                          np.arange(8, dtype=float).reshape( 2,4),

                          scale        = 2.0,

                          scale_string = "10.0"))

[280. 760.]

#+END_EXAMPLE

** Precomputing a cookie outside the slice computation

Sometimes it is useful to generate some resource once, before any of the

broadcasted slices were evaluated. The slice evaluation code could then make use

of this resource. Example: allocating memory, opening files. This is supported

using a 'cookie'. We define a structure that contains data that will be

available to all the generated functions. This structure is initialized at the

beginning, used by the slice computation functions, and then cleaned up at the

end. This is most easily described with an example. The scaled inner product

demonstrated immediately above has an inefficiency: we compute

'atof(scale_string)' once for every slice, even though the string does not

change. We should compute the atof() ONCE, and use the resulting value each

time. And we can:

#+BEGIN_EXAMPLE

m.function( "inner",

            "Inner product pywrapped with numpysane_pywrap",

            args_input       = ('a', 'b'),

            prototype_input  = (('n',), ('n',)),

            prototype_output = (),

            extra_args = (("double",      "scale",          "1",    "d"),

                          ("const char*", "scale_string",   "NULL", "s")),

            Ccode_cookie_struct = r"""

              double scale; /* from BOTH scale arguments: "scale", "scale_string" */

            """,

            Ccode_validate = r"""

                if(scale_string == NULL)

                {

                    PyErr_Format(PyExc_RuntimeError,

                        "The 'scale_string' argument is required" );

                    return false;

                }

                cookie->scale = *scale * (scale_string ? atof(scale_string) : 1.0);

                return true; """,

            Ccode_slice_eval = \

                {np.float64:

                 r"""

                   double* out = (double*)data_slice__output;

                   const int N = dims_slice__a[0];

                   *out = 0.0;

                   for(int i=0; iscale;

                   return true;""" },

            // Cleanup, such as free() or close() goes here

            Ccode_cookie_cleanup = ''

)

#+END_EXAMPLE

We defined a cookie structure that contains one element: 'double scale'. We

compute the scale factor (from BOTH of the extra arguments) before any of the

slices are evaluated: in the validation function. Then we apply the

already-computed scale with each slice. Both the validation and slice

computation functions have the whole cookie structure available in '*cookie'. It

is expected that the validation function will write something to the cookie, and

the slice functions will read it, but this is not enforced: this structure is

not const, and both functions can do whatever they like.

If the cookie initialization did something that must be cleaned up (like a

malloc() for instance), the cleanup code can be specified in the

'Ccode_cookie_cleanup' argument to function(). Note: this cleanup code is ALWAYS

executed, even if there were errors that raise an exception, EVEN if we haven't

initialized the cookie yet. When the cookie object is first initialized, it is

filled with 0, so the cleanup code can detect whether the cookie has been

initialized or not:

#+BEGIN_EXAMPLE

m.function( ...

            Ccode_cookie_struct = r"""

              ...

              bool initialized;

            """,

            Ccode_validate = r"""

              ...

              cookie->initialized = true;

              return true;

            """,

            Ccode_cookie_cleanup = r"""

              if(cookie->initialized) cleanup();

            """ )

#+END_EXAMPLE

** Examples

For some sample usage, see the wrapper-generator used in the test suite:

https://github.com/dkogan/numpysane/blob/master/test/genpywrap.py

** Planned functionality

Currently, each broadcasted slice is computed sequentially. But since the slices

are inherently independent, this is a natural place to add parallelism. And

implemention this with something like OpenMP should be straightforward. I'll get

around to doing this eventually, but in the meantime, patches are welcome.

* COMPATIBILITY

Python 2 and Python 3 should both be supported. Please report a bug if either

one doesn't work.

* REPOSITORY

https://github.com/dkogan/numpysane

* AUTHOR

Dima Kogan 

* LICENSE AND COPYRIGHT

Copyright 2016-2020 Dima Kogan.

This program is free software; you can redistribute it and/or modify it under

the terms of the GNU Lesser General Public License (any version) as published by

the Free Software Foundation

See https://www.gnu.org/licenses/lgpl.html