Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/pllua/pllua

Re-implementation of pllua, embedded Lua for postgresql
https://github.com/pllua/pllua
Last synced: 4 months ago
JSON representation
Re-implementation of pllua, embedded Lua for postgresql
Host: GitHub
URL: https://github.com/pllua/pllua
Owner: pllua
License: mit
Fork: true (RhodiumToad/pllua-ng)
Created: 2018-03-09T07:28:00.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2023-07-29T19:29:17.000Z (over 1 year ago)
Last Synced: 2024-08-01T03:19:07.102Z (7 months ago)
Language: C
Size: 818 KB
Stars: 145
Watchers: 13
Forks: 16
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        
pllua [![Build Status](https://github.com/pllua/pllua/workflows/Build/badge.svg)](https://github.com/pllua/pllua/actions?query=workflow%3ABuild)

=====

Embeds Lua into PostgreSQL as a procedural language module.

Please report any remaining bugs or missing functionality on github.

Currently it should build against (recent point releases of) pg

versions 9.5 to 16. It is known that this module will never work on pg

versions before 9.5 (we rely critically on memory context callbacks,

which were introduced in that version).

Lua 5.4 (only 5.4.2 onwards, excluding 5.4.5), Lua 5.3, and LuaJIT

2.1beta (with COMPAT52) are fully supported at this time. (Lua 5.4.0

and 5.4.1 are checked for at runtime and rejected, because they would

otherwise require runtime stack limit calculations; 5.4.5 is checked

for at compile and run time and rejected because of an incompatible

ABI/API change.)

Documentation is being migrated from this README to a more comprehensive

document:

[Documentation](https://pllua.github.io/pllua/)

COMPATIBILITY WITH 0.x (previous pllua-ng releases)

---------------------------------------------------

The server.* namespace is removed, to facilitate the compatibility

option described below.

server.error, server.debug, etc. are now available via spi.error,

spi.debug, or can be obtained from the 'pllua.elog' package via

require().  If you want, you can do:

        pllua.on_common_init='server = require "pllua.elog"'

to restore the previous behavior.

COMPATIBILITY WITH 1.x (old pllua)

----------------------------------

This version permits a degree of compatibility with pllua 1.x via

the following setting:

        pllua.on_common_init='require "pllua.compat"'

This creates the server.* namespace with functions that match the 1.x

calling conventions (e.g. passing args as tables). It also creates

global functions fromstring(), subtransaction(), debug(), log(),

info(), notice(), warning(), and setshared().

The following incompatibilities remain:

 + coroutine.yield() with no result values in 1.x ended execution of

   the SRF; in this version it returns a NULL row and continues.

 + The global error() is not modified, so using it to throw a table

   has the same effect it would have in plain Lua. In 1.x, a metatable

   was added to the thrown value in this case.

 + The pgfunc library is not supported at all.

 + The "readonly" parameter to server.execute and friends is ignored.

   All queries in a stable function are readonly, and all queries in

   a volatile function are read-write.

 + Returning multi-dimensional arrays by doing a simple return of a

   Lua table is no longer supported.

The pllua.compat module is implemented in pure Lua (inside the sandbox

in trusted mode), see src/compat.lua for the implementation.

Please report any incompatibilities discovered.

CHANGES

-------

*NOTE:* the name of the module is now just "pllua", and its extension

packaging is split into two according to usual practice for pl modules

(in this case "pllua" for the trusted language and "plluau" for the

untrusted language).

(Compared to the old pllua project:)

Some names and locations have been changed.

The old pllua.init table is gone. Instead we support three init

strings (superuser-only): pllua.on_init, pllua.on_trusted_init,

pllua.on_untrusted_init.

Note that the on_init string can be run in the postmaster process, by

including pllua in shared_preload_libraries. Accordingly, on_init

cannot do any database access, and the only function directly

available from this module is print(), which in this environment will

output to the server log as LOG: messages.

The on_init string can now access the trusted.allow() functionality,

but only by doing an explicit require 'pllua.trusted'. e.g.

      local t = require 'pllua.trusted'

      t.allow{ "lpeg", "re" }

SPI functionality is now in global table spi and has different calling

conventions:

      spi.execute("query text", arg, arg, ...)

      spi.execute_count("query text", maxrows, arg, arg, ...)

      spi.prepare("query text", {argtypes}, [{options}])

        - returns a statement object:

          s:execute(arg, arg, ...)  - returns a result table

          s:execute_count(maxrows, arg, arg, ...)  - returns a result table

          s:rows(arg, arg, ...) - returns iterator

          s:numargs() - returns integer

          s:argtype(argnum) - returns typeinfo

      spi.rows("query text", args...)

        - returns iterator

Execution now returns a table with no number keys (#t == 0) in the

event of no matching rows, whereas the old version returned nil. The

result is also currently a plain table, not an object.

spi.prepare takes an options table with these possible values:

      scroll = true or false

      no_scroll = true

      fast_start = true

      custom_plan = true

      generic_plan = true

      fetch_count = integer

Note that "scroll" and "no_scroll" are independent options to the

planner, but we treat { scroll = false } as if it were { no_scroll = true }

because not doing so would be too confusing. The fetch_count value is

used only by the :rows iterator, to determine how much prefetch to use;

the default is 50. (Smaller values might be desirable for fetching very

large rows, or a value of 1 disables prefetch entirely.)

Cursors work:

      spi.findcursor("name")   - find already-open portal by name

      spi.newcursor(["name"])  - find existing cursor or create new one

      s:getcursor(args)   - get cursor from statement (can't specify name)

      c:open(stmt,args)   - open a cursor

      c:open(query,args)  - open a cursor

      c:isopen()          - is it open

      c:name()

      c:fetch([n, [dir]])  - fetch n rows in dir (default: forward 1)

      c:move([n, [dir]])

There can only be one cursor object for a given open portal - doing a

findcursor on an existing cursor will always return the same object.

(But note that this matching is by portal, not name - if a cursor was

closed and reopened with the same name, findcursor will return a

different object for the new cursor.) If a cursor is closed by

external code (or transaction end), then the :isopen() state will be

automatically updated (this happens when the portal is actually

dropped). Cursor options are set on the statement object.

Refcursor parameters and results are transparently converted to and

from SPI cursor objects.

:save on a statement is now a no-op - all statements seen by lua code

have been passed through SPI_keepplan and are managed by Lua garbage

collection. (It was never safe to do otherwise.)

(SPI interface is particularly subject to change - in particular to

something more compatible with client-side database APIs)

print() is still a global function to print an informational message,

but other error levels such as debug, notice are installed as

spi.debug(), spi.warning() etc. spi.elog('error', ...) is equivalent

to spi.error(...) and so on.

spi.error() and friends can take optional args:

      spi.error('message')

      spi.error('sqlstate', 'message')

      spi.error('sqlstate', 'message', 'detail')

      spi.error('sqlstate', 'message', 'detail', 'hint')

      spi.error({ sqlstate = ?,

                  message = ?,

                  detail = ?,

                  hint = ?,

                  table = ?,

                  column = ?, ...})

Sqlstates can be given either as 5-character strings or as the string

names used in plpgsql: spi.error('invalid_argument', 'message')

Subtransactions are implemented via pcall() and xpcall(), which now

run the called function in a subtransaction. In the case of xpcall,

the subtransaction is ended *before* running the error function, which

therefore runs in the outer subtransaction. This does mean that while

Lua errors in the error function cause recursion and are eventually

caught by the xpcall, if an error function causes a PG error then the

xpcall will eventually rethrow that to its own caller. (This is

subject to change if I decide it was a bad idea.)

e.g.

      local ok,err = pcall(function() --[[ do stuff in subxact ]] end)

      if not ok then print("subxact failed with error",err) end

Currently there's also an lpcall function which does NOT create

subtransactions, but which will catch only Lua errors and not PG

errors (which are immediately rethrown). It's not clear yet how useful

this is; it saves the (possibly significant) subxact overhead, but it

can be quite unpredictable whether any given error will manifest as a

Lua error or a PG error.

The readonly-global-table and setshared() hacks are omitted. As the

trusted language now creates an entirely separate lua_State for each

calling userid, anything the user does in the global environment can

only affect themselves.

Type handling is all different. The global fromstring() is replaced by

the pgtype package/function:

      pgtype(d)

        -- if d is a pg datum value, returns an object representing its

           type

      pgtype(d,n)

        -- if d is a datum, as above; if not, returns the object

           describing the type of argument N of the function, or the

           return type of the function if N==0

      pgtype['typename']

      pgtype.typename

      pgtype(nil, 'typename')

        -- parse 'typename' as an SQL type and return the object for it

      pgtype.array.typename

      pgtype.array['typename']

        -- return the type for "typename[]" if it exists

The object representing a type can then be called as a constructor for

datum objects of that type:

      pgtype['mytablename'](col1,col2,col3)

      pgtype['mytablename']({ col1 = val1, col2 = val2, col3 = val3})

      pgtype.numeric(1234)

      pgtype.date('2017-12-01')

      pgtype.array.integer(1,2,3,4)

      pgtype.array.integer({1,2,3,4}, 4)        -- dimension mandatory

      pgtype.array.integer({{1,2},{3,4}},2,2)   -- dimensions mandatory

      pgtype.numrange(1,2)        -- range type constructor

      pgtype.numrange(1,2,'[]')   -- range type constructor

or the :fromstring method can be used:

      pgtype.date:fromstring('string')

In turn, datum objects of composite type can be indexed by column

number or name:

      row.foo  -- value of column "foo"

      row[3]   -- note this is attnum=3, which might not be the third

                  column if columns have been dropped

Arrays can be indexed normally as a[1] or a[3][6] etc. By default

array indexes in PG start at 1, but values starting at other indexes

can be constructed. One-dimensional arrays (but not higher dimensions)

can be extended by adding elements with indexes outside the current

bounds; ranges of unassigned elements between assigned ones contain

NULL.

tostring() works on any datum and returns its string representation.

pairs() works on a composite datum (and actually returns the attnum as a

third result):

      for colname,value,attnum in pairs(row) do ...

The result is always in column order.

ipairs() should NOT be used on a composite datum since it will stop at

a null value or dropped column.

Arrays, composite types, and jsonb values support a mapping operation

controlled by a configuration table:

      rowval{ map = function(colname,value,attno,row) ... return value end,

              null = (any value, default nil),

              discard = (boolean, default false)

            }

      arrayval{ map = function(elem,array,i,j,k...) ... return elem end,

                null = (any value, default nil),

                discard = (boolean, default false)

              }

      jsonbval{ map = function(key,val,...) ... return key,val end,

                null = (any value, default nil),

                discard = (boolean, default false),

                pg_numeric = (boolean, default false)

              }

The result in all cases is returned as a Lua table, not a datum,

unless the "discard" option was given as true, in which case no

result at all is returned.

The map function for arrays is passed as many indexes as the original

array dimension.

The map function for jsonb values is passed the path leading up to the

current key (not including the key) as separate additional parameters.

The key is an integer if the current container is an array, a string

if the container is an object, and nil if this is a single top-level

scalar value (which I believe is not strictly allowed in the json

spec, but pg allows it). The key/val returned by the function are used

to store the result, but do not affect the path values passed to any

other function call. If discard is not specified, then the function is

also called for completed containers (in which case val will be a

table). If pg_numeric is not true, then numeric values are converted

to Lua numbers, otherwise they remain as Datum values of numeric type

(for which see below).

Substitution of null values happens BEFORE the mapping function is

called; if that's not what you want, then do the substitution yourself

before returning the result. (If the mapping function itself returns a

Lua nil, then the entry will be omitted from the result.)

As a convenience shorthand, these work:

      d(nvl)   -> d{null = nvl}

      d(func)  -> d{map = func}

      d()      -> d{}

Jsonb supports an inverse mapping operation for construction of json

values from lua data:

      pgtype.jsonb(value,

                   { map = function(val) ... return val end,

                     null = (any value, default nil),

                     empty_object = (boolean, default false)

                     array_thresh = (integer, default 1000)

                     array_frac = (integer, default 1000)

                   }

"value" can be composed of any combination of (where "collection"

means a value which is either a table or possesses a __pairs

metamethod):

 + Empty collections, which will convert to empty json arrays unless

   empty_object=true in which case they become empty objects

 + Collections with only integer keys >= 1, which will convert to json

   arrays (with lua index 1 becoming json index 0) unless either more

   than array_thresh initial null values would have to be inserted, or

   the total size of the array would be more than array_frac times the

   number of table keys.

 + Collections with keys which can be stringified: strings or numbers, or

   tables or userdata with __tostring methods, will convert to json

   objects.

 + Values which compare raw-equal to the "null" parameter are converted

   to json nulls

 + Values of type nil, boolean, number, string are converted to

   corresponding json values

 + Datum values of type pgtype.numeric convert to json numbers

 + Datum values of other types convert to json in the same way as they

   do in SQL; in particular, jsonb and json values are included

   directly, and values with casts to jsonb have those casts

   respected

 + Values of other types that possess a __tostring metamethod are

   converted to strings

Unlike the other mapping functions, the map function for this

operation is called only for values (including collections), not keys,

and is not passed any path information.

Range types support the following pseudo-columns (immutable):

      r.lower

      r.upper

      r.lower_inc

      r.upper_inc

      r.lower_inf

      r.upper_inf

      r.isempty

Function arguments are converted to simple Lua values in the case of:

 + integers, floats  -- passed as Lua numbers

 + text, varchar, char, json (not jsonb), xml, cstring, name -- all passed

   as strings (with the padding preserved in the case of char(n))

 + enums  -- passed as the text label

 + bytea  -- passed as a string without any escaping or conversion

 + boolean  -- passed as boolean

 

 + nulls of any type  -- passed as nil

 + refcursor values are converted to or from SPI cursor objects

   (whether or not they correspond to open portals)

 + domains over any of the above are treated as the base types

Other values are kept as datum objects.

The trusted language is implemented differently - rather than removing

functions and packages, the trusted language evaluates all

user-supplied code (everything but the init strings) in a separate

environment table which contains only whitelisted content. A mini

version of the package library is installed in the sandbox

environment, allowing package.preload and package.searchers to work

(the user can install their own function into package.searchers to

load modules from database queries if they so wish).

See the main documentation for details on making additional modules

available to the trusted language.

A set-returning function isn't considered to end until it either

returns or throws an error; yielding with no results is considered the

same as yielding with explicit nils. (Old version killed the thread in

that scenario.) A set-returning function that returns on the first

call with no result is treated as returning 0 rows, but if the first

call returns values, those are treated as the (only) result row.

Trigger functions no longer have a global "trigger" object, but rather

are compiled with the following definition:

      function(trigger,old,new,...) --[[ body here ]] end

"trigger" is now a userdata, not a table, but can be indexed as

before.  Trigger functions may assign a row to trigger.row, or modify

fields of trigger.row or trigger.new, or may return a row or table; if

they do none of these and return nothing, they're treated as returning

trigger.row unchanged. Note that returning nil or assigning row=nil to

suppress the triggered operation is in general a bad idea; if you need

to prevent an action, then throw an error instead.

An interface to pg's "numeric" type (decimal arithmetic) is provided;

see the main documentation for details.

Polymorphic and variadic functions are fully supported, including

VARIADIC "any". VARIADIC of non-"any" type is passed as an array as

usual.

Interpreters are shut down on backend exit, meaning that finalizers

will be run for all objects at this time (including user-defined ones).

Currently, SPI functionality is disabled during exit.

AUTHOR

------

Andrew Gierth, aka RhodiumToad

The author acknowledges the work of Luis Carvalho and other contributors

to the original pllua project (of which this is a ground-up redesign).

License: MIT license