Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/klmitch/turnstile
A distributed rate limiting WSGI middleware.
https://github.com/klmitch/turnstile
Last synced: 23 days ago
JSON representation
A distributed rate limiting WSGI middleware.
- Host: GitHub
- URL: https://github.com/klmitch/turnstile
- Owner: klmitch
- License: apache-2.0
- Created: 2012-02-10T05:16:36.000Z (almost 13 years ago)
- Default Branch: master
- Last Pushed: 2013-09-20T23:21:02.000Z (about 11 years ago)
- Last Synced: 2024-11-07T05:52:25.390Z (about 1 month ago)
- Language: Python
- Homepage:
- Size: 851 KB
- Stars: 104
- Watchers: 5
- Forks: 6
- Open Issues: 2
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
- starred-awesome - turnstile - A distributed rate limiting WSGI middleware. (Python)
README
==============================================
Turnstile Distributed Rate-Limiting Middleware
==============================================Turnstile is a piece of WSGI middleware that performs true distributed
rate-limiting. System administrators can run an API on multiple
nodes, then place this middleware in the pipeline prior to the
application. Turnstile uses a Redis database to track the rate at
which users are hitting the API, and can then apply configured rate
limits, even if each request was made against a different API node.Installing Turnstile
====================Turnstile can be easily installed like many Python packages, using
`PIP`_::pip install turnstile
You can install the dependencies required by Turnstile by issuing the
following command::pip install -r .requires
From within your Turnstile source directory.
If you would like to run the tests, you can install the additional
test dependencies in the same way::pip install -r .test-requires
Then, to run the test suite, use::
nosetests -v
Alternatively, it is possible to run the full test suite using a
virtual environment using the tox tool; this is the recommended way
for developers to run the test suite. Four environments are defined:
"py26" and "py27" run the tests under Python 2.6 and Python 2.7,
respectively; "pep8" runs the pep8 style compliance tool (which should
only be done by developers); and "cover" runs the test suite under the
default Python installation, but with coverage enabled. The coverage
report generated by the "cover" environment is summarized in the HTML
files present in the "cov_html" subdirectory. An example tox
invocation::tox -e py27,pep8
Adding and Configuring Turnstile
================================Turnstile is intended for use with PasteDeploy-style configuration
files. It is a filter, and should be placed in an appropriate place
in the WSGI pipeline such that the limit classes used with Turnstile
can access the information necessary to make rate-limiting decisions.
(With the ``turnstile.limits:Limit`` class provided by Turnstile, no
additional information is required, as that class does not
differentiate between users of your application.)The filter section of the PasteDeploy configuration file will also
need to contain enough information to allow Turnstile to access the
Redis database. Other options may be configured from here as well,
such as the ``enable`` configuration variable. The simplest example
of a Turnstile configuration would be::[filter:turnstile]
use = egg:turnstile#turnstile
redis.host =The following are the recognized configuration options:
compactor.compactor_key
Specifies the sorted set that the compactor daemon uses for
communication of buckets that need to be compacted. (See below for
more information about the purpose of the compactor daemon.) This
option defaults to "compactor".compactor.compactor_lock
When multiple compactor daemons are being run, it is necessary to
serialize their access to the sorted set specified by
``compactor.compactor_key``. This option specifies a Redis key
containing the lock, and it defaults to "compactor_lock".compactor.compactor_timeout
If a compactor daemon (or its host) crashes while holding the lock,
the lock will eventually time out, to allow other compactor daemons
to run. This option specifies the timeout in seconds, and defaults
to 30.compactor.max_age
The bucket processing logic adds special "summarize" records to the
bucket representation, to signal to other Turnstile instances that a
request to summarize the bucket has been submitted. These records
must age for a minimum amount of time, to ensure that all Turnstile
instances have seen them, before the compactor daemon can run on the
bucket. However, if the summarize request to the compactor daemon
is lost, there must be a timeout, to ensure that a new request to
summarize a given bucket may be submitted. This option specifies a
maximum age for a "summarize" record, in seconds, and defaults to
600.compactor.max_updates
The bucket processing logic adds special "summarize" records to the
bucket representation, to signal to other Turnstile instances that a
request to summarize the bucket has been submitted. These requests
are generated when the number of update records in the bucket
representation exceed the value specified by this configuration
value. This option must be specified to enable the compaction
logic; a good value would be 30.compactor.min_age
The bucket processing logic adds special "summarize" records to the
bucket representation, to signal to other Turnstile instances that a
request to summarize the bucket has been submitted. These records
must age for a minimum amount of time, to ensure that all Turnstile
instances have seen them, before the compactor daemon can run on the
bucket. This option specifies the minimum age for a "summarize"
record, in seconds, and defaults to 30.compactor.sleep
The compactor daemon reads bucket keys from a sorted set in the
Redis database. If no keys are present, it will read from the
sorted set again, in a loop. To ensure that the compactor daemon
does not consume too much CPU time, after each read that returns no
bucket to compact, it will sleep for the number of seconds defined
by this option. The default is 5.config
Allows specification of an alternate configuration file. This can
be used to generate a single file which can be shared by WSGI
servers using the Turnstile middleware and the various provided
tools. This can also allow for separation of code-related options,
such as the ``enable`` option, from pure configuration, such as the
``redis.host`` option. The configuration file is an INI-formatted
file, with section names corresponding to the first segment of the
configuration option name. That is, the ``redis.host`` option would
be set as follows::[redis]
host =Configuration options which have no prefix are grouped under the
``[turnstile]`` section of the file, as follows::[turnstile]
status = 404 Not FoundNote that specifying the ``config`` option in the ``[turnstile]``
section will have no effect; it is not possible to cause another
configuration file to be included in this way.control.channel
Specifies the channel that the control daemon listens on. (See
below for more information about the purpose of the control daemon.)
This option defaults to "control".control.errors_channel
Specifies the channel that the control daemon (see below) reports
errors to. This option defaults to "errors".control.errors_key
Specifies the key of a set in the Redis database to which errors
will be stored. This option defaults to "errors".control.limits_key
The key under which the limits are stored in the database. See the
section on tools for more information on how to load and dump the
limits stored in the Redis database. This option defaults to
"limits".control.node_name
The name of the node. If provided, this option allows the
specification of a recognizable name for the node. Currently, this
node name is only reported when issuing a "ping" command to the
control daemon (see below), and may be used to verify that all hosts
responded to the ping.control.reload_spread
When limits are changed in the database, a command is sent to the
control daemon (see below) to cause the limits to be reloaded. As
having all nodes hit the Redis database simultaneously may overload
the database, this option, if set, allows the reload to be spread
out randomly within a configured interval. This option should be
set to the size of the desired interval, in seconds. If not set,
limits will be reloaded immediately by all nodes.control.remote
If set to "on", "yes", "true", or "1", Turnstile will connect to a
remote control daemon (see the ``remote_daemon`` tool described
below). This enables Turnstile to be compatible with WSGI servers
which use multiple worker processes. Note that the configuration
values ``control.remote.authkey``, ``control.remote.host``, and
``control.remote.port`` are required.control.remote.authkey
Set to an authentication key, for use when ``control.remote`` is
enabled. Must be the value used by the invocation of
``remote_daemon``.control.remote.host
Set to a host name or IP address, for use when ``control.remote`` is
enabled. Must be the value used by the invocation of
``remote_daemon``.control.remote.port
Set to a port number, for use when ``control.remote`` is enabled.
Must be the value used by the invocation of ``remote_daemon``.control.shard_hint
Can be used to set a sharding hint which will be provided to the
listening thread of the control daemon (see below). This hint is
not used by the default Redis ``Connection`` class.enable
Contains a list of ``turnstile.preprocessor`` and
``turnstile.postprocessor`` entrypoint names. Each name is resolved
into a preprocessor and postprocessor function (missing entrypoints
are ignored) and installed, as with the ``preprocess`` and
``postprocess`` configuration options. Note that the postprocessors
will be in the reverse ordering of the list contained in this
option. See the section on entrypoints for more information.Note that, if ``enable`` is used, ``preprocess`` and ``postprocess``
will be ignored.formatter
In previous versions of Turnstile, the only way to change the way
the delay response was generated was to subclass
``turnstile.middleware.TurnstileMiddleware`` and override the
``format_delay()`` method; this subclass could then be used by
specifying it as the value of the ``turnstile`` option. This
version now allows the formatter to be explicitly specified, using
this option.Searches for the formatter in the ``turnstile.formatter`` entrypoint
group; see the section on entrypoints for more information.postprocess
Contains a list of postprocessor functions. During each request,
each postprocessor will be called in turn, with the middleware
object (from which can be obtained the database handle, as well as
the configuration) and the request environment as arguments. Note
that any exceptions thrown by the postprocessors will not be caught,
and request processing will be halted; this will likely result in a
500 error being returned to the user. Postprocessors are only run
after processing all limits; most applications will not need to
install a postprocessor.Searches for the postprocessor in the ``turnstile.postprocessor``
entrypoint group; see the section on entrypoints for more
information.Note that, if ``enable`` is used, this option will be ignored.
preprocess
Contains a list of preprocessor functions. During each request,
each preprocessor will be called in turn, with the middleware object
(from which can be obtained the database handle, as well as the
configuration) and the request environment as arguments. Note that
any exceptions thrown by the preprocessors will not be caught, and
request processing will be halted; this will likely result in a 500
error being returned to the user. Preprocessors are run before
processing the limits.Searches for the preprocessor in the ``turnstile.preprocessor``
entrypoint group; see the section on entrypoints for more
information.Note that, if ``enable`` is used, this option will be ignored.
redis.connection_pool
Identifies the connection pool class to use. If not provided,
defaults to ``redis.ConnectionPool``. This may be used to allow
client-side sharding of the Redis database.Searches for the connection pool class in the
``turnstile.connection_pool`` entrypoint group; see the section on
entrypoints for more information.redis.connection_pool.connection_class
Identifies the connection class to use. If not provided, the
appropriate ``redis.Connection`` subclass for the configured
connection is used (``redis.Connection`` if ``redis.host`` is
specified, else ``redis.UnixDomainSocketConnection``).Searches for the connection class in the
``turnstile.connection_class`` entrypoint group; see the section on
entrypoints for more information.redis.connection_pool.max_connections
Allows specification of the maximum number of connections to the
Redis database. Optional.redis.connection_pool.parser_class
Identifies the parser class to use. Optional. This is an advanced
feature of the ``redis`` package used by Turnstile.Searches for the parser class in the ``turnstile.parser_class``
entrypoint group; see the section on entrypoints for more
information.redis.connection_pool.*
Any other configuration value provided in the
``redis.connection_pool.`` hierarchy will be passed as keyword
arguments to the configured connection pool class. Note that the
values will be passed as strings.redis.db
Identifies the specific sub-database of the Redis database to be
used by Turnstile. If not provided, defaults to 0.redis.host
Identifies the host name or IP address of the Redis database to
connect to. Either ``redis.host`` or ``redis.unix_socket_path``
must be provided.redis.password
If the Redis database has been configured to use a password, this
option allows that password to be specified.redis.port
Identifies the port the Redis database is listening on. If not
provided, defaults to 6379.redis.redis_client
Identifies a ``redis.StrictRedis`` subclass or analog, which will be
used as the client library for communicating with the Redis
database. This allows alternate clients which support clustering or
sharding to be used by Turnstile.Searches for the client class in the ``turnstile.redis_client``
entrypoint group; see the section on entrypoints for more
information.redis.socket_timeout
If provided, specifies an integer socket timeout for the Redis
database connection.redis.unix_socket_path
Names the UNIX socket on the local host for the local Redis database
to connect to. Either ``redis.host`` or ``redis.unix_socket_path``
must be provided.status
Contains the status code to return if rate limiting is tripped.
This defaults to "413 Request Entity Too Large". Note that this
value must start with the 3-digit HTTP code, followed by a space and
the text corresponding to that status code. Also note that,
regardless of the status code, Turnstile will include the
``Retry-After`` header in the response. (The value of the
``Retry-After`` header will be the integer number of seconds until
the request can be retried.)turnstile
If set, identifies an alternate class to use for the Turnstile
middleware. This can be used in conjunction with subclassing
``turnstile.middleware:TurnstileMiddleware``, which may be done to
override how over-limit conditions are formatted.Searches for the middleware class in the ``turnstile.middleware``
entrypoint group; see the section on entrypoints for more
information.This option is deprecated. To override the delay formatting
function, use the ``formatter`` option.Other configuration values are available to the preprocessors, the
postprocessors, the delay formatters, and the
``turnstile.limits:Limit`` subclasses, but extreme care should be
taken that such configurations remain in sync across the entire
cluster.Entrypoints
===========Turnstile takes many options which allow functions or classes to be
specified, as indicated above. All of these options expect their
values to be given in one of two forms. The first form, which was the
only valid format for older versions of Turnstile, is the
"module:name" format. However, Turnstile now has support for the
``pkg_resources`` "entrypoint" abstraction, which allows packages to
define a set of entrypoints. Entrypoints are organized into groups,
all having a similar interface; and each entrypoint has a given name.
To use a function or class which has a declared entrypoint, simply use
the name of that entrypoint. (Note that names are prohibited from
containing colons, to distinguish between the two formats.)The following entrypoint groups are recognized by Turnstile:
turnstile.command
The control daemon accepts commands from remote callers. One of
these commands is the "reload" command, which causes Turnstile to
reload the limits configuration from the Redis database. A second
built-in command is the "ping" command, which can be used to ensure
all Turnstile instances are receiving command messages. It is
possible to create additional commands by associating the command
string with a function under this entrypoint group. The function
has the following signature::def func(daemon, *args):
passThe first argument will be the actual control daemon (which could be
either a ``turnstile.control.ControlDaemon`` or a
``turnstile.remote.RemoteControlDaemon``); the remaining arguments
are the arguments passed to the command. See the
``turnstile-command`` tool for a way to submit arbitrary commands of
this form.turnstile.connection_class
The default Redis database client uses either a
``redis.UnixDomainSocketConnection`` or a ``redis.Connection``
object to maintain the connection to the Redis database. The
``redis.connection_pool.connection_class`` configuration value
allows this default to be overridden. Alternate classes will be
searched for in this entrypoint group, if there is no colon (":")
present in the configuration value. See the documentation for
``redis.Connection`` for details on this interface.turnstile.connection_pool
The default Redis database client maintains connections in a pool,
maintained as a ``redis.ConnectionPool`` object. The
``redis.connection_pool`` configuration value allows this default to
be overridden. Alternate classes will be searched for in this
entrypoint group, if there is no colon (":") present in the
configuration value. See the documentation for
``redis.ConnectionPool`` for details on this interface.turnstile.formatter
When the rate limiting logic determines that the request is
rate-limited, Turnstile generates a response indicating that the
REST client should try again after a certain delay. This response
can be formatted in any desired way by using the ``formatter``
configuration option to specify an alternate function, which will be
searched for under this entrypoint group. The formatter function
has the following signature::def formatter(status, delay, limit, bucket, environ, start_response):
passThe ``status`` is the configured status code for this Turnstile
instance. The ``delay`` is a float value, specifying the length of
the required delay in seconds. The ``limit`` and ``bucket`` values
specify the actual underlying ``turnstile.limits.Limit`` and
``turnstile.limits.Bucket`` subclasses associated with that delay;
alternate formatters can use the ``turnstile.limits.Limit.format()``
method to obtain a status and result entity specific for that limit.
Finally, ``environ`` and ``start_response`` come from the WSGI
pipeline; additional Turnstile configuration values can be retrieved
from the ``turnstile.conf`` key in ``environ``.turnstile.limit
The ``setup_limits`` tool reads the limits configuration from an XML
file. In that file, each limit has an associated limit class,
specified by the "class" attribute of the ```` element. When
dumped using the ``dump_limits`` tool, this attribute will always be
a "module:class" pair, but ``setup_limits`` recognizes short names,
which will be searched for in this entrypoint group. See the
documentation for ``turnstile.limits.Limit`` for details on this
interface.turnstile.middleware
Older versions of Turnstile allowed the formatter to be configured
by subclassing ``turnstile.middleware.TurnstileMiddleware`` and
overriding the ``format_delay()`` method. Although this is now
deprecated, it is still possible, using the ``turnstile`` option in
the configuration, to specify a subclass of ``TurnstileMiddleware``
that ``turnstile.middleware.turnstile_filter()`` should use. When
no colon (":") is present in the ``turnstile`` configuration value,
this is the entrypoint group that will be searched. See the
documentation for ``TurnstileMiddleware`` for details on this
interface.turnstile.parser_class
The default Redis database client uses either a
``redis.connection.PythonParser`` or a
``redis.connection.HiredisParser`` object to parse the data stream
from the Redis database. The ``redis.connection_pool.parser_class``
configuration value allows this default to be overridden. Alternate
classes will be searched for in this entrypoint group, if there is
no colon (":") present in the configuration value. See the
documentation for ``redis.connection.PythonParser`` for details on
this interface.turnstile.postprocessor
Postprocessors run immediately after searching all the limits and
verifying that the request should not be rate-limited. (They will
not be run if the request is rate-limited.) They can be specified
using either the ``postprocess`` or ``enable`` configuration
options. The postprocessor function has the following signature::def proc(middleware, environ)
passThe first argument is the actual middleware object, from which the
configuration can be retrieved; the second argument is the WSGI
environment.turnstile.preprocessor
Preprocessors run immediately before searching all the limits. They
can be specified using either the ``preprocess`` or ``enable``
configuration options. The preprocessor function has the following
signature::def proc(middleware, environ)
passThe first argument is the actual middleware object, from which the
configuration can be retrieved; the second argument is the WSGI
environment.turnstile.redis_client
By default, Turnstile uses a ``redis.StrictRedis`` object to
communicate with the Redis database. The ``redis.redis_client``
configuration value allows this default to be overridden. Alternate
classes will be searched for in this entrypoint group, if there is
no colon (":") present in the configuration value. See the
documentation for ``redis.StrictRedis`` for details on this
interface.The Control Daemon
==================Turnstile stores the limits configuration in the Redis database, in
addition to the ephemeral information used to check and enforce the
rate limits. This makes it possible to change the limits dynamically
from a single, central location. In order to facilitate such changes,
each Turnstile instance uses an eventlet thread to run a "control
daemon." The control daemon uses the publish/subscribe support
provided by Redis to listen for commands, of which two are currently
recognized: ping and reload.Some WSGI servers cannot use Turnstile in this mode, due to using
multiple processes (typically through use of the "multiprocessing"
Python module). In these circumstances, the control daemon may be
started in its own process (see the ``remote_daemon`` tool). Enabling
this requires that the ``control.remote`` configuration option be
turned on, and values provided for ``control.remote.authkey``,
``control.remote.host``, and ``control.remote.port``. See the
documentation for these options for more information.It is possible to configure the listening thread of the control daemon
to use alternate configuration for connecting to the Redis database.
The defaults will be drawn from the ``[redis]`` section of the
configuration, but by specifying ``redis.*`` options in the
``[control]`` section of the configuration, specific values may be
overridden.The Ping Command
----------------The "ping" command is the simplest of the control daemon commands. In
its simplest form, the message "ping:" is written to the control
channel, which will cause all running Turnstile instances to return
the message "pong" to the specified channel. If the
``control.node_name`` configuration option has been set, this node
name will be included in the response, as "pong:".
Finally, additional data (such as a timestamp) can be included in the
"ping" command, as in the message "ping::"; this
data will be appended to the response, i.e., "pong::". This could be used to verify that all nodes are
responding and not too heavily loaded.(Note that if ``control.node_name`` is not specified, the response to
a "ping" command containing additional data such as a timestamp will
be "pong::".)Note that, at present, no tool exists for sending pings or receiving
pongs.The Reload Command
------------------The "reload" command is the real reason for the existence of the
control daemon. This command causes the current set of limits to be
reloaded from the database and presented to the middleware for
enforcement.The simplest form of the reload command is simply, "reload". If the
``control.reload_spread`` configuration option was set, the reload
will be scheduled for some time within the configured time interval;
otherwise, it will be performed immediately.The next simplest form of the reload command is "reload:immediate".
This causes an immediate reload of the limits, overriding any
configured time spread.The final form of the reload command is "reload:spread:",
where the "" specifies a time interval, in seconds, over
which to spread reloading of the limits. This specified interval is
used in preference to that specified by ``control.reload_spread``, if
set.Note that the ``setup_limits`` tool automatically initiates a reload
once the limits are updated in the database. See the section on tools
for more information.The Compactor Daemon
====================This version of Turnstile includes scalability enhancements which
change how bucket data is stored in the Redis database. This
eliminates the need for transactions--enabling various Redis
clustering tools to be used--but at the cost of increased storage for
the bucket data. Buckets are now stored as lists of records; each
request processed by Turnstile results in the addition of an "update"
record to the bucket representation. Then, to determine whether the
request should be rate-limited, the bucket is reconstructed by
applying all of the updates.To prevent this list of records from growing without bound, the rate
limiting logic includes a mechanism for triggering the compaction of a
bucket--many of these update records are compacted into a single
"bucket" record. This is triggered by setting a non-zero value for
the ``compactor.max_updates`` configuration option. When the number
of update records exceeds this threshold, a signal will be sent to the
compactor daemon, which performs the actual compaction algorithm.The compaction logic works by adding special "summarize" records to
the bucket representation and placing the bucket's key into a special
sorted set. The compactor daemon allows these entries in the sorted
set to age for a given period of time (under control of
``compactor.min_age``). Although no new summarize records will be
added to the bucket representation if one is already present, there is
the potential for multiple Turnstile instances to add one
simultaneously; this aging allows all Turnstile instances to see that
a summarize request is in progress.Once a summarize request has aged sufficiently, the compactor daemon
will perform the compaction and insert the resulting bucket back into
the list representation. It then eliminates the now-extraneous update
records.If a summarize request is lost, due to a compactor daemon (or its
host) crashing, the summarize records in the bucket representation
have a maximum age as well; once the record exceeds its maximum age, a
new summarize request will be generated.Turnstile Tools
===============The limits are stored in the Redis database using a sorted set, and
they are encoded using Msgpack. (Although the Msgpack format is not
human-readable, it is very space and time efficient, which is why it
was chosen for this application.) This makes manual management of the
limits configuration more difficult, and so Turnstile ships with two
tools to make management of the rate limiting configuration easier. A
third tool starts up a remote control daemon, for use when Turnstile
is used with applications that run multiple processes, such as the
``nova-api`` component of OpenStack.The ``compactor_daemon`` Tool
-----------------------------The ``compactor_daemon`` tool may be used to start a compactor daemon
process. This tool requires the name of an INI-style configuration
file; see the section on configuring the tools below for more
information.A usage summary for ``compactor_daemon``::
usage: compactor_daemon [-h] [--log-config LOGGING] [--debug] config
Run the compactor daemon.
positional arguments:
config Name of the configuration file.optional arguments:
-h, --help show this help message and exit
--log-config LOGGING, -l LOGGING
Specify a logging configuration file.
--debug, -d Run the tool in debug mode.The ``dump_limits`` Tool
------------------------The ``dump_limits`` tool may be used to dump the current limits in the
database into an XML representation. This tool requires the name of
an INI-style configuration file; see the section on configuring the
tools below for more information.A usage summary for ``dump_limits``::
usage: dump_limits [-h] [--debug] config limits_file
Dump the current limits from the Redis database.
positional arguments:
config Name of the configuration file, for connecting to the Redis
database.
limits_file Name of the XML file that the limits will be dumped to.optional arguments:
-h, --help show this help message and exit
--debug, -d Run the tool in debug mode.The ``remote_daemon`` Tool
--------------------------The ``remote_daemon`` tool may be used to start a separate control
daemon process. This tool requires the name of an INI-style
configuration file; see the section on configuring the tools below for
more information. Note that, in addition to the required Redis
configuration values, configuration values for the
``control.remote.authkey``, ``control.remote.host``, and
``control.remotes.port`` options must be provided.A usage summary for ``remote_daemon``::
usage: remote_daemon [-h] [--log-config LOGGING] [--debug] config
Run the external control daemon.
positional arguments:
config Name of the configuration file.optional arguments:
-h, --help show this help message and exit
--log-config LOGGING, -l LOGGING
Specify a logging configuration file.
--debug, -d Run the tool in debug mode.The ``setup_limits`` Tool
-------------------------The ``setup_limits`` tool may be used to read an XML file (such as
that produced by ``dump_limits``) and load the rate limiting
configuration into the Redis database. This tool requires the name of
an INI-style configuration file; see the section on configuring the
tools below for more information.A usage summary for ``setup_limits``::
usage: setup_limits [-h] [--debug] [--dryrun] [--noreload]
[--reload-immediate] [--reload-spread SECS]
config limits_fileSet up or update limits in the Redis database.
positional arguments:
config Name of the configuration file, for connecting to the
Redis database.
limits_file Name of the XML file describing the limits to
configure.optional arguments:
-h, --help show this help message and exit
--debug, -d Run the tool in debug mode.
--dryrun, --dry_run, --dry-run, -n
Perform a dry run; inhibits loading data into the
database.
--noreload, -R Inhibit issuing a reload command.
--reload-immediate, -r
Cause all nodes to immediately reload the limits
configuration.
--reload-spread SECS, -s SECS
Cause all nodes to reload the limits configuration
over the specified number of seconds.The ``turnstile_command`` Tool
------------------------------The ``turnstile_command`` tool may be used to send arbitrary commands
to all running control daemons. This tool requires the name of an
INI-style configuration file; see the section on configuring the tools
below for more information.A usage summary for ``turnstile_command``::
usage: turnstile_command [-h] [--listen CHANNEL] [--debug]
config command [arguments [arguments ...]]Issue a command to all running control daemons.
positional arguments:
config Name of the configuration file.
command The command to execute. Note that 'ping' is handled
specially; in particular, the --listen parameter is
implied.
arguments The arguments to pass for the command. Note that the
colon character (':') cannot be used.optional arguments:
-h, --help show this help message and exit
--listen CHANNEL, -l CHANNEL
A channel to listen on for the command responses. Use
C-c (or your systems keyboard interrupt sequence) to
stop waiting for responses.
--debug, -d Run the tool in debug mode.Configuring the Tools
---------------------All of the tools require an INI-style configuration file, which
specifies how to connect to the Redis database. This file should
contain the section "[redis]" and should be populated with the same
"redis.*" options as the PasteDeploy configuration file, minus the
"redis." prefix. For example::[redis]
host =Each "redis.*" option recognized by the Turnstile middleware is
understood by the tools.Additional options may be provided, such as the control channel,
limits key, and the ``compactor_daemon`` and ``remote_daemon``
options. The configuration file should be compatible with the
alternate configuration file described under the ``config``
configuration option for the Turnstile middleware.Rate Limit XML
--------------The XML file used for expressing rate limit configuration is
relatively straightforward, or at least as straightforward as XML can
be. The top-level element is ````; this should contain a
sequence of ```` elements, each containing a number of
```` elements. The specific attributes available for any given
limit class depend on the exact class, but that information is
documented in the ``attrs`` attribute of the limit class. (This
information is suitable for introspection.)The ```` element has one XML attribute which must be specified:
the ``class`` attribute, which must identify the desired limit class.
This value must be specified either as a "module:class" string, or a
single name corresponding to a "turnstile.limit" entrypoint group.
The ```` element also has a single XML attribute which must be
set: ``name``, which identifies the name of the Limit attribute. The
contents of the ```` element identify the value for the named
attribute.Some limit attributes are lists; for these attributes, the ````
element must contain one or more ```` elements, whose contents
identify a single item in the attribute list. Other limit attributes
are dictionaries; for these attributes, again the ```` element
must contain one or more ```` elements, but now those
```` elements must have the XML attribute ``key`` set to the
dictionary key corresponding to that value.As an example, consider the following limits configuration::
[0-9]+
second
/page/{pageid}
10
GET
In this example, GET access to ``/page/{pageid}`` is rate-limited to
10 per second. The ``requirements`` attribute may be used to specify
regular expressions to tune the matching of URI components; in this
case, the ``{pageid}`` value must be composed of 1 or more digits.
The limit class used is the basic ``turnstile.limits:Limit`` limit
class.Custom Limit Classes
====================All limit classes must descend from ``turnstile.limits:Limit``. This
admittedly un-Pythonic requirement has a number of advantages,
including the specific machinery which allows limits to be stored into
the Redis database. Most limit classes only need to worry about the
``attrs`` class attribute and the ``filter()`` method, although the
``route()`` and ``format()`` methods may also be hooked. For more
information about these methods, see the docstrings provided for their
default implementations in ``turnstile.limits:Limit``.Accessing the Turnstile Configuration
=====================================The Turnstile configuration is available to preprocessors and to the
Limit classes. For preprocessors, it is available directly from the
middleware object (the first passed parameter) via the ``config``
attribute. (The database handle is also available via the ``db``
attribute, should access to the database be required.) For the
``filter()`` method of the Limit classes, the configuration is
available in the request environment under the ``turnstile.conf`` key.The Turnstile configuration is represented as a
``turnstile.config:Config`` object. Configuration keys that do not
contain a "." are available as attributes of this object; for example,
to obtain the configured status value, assuming the Turnstile
configuration is available in the ``conf`` variable, the correct code
would be::status = conf.status
For those configuration keys which do contain a ".", the part of the
name to the left of the first "." becomes a dictionary key, and the
remainder of the name will be a second key. For example, to access
the value of the ``redis.connection_pool.connection_class`` variable,
the correct code would be::connection_class = config['redis']['connection_pool.connection_class']
All values in the configuration are stored as strings. Configuration
values do not need to be pre-declared in any way; Turnstile ignores
(but maintains) configuration values that it does not use, making
these values available for use by preprocessors and Limit subclasses.For convenience, the ``turnstile.config:Config`` class offers a static
method ``to_bool()`` which can convert a string value to a boolean
value. The strings "t", "true", "on", "y", and "yes" are all
recognized as a boolean ``True`` value, as are numeric strings which
evaluate to non-zero values. The strings "f", "false", "off", "n",
and "no" are all recognized as a boolean ``False`` value, as are
numeric strings which evaluate to zero values. Any other string value
will cause ``to_bool()`` to raise a ``ValueError``, unless the
``do_raise`` argument is given as ``False``, in which case
``to_bool()`` will return a boolean ``False`` value.Determining User Buckets
========================Some applications need to be able to inform the user of the next time
they are able to make a call against a given URI, often as a part of
listing the limits applying to that user. This entails access to the
bucket data for that user. Under previous versions of Turnstile, this
could only be accomplished by using the Redis "KEYS" command, which is
most definitely not scalable. A new feature in Turnstile allows
preprocessors to add the name of a sorted set in the WSGI environment
variable ``turnstile.bucket_set``; if this environment variable is set
when a limit is processed, it will store the bucket key that was used
into the named sorted set. The score used for this will be the
expiration time for the bucket, which can be used to eliminate entries
for buckets that have expired from the database.Applications that have this requirement should implement both a
preprocessor and a postprocessor; the preprocessor should set
``turnstile.bucket_set`` to an appropriate value, and the
postprocessor should trim off the outdated entries from the named
sorted set and load the buckets, performing whatever processing is
necessary to make the data available to the application.Backwards Compatibility and Interoperability
============================================This version of Turnstile includes several enhancements, such as the
addition of postprocessors and the ``enable`` configuration value.
For the vast majority of these enhancements, backwards compatibility
has been preserved; if you see an issue caused by lack of backwards
compatibility, please log it as a bug.There are, however, several features that have been deprecated in
previous versions of Turnstile which are now removed; these are listed
below:* The special treatment of the ``[connection]`` section of the
configuration is removed; users should use the options in the
``[redis]`` and ``[control]`` sections.
* The ``turnstile.config`` variable in the WSGI environment is
removed; users should use the ``turnstile.conf`` variable instead.
* The ``config`` property of the middleware object is removed; users
should use the ``conf`` attribute instead.
* The ``import_class()`` function of ``turnstile.utils`` is removed;
users should use the ``find_entrypoint()`` function instead.
* The ``TurnstileRedis`` class of ``turnstile.database`` is removed,
along with its ``safe_update()``, ``limit_update()``, and
``command()`` methods. The latter two have been replaced by
``limit_update()`` and ``command()`` functions declared in the
``turnstile.database`` module. There is no replacement for
``safe_update()``.The following features have been deprecated and will be removed in
future versions of Turnstile:* Overriding the ``TurnstileMiddleware`` class with the ``turnstile``
configuration option is deprecated; users should use the
``formatter`` option to override delay formatting.
* The ``decode()`` method of ``Limit`` classes is deprecated. Use the
``BucketKey`` class in ``turnstile.limits`` to decode bucket keys.
* Except for the ``setup_limits`` tool's XML input file, the
specification of functions and classes using "module:function" or
"module:class" syntax is deprecated; Turnstile is moving to a
``pkg_resources`` entrypoint-based approach. See the section on
entrypoints above for more information.Interoperability with Older Versions of Turnstile
-------------------------------------------------This version of Turnstile is not completely interoperable with older
versions of Turnstile. Care has been taken to ensure that both new
and old instances of Turnstile can run against the same database;
however, the old versions cannot load bucket data from new versions
and vice versa. Thus, users should only be running both versions
during a transitional period; avoid running both versions for an
extended period of time.The bucket storage format has changed; the new format enhances
Turnstile's scalability by eliminating the use of transactions when
storing bucket data. To allow for a phased transition to a new
version of Turnstile, the bucket keys have also changed. The result
of this is that rate-limits are applied to users hitting the new
version of Turnstile independently of those applied to users hitting
the old version. This means that a user may be able to make twice as
many requests as permitted by the rate limits. An expedited
transition to the new version of Turnstile will address this problem... _PIP: http://www.pip-installer.org/en/latest/index.html