{"id":14978565,"url":"https://github.com/lyokha/ngx-export-tools-extra","last_synced_at":"2025-10-28T11:30:23.485Z","repository":{"id":41186733,"uuid":"182112079","full_name":"lyokha/ngx-export-tools-extra","owner":"lyokha","description":"More extra tools for Nginx haskell module","archived":false,"fork":false,"pushed_at":"2024-09-11T21:53:45.000Z","size":451,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-02-01T14:14:01.542Z","etag":null,"topics":["aggregate","haskell","nginx","service","stats","web"],"latest_commit_sha":null,"homepage":null,"language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lyokha.png","metadata":{"files":{"readme":"README.md","changelog":"Changelog.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-18T15:23:30.000Z","updated_at":"2024-09-10T08:04:13.000Z","dependencies_parsed_at":"2022-09-03T02:20:56.433Z","dependency_job_id":"5752706e-591d-4e34-a063-8eee65d4bb89","html_url":"https://github.com/lyokha/ngx-export-tools-extra","commit_stats":{"total_commits":297,"total_committers":1,"mean_commits":297.0,"dds":0.0,"last_synced_commit":"44def9b839fa97ff99560cfdb4d992eadb828568"},"previous_names":[],"tags_count":34,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyokha%2Fngx-export-tools-extra","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyokha%2Fngx-export-tools-extra/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyokha%2Fngx-export-tools-extra/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lyokha%2Fngx-export-tools-extra/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lyokha","download_url":"https://codeload.github.com/lyokha/ngx-export-tools-extra/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238638177,"owners_count":19505555,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aggregate","haskell","nginx","service","stats","web"],"created_at":"2024-09-24T13:57:55.203Z","updated_at":"2025-10-28T11:30:23.476Z","avatar_url":"https://github.com/lyokha.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"More extra tools for Nginx Haskell module\n=========================================\n\n\u003c!--[![Build Status](https://travis-ci.com/lyokha/ngx-export-tools-extra.svg?branch=master)](https://travis-ci.com/lyokha/ngx-export-tools-extra)--\u003e\n[![Build Status](https://github.com/lyokha/ngx-export-tools-extra/workflows/CI/badge.svg)](https://github.com/lyokha/ngx-export-tools-extra/actions?query=workflow%3ACI)\n[![Hackage](https://img.shields.io/hackage/v/ngx-export-tools-extra.svg?label=hackage%20%7C%20ngx-export-tools-extra\u0026logo=haskell\u0026logoColor=%239580D1)](https://hackage.haskell.org/package/ngx-export-tools-extra)\n\nThis package contains a collection of Haskell modules with more extra tools for\n[*Nginx Haskell module*](https://github.com/lyokha/nginx-haskell-module).\nDetailed documentation on each module's exported functions and data can be found\nat [*the Hackage page*](http://hackage.haskell.org/package/ngx-export-tools-extra).\n\n#### Table of contents\n\n- [Module NgxExport.Tools.Aggregate](#module-ngxexporttoolsaggregate)\n- [Module NgxExport.Tools.EDE](#module-ngxexporttoolsede)\n- [Module NgxExport.Tools.PCRE](#module-ngxexporttoolspcre)\n- [Module NgxExport.Tools.Prometheus](#module-ngxexporttoolsprometheus)\n- [Module NgxExport.Tools.Resolve](#module-ngxexporttoolsresolve)\n- [Module NgxExport.Tools.ServiceHookAdaptor](#module-ngxexporttoolsservicehookadaptor)\n- [Module NgxExport.Tools.Subrequest](#module-ngxexporttoolssubrequest)\n- [Building and installation](#building-and-installation)\n\n#### Module *NgxExport.Tools.Aggregate*\n\nAn aggregate service collects custom typed data reported by worker processes\nand sends this via HTTP when requested. This is an *ignitionService* in terms\nof module *NgxExport.Tools*, which means that it starts upon the startup of\nthe worker process and runs until termination of the worker. Internally, an\naggregate service starts an HTTP server implemented via the [*Snap\nframework*](http://snapframework.com/), which serves incoming requests from\nworker processes (collecting data) as well as from the Nginx server's\nclients (reporting collected data for administration purpose).\n\n##### An example\n\n###### File *test_tools_extra_aggregate.hs*\n\n```haskell\n{-# LANGUAGE TemplateHaskell, DeriveGeneric, TypeApplications #-}\n{-# LANGUAGE OverloadedStrings, BangPatterns #-}\n\nmodule TestToolsExtraAggregate where\n\nimport           NgxExport\nimport           NgxExport.Tools\nimport           NgxExport.Tools.Aggregate\n\nimport           Data.ByteString (ByteString)\nimport qualified Data.ByteString.Lazy.Char8 as C8L\nimport           Data.Aeson\nimport           Data.Maybe\nimport           Data.IORef\nimport           System.IO.Unsafe\nimport           GHC.Generics\n\ndata Stats = Stats { bytesSent :: Int\n                   , requests :: Int\n                   , meanBytesSent :: Int\n                   } deriving Generic\ninstance FromJSON Stats\ninstance ToJSON Stats\n\nstats :: IORef Stats\nstats = unsafePerformIO $ newIORef $ Stats 0 0 0\n{-# NOINLINE stats #-}\n\nupdateStats :: ByteString -\u003e IO C8L.ByteString\nupdateStats s = voidHandler $ do\n    let cbs = readFromByteString @Int s\n    modifyIORef' stats $ \\(Stats bs rs _) -\u003e\n        let !nbs = bs + fromMaybe 0 cbs\n            !nrs = rs + 1\n            !nmbs = nbs `div` nrs\n        in Stats nbs nrs nmbs\nngxExportIOYY 'updateStats\n\nreportStats :: Int -\u003e Bool -\u003e IO C8L.ByteString\nreportStats = deferredService $ \\port -\u003e voidHandler $ do\n    s \u003c- readIORef stats\n    reportAggregate port (Just s) \"stats\"\nngxExportSimpleServiceTyped 'reportStats ''Int $\n    PersistentService $ Just $ Sec 5\n\nngxExportAggregateService \"stats\" ''Stats\n```\n\nHere, on the bottom line, aggregate service *stats* is declared. It expects\nfrom worker processes reports in JSON format with data of type *Stats* which\nincludes the number of bytes sent so far, the number of client requests, and\nthe mean value of bytes sent per a single request. Its own configuration\n(a TCP port and the *purge interval*) shall be defined in the Nginx\nconfiguration file. The reports from worker processes are sent from a\n*deferredService* *reportStats* every 5 seconds: it merely reads data\ncollected in a global IORef *stats* and then sends this to the aggregate\nservice using *reportAggregate*. Handler *updateStats* updates the *stats*\non every run. It accepts a *ByteString* from Nginx, then converts it to an\n*Int* value and interprets this as the number of bytes sent in the current\nrequest. It also increments the number or requests and calculates the mean\nvalue of bytes sent in all requests to this worker so far. Notice that all\nthe parts of *stats* are evaluated *strictly*, it is important!\n\n###### File *nginx.conf*\n\n```nginx\nuser                    nobody;\nworker_processes        2;\n\nevents {\n    worker_connections  1024;\n}\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n\n    haskell load /var/lib/nginx/test_tools_extra_aggregate.so;\n\n    haskell_run_service simpleService_aggregate_stats $hs_stats\n            'AggregateServerConf { asPort = 8100, asPurgeInterval = Min 5 }';\n\n    haskell_service_var_in_shm stats 32k /tmp $hs_stats;\n\n    haskell_run_service simpleService_reportStats $hs_reportStats 8100;\n\n    server {\n        listen       8010;\n        server_name  main;\n        error_log    /tmp/nginx-test-haskell-error.log;\n        access_log   /tmp/nginx-test-haskell-access.log;\n\n        haskell_run updateStats !$hs_updateStats $bytes_sent;\n\n        location / {\n            echo Ok;\n        }\n    }\n\n    server {\n        listen       8020;\n        server_name  stat;\n\n        location / {\n            allow 127.0.0.1;\n            deny all;\n            proxy_pass http://127.0.0.1:8100/get/stats;\n        }\n    }\n}\n```\n\nThe aggregate service *stats* must be referred from the Nginx configuration\nfile with prefix __*simpleService_aggregate\u0026#95;*__. Its configuration is typed,\nthe type is *AggregateServerConf*. Though its only constructor\n*AggregateServerConf* is not exported from this module, the service is still\nconfigurable from an Nginx configuration. Here, the aggregate service listens\non TCP port *8100*, and its *purge interval* is 5 minutes. Notice that an\naggregate service must be *shared* (here, variable *hs_stats* is declared as\nshared with Nginx directive *haskell_service_var_in_shm*), otherwise it won't\neven start because the internal HTTP servers on each worker process won't be\nable to bind to the same TCP port. Inside the upper *server* clause, handler\n*updateStats* runs on every client request. This handler always returns an\nempty string in variable *hs_updateStats* because it is only needed for the side\neffect of updating the *stats*. However, as soon as Nginx variable handlers are\n*lazy*, evaluation of *hs_updateStats* must be forced somehow. To achieve this,\nwe used the *strict annotation* (the *bang* symbol) in directive *haskell_run*\nthat enforces strict evaluation in a late request processing phase, when the\nvalue of variable *bytes_sent* has been already calculated.\n\nData collected by the aggregate service can be obtained in a request to the\nvirtual server listening on TCP port *8020*. It simply proxies requests to\nthe internal aggregate server with URL */get/__stats__* where __*stats*__\ncorresponds to the *name* of the aggregate service.\n\n###### A simple test\n\nAs far as *reportStats* is a deferred service, we won't get useful data in 5\nseconds after Nginx start.\n\n```ShellSession\n$ curl -s 'http://127.0.0.1:8020/' | jq\n[\n  \"1858-11-17T00:00:00Z\",\n  {}\n]\n```\n\nHowever, later we should get some useful data.\n\n```ShellSession\n$ curl -s 'http://127.0.0.1:8020/' | jq\n[\n  \"2021-12-08T09:56:18.118132083Z\",\n  {\n    \"21651\": [\n      \"2021-12-08T09:56:18.12155413Z\",\n      {\n        \"meanBytesSent\": 0,\n        \"requests\": 0,\n        \"bytesSent\": 0\n      }\n    ],\n    \"21652\": [\n      \"2021-12-08T09:56:18.118132083Z\",\n      {\n        \"meanBytesSent\": 0,\n        \"requests\": 0,\n        \"bytesSent\": 0\n      }\n    ]\n  }\n]\n```\n\nHere we have collected stats from the two Nginx worker processes with *PIDs*\n*21651* and *21652*. The timestamps show when the stats was updated the last\ntime. The topmost timestamp shows the time of the latest *purge* event. The\ndata itself have only zeros as soon we have made no request to the main\nserver so far. Let's run 100 simultaneous requests and look at the stats (it\nshould update at worst in 5 seconds after running them).\n\n```ShellSession\n$ for i in {1..100} ; do curl 'http://127.0.0.1:8010/' \u0026 done\n```\n\nWait 5 seconds...\n\n```ShellSession\n$ curl -s 'http://127.0.0.1:8020/' | jq\n[\n  \"2021-12-08T09:56:18.118132083Z\",\n  {\n    \"21651\": [\n      \"2021-12-08T09:56:48.159263993Z\",\n      {\n        \"meanBytesSent\": 183,\n        \"requests\": 84,\n        \"bytesSent\": 15372\n      }\n    ],\n    \"21652\": [\n      \"2021-12-08T09:56:48.136934713Z\",\n      {\n        \"meanBytesSent\": 183,\n        \"requests\": 16,\n        \"bytesSent\": 2928\n      }\n    ]\n  }\n]\n```\n\n---\n\nService *simpleService_aggregate_stats* was implemented using\n*Snap framework*. Basically, a native Nginx implementation is not easy\nbecause the service must listen on a single (not duplicated) file descriptor\nwhich is not the case when Nginx spawns more than one worker processes.\nRunning *simpleService_aggregate_stats* as a shared service is an elegant\nsolution as shared services guarantee that they occupy only one worker at a\ntime. However, *nginx-haskell-module* provides directive *single_listener*\nwhich can be used to apply the required restriction in a custom Nginx virtual\nserver. This directive requires that the virtual server listens with option\n*reuseport* and is only available on Linux with socket option\n*SO_ATTACH_REUSEPORT_CBPF*.\n\nExporter *ngxExportAggregateService* exports additional handlers to build a\nnative Nginx-based aggregate service. Let's replace service\n*simpleService_aggregate_stats* from the previous example with such a native\nNginx-based aggregate service using *single_listener* and listening on port\n*8100*.\n\n```nginx\nuser                    nobody;\nworker_processes        2;\n\nevents {\n    worker_connections  1024;\n}\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n\n    haskell load /var/lib/nginx/test_tools_extra_aggregate.so;\n\n    haskell_run_service simpleService_reportStats $hs_reportStats 8100;\n\n    haskell_var_empty_on_error $hs_stats;\n\n    server {\n        listen       8010;\n        server_name  main;\n        error_log    /tmp/nginx-test-haskell-error.log;\n        access_log   /tmp/nginx-test-haskell-access.log;\n\n        haskell_run updateStats !$hs_updateStats $bytes_sent;\n\n        location / {\n            echo Ok;\n        }\n    }\n\n    server {\n        listen       8020;\n        server_name  stat;\n\n        location / {\n            allow 127.0.0.1;\n            deny all;\n            proxy_pass http://127.0.0.1:8100/get/stats;\n        }\n    }\n\n    server {\n        listen          8100 reuseport;\n        server_name     stats;\n\n        single_listener on;\n\n        location /put/stats {\n            haskell_run_async_on_request_body receiveAggregate_stats\n                    $hs_stats \"Min 1\";\n\n            if ($hs_stats = '') {\n                return 400;\n            }\n\n            return 200;\n        }\n\n        location /get/stats {\n            haskell_async_content sendAggregate_stats noarg;\n        }\n    }\n}\n```\n\nHandler *receiveAggregate_stats* accepts a time interval corresponding to the\nvalue of *asPurgeInterval* from service *simpleService_aggregate_stats*. If\nthe value is not readable (say, *noarg*) then it is defaulted to *Min 5*.\n\nNotice that the stats server must listen on address *127.0.0.1* because\nservice *simpleService_reportStats* reports stats to this address.\n\n#### Module *NgxExport.Tools.EDE*\n\nThis module allows for complex parsing of JSON objects with [*EDE templating\nlanguage*](http://hackage.haskell.org/package/ede/docs/Text-EDE.html). In\nterms of module *NgxExport.Tools*, it exports a *single-shot* service\n*compileEDETemplates* to configure a list of templates parameterized by\na simple key, and two variable handlers *renderEDETemplate* and\n*renderEDETemplateFromFreeValue* for parsing JSON objects and\nsubstitution of extracted data into provided EDE templates. The former\nhandler is *asynchronous* and suitable for parsing JSON objects POSTed in a\nrequest body, while the latter is *synchronous* and can parse JSON objects\ncontained in Nginx variables.\n\n##### An example\n\n###### File *test_tools_extra_ede.hs*\n\n```haskell\n{-# LANGUAGE TemplateHaskell #-}\n\nmodule TestToolsExtraEDE where\n\nimport           NgxExport\nimport           NgxExport.Tools.EDE ()\n\nimport           Data.ByteString (ByteString)\nimport qualified Data.ByteString.Lazy as L\nimport qualified Network.HTTP.Types.URI as URI\n\nurlDecode :: ByteString -\u003e L.ByteString\nurlDecode = L.fromStrict . URI.urlDecode False\n\nngxExportYY 'urlDecode\n```\n\nWe are going to use *urlDecode* to decode JSON  values contained in HTTP\ncookies. Notice that we are not using any Haskell declarations from module\n*NgxExport.Tools.EDE* while still need to import this to access the three\nhandlers from the Nginx configuration. This situation is quite valid though\nnot usual to *ghc*, and to make it keep silence, an explicit empty import\nlist was added at the end of the import stanza.\n\n###### File *nginx.conf*\n\n```nginx\nuser                    nobody;\nworker_processes        2;\n\nevents {\n    worker_connections  1024;\n}\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n\n    haskell load /var/lib/nginx/test_tools_extra_ede.so;\n\n    haskell_run_service simpleService_compileEDETemplates $hs_EDETemplates\n            '(\"/var/lib/nginx/EDE\",\n              [(\"user\",\n                \"{{user.id}}/{{user.ops|b64}}/{{resources.path|uenc}}\")])';\n\n    server {\n        listen       8010;\n        server_name  main;\n        error_log    /tmp/nginx-test-haskell-error.log;\n        access_log   /tmp/nginx-test-haskell-access.log;\n\n        location / {\n            haskell_run_async_on_request_body renderEDETemplate $hs_user user;\n            rewrite ^ /internal/user/$hs_user last;\n        }\n\n        location ~ ^/internal/user/(EDE\\ ERROR:.*) {\n            internal;\n            echo_status 404;\n            echo \"Bad input: $1\";\n        }\n\n        location ~ ^/internal/user/([^/]+)/([^/]+)/([^/]+)$ {\n            internal;\n            echo \"User id: $1, options: $2, path: $3\";\n        }\n\n        location ~ ^/internal/user/(.*) {\n            internal;\n            echo_status 404;\n            echo \"Unexpected input: $1\";\n        }\n\n        location /cookie {\n            haskell_run urlDecode $hs_cookie_user $cookie_user;\n            haskell_run renderEDETemplateFromFreeValue $hs_user_from_cookie\n                    user|$hs_cookie_user;\n            rewrite ^ /internal/user/$hs_user_from_cookie last;\n        }\n    }\n}\n```\n\nThere is an EDE template declared by the argument of service\n*simpleService_compileEDETemplates*. The template will be accessed later\nin the asynchronous body handler *renderEDETemplate* with key *user*.\nPath */var/lib/nginx/EDE* can be used in the templates to *include* more\nrules from files located inside it, but we do not actually use this here.\n\nThe rule inside template *user* says: with given JSON object,\n\n* print object *id* inside a top object *user*,\n* print *slash*,\n* print object *ops* inside the top object *user* filtered by function *b64*,\n* print *slash*,\n* print object *path* inside a top object *resources* filtered by function\n  *uenc*.\n\nFunctions *b64* and *uenc* are *polymorphic filters* in terms of EDE language.\nThere are many filters shipped with EDE, but *b64* and *uenc* were defined in\nthis module.\n\n* *b64* encodes an Aeson's *Value* using *base64url* encoding,\n* *uenc* encodes an Aeson's *Value* using *URL encoding* rules.\n\nSo, basically, we used *renderEDETemplate* to decompose POSTed JSON objects\nand then *rewrite* requests to other locations where the URL path after\nsubstitution of the extracted and then encoded into variable *hs_user*\nfields points to. Handler *renderEDETemplateFromFreeValue* in location\n*/cookie* does the same but reads JSON objects from HTTP cookie *user*.\n\n###### A simple test\n\n```ShellSession\n$ curl -d '{\"user\": {\"id\" : \"user1\", \"ops\": [\"op1\", \"op2\"]}, \"resources\": {\"path\": \"/opt/users\"}}' 'http://localhost:8010/'\nUser id: user1, options: WyJvcDEiLCJvcDIiXQ==, path: %2Fopt%2Fusers\n```\n\nLet's try to send a broken (in any meaning) input value.\n\n```ShellSession\n$ curl -d '{\"user\": {\"id\" : \"user1\", \"ops\": [\"op1\", \"op2\"]}, \"resources\": {\"p\": \"/opt/users\"}}' 'http://localhost:8010/'\nBad input: EDE ERROR: Text.EDE.parse:1:32 error: variable resources.path doesn't exist.\n```\n\nNow we got response with HTTP status *404* and a comprehensive description of\nwhat went wrong. To not mess rewrite logic and error responses, variable\n*hs_user* can be listed inside directive *haskell_var_empty_on_error* in the\nNginx configuration.\n\n```nginx\n    haskell_var_empty_on_error $hs_user;\n```\n\nNow the variable will always be empty on errors, while the errors will still\nbe logged by Nginx in the error log.\n\nLet's read user data encoded in HTTP cookie *user*.\n\n```ShellSession\n$ curl -b 'user=%7B%22user%22%3A%20%7B%22id%22%20%3A%20%22user1%22%2C%20%22ops%22%3A%20%5B%22op1%22%2C%20%22op2%22%5D%7D%2C%20%22resources%22%3A%20%7B%22path%22%3A%20%22%2Fopt%2Fusers%22%7D%7D' 'http://localhost:8010/cookie'\nUser id: user1, options: WyJvcDEiLCJvcDIiXQ==, path: %2Fopt%2Fusers\n```\n\n#### Module *NgxExport.Tools.PCRE*\n\nThis module provides a simple handler *matchRegex* to match a value\nagainst a PCRE regex preliminary declared and compiled in\n*configuration service* *simpleService_declareRegexes* (which is an\n*ignitionService* in terms of module *NgxExport.Tools*) and the corresponding\n*service update hook* (in terms of module *NgxExport*) *compileRegexes*\nat the start of the service.\n\n##### An example\n\n###### File *test_tools_extra_pcre.hs*\n\n```haskell\nmodule TestToolsExtraPCRE where\n\nimport NgxExport.Tools.PCRE ()\n```\n\nThe file does not contain any significant declarations as we are going to use\nonly the exporters of the handlers.\n\n###### File *nginx.conf*\n\n```nginx\nuser                    nobody;\nworker_processes        2;\n\nevents {\n    worker_connections  1024;\n}\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n\n    haskell load /var/lib/nginx/test_tools_extra_pcre.so;\n\n    haskell_run_service simpleService_declareRegexes $hs_regexes\n            '[(\"userArea\", \"(?:\\\\\\\\|)(\\\\\\\\d+)$\", \"\")\n             ,(\"keyValue\", \"(k\\\\\\\\w+)(\\\\\\\\|)(v\\\\\\\\w+)\", \"i\")\n             ]';\n\n    haskell_var_empty_on_error $hs_kv;\n\n    server {\n        listen       8010;\n        server_name  main;\n        error_log    /tmp/nginx-test-haskell-error.log;\n        access_log   /tmp/nginx-test-haskell-access.log;\n\n        location / {\n            haskell_run matchRegex $hs_user_area 'userArea|$arg_user';\n            rewrite ^ /internal/user/area/$hs_user_area last;\n        }\n\n        location ~ ^/internal/user/area/(PCRE\\ ERROR:.*) {\n            internal;\n            echo_status 404;\n            echo \"Bad input: $1\";\n        }\n\n        location = /internal/user/area/ {\n            internal;\n            echo_status 404;\n            echo \"No user area attached\";\n        }\n\n        location ~ ^/internal/user/area/(.+) {\n            internal;\n            echo \"User area: $1\";\n        }\n    }\n}\n```\n\nIn this example, we expect requests with argument *user* which should\nsupposedly be tagged with an *area* code containing digits only. The *user*\nvalue should match against regex *userArea* declared alongside with another\nregex *keyValue* (the latter has an option *i* which corresponds to\n*caseless*; the regex compiler has also support for options *s* and *m* which\ncorrespond to *dotall* and *multiline* respectively). Notice that regex\ndeclarations require 4-fold backslashes as they are getting shrunk while\ninterpreted sequentially by the Nginx configuration interpreter and then by\nthe Haskell compiler too.\n\nHandler *matchRegex* finds the named regex *userArea* from the beginning of\nits argument: the second part of the argument is delimited by a *bar* symbol\nand contains the value to match against. If the regex contains captures, then\nthe matched value shall correspond to the contents of the first capture (in\ncase of *userArea*, this is the area code), otherwise it must correspond to\nthe whole matched value.\n\n###### A simple test\n\n```ShellSession\n$ curl 'http://localhost:8010/'\nNo user area attached\n$ curl 'http://localhost:8010/?user=peter|98'\nUser area: 98\n$ curl 'http://localhost:8010/?user=peter|98i'\nNo user area attached\n```\n\n---\n\nThere are handlers to make substitutions using PCRE regexes. An\n*ignitionService* *simpleService_mapSubs* declares named *plain*\nsubstitutions which are made in run-time by handlers *subRegex* and\n*gsubRegex*. Functions *subRegexWith* and *gsubRegexWith* make it\npossible to write custom *functional* substitutions.\n\nLet's extend our example by adding ability to erase the captured area code.\nWe also going to implement a *functional* substitution to swap the keys and\nthe values matched in the *keyValue* regex.\n\n###### File *test_tools_extra_pcre.hs*\n\n```haskell\n{-# LANGUAGE TemplateHaskell, LambdaCase #-}\n\nmodule TestToolsExtraPCRE where\n\nimport           NgxExport\nimport           NgxExport.Tools.PCRE\n\nimport           Data.ByteString (ByteString)\nimport qualified Data.ByteString as B\nimport qualified Data.ByteString.Lazy as L\n\ngsubSwapAround :: ByteString -\u003e IO L.ByteString\ngsubSwapAround = gsubRegexWith $ const $ \\case\n    a : d : b : _ -\u003e B.concat [b, d, a]\n    _ -\u003e B.empty\n\nngxExportIOYY 'gsubSwapAround\n```\n\nFunctional substitution handler *gsubSwapAround* expects a regular expression\nwith at least 3 capture groups to swap the contents of the first and the\nthird groups around. We are going to apply this handler against regex\n*keyValue*.\n\n###### File *nginx.conf*: erase area code and swap keys and values\n\n```nginx\n    haskell_run_service simpleService_mapSubs $hs_subs\n            '[(\"erase\", \"\")]';\n\n    haskell_var_empty_on_error $hs_kv;\n```\n\n```nginx\n        location /erase/area {\n            haskell_run subRegex $hs_user_no_area 'userArea|erase|$arg_user';\n            rewrite ^ /internal/user/noarea/$hs_user_no_area last;\n        }\n\n        location ~ ^/internal/user/noarea/(PCRE\\ ERROR:.*) {\n            internal;\n            echo_status 404;\n            echo \"Bad input: $1\";\n        }\n\n        location ~ ^/internal/user/noarea/(.*) {\n            internal;\n            echo \"User without area: $1\";\n        }\n\n        location /swap {\n            haskell_run gsubSwapAround $hs_kv 'keyValue|$arg_kv';\n            echo \"Swap $arg_kv = $hs_kv\";\n        }\n```\n\nService *simpleService_mapSubs* declares a list of named *plain*\nsubstitutions. In this example, it declares only one substitution *erase*\nwhich substitutes an empty string, i.e. *erases* the matched text. Notice\nthat the argument of handler *subRequest* requires three parts delimited by\n*bar* symbols: the named regex, the named substitution, and the value to\nmatch against.\n\n###### A simple test\n\n```ShellSession\n$ curl 'http://localhost:8010/erase/area?user=peter|98'\nUser without area: peter\n$ curl 'http://localhost:8010/swap?kv=kid|v0012a'\nSwap kid|v0012a = v0012a|kid\n```\n\n#### Module *NgxExport.Tools.Prometheus*\n\nThis module is aimed to convert custom counters from\n[nginx-custom-counters-module](https://github.com/lyokha/nginx-custom-counters-module)\nto Prometheus metrics. For this, it exposes four exporters:\n*prometheusConf* which is an *ignitionService* in terms of module\n*NgxExport.Tools*, *toPrometheusMetrics* to convert *custom counters* to\nPrometheus metrics, *prometheusMetrics* which is a content handler aiming\nto return Prometheus metrics to the client, and a handy utility\n*scale1000* to convert small floating point numbers to integers by\nmultiplying them by *1000* (which fits well for dealing with request\ndurations).\n\nThe module makes use of a few custom data types which are not exported while\nstill needed when writing Nginx configurations. In the following example they\nare used in configurations of *simpleService_prometheusConf* and\n*toPrometheusMetrics*.\n\n##### An example\n\n###### File *test_tools_extra_prometheus.hs*\n\n```haskell\nmodule TestToolsExtraPrometheus where\n\nimport NgxExport.Tools.Prometheus ()\n```\n\nThe file does not contain any significant declarations as we are going to use\nonly the exporters.\n\n###### File *nginx.conf*\n\n```nginx\nuser                    nobody;\nworker_processes        2;\n\nevents {\n    worker_connections  1024;\n}\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n\n    map $status $inc_cnt_4xx {\n        default         0;\n        '~^4(?:\\d){2}'  1;\n    }\n\n    map $status $inc_cnt_5xx {\n        default         0;\n        '~^5(?:\\d){2}'  1;\n    }\n\n    map_to_range_index $hs_request_time $request_time_bucket\n        0.005\n        0.01\n        0.05\n        0.1\n        0.5\n        1.0\n        5.0\n        10.0\n        30.0\n        60.0;\n\n    map_to_range_index $hs_bytes_sent $bytes_sent_bucket\n        0\n        10\n        100\n        1000\n        10000;\n\n    haskell load /var/lib/nginx/test_tools_extra_prometheus.so;\n\n    haskell_run_service simpleService_prometheusConf $hs_prometheus_conf\n            'PrometheusConf\n                { pcMetrics = fromList\n                    [(\"cnt_4xx\", \"Number of responses with 4xx status\")\n                    ,(\"cnt_5xx\", \"Number of responses with 5xx status\")\n                    ,(\"cnt_stub_status_active\", \"Active requests\")\n                    ,(\"cnt_uptime\", \"Nginx master uptime\")\n                    ,(\"cnt_uptime_reload\", \"Nginx master uptime after reload\")\n                    ,(\"hst_request_time\", \"Request duration\")\n                    ]\n                , pcGauges = fromList\n                    [\"cnt_stub_status_active\"]\n                , pcScale1000 = fromList\n                    [\"hst_request_time_sum\"]\n                }';\n\n    haskell_var_empty_on_error $hs_prom_metrics;\n\n    counters_survive_reload on;\n\n    server {\n        listen       8010;\n        server_name  main;\n        error_log    /tmp/nginx-test-haskell-error.log;\n        access_log   /tmp/nginx-test-haskell-access.log;\n\n        counter $cnt_4xx inc $inc_cnt_4xx;\n        counter $cnt_5xx inc $inc_cnt_5xx;\n\n        # cache $request_time and $bytes_sent\n        haskell_run ! $hs_request_time $request_time;\n        haskell_run ! $hs_bytes_sent $bytes_sent;\n\n        histogram $hst_request_time 11 $request_time_bucket;\n        haskell_run scale1000 $hs_request_time_scaled $hs_request_time;\n        counter $hst_request_time_sum inc $hs_request_time_scaled;\n\n        histogram $hst_bytes_sent 6 $bytes_sent_bucket;\n        counter $hst_bytes_sent_sum inc $hs_bytes_sent;\n\n        location / {\n            echo_sleep 0.5;\n            echo Ok;\n        }\n\n        location /1 {\n            echo_sleep 1.0;\n            echo Ok;\n        }\n\n        location /404 {\n            return 404;\n        }\n    }\n\n    server {\n        listen       8020;\n        server_name  stats;\n\n        location / {\n            haskell_run toPrometheusMetrics $hs_prom_metrics\n                    '[\"main\"\n                     ,$cnt_collection\n                     ,$cnt_histograms\n                     ,{\"cnt_stub_status_active\": $cnt_stub_status_active\n                      ,\"cnt_uptime\": $cnt_uptime\n                      ,\"cnt_uptime_reload\": $cnt_uptime_reload\n                      }\n                     ]';\n\n            if ($hs_prom_metrics = '') {\n                return 503;\n            }\n\n            default_type \"text/plain; version=0.0.4; charset=utf-8\";\n\n            echo -n $hs_prom_metrics;\n        }\n\n        location /counters {\n            default_type application/json;\n            echo $cnt_collection;\n        }\n\n        location /histograms {\n            default_type application/json;\n            echo $cnt_histograms;\n        }\n\n        location /uptime {\n            echo \"Uptime (after reload): $cnt_uptime ($cnt_uptime_reload)\";\n        }\n    }\n}\n```\n\nType *PrometheusConf* contains fields *pcMetrics*, *pcGauges*, and\n*pcScale1000*. Field *pcMetrics* is a map from metrics names to help\nmessages: this can be used to bind small descriptions to the metrics as\n*nginx-custom-counters-module* does not provide such functionality. Setting\ndescriptions to counters is optional. Field *pcGauges* lists counters that\nmust be regarded as gauges: the number of currently active requests is\nobviously a gauge. Field *pcScale1000* contains a list of counters that were\nscaled with *scale1000* and must be converted back.\n\nHandler *toPrometheusMetrics* expects 4 fields: the name of the\n*counter set identifier* \u0026mdash; in our example there is only one counter\nset *main*, predefined variables *cnt_collection* and *cnt_histograms* from\n*nginx-custom-counters-module*, and a list of additional counters \u0026mdash; in\nour example there are three additional counters *cnt_stub_status_active*,\n*cnt_uptime*, and *cnt_uptime_reload* which are also defined in\n*nginx-custom-counters-module*.\n\nTo fulfill histogram description in Prometheus, the *sum* value must be\nprovided. Histogram sums are not supported in *nginx-custom-counters-module*,\nand therefore they must be declared in separate counters. In this example\nthere are two histograms collecting request durations and the number of sent\nbytes, and accordingly, there are two sum counters: *hst_request_time_sum*\nand *hst_bytes_sent_sum*. As request durations may last milliseconds while\nbeing shown in seconds, they must be scaled with *scale1000*.\n\nTo further ensure histogram validity, it is important to have the last bucket\nin a histogram labeled as *\"+Inf\"*. This is achieved automatically when\nthe number of range boundaries in directive *map_to_range_index* is less by\none than the number in the corresponding histogram declaration: in this\nexample, the map for *request_time_bucket* has 10 range boundaries while\nhistogram *hst_request_time* has 11 buckets, the map for *bytes_sent_bucket*\nhas 5 range boundaries while histogram *hst_bytes_sent* has 6 buckets.\n\nNotice that the variable handler *toPrometheusMetrics* and directive *echo*\nin location */* can be replaced with a single content handler\n*prometheusMetrics* like in the following block.\n\n```nginx\n        location / {\n            haskell_async_content prometheusMetrics\n                    '[\"main\"\n                     ,$cnt_collection\n                     ,$cnt_histograms\n                     ,{\"cnt_stub_status_active\": $cnt_stub_status_active\n                      ,\"cnt_uptime\": $cnt_uptime\n                      ,\"cnt_uptime_reload\": $cnt_uptime_reload\n                      }\n                     ]';\n        }\n```\n\n###### A simple test\n\nLet's look at the metrics right after starting Nginx.\n\n```ShellSession\n$ curl -s 'http://localhost:8020/'\n# HELP cnt_4xx Number of responses with 4xx status\n# TYPE cnt_4xx counter\ncnt_4xx 0.0\n# HELP cnt_5xx Number of responses with 5xx status\n# TYPE cnt_5xx counter\ncnt_5xx 0.0\n# HELP cnt_stub_status_active Active requests\n# TYPE cnt_stub_status_active gauge\ncnt_stub_status_active 1.0\n# HELP cnt_uptime Nginx master uptime\n# TYPE cnt_uptime counter\ncnt_uptime 8.0\n# HELP cnt_uptime_reload Nginx master uptime after reload\n# TYPE cnt_uptime_reload counter\ncnt_uptime_reload 8.0\n# HELP hst_bytes_sent\n# TYPE hst_bytes_sent histogram\nhst_bytes_sent_bucket{le=\"0\"} 0\nhst_bytes_sent_bucket{le=\"10\"} 0\nhst_bytes_sent_bucket{le=\"100\"} 0\nhst_bytes_sent_bucket{le=\"1000\"} 0\nhst_bytes_sent_bucket{le=\"10000\"} 0\nhst_bytes_sent_bucket{le=\"+Inf\"} 0\nhst_bytes_sent_count 0\nhst_bytes_sent_sum 0.0\n# HELP hst_bytes_sent_err\n# TYPE hst_bytes_sent_err counter\nhst_bytes_sent_err 0.0\n# HELP hst_request_time Request duration\n# TYPE hst_request_time histogram\nhst_request_time_bucket{le=\"0.005\"} 0\nhst_request_time_bucket{le=\"0.01\"} 0\nhst_request_time_bucket{le=\"0.05\"} 0\nhst_request_time_bucket{le=\"0.1\"} 0\nhst_request_time_bucket{le=\"0.5\"} 0\nhst_request_time_bucket{le=\"1.0\"} 0\nhst_request_time_bucket{le=\"5.0\"} 0\nhst_request_time_bucket{le=\"10.0\"} 0\nhst_request_time_bucket{le=\"30.0\"} 0\nhst_request_time_bucket{le=\"60.0\"} 0\nhst_request_time_bucket{le=\"+Inf\"} 0\nhst_request_time_count 0\nhst_request_time_sum 0.0\n# HELP hst_request_time_err\n# TYPE hst_request_time_err counter\nhst_request_time_err 0.0\n```\n\nRun some requests and look at the metrics again.\n\n```ShellSession\n$ for i in {1..20} ; do curl -D- 'http://localhost:8010/' \u0026 done\n  ...\n$ for i in {1..30} ; do curl -D- 'http://localhost:8010/1' \u0026 done\n  ...\n$ curl 'http://127.0.0.1:8010/404'\n  ...\n```\n\n```ShellSession\n$ curl -s 'http://localhost:8020/'\n# HELP cnt_4xx Number of responses with 4xx status\n# TYPE cnt_4xx counter\ncnt_4xx 1.0\n# HELP cnt_5xx Number of responses with 5xx status\n# TYPE cnt_5xx counter\ncnt_5xx 0.0\n# HELP cnt_stub_status_active Active requests\n# TYPE cnt_stub_status_active gauge\ncnt_stub_status_active 1.0\n# HELP cnt_uptime Nginx master uptime\n# TYPE cnt_uptime counter\ncnt_uptime 371.0\n# HELP cnt_uptime_reload Nginx master uptime after reload\n# TYPE cnt_uptime_reload counter\ncnt_uptime_reload 371.0\n# HELP hst_bytes_sent\n# TYPE hst_bytes_sent histogram\nhst_bytes_sent_bucket{le=\"0\"} 0\nhst_bytes_sent_bucket{le=\"10\"} 0\nhst_bytes_sent_bucket{le=\"100\"} 0\nhst_bytes_sent_bucket{le=\"1000\"} 51\nhst_bytes_sent_bucket{le=\"10000\"} 51\nhst_bytes_sent_bucket{le=\"+Inf\"} 51\nhst_bytes_sent_count 51\nhst_bytes_sent_sum 9458.0\n# HELP hst_bytes_sent_err\n# TYPE hst_bytes_sent_err counter\nhst_bytes_sent_err 0.0\n# HELP hst_request_time Request duration\n# TYPE hst_request_time histogram\nhst_request_time_bucket{le=\"0.005\"} 1\nhst_request_time_bucket{le=\"0.01\"} 1\nhst_request_time_bucket{le=\"0.05\"} 1\nhst_request_time_bucket{le=\"0.1\"} 1\nhst_request_time_bucket{le=\"0.5\"} 13\nhst_request_time_bucket{le=\"1.0\"} 44\nhst_request_time_bucket{le=\"5.0\"} 51\nhst_request_time_bucket{le=\"10.0\"} 51\nhst_request_time_bucket{le=\"30.0\"} 51\nhst_request_time_bucket{le=\"60.0\"} 51\nhst_request_time_bucket{le=\"+Inf\"} 51\nhst_request_time_count 51\nhst_request_time_sum 40.006\n# HELP hst_request_time_err\n# TYPE hst_request_time_err counter\nhst_request_time_err 0.0\n```\n\n---\n\nModule *NgxExport.Tools.Prometheus* has limited support for extracting data from\nlists of values. Normally, variables from Nginx upstream module such as\n*upstream_status*, *upstream_response_time* and others contain lists of values\nseparated by commas and semicolons. With handler *statusLayout*, numbers of\n*2xx*, *3xx*, *4xx* and *5xx* responses from backends can be collected in a\ncomma-separated list. Handlers *cumulativeValue* and *cumulativeFPValue* can be\nused to count cumulative integer and floating point numbers from lists of\nvalues.\n\nLet's add checking upstream statuses and cumulative response times from all\nservers in an upstream into the original file *nginx.conf* from the previous\nexample.\n\n###### File *nginx.conf*: checking upstream statuses and response times\n\n```nginx\n    upstream backends {\n        server 127.0.0.1:8030 max_fails=0;\n        server 127.0.0.1:8040 max_fails=0;\n    }\n```\n\n```nginx\n    server {\n        listen       8030;\n        server_name  backend1;\n\n        location / {\n            echo_sleep 0.5;\n            echo_status 404;\n            echo \"Backend1 Ok\";\n        }\n    }\n\n    server {\n        listen       8040;\n        server_name  backend2;\n\n        location / {\n            echo_status 504;\n            echo \"Backend2 Ok\";\n        }\n    }\n```\n\nHere we added upstream *backends* with two virtual servers that will play\nthe role of backends. One of them will wait for half a second and return\nHTTP status *404*, while the other will return HTTP status *504* immediately.\nBoth servers are tagged with *max_fails=0* to prevent blacklisting them.\n\nWe also have to add counters and mappings.\n\n```nginx\n    map $hs_upstream_status $inc_cnt_u_4xx {\n        default                               0;\n        '~^(?:(?:\\d+),){2}(?P\u003cm_status\u003e\\d+)'  $m_status;\n    }\n\n    map $hs_upstream_status $inc_cnt_u_5xx {\n        default                               0;\n        '~^(?:(?:\\d+),){3}(?P\u003cm_status\u003e\\d+)'  $m_status;\n    }\n\n    map_to_range_index $hs_u_response_time $u_response_time_bucket\n        0.005\n        0.01\n        0.05\n        0.1\n        0.5\n        1.0\n        5.0\n        10.0\n        30.0\n        60.0;\n```\n\n```nginx\n        haskell_run statusLayout $hs_upstream_status $upstream_status;\n        counter $cnt_u_4xx inc $inc_cnt_u_4xx;\n        counter $cnt_u_5xx inc $inc_cnt_u_5xx;\n\n        # cache $upstream_response_time\n        haskell_run ! $hs_u_response_times $upstream_response_time;\n\n        histogram $hst_u_response_time 11 $u_response_time_bucket;\n        histogram $hst_u_response_time undo;\n        haskell_run cumulativeFPValue $hs_u_response_time $hs_u_response_times;\n        haskell_run scale1000 $hs_u_response_time_scaled $hs_u_response_time;\n```\n\nNotice that histogram *hst_u_response_time* was disabled on this level to\nnot count visiting unrelated locations (i.e. */*, */1*, and */404*): the\nhistogram will be re-enabled later in locations related to proxying requests.\nThe sum counter will also be declared inside the proxying locations and take\nthe value of *hs_u_response_time_scaled* as the input value.\n\nSo many new variables require a bigger hash table to store them.\n\n```nginx\n    variables_hash_max_size 4096;\n```\n\nAnd finally, we have to update counters declarations in\n*simpleService_prometheusConf* and add location */backends* in the main\nserver.\n\n```nginx\n    haskell_run_service simpleService_prometheusConf $hs_prometheus_conf\n            'PrometheusConf\n                { pcMetrics = fromList\n                    [(\"cnt_4xx\", \"Number of responses with 4xx status\")\n                    ,(\"cnt_5xx\", \"Number of responses with 5xx status\")\n                    ,(\"cnt_u_4xx\"\n                     ,\"Number of responses from upstreams with 4xx status\")\n                    ,(\"cnt_u_5xx\"\n                     ,\"Number of responses from upstreams with 5xx status\")\n                    ,(\"cnt_stub_status_active\", \"Active requests\")\n                    ,(\"cnt_uptime\", \"Nginx master uptime\")\n                    ,(\"cnt_uptime_reload\", \"Nginx master uptime after reload\")\n                    ,(\"hst_request_time\", \"Request duration\")\n                    ,(\"hst_u_response_time\"\n                     ,\"Response time from all servers in a single upstream\")\n                    ]\n                , pcGauges = fromList\n                    [\"cnt_stub_status_active\"]\n                , pcScale1000 = fromList\n                    [\"hst_request_time_sum\"\n                    ,\"hst_u_response_time_sum\"\n                    ]\n                }';\n```\n\n```nginx\n        location /backends {\n            histogram $hst_u_response_time reuse;\n            counter $hst_u_response_time_sum inc $hs_u_response_time_scaled;\n            error_page 404 @status404;\n            proxy_intercept_errors on;\n            proxy_pass http://backends;\n        }\n\n        location @status404 {\n            histogram $hst_u_response_time reuse;\n            counter $hst_u_response_time_sum inc $hs_u_response_time_scaled;\n            echo_sleep 0.2;\n            echo \"Caught 404\";\n        }\n```\n\nWe are going to additionally increase response time by *0.2* seconds when a\nbackend server responds with HTTP status *404*, and this is why location\n*@status404* was added.\n\n###### A simple test\n\nLet's restart Nginx and run a simple test.\n\n```ShellSession\n$ for i in {1..20} ; do curl -D- 'http://localhost:8010/backends' \u0026 done\n  ...\n```\n\n```ShellSession\n$ curl -s 'http://127.0.0.1:8020/'\n# HELP cnt_4xx Number of responses with 4xx status\n# TYPE cnt_4xx counter\ncnt_4xx 11.0\n# HELP cnt_5xx Number of responses with 5xx status\n# TYPE cnt_5xx counter\ncnt_5xx 9.0\n# HELP cnt_stub_status_active Active requests\n# TYPE cnt_stub_status_active gauge\ncnt_stub_status_active 1.0\n# HELP cnt_u_4xx Number of responses from upstreams with 4xx status\n# TYPE cnt_u_4xx counter\ncnt_u_4xx 11.0\n# HELP cnt_u_5xx Number of responses from upstreams with 5xx status\n# TYPE cnt_u_5xx counter\ncnt_u_5xx 9.0\n# HELP cnt_uptime Nginx master uptime\n# TYPE cnt_uptime counter\ncnt_uptime 63.0\n# HELP cnt_uptime_reload Nginx master uptime after reload\n# TYPE cnt_uptime_reload counter\ncnt_uptime_reload 63.0\n# HELP hst_bytes_sent\n# TYPE hst_bytes_sent histogram\nhst_bytes_sent_bucket{le=\"0\"} 0\nhst_bytes_sent_bucket{le=\"10\"} 0\nhst_bytes_sent_bucket{le=\"100\"} 0\nhst_bytes_sent_bucket{le=\"1000\"} 20\nhst_bytes_sent_bucket{le=\"10000\"} 20\nhst_bytes_sent_bucket{le=\"+Inf\"} 20\nhst_bytes_sent_count 20\nhst_bytes_sent_sum 4032.0\n# HELP hst_bytes_sent_err\n# TYPE hst_bytes_sent_err counter\nhst_bytes_sent_err 0.0\n# HELP hst_request_time Request duration\n# TYPE hst_request_time histogram\nhst_request_time_bucket{le=\"0.005\"} 9\nhst_request_time_bucket{le=\"0.01\"} 9\nhst_request_time_bucket{le=\"0.05\"} 9\nhst_request_time_bucket{le=\"0.1\"} 9\nhst_request_time_bucket{le=\"0.5\"} 9\nhst_request_time_bucket{le=\"1.0\"} 20\nhst_request_time_bucket{le=\"5.0\"} 20\nhst_request_time_bucket{le=\"10.0\"} 20\nhst_request_time_bucket{le=\"30.0\"} 20\nhst_request_time_bucket{le=\"60.0\"} 20\nhst_request_time_bucket{le=\"+Inf\"} 20\nhst_request_time_count 20\nhst_request_time_sum 7.721\n# HELP hst_request_time_err\n# TYPE hst_request_time_err counter\nhst_request_time_err 0.0\n# HELP hst_u_response_time Response time from all servers in a single upstream\n# TYPE hst_u_response_time histogram\nhst_u_response_time_bucket{le=\"0.005\"} 9\nhst_u_response_time_bucket{le=\"0.01\"} 9\nhst_u_response_time_bucket{le=\"0.05\"} 9\nhst_u_response_time_bucket{le=\"0.1\"} 9\nhst_u_response_time_bucket{le=\"0.5\"} 13\nhst_u_response_time_bucket{le=\"1.0\"} 20\nhst_u_response_time_bucket{le=\"5.0\"} 20\nhst_u_response_time_bucket{le=\"10.0\"} 20\nhst_u_response_time_bucket{le=\"30.0\"} 20\nhst_u_response_time_bucket{le=\"60.0\"} 20\nhst_u_response_time_bucket{le=\"+Inf\"} 20\nhst_u_response_time_count 20\nhst_u_response_time_sum 5.519\n# HELP hst_u_response_time_err\n# TYPE hst_u_response_time_err counter\nhst_u_response_time_err 0.0\n```\n\nCounters look good. Numbers of visiting backend servers are almost equal (11\nand 9), the sum of cumulative response times from backends is approximately 5\nseconds, while the sum of all requests durations is approximately 7 seconds\nwhich corresponds to 11 visits to location *@status404* and the sleep time\n*0.2* seconds that was added there.\n\n---\n\nIn the previous examples we used many counters which served similar purposes.\nFor example, counters *cnt_4xx*, *cnt_5xx*, *cnt_u_4xx*, and *cnt_u_5xx*\ncounted response statuses in different conditions: particularly, the 2 former\ncounters counted *4xx* and *5xx* response statuses sent to clients, while the\nlatter 2 counters counted *4xx* and *5xx* response statuses received from the\nupstream. It feels that they could be shown as a single compound counter\nparameterized by the range of values and the origin. We also had two\nhistograms *hst_request_time* and *hst_u_response_time* which could also be\ncombined in a single entity parameterized by the scope (the time of the whole\nrequest against the time spent in the upstream).\n\nFortunately, Prometheus provides a mechanism to make such custom\nparameterizations by using *labels* in metrics. This module supports the\nparameterization with labels by expecting special *annotations* attached to\nthe names of the counters.\n\nLet's parameterize the status counters and the request times as it was\nproposed at the beginning of this section.\n\n###### File *nginx.conf*: changes related to counters annotations\n\n```nginx\n    haskell_run_service simpleService_prometheusConf $hs_prometheus_conf\n            'PrometheusConf\n                { pcMetrics = fromList\n                    [(\"cnt_status\", \"Number of responses with given status\")\n                    ,(\"cnt_stub_status_active\", \"Active requests\")\n                    ,(\"cnt_uptime\", \"Nginx master uptime\")\n                    ,(\"cnt_uptime_reload\", \"Nginx master uptime after reload\")\n                    ,(\"hst_request_time\", \"Request duration\")\n                    ]\n                , pcGauges = fromList\n                    [\"cnt_stub_status_active\"]\n                , pcScale1000 = fromList\n                    [\"hst_request_time@scope=(total)_sum\"\n                    ,\"hst_request_time@scope=(in_upstreams)_sum\"\n                    ]\n                }';\n```\n\n```nginx\n        counter $cnt_status@value=(4xx),from=(response) inc $inc_cnt_4xx;\n        counter $cnt_status@value=(5xx),from=(response) inc $inc_cnt_5xx;\n\n        haskell_run statusLayout $hs_upstream_status $upstream_status;\n        counter $cnt_status@value=(4xx),from=(upstream) inc $inc_cnt_u_4xx;\n        counter $cnt_status@value=(5xx),from=(upstream) inc $inc_cnt_u_5xx;\n\n        # cache $request_time and $bytes_sent\n        haskell_run ! $hs_request_time $request_time;\n        haskell_run ! $hs_bytes_sent $bytes_sent;\n\n        histogram $hst_request_time@scope=(total) 11 $request_time_bucket;\n        haskell_run scale1000 $hs_request_time_scaled $hs_request_time;\n        counter $hst_request_time@scope=(total)_sum inc $hs_request_time_scaled;\n\n        histogram $hst_bytes_sent 6 $bytes_sent_bucket;\n        counter $hst_bytes_sent_sum inc $hs_bytes_sent;\n\n        # cache $upstream_response_time\n        haskell_run ! $hs_u_response_times $upstream_response_time;\n\n        histogram $hst_request_time@scope=(in_upstreams) 11\n                $u_response_time_bucket;\n        histogram $hst_request_time@scope=(in_upstreams) undo;\n        haskell_run cumulativeFPValue $hs_u_response_time $hs_u_response_times;\n        haskell_run scale1000 $hs_u_response_time_scaled $hs_u_response_time;\n\n        location / {\n            echo_sleep 0.5;\n            echo Ok;\n        }\n\n        location /1 {\n            echo_sleep 1.0;\n            echo Ok;\n        }\n\n        location /404 {\n            return 404;\n        }\n\n        location /backends {\n            histogram $hst_request_time@scope=(in_upstreams) reuse;\n            counter $hst_request_time@scope=(in_upstreams)_sum inc\n                    $hs_u_response_time_scaled;\n            error_page 404 @status404;\n            proxy_intercept_errors on;\n            proxy_pass http://backends;\n        }\n\n        location @status404 {\n            histogram $hst_request_time@scope=(in_upstreams) reuse;\n            counter $hst_request_time@scope=(in_upstreams)_sum inc\n                    $hs_u_response_time_scaled;\n            echo_sleep 0.2;\n            echo \"Caught 404\";\n        }\n```\n\nNotice that the 4 status counters were combined into a compound counter\n*cnt_status* whose name was annotated by a tail starting with *@*. This\nannotation gets put in the list of labels of the Prometheus metrics with\nsymbols *(* and *)* replaced by *\"* without any further validation. The\nrequest time histograms and the corresponding sum counters were annotated in\na similar way. Annotations in histogram sum counters must be put between the\nbase name of the counter and the suffix *_sum*.\n\n###### A simple test\n\n```ShellSession\n$ curl 'http://127.0.0.1:8010/404'\n  ...\n$ for i in {1..20} ; do curl -D- 'http://localhost:8010/backends' \u0026 done\n  ...\n```\n\n```ShellSession\n$ curl -s 'http://localhost:8020/'\n# HELP cnt_status Number of responses with given status\n# TYPE cnt_status counter\ncnt_status{value=\"4xx\",from=\"response\"} 11.0\ncnt_status{value=\"4xx\",from=\"upstream\"} 10.0\ncnt_status{value=\"5xx\",from=\"response\"} 10.0\ncnt_status{value=\"5xx\",from=\"upstream\"} 10.0\n# HELP cnt_stub_status_active Active requests\n# TYPE cnt_stub_status_active gauge\ncnt_stub_status_active 1.0\n# HELP cnt_uptime Nginx master uptime\n# TYPE cnt_uptime counter\ncnt_uptime 70.0\n# HELP cnt_uptime_reload Nginx master uptime after reload\n# TYPE cnt_uptime_reload counter\ncnt_uptime_reload 70.0\n# HELP hst_bytes_sent\n# TYPE hst_bytes_sent histogram\nhst_bytes_sent_bucket{le=\"0\"} 0\nhst_bytes_sent_bucket{le=\"10\"} 0\nhst_bytes_sent_bucket{le=\"100\"} 0\nhst_bytes_sent_bucket{le=\"1000\"} 21\nhst_bytes_sent_bucket{le=\"10000\"} 21\nhst_bytes_sent_bucket{le=\"+Inf\"} 21\nhst_bytes_sent_count 21\nhst_bytes_sent_sum 4348.0\n# HELP hst_bytes_sent_err\n# TYPE hst_bytes_sent_err counter\nhst_bytes_sent_err 0.0\n# HELP hst_request_time Request duration\n# TYPE hst_request_time histogram\nhst_request_time_bucket{le=\"0.005\",scope=\"in_upstreams\"} 10\nhst_request_time_bucket{le=\"0.01\",scope=\"in_upstreams\"} 10\nhst_request_time_bucket{le=\"0.05\",scope=\"in_upstreams\"} 10\nhst_request_time_bucket{le=\"0.1\",scope=\"in_upstreams\"} 10\nhst_request_time_bucket{le=\"0.5\",scope=\"in_upstreams\"} 14\nhst_request_time_bucket{le=\"1.0\",scope=\"in_upstreams\"} 20\nhst_request_time_bucket{le=\"5.0\",scope=\"in_upstreams\"} 20\nhst_request_time_bucket{le=\"10.0\",scope=\"in_upstreams\"} 20\nhst_request_time_bucket{le=\"30.0\",scope=\"in_upstreams\"} 20\nhst_request_time_bucket{le=\"60.0\",scope=\"in_upstreams\"} 20\nhst_request_time_bucket{le=\"+Inf\",scope=\"in_upstreams\"} 20\nhst_request_time_count{scope=\"in_upstreams\"} 20\nhst_request_time_sum{scope=\"in_upstreams\"} 5.012\nhst_request_time_bucket{le=\"0.005\",scope=\"total\"} 11\nhst_request_time_bucket{le=\"0.01\",scope=\"total\"} 11\nhst_request_time_bucket{le=\"0.05\",scope=\"total\"} 11\nhst_request_time_bucket{le=\"0.1\",scope=\"total\"} 11\nhst_request_time_bucket{le=\"0.5\",scope=\"total\"} 11\nhst_request_time_bucket{le=\"1.0\",scope=\"total\"} 21\nhst_request_time_bucket{le=\"5.0\",scope=\"total\"} 21\nhst_request_time_bucket{le=\"10.0\",scope=\"total\"} 21\nhst_request_time_bucket{le=\"30.0\",scope=\"total\"} 21\nhst_request_time_bucket{le=\"60.0\",scope=\"total\"} 21\nhst_request_time_bucket{le=\"+Inf\",scope=\"total\"} 21\nhst_request_time_count{scope=\"total\"} 21\nhst_request_time_sum{scope=\"total\"} 7.02\n# HELP hst_request_time_err\n# TYPE hst_request_time_err counter\nhst_request_time_err{scope=\"in_upstreams\"} 0.0\nhst_request_time_err{scope=\"total\"} 0.0\n```\n\n#### Module *NgxExport.Tools.Resolve*\n\nWith Nginx module\n[nginx-upconf-module](https://github.com/lyokha/nginx-haskell-module/tree/master/examples/dynamicUpstreams),\nit is possible to update servers inside upstreams dynamically. The module\nrequires an agent to update a bound variable with upstreams layout and also\nsignal that the variable has been altered. This module is such an agent. It\nupdates the variable with the upstreams layout in service\n*collectUpstreams* and signals about this in service callback\n*signalUpconf*. Collecting upstreams encompasses DNS queries of *A* and\n*SRV* records. The queries are configured independently for each managed\nupstream. With both *A* and *SRV* queries, the module allows configuration\nof complex hierarchies of priorities given that compound upstream containers\nnamed *upstrands* are in use (they are implemented in\n[nginx-combined-upstreams-module](https://github.com/lyokha/nginx-combined-upstreams-module)).\n\nAdditionally, the module exports a number of functions and data types which\nimplement service *collectUpstreams*.\n\n##### An example\n\nIn the following example, we are going to extract IP addresses from an *SRV*\nrecord for *_http._tcp.mycompany.com* to inhabit upstream *utest*.\n\n###### File *test_tools_extra_resolve.hs*\n\n```haskell\nmodule TestToolsExtraResolve where\n\nimport NgxExport.Tools.Resolve ()\n```\n\nThe file does not contain any significant declarations as we are going to use\nonly the exporters.\n\n###### File *nginx.conf*\n\n```nginx\nuser                    nginx;\nworker_processes        4;\n\nevents {\n    worker_connections  1024;\n}\n\nerror_log               /tmp/nginx-test-upconf-error.log notice;\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n\n    error_log           /tmp/nginx-test-upconf-error.log notice;\n    access_log          /tmp/nginx-test-upconf-access.log;\n\n    upstream utest {\n        zone utest 64k;\n        upconf_round_robin;\n        server localhost:9000;\n    }\n\n    haskell load /var/lib/nginx/test_tools_extra_resolve.so;\n\n    haskell_run_service simpleService_collectUpstreams $hs_upstreams\n        'Conf { upstreams =\n                    [UData { uQuery =\n                                 QuerySRV\n                                     (Name \"_http._tcp.mycompany.com\")\n                                         (SinglePriority \"utest\")\n                           , uMaxFails = 1\n                           , uFailTimeout = 10\n                           }\n                    ]\n              , maxWait = Sec 300\n              , waitOnException = Sec 2\n              , responseTimeout = Unset\n              }';\n\n    haskell_service_var_ignore_empty $hs_upstreams;\n    haskell_service_var_in_shm upstreams 64k /tmp $hs_upstreams;\n\n    haskell_service_var_update_callback simpleService_signalUpconf $hs_upstreams\n        '[\"http://127.0.0.1:8010/upconf\"]';\n\n    server {\n        listen          localhost:8010;\n        server_name     main;\n\n        location /upconf {\n            upconf $hs_upstreams;\n\n            allow 127.0.0.1;\n            deny  all;\n        }\n\n        location /upstreams {\n            default_type application/json;\n            echo $hs_upstreams;\n\n            allow 127.0.0.1;\n            deny  all;\n        }\n\n        location / {\n            proxy_pass http://utest;\n        }\n    }\n\n    server {\n        listen          localhost:9000;\n        server_name     backend9000;\n\n        location / {\n            echo_status 503;\n            echo \"Not configured\";\n        }\n    }\n}\n```\n\nAt the start of Nginx, upstream *utest* contains a statically declared server\nwhich reports *Not configured*, but so soon as service *collectUpstreams*\ncollects servers for the upstream in variable *\\$hs_upstreams*, and then\nthe *upconf* module gets notified about this via callback *signalUpconf*, the\nupstream gets inhabited by the collected servers. Notice that *signalUpconf*\naccepts a *list* of URLs which means that it can broadcast collected servers\nto multiple *upconf* endpoints listening on this or other hosts.\n\nThe upstream contents will be re-checked within the time interval of\n*(1 or waitOnException, maxWait)*. Particularly, if an exception happens\nduring the collection of the servers, then the service will restart in\n*waitOnException*. If there were no exceptions and the smallest value of\n*TTL* calculated from all collected servers does not exceed the value of\n*maxWait*, then the service will restart in this time.\n\nToo big response times may also cause exceptions during the collection of the\nservers. The timeout is defined by the value of *responseTimeout*. In our\nexample, the timeout is not set.\n\nNotice that we used *QuerySRV* and *SinglePriority \"utest\"*. The latter\nmeans that all collected servers will inhabit upstream *utest* regardless of\ntheir priority values. To distribute collected servers among a number of\nupstreams, we can use *PriorityList*.\n\n###### File *nginx.conf*: collect upstreams with *PriorityList*\n\n```nginx\n    haskell_run_service simpleService_collectUpstreams $hs_upstreams\n        'Conf { upstreams =\n                    [UData { uQuery =\n                                 QuerySRV\n                                     (Name \"_http._tcp.mycompany.com\")\n                                         (PriorityList [\"utest\", \"utest1\"])\n                           , uMaxFails = 1\n                           , uFailTimeout = 10\n                           }\n                    ]\n              , maxWait = Sec 300\n              , waitOnException = Sec 2\n              , responseTimeout = Unset\n              }';\n```\n\nWith this configuration, servers with the highest priority will inhabit\nupstream *utest*, while servers with the less priority will inhabit upstream\n*utest1*. Upstream *utest1* must also be managed by the *upconf* module.\nGenerally, given the number of upstreams in the priority list is $N$ and\nthe number of all variations of server priorities collected in the response\nis $M$, and $N$ is less than $M$, then remaining $M - N$ servers with\nthe lowest priorities won't be used in the upstreams at all, otherwise, if\n$N$ is greater than $M$, then remaining $N - M$ upstreams at the end of\nthe priority list will contain the same servers of the lowest priority.\n\nUpstreams in the priority list can be put inside of an *upstrand* to form the\nmain and the backup layers of servers.\n\n###### File *nginx.conf*: upstrand *utest*\n\n```nginx\n    upstream utest1 {\n        zone utest1 64k;\n        upconf_round_robin;\n        server localhost:9000;\n    }\n\n    upstrand utest {\n        upstream utest;\n        upstream utest1;\n        order per_request;\n        next_upstream_statuses error timeout 5xx;\n        next_upstream_timeout 60s;\n    }\n```\n\n###### File *nginx.conf*: location */upstrand*\n\n```nginx\n        location /upstrand {\n            proxy_pass http://$upstrand_utest;\n        }\n```\n\n#### Module *NgxExport.Tools.ServiceHookAdaptor*\n\nThis module exports a *simple service* (in terms of module *NgxExport.Tools*)\n*simpleService_hookAdaptor* which sleeps forever. Its sole purpose is to\nserve *service hooks* for changing global data in all the worker processes in\nrun-time. A single service hook adaptor can serve any number of service hooks\nwith any type of global data.\n\n##### An example\n\n###### File *test_tools_extra_servicehookadaptor.hs*\n\n```haskell\n{-# LANGUAGE TemplateHaskell, OverloadedStrings #-}\n\nmodule TestToolsExtraServiceHookAdaptor where\n\nimport           NgxExport\nimport           NgxExport.Tools.ServiceHookAdaptor ()\n\nimport           Data.ByteString (ByteString)\nimport qualified Data.ByteString as B\nimport qualified Data.ByteString.Lazy as L\nimport           Data.IORef\nimport           Control.Monad\nimport           Control.Exception\nimport           System.IO.Unsafe\n\ndata SecretWordUnset = SecretWordUnset\n\ninstance Exception SecretWordUnset\ninstance Show SecretWordUnset where\n    show = const \"unset\"\n\nsecretWord :: IORef ByteString\nsecretWord = unsafePerformIO $ newIORef \"\"\n{-# NOINLINE secretWord #-}\n\ntestSecretWord :: ByteString -\u003e IO L.ByteString\ntestSecretWord v = do\n    s \u003c- readIORef secretWord\n    when (B.null s) $ throwIO SecretWordUnset\n    return $ if v == s\n                 then \"success\"\n                 else \"\"\nngxExportIOYY 'testSecretWord\n\nchangeSecretWord :: ByteString -\u003e IO L.ByteString\nchangeSecretWord s = do\n    writeIORef secretWord s\n    return \"The secret word was changed\"\nngxExportServiceHook 'changeSecretWord\n```\n\nHere we are going to maintain a *secret word* of type *ByteString* in\nrun-time. When a worker process starts, the word is empty. The word can be\nchanged in run-time by triggering service hook *changeSecretWord*. Client\nrequests are managed differently depending on their knowledge of the secret\nwhich is tested in handler *testSecretWord*.\n\n###### File *nginx.conf*\n\n```nginx\nuser                    nobody;\nworker_processes        2;\n\nevents {\n    worker_connections  1024;\n}\n\nerror_log               /tmp/nginx-test-haskell-error.log info;\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n    error_log           /tmp/nginx-test-haskell-error.log info;\n    access_log          /tmp/nginx-test-haskell-access.log;\n\n    haskell load /var/lib/nginx/test_tools_extra_servicehookadaptor.so;\n\n    haskell_run_service simpleService_hookAdaptor $hs_hook_adaptor noarg;\n\n    haskell_service_hooks_zone hooks 32k;\n\n    server {\n        listen       8010;\n        server_name  main;\n\n        location / {\n            haskell_run testSecretWord $hs_secret_word $arg_s;\n\n            if ($hs_secret_word = unset) {\n                echo_status 503;\n                echo \"Try later! The service is not ready!\";\n                break;\n            }\n\n            if ($hs_secret_word = success) {\n                echo_status 200;\n                echo \"Congrats! You know the secret word!\";\n                break;\n            }\n\n            echo_status 404;\n            echo \"Hmm, you do not know a secret!\";\n        }\n\n        location /change_sw {\n            allow 127.0.0.1;\n            deny all;\n\n            haskell_service_hook changeSecretWord $hs_hook_adaptor $arg_s;\n        }\n    }\n}\n```\n\nNotice that service *simpleService_hookAdaptor* is not shared, however this\nis not such important because shared services must work as well.\n\n###### A simple test\n\nAfter starting Nginx, the secret word service must be not ready.\n\n```ShellSession\n$ curl 'http://127.0.0.1:8010/'\nTry later! The service is not ready!\n```\n\nLet's change the secret word,\n\n```ShellSession\n$ curl 'http://127.0.0.1:8010/change_sw?s=secret'\n```\n\nand try again.\n\n```ShellSession\n$ curl 'http://127.0.0.1:8010/'\nHmm, you do not know a secret!\n$ curl 'http://127.0.0.1:8010/?s=try1'\nHmm, you do not know a secret!\n$ curl 'http://127.0.0.1:8010/?s=secret'\nCongrats! You know the secret word!\n```\n\nChange the secret word again.\n\n```ShellSession\n$ curl 'http://127.0.0.1:8010/change_sw?s=secret1'\n$ curl 'http://127.0.0.1:8010/?s=secret'\nHmm, you do not know a secret!\n$ curl 'http://127.0.0.1:8010/?s=secret1'\nCongrats! You know the secret word!\n```\n\nWhat if a worker process quits for some reason or crashes? Let's try!\n\n```ShellSession\n# ps -ef | grep nginx | grep worker\nnobody     13869   13868  0 15:43 ?        00:00:00 nginx: worker process\nnobody     13870   13868  0 15:43 ?        00:00:00 nginx: worker process\n# kill -QUIT 13869 13870\n# ps -ef | grep nginx | grep worker\nnobody     14223   13868  4 15:56 ?        00:00:00 nginx: worker process\nnobody     14224   13868  4 15:56 ?        00:00:00 nginx: worker process\n```\n\n```ShellSession\n$ curl 'http://127.0.0.1:8010/?s=secret1'\nCongrats! You know the secret word!\n```\n\nOur secret is still intact! This is because service hooks manage new worker\nprocesses so well as those that were running when a hook was triggered.\n\nNote, however, that the order of service hooks execution in a restarted\nworker process is not well-defined which means that hooks that affect the\nsame data should be avoided. For example, we could declare another service\nhook to reset the secret word.\n\n###### File *test_tools_extra_servicehookadaptor.hs*: reset the secret word\n\n```haskell\nresetSecretWord :: ByteString -\u003e IO L.ByteString\nresetSecretWord = const $ do\n    writeIORef secretWord \"\"\n    return \"The secret word was reset\"\nngxExportServiceHook 'resetSecretWord\n```\n\n###### File *nginx.conf*: new location */reset_sw* in server *main*\n\n```nginx\n        location /reset_sw {\n            allow 127.0.0.1;\n            deny all;\n\n            haskell_service_hook resetSecretWord $hs_hook_adaptor;\n        }\n```\n\nBoth *changeSecretWord* and *resetSecretWord* alter the *secretWord* storage.\nThe order of their execution in a restarted worker process may differ from\nthe order they had happened before the new worker started, and therefore the\nstate of *secretWord* can get altered in the new worker.\n\nTo fix this issue in this example, get rid of hook *resetSecretWord* and use\ndirective *rewrite* to process the reset request in location */change_sw*.\n\n```nginx\n        location /reset_sw {\n            allow 127.0.0.1;\n            deny all;\n\n            rewrite ^ /change_sw last;\n        }\n```\n\nYou may also want to change the hook message in *changeSecretWord* to\nproperly log the reset case.\n\n```haskell\nchangeSecretWord :: ByteString -\u003e IO L.ByteString\nchangeSecretWord s = do\n    writeIORef secretWord s\n    return $ \"The secret word was \" `L.append` if B.null s\n                                                   then \"reset\"\n                                                   else \"changed\"\n```\n\n#### Module *NgxExport.Tools.Subrequest*\n\nUsing asynchronous variable handlers and services together with the HTTP\nclient from *Network.HTTP.Client* allows making HTTP subrequests easily.\nThis module provides such functionality by exporting asynchronous variable\nhandlers *makeSubrequest* and *makeSubrequestWithRead*, and functions\n*makeSubrequest* and *makeSubrequestWithRead* to build custom handlers.\n\n##### An example\n\n###### File *test_tools_extra_subrequest.hs*\n\n```haskell\n{-# LANGUAGE TemplateHaskell #-}\n\nmodule TestToolsExtraSubrequest where\n\nimport           NgxExport\nimport           NgxExport.Tools\nimport           NgxExport.Tools.Subrequest\n\nimport           Data.ByteString (ByteString)\nimport qualified Data.ByteString.Lazy as L\n\nmakeRequest :: ByteString -\u003e NgxExportService\nmakeRequest = const . makeSubrequest\n\nngxExportSimpleService 'makeRequest $ PersistentService $ Just $ Sec 10\n```\n\nHandler *makeRequest* will be used in a *periodical* service which will\nretrieve data from a specified URI every 10 seconds.\n\n###### File *nginx.conf*\n\n```nginx\nuser                    nobody;\nworker_processes        2;\n\nevents {\n    worker_connections  1024;\n}\n\nhttp {\n    default_type        application/octet-stream;\n    sendfile            on;\n\n    upstream backend {\n        server 127.0.0.1:8020;\n    }\n\n    haskell load /var/lib/nginx/test_tools_extra_subrequest.so;\n\n    haskell_run_service simpleService_makeRequest $hs_service_httpbin\n            '{\"uri\": \"https://httpbin.org\"}';\n\n    haskell_var_empty_on_error $hs_subrequest;\n\n    server {\n        listen       8010;\n        server_name  main;\n        error_log    /tmp/nginx-test-haskell-error.log;\n        access_log   /tmp/nginx-test-haskell-access.log;\n\n        location / {\n            haskell_run_async makeSubrequest $hs_subrequest\n                    '{\"uri\": \"http://127.0.0.1:8010/proxy\"\n                     ,\"headers\": [[\"Custom-Header\", \"$arg_a\"]]\n                     }';\n\n            if ($hs_subrequest = '') {\n                echo_status 404;\n                echo \"Failed to perform subrequest\";\n                break;\n            }\n\n            echo -n $hs_subrequest;\n        }\n\n        location ~ ^/proxy(.*) {\n            allow 127.0.0.1;\n            deny all;\n            proxy_pass http://backend$1;\n        }\n\n        location /httpbin {\n            echo $hs_service_httpbin;\n        }\n    }\n\n    server {\n        listen       8020;\n        server_name  backend;\n\n        location / {\n            set $custom_header $http_custom_header;\n            echo \"In backend, Custom-Header is '$custom_header'\";\n        }\n    }\n}\n```\n\nConfigurations of subrequests are defined via JSON objects which contain URI\nand other relevant data such as HTTP method, request body and headers. In\nthis configuration we are running a periodical service which gets contents of\n*httpbin.org* every 10 seconds, and doing a subrequest to a virtual server\n*backend* on every request to location */*. In this subrequest, an HTTP\nheader *Custom-Header* is sent to the backend with value equal to the value\nof argument *a* from the client request's URI.\n\nIt is worth noting that making HTTP subrequests to the own Nginx service\n(e.g. via *127.0.0.1*) allows for leveraging well-known advantages of Nginx\nsuch as load-balancing via upstreams as it is happening in this example.\n\n###### A simple test\n\n```ShellSession\n$ curl -s 'http://localhost:8010/httpbin' | head\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n\n\u003chead\u003e\n    \u003cmeta charset=\"UTF-8\"\u003e\n    \u003ctitle\u003ehttpbin.org\u003c/title\u003e\n    \u003clink href=\"https://fonts.googleapis.com/css?family=Open+Sans:400,700|Source+Code+Pro:300,600|Titillium+Web:400,600,700\"\n        rel=\"stylesheet\"\u003e\n    \u003clink rel=\"stylesheet\" type=\"text/css\" href=\"/flasgger_static/swagger-ui.css\"\u003e\n    \u003clink rel=\"icon\" type=\"image/png\" href=\"/static/favicon.ico\" sizes=\"64x64 32x32 16x16\" /\u003e\n```\n\n```ShellSession\n$ curl 'http://localhost:8010/?a=Value'\nIn backend, Custom-Header is 'Value'\n```\n\nLet's do a nasty thing. By injecting a double quote into the argument *a* we\nshall break JSON parsing.\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/?a=Value\"'\nHTTP/1.1 404 Not Found\nServer: nginx/1.17.9\nDate: Mon, 30 Mar 2020 14:42:42 GMT\nContent-Type: application/octet-stream\nTransfer-Encoding: chunked\nConnection: keep-alive\n\nFailed to perform subrequest\n```\n\n---\n\nMaking HTTP subrequests to the own Nginx service via the loopback interface\n(e.g. via *127.0.0.1*) has disadvantages of being neither very fast (if\ncompared with various types of local data communication channels) nor very\nsecure. Unix domain sockets is a better alternative in this sense. This\nmodule has support for them by providing configuration service\n*simpleService_configureUDS* where path to the socket can be set, and\nsetting field *manager* to value *uds* in the subrequest configuration.\n\nTo extend the previous example for using with Unix domain sockets, the\nfollowing declarations should be added.\n\n###### File *nginx.conf*: configuring the Unix domain socket\n\n```nginx\n    haskell_run_service simpleService_configureUDS $hs_service_uds\n            'UDSConf {udsPath = \"/tmp/backend.sock\"}';\n```\n\n*UDSConf* is an opaque type containing only one field *udsPath* with the path\nto the socket.\n\n###### File *nginx.conf*: new location */uds* in server *main*\n\n```nginx\n        location /uds {\n            haskell_run_async makeSubrequest $hs_subrequest\n                    '{\"uri\": \"http://backend_proxy/\"\n                     ,\"headers\": [[\"Custom-Header\", \"$arg_a\"]]\n                     ,\"manager\": \"uds\"\n                     }';\n\n            if ($hs_subrequest = '') {\n                echo_status 404;\n                echo \"Failed to perform subrequest\";\n                break;\n            }\n\n            echo -n $hs_subrequest;\n        }\n```\n\n###### File *nginx.conf*: new virtual server *backend_proxy*\n\n```nginx\n    server {\n        listen       unix:/tmp/backend.sock;\n        server_name  backend_proxy;\n\n        location / {\n            proxy_pass http://backend;\n        }\n    }\n```\n\nThe server listens on the Unix domain socket with the path configured in\nservice *simpleService_configureUDS*.\n\n###### A simple test\n\n```ShellSession\n$ curl 'http://localhost:8010/uds?a=Value'\nIn backend, Custom-Header is 'Value'\n```\n\n---\n\nTo serve subrequests, a custom HTTP manager can be implemented and then\nregistered in a custom service handler with *registerCustomManager*. To\nenable this manager in a subrequest configuration, use field *manager*\nwith the key that was bound to the manager in *registerCustomManager*.\n\nFor example, let's implement a custom UDS manager which will serve\nconnections via Unix Domain Sockets as in the previous section.\n\n###### File *test_tools_extra_subrequest_custom_manager.hs*\n\n```haskell\n{-# LANGUAGE TemplateHaskell, OverloadedStrings #-}\n\nmodule TestToolsExtraSubrequestCustomManager where\n\nimport           NgxExport.Tools\nimport           NgxExport.Tools.Subrequest\n\nimport           Data.ByteString (ByteString)\nimport qualified Data.ByteString.Lazy as L\n\nimport           Network.HTTP.Client hiding (path)\nimport qualified Network.Socket as S\nimport qualified Network.Socket.ByteString as SB\nimport qualified Data.ByteString.Char8 as C8\n\nconfigureUdsManager :: ByteString -\u003e NgxExportService\nconfigureUdsManager = ignitionService $ \\path -\u003e voidHandler $ do\n    man \u003c- newManager defaultManagerSettings\n               { managerRawConnection = return $ openUDS path }\n    registerCustomManager \"myuds\" man\n    where openUDS path _ _ _ = do\n              s \u003c- S.socket S.AF_UNIX S.Stream S.defaultProtocol\n              S.connect s (S.SockAddrUnix $ C8.unpack path)\n              makeConnection (SB.recv s 4096) (SB.sendAll s) (S.close s)\n\nngxExportSimpleService 'configureUdsManager SingleShotService\n```\n\n###### File *nginx.conf*: configuring the custom manager\n\n```nginx\n    haskell_run_service simpleService_configureUdsManager $hs_service_myuds\n            '/tmp/myuds.sock';\n```\n\n###### File *nginx.conf*: new location */myuds* in server *main*\n\n```nginx\n        location /myuds {\n            haskell_run_async makeSubrequest $hs_subrequest\n                    '{\"uri\": \"http://backend_proxy_myuds\"\n                     ,\"headers\": [[\"Custom-Header\", \"$arg_a\"]]\n                     ,\"manager\": \"myuds\"\n                     }';\n\n            if ($hs_subrequest = '') {\n                echo_status 404;\n                echo \"Failed to perform subrequest\";\n                break;\n            }\n\n            echo -n $hs_subrequest;\n        }\n```\n\n###### File *nginx.conf*: new virtual server *backend_proxy_myuds*\n\n```nginx\n    server {\n        listen       unix:/tmp/myuds.sock;\n        server_name  backend_proxy_myuds;\n\n        location / {\n            proxy_pass http://backend;\n        }\n    }\n```\n\n---\n\nHandlers *makeSubrequest* and *makeSubrequestWithRead* return response body\nof subrequests skipping the response status and headers. To retrieve full\ndata from a response, use another pair of asynchronous variable handlers and\nfunctions: *makeSubrequestFull* and *makeSubrequestFullWithRead*,\nand *makeSubrequestFull* and *makeSubrequestFullWithRead* respectively.\n\nUnlike the simple body handlers, there is no sense of using the corresponding\nvariables directly as they are binary encoded values. Instead, the response\nstatus, headers and the body must be extracted using handlers\n*extractStatusFromFullResponse*, *extractHeaderFromFullResponse*,\nand *extractBodyFromFullResponse* which are based on functions of the\nsame name. Handler *extractExceptionFromFullResponse* and the\ncorresponding function can be used to extract the error message if an\nexception has happened while making the subrequest: the value is empty if\nthere was no exception.\n\nLet's extend our example with these handlers.\n\nFile *test_tools_extra_subrequest.hs* does not have any changes as we are\ngoing to use exported handlers only.\n\n###### File *nginx.conf*: new location */full* in server *main*\n\n```nginx\n        location /full {\n            haskell_run_async makeSubrequestFull $hs_subrequest\n                    '{\"uri\": \"http://127.0.0.1:$arg_p/proxy\"\n                     ,\"headers\": [[\"Custom-Header\", \"$arg_a\"]]\n                     }';\n\n            haskell_run extractStatusFromFullResponse $hs_subrequest_status\n                    $hs_subrequest;\n\n            haskell_run extractHeaderFromFullResponse $hs_subrequest_header\n                    subrequest-header|$hs_subrequest;\n\n            haskell_run extractBodyFromFullResponse $hs_subrequest_body\n                    $hs_subrequest;\n\n            if ($hs_subrequest_status = 400) {\n                echo_status 400;\n                echo \"Bad request\";\n                break;\n            }\n\n            if ($hs_subrequest_status = 500) {\n                echo_status 500;\n                echo \"Internal server error while making subrequest\";\n                break;\n            }\n\n            if ($hs_subrequest_status = 502) {\n                echo_status 502;\n                echo \"Backend unavailable\";\n                break;\n            }\n\n            if ($hs_subrequest_status != 200) {\n                echo_status 404;\n                echo \"Subrequest status: $hs_subrequest_status\";\n                break;\n            }\n\n            echo    \"Subrequest status: $hs_subrequest_status\";\n            echo    \"Subrequest-Header: $hs_subrequest_header\";\n            echo -n \"Subrequest body: $hs_subrequest_body\";\n        }\n```\n\nNow we can recognize HTTP response statuses of subrequests and handle them\ndifferently. We also can read a response header *Subrequest-Header*.\n\n###### File *nginx.conf*: new response header *Subrequest-Header* in location */* of server *backend*\n\n```nginx\n            add_header Subrequest-Header \"This is response from subrequest\";\n```\n\n###### A simple test\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/full?a=Value\"'\nHTTP/1.1 400 Bad Request\nServer: nginx/1.17.9\nDate: Sat, 04 Apr 2020 12:44:36 GMT\nContent-Type: application/octet-stream\nTransfer-Encoding: chunked\nConnection: keep-alive\n\nBad request\n```\n\nGood. Now we see that injecting a double quote into a JSON field makes a bad\nrequest.\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/full?a=Value'\nHTTP/1.1 500 Internal Server Error\nServer: nginx/1.17.9\nDate: Sat, 04 Apr 2020 12:47:11 GMT\nContent-Type: application/octet-stream\nTransfer-Encoding: chunked\nConnection: keep-alive\n\nInternal server error while making subrequest\n```\n\nThis is also good. Now we are going to define port of the backend server via\nargument *arg_p*. Skipping this makes URI look unparsable\n(*http://127.0.0.1:/*) which leads to the error.\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/full?a=Value\u0026p=8020'\nHTTP/1.1 200 OK\nServer: nginx/1.17.9\nDate: Sat, 04 Apr 2020 12:52:03 GMT\nContent-Type: application/octet-stream\nTransfer-Encoding: chunked\nConnection: keep-alive\n\nSubrequest status: 200\nSubrequest-Header: This is response from subrequest\nSubrequest body: In backend, Custom-Header is 'Value'\n```\n\nFinally, we are getting a good response with all the response data decoded\ncorrectly.\n\nLet's try another port.\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/full?a=Value\u0026p=8021'\nHTTP/1.1 502 Bad Gateway\nServer: nginx/1.17.9\nDate: Sat, 04 Apr 2020 12:56:02 GMT\nContent-Type: application/octet-stream\nTransfer-Encoding: chunked\nConnection: keep-alive\n\nBackend unavailable\n```\n\nGood. There is no server listening on port 8021.\n\n---\n\nData encoded in the full response can be translated to *ContentHandlerResult*\nand forwarded downstream to the client in directive *haskell_content*.\nHandlers *fromFullResponse* and *fromFullResponseWithException*\nperform such a translation. Not all response headers are allowed being\nforwarded downstream, and thus the handlers delete response headers with\nnames listed in set *notForwardableResponseHeaders* as well as all headers\nwith names starting with *X-Accel-* before sending the response to the\nclient. The set of not forwardable response headers can be customized in\nfunction *contentFromFullResponse*.\n\nLet's forward responses in location */full* when argument *proxy* in the\nclient request's URI is equal to *yes*.\n\n###### File *nginx.conf*: forward responses from location */full*\n\n```nginx\n            set $proxy_with_exception $arg_proxy$arg_exc;\n\n            if ($proxy_with_exception = yesyes) {\n                haskell_content fromFullResponseWithException $hs_subrequest;\n                break;\n            }\n\n            if ($arg_proxy = yes) {\n                haskell_content fromFullResponse $hs_subrequest;\n                break;\n            }\n```\n\n###### A simple test\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/full?a=Value\u0026p=8020\u0026proxy=yes'\nHTTP/1.1 200 OK\nServer: nginx/1.17.9\nDate: Fri, 24 Jul 2020 13:14:33 GMT\nContent-Type: application/octet-stream\nContent-Length: 37\nConnection: keep-alive\nSubrequest-Header: This is response from subrequest\n\nIn backend, Custom-Header is 'Value'\n```\n\nNow let's get an error message in the response after feeding a wrong port\nvalue.\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/full?a=Value\u0026p=8021\u0026proxy=yes\u0026exc=yes'\nHTTP/1.1 502 Bad Gateway\nServer: nginx/1.19.4\nDate: Mon, 14 Dec 2020 08:24:22 GMT\nContent-Length: 593\nConnection: keep-alive\n\nHttpExceptionRequest Request {\n  host                 = \"127.0.0.1\"\n  port                 = 8021\n  secure               = False\n  requestHeaders       = [(\"Custom-Header\",\"Value\")]\n  path                 = \"/proxy\"\n  queryString          = \"\"\n  method               = \"GET\"\n  proxy                = Nothing\n  rawBody              = False\n  redirectCount        = 10\n  responseTimeout      = ResponseTimeoutDefault\n  requestVersion       = HTTP/1.1\n  proxySecureMode      = ProxySecureWithConnect\n}\n (ConnectionFailure Network.Socket.connect: \u003csocket: 31\u003e: does not exist (Connection refused))\n```\n\n---\n\nA bridged HTTP subrequest streams the response body from the *source* end of\nthe *bridge* to the *sink* end. Both source and sink are subrequests\nconfigured with the familiar type *SubrequestConf*. They comprise another\nopaque type *BridgeConf*. The bridge abstraction is useful when some data is\ngoing to be copied from some source to some destination.\n\nA bridge can be configured using handlers *makeBridgedSubrequest*,\n*makeBridgedSubrequestWithRead*, *makeBridgedSubrequestFull*, and\n*makeBridgedSubrequestFullWithRead* derived from the functions with the\nsame names.\n\nLet's extend our example with bridged subrequests.\n\n###### File *test_tools_extra_subrequest.hs*: auxiliary read body handler\n\n```haskell\nreqBody :: L.ByteString -\u003e ByteString -\u003e IO L.ByteString\nreqBody = const . return\n\nngxExportAsyncOnReqBody 'reqBody\n```\n\nIn this example, we are going to collect the request body at the sink end\nwith an auxiliary handler *reqBody*.\n\n###### File *nginx.conf*: upstream *sink*\n\n```nginx\n    upstream sink {\n        server 127.0.0.1:8030;\n    }\n```\n\n###### File *nginx.conf*: new location */bridge* in server *main*\n\n```nginx\n        location /bridge {\n            haskell_run_async makeBridgedSubrequestFull $hs_subrequest\n                    '{\"source\":\n                        {\"uri\": \"http://127.0.0.1:$arg_p/proxy/bridge\"\n                        ,\"headers\": [[\"Custom-Header\", \"$arg_a\"]]\n                        }\n                     ,\"sink\":\n                        {\"uri\": \"http://sink_proxy/echo\"\n                        ,\"manager\": \"uds\"\n                        }\n                     }';\n\n            if ($arg_exc = yes) {\n                haskell_content fromFullResponseWithException $hs_subrequest;\n                break;\n            }\n\n            haskell_content fromFullResponse $hs_subrequest;\n        }\n```\n\n###### File *nginx.conf*: new location */bridge* in server *backend*\n\n```nginx\n        location /bridge {\n            set $custom_header $http_custom_header;\n            add_header Subrequest-Header \"This is response from subrequest\";\n            echo \"The response may come in chunks!\";\n            echo \"In backend, Custom-Header is '$custom_header'\";\n        }\n```\n\n###### File *nginx.conf*: new servers *sink_proxy* and *sink*\n\n```nginx\n    server {\n        listen       unix:/tmp/backend.sock;\n        server_name  sink_proxy;\n\n        location / {\n            proxy_pass http://sink;\n        }\n    }\n\n    server {\n        listen       8030;\n        server_name  sink;\n\n        location /echo {\n            haskell_run_async_on_request_body reqBody $hs_rb noarg;\n            add_header Bridge-Header\n                    \"This response was bridged from subrequest\";\n            echo \"Here is the bridged response:\";\n            echo -n $hs_rb;\n        }\n    }\n```\n\nUpon receiving a request with URI */bridge* at the main server, we are going\nto connect to the *source* with the same URI at the server with port equal to\nargument *arg_p*, and then stream its response body to a *sink* with URI\n*/echo* via proxy server *sink_proxy*. Using an internal Nginx proxy server\nfor the sink end of the bridge is necessary if the sink end does not\nrecognize chunked HTTP requests! Note also that *method* of the sink\nsubrequest is always *POST* independently of whether or not and how exactly\nit was specified.\n\nThe source end puts into the bridge channel its response headers except those\nlisted in *notForwardableResponseHeaders* and those with names starting with\n*X-Accel-*. The request headers listed in the sink configuration get also\nsent: their values override the values of the headers of the same names sent\nin the response from the source end of the bridge.\n\nBridged HTTP subrequests have transactional semantics: any errors occurred at\neither end of a bridge make the whole subrequest fail. Responses from the\nsource end of a bridge with *non-2xx* status codes are regarded as a failure.\n\nIn this example, after receiving all streamed data the sink collects the\nrequest body in variable *hs_rb* and merely sends it back as a response to\nthe original bridged subrequest. Then this response gets decoded with\nhandlers *fromFullResponse* or *fromFullResponseWithException* and finally\nreturned in the response to the client.\n\n###### A simple test\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/bridge?a=Value\u0026p=8010\u0026exc=yes'\nHTTP/1.1 200 OK\nServer: nginx/1.19.4\nDate: Tue, 19 Oct 2021 13:12:46 GMT\nContent-Type: application/octet-stream\nContent-Length: 100\nConnection: keep-alive\nBridge-Header: This response was bridged from subrequest\n\nHere is the bridged response:\nThe response may come in chunks!\nIn backend, Custom-Header is 'Value'\n```\n\nA negative case.\n\n```ShellSession\n$ curl -D- 'http://localhost:8010/bridge?a=Value\u0026p=8021\u0026exc=yes'\nHTTP/1.1 502 Bad Gateway\nServer: nginx/1.19.4\nDate: Tue, 19 Oct 2021 13:16:18 GMT\nContent-Length: 600\nConnection: keep-alive\n\nHttpExceptionRequest Request {\n  host                 = \"127.0.0.1\"\n  port                 = 8021\n  secure               = False\n  requestHeaders       = [(\"Custom-Header\",\"Value\")]\n  path                 = \"/proxy/bridge\"\n  queryString          = \"\"\n  method               = \"GET\"\n  proxy                = Nothing\n  rawBody              = False\n  redirectCount        = 10\n  responseTimeout      = ResponseTimeoutDefault\n  requestVersion       = HTTP/1.1\n  proxySecureMode      = ProxySecureWithConnect\n}\n (ConnectionFailure Network.Socket.connect: \u003csocket: 32\u003e: does not exist (Connection refused))\n```\n\n#### Building and installation\n\n##### Build and install with cabal v1-commands\n\n###### Configure and build\n\n```ShellSession\n$ cabal v1-configure\n$ cabal v1-build\n```\n\n###### Install\n\n```ShellSession\n$ cabal v1-install\n```\n\nThe module is also available on\n[*hackage.haskell.org*](http://hackage.haskell.org/package/ngx-export-tools-extra),\nso you can simply install it from there with\n\n```ShellSession\n$ cabal v1-install ngx-export-tools-extra\n```\n\n##### Build as a dependency in a Nix-style local build aka v2-build\n\n```ShellSession\n$ cabal build\n```\n\nNote that module *NgxExport.Tools.PCRE* depends on the old PCRE library which\nmay be missing in basic sets of installed packages in modern Linux\ndistributions. If the PCRE module is not needed, you may simply skip building\nit by putting line\n\n```Cabal Config\nconstraints: ngx-export-tools-extra -pcre\n```\n\ninside *cabal.project*. Otherwise, the old PCRE library must be installed. In\n*Ubuntu 24.04* this can be done with command\n\n```ShellSession\n$ sudo apt-get install libpcre3-dev\n```\n\n##### Building custom libraries\n\nSee details in\n\n- [test/Aggregate/README.md](test/Aggregate/README.md),\n- [test/EDE/README.md](test/EDE/README.md),\n- [test/PCRE/README.md](test/PCRE/README.md),\n- [test/Prometheus/README.md](test/Prometheus/README.md),\n- [test/Resolve/README.md](test/Resolve/README.md),\n- [test/ServiceHookAdaptor/README.md](test/ServiceHookAdaptor/README.md),\n- [test/Subrequest/README.md](test/Subrequest/README.md).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flyokha%2Fngx-export-tools-extra","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flyokha%2Fngx-export-tools-extra","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flyokha%2Fngx-export-tools-extra/lists"}