{"id":13636483,"url":"https://github.com/openresty/stapxx","last_synced_at":"2025-04-19T08:32:18.294Z","repository":{"id":1021283,"uuid":"12009188","full_name":"openresty/stapxx","owner":"openresty","description":"Simple macro language extentions to systemtap","archived":false,"fork":false,"pushed_at":"2022-05-09T03:19:23.000Z","size":327,"stargazers_count":680,"open_issues_count":20,"forks_count":200,"subscribers_count":68,"default_branch":"master","last_synced_at":"2024-02-14T20:35:27.428Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openresty.png","metadata":{"files":{"readme":"README.markdown","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-08-09T19:42:39.000Z","updated_at":"2024-01-13T10:00:42.000Z","dependencies_parsed_at":"2022-08-06T10:01:23.423Z","dependency_job_id":null,"html_url":"https://github.com/openresty/stapxx","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openresty%2Fstapxx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openresty%2Fstapxx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openresty%2Fstapxx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openresty%2Fstapxx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openresty","download_url":"https://codeload.github.com/openresty/stapxx/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249650193,"owners_count":21305977,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T00:01:01.791Z","updated_at":"2025-04-19T08:32:17.954Z","avatar_url":"https://github.com/openresty.png","language":"Perl","funding_links":[],"categories":["Libraries","Use case","Perl"],"sub_categories":["OpenResty"],"readme":"NAME\n====\n\nstap++ - Simple macro language extensions to systemtap\n\nTable of Contents\n=================\n\n* [NAME](#name)\n* [Synopsis](#synopsis)\n* [Description](#description)\n* [Features](#features)\n    * [Standard Macro Variables](#standard-macro-variables)\n        * [$^exec_path](#exec_path)\n        * [$^libNAME_path](#libname_path)\n        * [$^arg_NAME](#arg_name)\n        * [Default values](#default-values)\n    * [User-defined Macro Variables](#user-defined-macro-variables)\n    * [Tapset Modules](#tapset-modules)\n    * [Shorthands](#shorthands)\n        * [@pfunc(FUNCTION)](#pfuncfunction)\n* [Samples](#samples)\n    * [ngx-rps](#ngx-rps)\n    * [ngx-req-latency-distr](#ngx-req-latency-distr)\n    * [ctx-switches](#ctx-switches)\n    * [ngx-lj-gc](#ngx-lj-gc)\n    * [lj-gc](#lj-gc)\n    * [ngx-lj-gc-objs](#ngx-lj-gc-objs)\n    * [lj-gc-objs](#lj-gc-objs)\n    * [ngx-lj-vm-states](#ngx-lj-vm-states)\n    * [lj-vm-states](#lj-vm-states)\n    * [ngx-lj-trace-exits](#ngx-lj-trace-exits)\n    * [ngx-lj-lua-bt](#ngx-lj-lua-bt)\n    * [lj-lua-bt](#lj-lua-bt)\n    * [ngx-lj-lua-stacks](#ngx-lj-lua-stacks)\n    * [lj-lua-stacks](#lj-lua-stacks)\n    * [epoll-et-lt](#epoll-et-lt)\n    * [epoll-loop-blocking-distr](#epoll-loop-blocking-distr)\n    * [sample-bt-leaks](#sample-bt-leaks)\n    * [ngx-lua-shdict-writes](#ngx-lua-shdict-writes)\n    * [ngx-lua-shdict-info](#ngx-lua-shdict-info)\n    * [ngx-single-req-latency](#ngx-single-req-latency)\n    * [ngx-rewrite-latency-distr](#ngx-rewrite-latency-distr)\n    * [ngx-lua-exec-time](#ngx-lua-exec-time)\n    * [ngx-lua-tcp-recv-time](#ngx-lua-tcp-recv-time)\n    * [ngx-lua-tcp-total-recv-time](#ngx-lua-tcp-total-recv-time)\n    * [ngx-lua-udp-recv-time](#ngx-lua-udp-recv-time)\n    * [ngx-lua-udp-total-recv-time](#ngx-lua-udp-total-recv-time)\n    * [ngx-orig-resp-body-len](#ngx-orig-resp-body-len)\n    * [zlib-deflate-chunk-size](#zlib-deflate-chunk-size)\n    * [lj-str-tab](#lj-str-tab)\n    * [func-latency-distr](#func-latency-distr)\n    * [ngx-count-conns](#ngx-count-conns)\n    * [ngx-lua-count-timers](#ngx-lua-count-timers)\n    * [cpu-hogs](#cpu-hogs)\n    * [cpu-robbers](#cpu-robbers)\n    * [ngx-pcre-dist](#ngx-pcre-dist)\n    * [ngx-pcre-top](#ngx-pcre-top)\n    * [vfs-page-cache-misses](#vfs-page-cache-misses)\n    * [openssl-handshake-diagnosis](#openssl-handshake-diagnosis)\n* [Installation](#installation)\n* [Author](#author)\n* [Copyright and License](#copyright-and-license)\n* [See Also](#see-also)\n\nStatus\n======\n\n**IMPORTANT!!! This project is no longer maintained and our focus has been shifted to a much better dynamic tracing platform named [OpenResty XRay](https://openresty.com/en/xray/). Existing users of the tools here are recommended to switch too.**\n\nThe stap++ language is now superseded by the [Y lang](https://doc.openresty.com/en/ylang/) as part of the [OpenResty XRay](https://openresty.com/en/xray/) platform.\n\nSynopsis\n========\n\n```bash\n    $ stap++ -I ./tapset -x 12345 --arg limit=10 samples/ngx-upstream-post-conn.sxx\n    $ stap++ -e 'probe begin { println(\"hello\") exit() }'\n```\n\nDescription\n===========\n\nThis interpreter adds some simple macro language extensions to the systemtap scripting language.\n\nEfforts has been made to ensure that this macro language expansion does\nnot affect the source line numbers so that the line numbers reported by `stap` are exactly the same in the original `.sxx` source files.\n\nFeatures\n========\n\n[Back to TOC](#table-of-contents)\n\nStandard Macro Variables\n------------------------\n\n[Back to TOC](#table-of-contents)\n\n### $^exec_path\n\nThe variable `$^exec_path` is always evaluated to the path to the executable file\nfor the pid specified by the `-x` or `--master` option.\n\nHere is an example:\n\n```stap\n    probe process(\"$^exec_path\").function(\"blah\") { ... }\n```\n\nWhen you specify the `--exec PATH` option on the command line, then\nthis PATH is always used regardless of the presense of the `-x` or `--master` option.\n\n[Back to TOC](#table-of-contents)\n\n### $^libNAME_path\n\nThis variable expands to the absolute path of the DSO library file specified by a pattern.\n\n`stap++` automatically scans all the loaded DSO files in the running process (if the `-x PID` option is specified) to find a match. If it fails to find a match, this variable will take the value of `$^exec_path`, that is, assuming the library is statically linked.\n\nBelow is an example for tracing a user-land function in the libpcre library:\n\n```stap\n    probe process(\"$^libpcre_path\").function(\"pcre_exec\")\n    {\n        println(\"pcre_exec called\")\n        print_ubacktrace()\n    }\n```\n\n[Back to TOC](#table-of-contents)\n\n### $^arg_NAME\n\nThis variable can evaluate to the value of a specified command-line argument. For example, `$^arg_limit` is evaluated to the value of the command line argument `limit` specified like this:\n\n    stap++ --arg limit=1000\n\nYou can dump out all the available arguments in the stap++ script by specifying the --args option, for example:\n\n    $ stap++ --args foo.sxx\n    --arg method=VALUE (default: )\n    --arg time=VALUE (default: 60)\n\n[Back to TOC](#table-of-contents)\n\n### Default values\n\nIt's possible to specify a default value for a macro variable by means of the `default` trait, as in\n\n    foreach (key in stats- limit $^arg_limit :default(1000)) {\n        ...\n    }\n\nwhere `$^arg_limit` takes the default value 1000 when the user does not specify the `--arg limit=N` command-line option while invoking `stap++`.\n\n[Back to TOC](#table-of-contents)\n\nUser-defined Macro Variables\n----------------------------\n\nIt's possible to bind a `@cast()` or `@var()` expression to a user-defined macro variable of the form `$*NAME`. Here is an example,\n\n    sock = sockfd_lookup(fd)\n    $*sock := @cast(sock, \"socket\", \"kernel\")\n\n    printf(\", sock-\u003estate:%d\", $*sock-\u003estate)\n    state = $*sock-\u003esk-\u003e__sk_common-\u003eskc_state\n    printf(\", sock-\u003esk-\u003esk_state:%d (%s)\\n\", state, tcp_sockstate_str(state))\n\nNote that we used the `:=` operator to bind a `@cast()` or `@var()` expression to user variable `$*sock`, and later we reference it whenever we need that `@cast()` or `@var()` expression.\n\nThe scope of user variables is always limited to the current `.sxx` source file.\n\n[Back to TOC](#table-of-contents)\n\nTapset Modules\n--------------\n\nOne can use the stap++ language to define new tapset module files and later use the `@use` directive to load the module in a main stap++ program file.\n\nFor example, we can have a module file located at `./tapset/kernel/socket.sxx`:\n\n    // module kernel.socket\n    function socketfd_lookup(fd)\n    {\n        ...\n    }\n\nAnd then in a stap++ script file, `foo.sxx`, we can import this library like this\n\n    @use kernel.socket\n\nand in `foo.sxx`, we are now free to call the `socketfd_lookup` function defined in the `kernel.socket` module.\n\nFinally, we should invoke the `stap++` interpreter like this:\n\n    stap++ -I ./tapset foo.sxx ...\n\nNote the `-I ./tapset` option that specifies the search path for the stap++ tapset modules. The default module search paths are `.`, and `\u003cbin-dir\u003e/tapset`, where `\u003cbin-dir\u003e` is the directory where `stap++` sits in.\n\nUnlike `stap`, only the used stapset modules are processed so as to reduce startup time.\n\nOne can `@use` multiple modules like this\n\n    @use kernel.socket\n    @use nginx.upstream\n\nor equivalently,\n\n    @use kernel.socket, nginx.upstream\n\nAll those macro variables are free to use in the tapset module files.\n\n[Back to TOC](#table-of-contents)\n\nShorthands\n----------\n\n[Back to TOC](#table-of-contents)\n\n### @pfunc(FUNCTION)\n\nThis is equivalent to `process(\"$^exec_path\").function(\"FUNCTION\").\n\nFor example,\n\n    probe @pfunc(ngx_http_upstream_finalize_request),\n          @pfunc(ngx_http_upstream_send_request)\n    {\n        ...\n    }\n\nis equivalent to\n\n    probe process(\"$^exec_path\").function(\"ngx_http_upstream_finalize_request\"),\n          process(\"$^exec_path\").function(\"ngx_http_upstream_send_request\")\n    {\n        ...\n    }\n\n[Back to TOC](#table-of-contents)\n\nSamples\n=======\n\n**IMPORTANT!!! The sample tools below are no longer maintained and our focus has been\nshifted to a much better dynamic tracing platform named\n[OpenResty XRay](https://openresty.com/en/xray). Existing users of the tools here are recommended to switch too.**\n\n[Back to TOC](#table-of-contents)\n\nngx-rps\n-------\n\nCalculate the current number of requests per second handled by the Nginx\nworker process specified by its pid:\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming one nginx worker process has the pid 19647.\n    $ ./samples/ngx-rps.sxx -x 19647\n    WARNING: Tracing process 19647.\n    Hit Ctrl-C to end.\n    [1376939543] 300 req/sec\n    [1376939544] 235 req/sec\n    [1376939545] 235 req/sec\n    [1376939546] 166 req/sec\n    [1376939547] 238 req/sec\n    [1376939548] 234 req/sec\n    ^C\n\nThe numbers in the leading square brackets are the current timestamp (seconds since the Epoch).\n\nBehind the scene, the Nginx main requests' completion events are traced.\n\n[Back to TOC](#table-of-contents)\n\nngx-req-latency-distr\n---------------------\n\nCalculates the distribution of the Nginx request latencies (excluding the request header reading time) in any specified\nNginx worker process at real time:\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    $ ./samples/ngx-req-latency-distr.sxx -x 28078\n    WARNING: Start tracing process 28078 (/path/to/some/program)...\n    ^C\n    Distribution of the main request latencies (in microseconds)\n    (min/avg/max: 92/242181/42808832)\n        value |-------------------------------------------------- count\n           16 |                                                      0\n           32 |                                                      0\n           64 |                                                      8\n          128 |                                                      1\n          256 |                                                      3\n          512 |@@@@@                                               274\n         1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  2474\n         2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                     1547\n         4096 |@@@@@@@@@@@@@@@@@@@                                 952\n         8192 |@@@@@@@@@@                                          500\n        16384 |@@@@@@@                                             359\n        32768 |@@@@@@@@                                            414\n        65536 |@@@@@@@@@@@@                                        644\n       131072 |@@@@@@@@@@@@@@@@@                                   851\n       262144 |@@@@@@@@@@@@                                        614\n       524288 |@@@@@@                                              334\n      1048576 |@@                                                  147\n      2097152 |                                                     46\n      4194304 |                                                     24\n      8388608 |@                                                    64\n     16777216 |                                                      1\n     33554432 |                                                      1\n     67108864 |                                                      0\n    134217728 |                                                      0\n\nOne can also filter out requests by a specified request method name via the `--arg method=METHOD` option. For instance,\n\n    $ ./samples/ngx-req-latency-distr.sxx -x 5447 --arg method=POST --arg time=60\n    Start tracing process 5447 (/opt/nginx/sbin/nginx)...\n    Please wait for 60 seconds...\n    (Tracing only POST request methods)\n\n    Distribution of the main request latencies (in microseconds) for 52 samples:\n    (min/avg/max: 1167/8373/28281)\n    value |-------------------------------------------------- count\n      256 |                                                    0\n      512 |                                                    0\n     1024 |@@                                                  2\n     2048 |@@@@@@@@                                            8\n     4096 |@@@@@@@@@@@@@@@@@@@@@@@                            23\n     8192 |@@@@@@@@@@@@@@                                     14\n    16384 |@@@@@                                               5\n    32768 |                                                    0\n    65536 |                                                    0\n\nWe can also see from the example above that we can limit the sampling period by specifying the `--arg time=SECONDS` option.\n\n[Back to TOC](#table-of-contents)\n\nctx-switches\n------------\n\nCalculates the CPU context switching rate (number/second) in any specified user process at real time:\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming the target process pid is 6254:\n    $ ./samples/ctx-switches.sxx -x 6254\n    WARNING: Tracing process 6254 (/path/to/some/process).\n    Hit Ctrl-C to end.\n    [1379631372] 13741 cs/sec\n    [1379631373] 13330 cs/sec\n    [1379631374] 14263 cs/sec\n    [1379631375] 14424 cs/sec\n    [1379631376] 14591 cs/sec\n    [1379631377] 11108 cs/sec\n    [1379631378] 12620 cs/sec\n    [1379631379] 12519 cs/sec\n    [1379631380] 13479 cs/sec\n    [1379631381] 14614 cs/sec\n    [1379631382] 14721 cs/sec\n    [1379631383] 13408 cs/sec\n    [1379631384] 14682 cs/sec\n\nBoth switch-in and switch-out are counted in this tool.\n\nHigh context switching rate usually means higher overhead in the system. Ideally\nwe could keep the context switching rate low.\n\n[Back to TOC](#table-of-contents)\n\nngx-lj-gc\n---------\n\nThis tool has been renamed to [lj-gc](#lj-gc) because it is no longer specific to Nginx.\n\n[Back to TOC](#table-of-contents)\n\nlj-gc\n-----\n\nThis tool analyses the LuaJIT 2.0/2.1 GC in the specified \"luajit\" utility program's process or the specified Nginx worker process via the [ngx_lua](http://wiki.nginx.org/HttpLuaModule) mdoule.\n\nOther custom C processes with LuaJIT embedded can also be analyzed by this tool as long as the target C program saves the main Lua VM state (lua_State) pointer in a global C variable named `globalL`, just as in the standard `luajit` command-line utility program.\n\nFor now, it just prints out the total memory currently allocated in the LuaJIT GC. For example,\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming the nginx worker process's pid is 4771:\n    $ ./samples/lj-gc.sxx -x 4771\n    Start tracing 4771 (/opt/nginx/sbin/nginx)\n    Total GC count: 258618 bytes\n\n[Back to TOC](#table-of-contents)\n\nngx-lj-gc-objs\n--------------\n\nThis tool has been renamed to [lj-gc-objs](#lj-gc-objs) because it is no longer specific to Nginx.\n\n[Back to TOC](#table-of-contents)\n\nlj-gc-objs\n----------\n\nThis tool dumps the GC objects' memory usage stats in any specified `luajit` utility program's process or any specified running Nginx worker process\naccording to the GC object's types.\n\nOther custom C processes with LuaJIT embedded can also be analyzed by this tool as long as the target C program saves the main Lua VM state (lua_State) pointer in a global C variable named `globalL`, just as in the standard `luajit` command-line utility program.\n\nThis tool reveals exactly how the memory is distributed among all Lua value types, which is useful for optimizing Lua code's memory usage and debugging memory leak issues in the Lua programs.\n\nFor now, both LuaJIT 2.0 and LuaJIT 2.1 are supported.\n\nHere is an example.\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming the nginx worker pid is 5686:\n    $ ./samples/lj-gc-objs.sxx -x 5686\n    Start tracing 5686 (/opt/nginx/sbin/nginx)\n\n    main machine code area size: 65536 bytes\n    C callback machine code size: 4096 bytes\n    GC total size: 922648 bytes\n    GC state: sweep\n\n    4713 table objects: max=6176, avg=90, min=32, sum=428600 (in bytes)\n    3341 string objects: max=2965, avg=47, min=18, sum=159305 (in bytes)\n    677 function objects: max=144, avg=29, min=20, sum=20224 (in bytes)\n    563 userdata objects: max=8895, avg=82, min=24, sum=46698 (in bytes)\n    306 proto objects: max=34571, avg=541, min=78, sum=165557 (in bytes)\n    287 upvalue objects: max=24, avg=24, min=24, sum=6888 (in bytes)\n    102 trace objects: max=928, avg=337, min=160, sum=34468 (in bytes)\n    8 cdata objects: max=24, avg=17, min=16, sum=136 (in bytes)\n    7 thread objects: max=1648, avg=1493, min=568, sum=10456 (in bytes)\n    JIT state size: 6920 bytes\n    global state tmpbuf size: 2948 bytes\n    C type state size: 2520 bytes\n\n    My GC walker detected for total 922648 bytes.\n    5782 microseconds elapsed in the probe handler.\n\nFor LuaJIT instances with big memory usage, you need to increase the `MAXACTION` threshold, as in\n\n    $ ./samples/lj-gc-objs.sxx -x 14378 -D MAXACTION=200000\n    Start tracing 14378 (/opt/nginx/sbin/nginx)\n\n    main machine code area size: 65536 bytes\n    C callback machine code size: 4096 bytes\n    GC total size: 9683407 bytes\n    GC state: pause\n\n    27948 table objects: max=131112, avg=106, min=32, sum=2983944 (in bytes)\n    22343 string objects: max=1421562, avg=198, min=18, sum=4432482 (in bytes)\n    12168 userdata objects: max=8916, avg=50, min=27, sum=619223 (in bytes)\n    2837 function objects: max=148, avg=27, min=20, sum=78264 (in bytes)\n    1200 upvalue objects: max=24, avg=24, min=24, sum=28800 (in bytes)\n    650 proto objects: max=3860, avg=313, min=74, sum=203902 (in bytes)\n    349 thread objects: max=1648, avg=774, min=424, sum=270464 (in bytes)\n    202 trace objects: max=1560, avg=375, min=160, sum=75832 (in bytes)\n    9 cdata objects: max=36, avg=17, min=12, sum=156 (in bytes)\n    JIT state size: 7696 bytes\n    global state tmpbuf size: 710772 bytes\n    C type state size: 4568 bytes\n\n    My GC walker detected for total 9683407 bytes.\n    45008 microseconds elapsed in the probe handler.\n\nThe \"objects\" are Lua values that actually participate in garbage\ncollection (GC).\n\nPrimitive Lua values like numbers, booleans, nils, and light user data\ndo not participate in GC, so we shall never see them listed in the\noutput of the `lj-gc-objs` tool.\n\nAnother interesting exception is empty Lua strings, they are specially\nhandled by LuaJIT and they never appear in the output either.\n\nThe following types of objects participate in GC and thus considered\n\"GC objects\" in LuaJIT 2.0:\n\n* string: Lua strings\n* upvalue: Lua Upvalues\n* thread: Lua threads (i.e., Lua coroutines)\n* proto: Lua function prototypes\n* function: Lua functions (Lua closures) and C functions\n* cdata: cdata created by the FFI API in Lua.\n* table: Lua tables\n* userdata: Lua user data\n* trace: JIT compiled Lua code paths\n\nNote that for the space calculated for aggregate objects like \"table\" and\n\"function\", only the size of their backbones is calculated. For\nexample, for a Lua table, we do not follow the references to its value\nobjects and key objects (but we do include the size of the \"key\" and\n\"value\" reference pointers themselves). Similarly, for a Lua function\nobject, we do not follow its upvalues either. Therefore, we can safely\nadd up the sizes of all the GC objects to obtain the total size\nallocated by the LuaJIT GC without repeating anyone.\n\nIt's worth mentioning that the LuaJIT GC may also allocate some space\nfor some components in the VM itself (on demand):\n\n* the JIT compiler state (i.e., the \"JIT state size\" item in the tool's output).\n* the state for FFI C types (i.e., the \"C type state size\" item in the output).\n* global state temporary buffer (i.e., the \"global state tmpbuf size\" item).\n\nThese allocations may not happen for trivial examples. For example, if\nthe JIT compiler is disabled, we won't see nonzero \"JIT state size\".\nSimilarly, if we don't use FFI in our Lua code at all, we won't see\nnonzero \"C type state size\" either.\n\nLike any language with a proper GC, any unused Lua objects will be\nautomatically freed by the LuaJIT GC. You make a Lua object become\nunused by avoid referencing the object from any \"GC root objects\"\neither directly or indirectly.\n\nIt is usually fine for Lua objects to stay a little longer then needed.Basically it's fine and also\nnormal to see the GC count (or the memory allocated by the GC) going\nup and down during the lifetime of the process.\n\nThat is how a typical incremental GC works. The LuaJIT GC cycle\nconsists of various different phases. And all these phases are divided\ninto many small pieces that interleave with normal Lua code execution.\nFor example, when the GC cycle is at the \"sweep-string\" phase,\nnon-string GC objects will not freed at all until the GC cycle later\nenters the \"sweep\" phase.\n\nThis tool is good at finding out the largest GC objects\nin your already bloated Nginx worker processes. It is not really designed for debugging\nindividual requests due to the non-determinism of the GC on micro\nlevels. You should load your nginx workers by tools like ab and\nweighttp or just trace workers in production, so as to make your nginx worker eat up a _lot_ of memory. The\nmore, the merrier. After that, run `lj-gc-objs` on your largest\n`luajit` process or `nginx` worker process.\n\nTo debug individual requests, you *can* force the LuaJIT GC to free all the unsed objects at once by calling\n\n    collectgarbage()\n\nThis forces the LuaJIT (or standard Lua interpreter) GC to run a\ncomplete collection cycle immediately. See\nhttp://www.lua.org/manual/5.1/manual.html#pdf-collectgarbage for more\ndetails. But this is usually very expensive to call and it is strongly discouraged for production use.\n\n[Back to TOC](#table-of-contents)\n\nngx-lj-vm-states\n----------------\n\nThis tool has been renamed to [lj-vm-states](#lj-vm-states) because it is no longer specific to Nginx.\n\n[Back to TOC](#table-of-contents)\n\nlj-vm-states\n----------------\n\nThis tool samples the LuaJIT's VM states in the specified `luajit` utility program or the specified nginx worker process (running the ngx_lua module) via kernel's timer hooks.\n\nOther custom C processes with LuaJIT embedded can also be analyzed by this tool as long as the target C program saves the main Lua VM state (lua_State) pointer in a global C variable named `globalL`, just as in the standard `luajit` command-line utility program.\n\nWe can know how the CPU time is distributed among interpreted Lua code, (JIT) compiled Lua code, garbage collector, and etc.\n\nThe following LuaJIT VM states are analyzed:\n\n* Interpreted\n\n    Running interpreted Lua code\n* Compiled\n\n    Running already compiled Lua code (this including running C functions *compiled* by the `FFI` API)\n* C Code (by interpreted Lua)\n\n    Running C functions of the form `lua_CFunction` or called by the `FFI` API in the Lua interpreter.\n* Garbage Collector\n\n    Doing GC work in the garbage collector (for both compiled code and interpreted code).\n* Trace exiting\n\n    Exiting compiled Lua code and falling back to the Lua interpreter.\n* JIT Compiler\n\n    Compiling Lua code to native code in the Lua JIT compiler.\n\nFor now, this tool only supports LuaJIT v2.1.\n\nBelow are some examples:\n\n```bash\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n$ lj-vm-states.sxx -x 24405 --arg time=30\nStart tracing 24405 (/opt/nginx/sbin/nginx)\nPlease wait for 30 seconds...\n\nObserved 457 Lua-running samples and ignored 0 unrelated samples.\nC Code (by interpreted Lua): 78% (361 samples)\nInterpreted: 14% (65 samples)\nGarbage Collector: 4% (21 samples)\nCompiled: 1% (5 samples)\nJIT Compiler: 1% (5 samples)\n```\n\nIn this example, we can see most of the CPU time is spent on interpreted Lua code, which means big room for future speedup via JIT compiling more hot Lua code paths.\n\n```bash\n$ lj-vm-states.sxx -x 9087\nStart tracing 9087 (/opt/nginx/sbin/nginx)\nHit Ctrl-C to end.\n^C\nObserved 2082 Lua-running samples and ignored 0 unrelated samples.\nCompiled: 97% (2024 samples)\nGarbage Collector: 2% (56 samples)\nTrace exiting: 0% (1 samples)\nInterpreted: 0% (1 samples)\n```\n\nThis example shows that almost all the CPU time is spent on compiled Lua code, which is a very good sign.\n\nIf you see \"read faults\" errors, they are usually very normal and you can just try specifying the `--skip-badvars` option to ignore them.\n\n[Back to TOC](#table-of-contents)\n\nngx-lj-trace-exits\n------------------\n\nThis tool analyzes the number of LuaJIT \"trace exits\" in each request handled by Nginx (with the [ngx_lua](https://github.com/chaoslawful/lua-nginx-module) module enabled).\n\nA \"trace exit\" is an event that the LuaJIT VM exits a compiled trace (or a single unit of compiled Lua code path) and falls back to LuaJIT's Lua interpreter.\n\nBasically we hope every request has some trace exits, but not too many! Too many trace exits per request usually means highly fragmented compiled Lua code paths and relatively high overhead in synchronizing the VM state when leaving the compiled code and entering interpreted mode.\n\nConsider the following example,\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming the nginx worker process has the pid 24391:\n    $ ngx-lj-trace-exits.sxx -x 24391 --arg time=20\n    Start tracing process 24391 (/opt/nginx/sbin/nginx)...\n    Please wait for 20 seconds...\n\n    5 out of 2251 requests used compiled traces generated by LuaJIT.\n    Distribution of LuaJIT trace exit times per request for 5 sample(s):\n    (min/avg/max: 1/1/2)\n    value |-------------------------------------------------- count\n        0 |                                                   0\n        1 |@@@@                                               4\n        2 |@                                                  1\n        4 |                                                   0\n        8 |                                                   0\n\nWe can see that only 5 out of 2251 requests use compiled Lua code, which is a bad sign of slowness if most of the requests go through some Lua code.\n\nAlso, in the example above, we used the `--arg time=20` option to let\nthe tool automatically quit after 20 seconds of sampling.\n\nNow take a look at the following example,\n\n    $ ngx-lj-trace-exits.sxx -x 12718 --arg time=20\n    Start tracing process 12718 (/opt/nginx/sbin/nginx)...\n    Hit Ctrl-C to end.\n    ^C\n    18022 out of 18023 requests used compiled traces generated by LuaJIT.\n    Distribution of LuaJIT trace exit times per request for 18022 sample(s):\n    (min/avg/max: 4/4/6)\n    value |-------------------------------------------------- count\n        1 |                                                       0\n        2 |                                                       0\n        4 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  18022\n        8 |                                                       0\n       16 |                                                       0\n\nwe can see that almost all the requests used compiled traces, which is great! And we can see that almost all the requests have 4 \"trace exits\" in their lifetime.\n\n[Back to TOC](#table-of-contents)\n\nngx-lj-lua-bt\n-------------\n\nThis tool has been renamed to [lj-lua-bt](#lj-lua-bt) because it is no longer specific to Nginx.\n\n[Back to TOC](#table-of-contents)\n\nlj-lua-bt\n-------------\n\nThis tool dumps out the current Lua backtrace in the running LuaJIT 2.1 VM in the specified \"luajit\" utility program or the specified nginx worker process\n(with the [ngx_lua](https://github.com/chaoslawful/lua-nginx-module) module enabled).\n\nOther custom C processes with LuaJIT embedded can also be analyzed by this tool as long as the target C program saves the main Lua VM state (lua_State) pointer in a global C variable named `globalL`, just as in the standard `luajit` command-line utility program.\n\nThis tool uses the kernel timer hook to preempt into the target nginx process. So even if the Lua code is in a tight loop or the C code called by some Lua code is spinning, we can get a proper Lua backtrace. But when the process is simply blocking on some system calls without consuming any CPU time at all, then this tool will just hang and keep waiting.\n\nThis tool is best for locating very hot or even infinite loops in some Lua code or C code called by some Lua code.\n\nBoth interpreted and compiled Lua code is supported (remember that LuaJIT ships with both a fast interpreter and an awesome JIT compiler?).\n\nBelow is an example,\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming the nginx worker process pid is 22544:\n    $ ./samples/lj-lua-bt.sxx --skip-badvars -x 22544\n    Start tracing 22544 (/opt/nginx/sbin/nginx)\n    @/home/agentzh/git/cf/nginx-waf/gen/lua/waf-core.lua:1283\n    /waf/rulesets/lua/modsecurity_setup.lua:66\n    @/home/agentzh/git/cf/nginx-waf/gen/lua/waf.lua:289\n    builtin#21\n    @/home/agentzh/git/cf/nginx-waf/gen/lua/waf.lua:309\n    access_by_lua:3\n\nThe file paths here are for the Lua source file defining the Lua function (frames). And the number after the colon is the corresponding file line number where the corresponding source line is currently executed.\n\nFor every Lua function frame, the line number of the Lua source line currently being executed will be printed out wherever possible. Failing that, the first line of the Lua function will be printed instead.\n\nFunction frames prefixed by `C:` means C function frames and the things after `C:` are the C function names. Such C functions are only those called via the `lua_CFunction` mechanism, not LuaJIT FFI.\n\nFunction frames prefixed by `T:` are the Lua functions from which the (compiled) traces start. For traces generated by the JIT compiler, they could \"encompass more than one function\nand may correspond to multiple Lua stack levels\". But for simplicity, we only fetch their starting Lua function as a first-order approximation for now. This approximation should be good enough for most of the common cases.\n\nFunction frames like `builtin#82` means Lua builtin functions and the number `82` here indicates the builtin function (or \"fast function\")'s ID. You can get the actual function name from the ID by running the [ljff.lua](https://github.com/agentzh/nginx-devel-utils/blob/master/ljff.lua) script using exactly the same LuaJIT, for example,\n\n    $ luajit-2.1.0-alpha /path/to/nginx-devel-utils/ljff.lua 82\n    FastFunc string.dump\n\nWhich means the ID 82 corresponds to the \"string.dump\" builtin. The IDs of builtins might be different among different versions of LuaJIT, so ensure you are using the same LuaJIT to run this Lua script.\n\n[Back to TOC](#table-of-contents)\n\nngx-lj-lua-stacks\n-----------------\n\n**WARNING!!!** This tool has many bugs and is obsoleted by [OpenResty XRay](https://openresty.com/en/xray/). See also the blog post [\"Introduction to Lua-Land CPU Flame Graphs\"](https://blog.openresty.com/en/lua-cpu-flame-graph/).\n\nThis tool has been renamed to [lj-lua-stacks](#lj-lua-stacks) because it is no longer specific to Nginx.\n\n[Back to TOC](#table-of-contents)\n\nlj-lua-stacks\n-------------\n\n**WARNING!!!** This tool has many bugs and is obsoleted by [OpenResty XRay](https://openresty.com/en/xray/). See also the blog post [\"Introduction to Lua-Land CPU Flame Graphs\"](https://blog.openresty.com/en/lua-cpu-flame-graph/).\n\nThis tool samples Lua backtraces in the running LuaJIT 2.1 VM of the specified `luajit` process or `nginx` worker process (with the [ngx_lua](https://github.com/chaoslawful/lua-nginx-module) module). The timer hook API of the Linux kernel is used for relatively even sampling according to the CPU time usage.\n\nIt aggregates idential Lua backtraces during the sampling process so the final output data will not be very big.\n\nOther custom C processes with LuaJIT embedded can also be analyzed by this tool as long as the target C program saves the main Lua VM state (lua_State) pointer in a global C variable named `globalL`, just as in the standard `luajit` command-line utility program.\n\nBelow is some examples:\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming the nginx worker process pid is 6949:\n    $ ./samples/lj-lua-stacks.sxx --skip-badvars -x 6949 \u003e a.bt\n    Start tracing 6949 (/opt/nginx/sbin/nginx)\n    Hit Ctrl-C to end.\n    ^C\n\nBy default, the tool will just keep sampling until you hit `Ctrl-C`. You can also specify the `--arg time=N` option to let the tool exit automatically after `N` seconds. For example,\n\n    $ ./samples/lj-lua-stacks.sxx --arg time=5 \\\n                --skip-badvars -x 6949 \u003e a.bt\n    Start tracing 6949 (/opt/nginx/sbin/nginx)\n    Please wait for 5 seconds\n\nFor the sample commands above, the tool's output data will then be in the file `a.bt`. The output format is\n\n    \u003cbacktrace-1\u003e\n    \\t\u003ccount-1\u003e\n    \u003cbacktrace-2\u003e\n    \\t\u003ccount-2\u003e\n    \u003cbacktrace-3\u003e\n    \\t\u003ccount-3\u003e\n    ...\n\nBelow is an example:\n\n    C:ngx_http_lua_blah\n    @/opt/gen/lua/waf-core.lua:770\n    @/opt/gen/lua/waf-core.lua:1283\n    /opt/gen/lua/waf-attacks.lua:38\n    builtin#21\n    @/opt/gen/lua/waf.lua:301\n    access_by_lua:1\n            97\n    T:@/opt/gen/lua/waf-core.lua:770\n    @/opt/gen/lua/waf-core.lua:770\n    @/opt/gen/lua/waf-core.lua:1283\n    /opt/gen/xss_attacks.lua:15\n    @/opt/gen/lua/waf.lua:174\n    builtin#21\n    @/opt/gen/lua/waf.lua:301\n    access_by_lua:1\n            52\n\nEach backtrace above follows the same format as in the output of the [lj-lua-bt](#lj-lua-bt) tool. The only difference is that the line numbers are always for the first source lines of the Lua functions.\n\nThis output can be consumed by the [FlameGraphs tool](https://github.com/brendangregg/FlameGraph) (created by Brenden Gregg) to generate Lua-land Flame Graphs for performance analysis. For example,\n\n    $ stackcollapse-stap.pl a.bt \u003e a.cbt\n\n    $ flamegraph.pl --encoding=\"ISO-8859-1\" \\\n                  --title=\"Lua-land on-CPU flamegraph\" \\\n                  a.cbt \u003e a.svg\n\nThe resulting `a.svg` file is an interactive SVG graph that can be displayed in any descent web browser, like Google Chrome:\n\n    $ chrome a.svg\n\nYou can get much better Lua-land Flame Graphs by filtering the output of this `lj-lua-stacks` tool with the [fix-lua-bt](https://github.com/agentzh/nginx-systemtap-toolkit#fix-lua-bt) tool in my [Nginx Systemtap Toolkit](https://github.com/agentzh/nginx-systemtap-toolkit):\n\n    $ fix-lua-bt a.bt \u003e a2.bt\n\nAnd then feed the `a2.bt` file instead to Brendan Gregg's `stackcollapse-stap.pl` tool and etc to generate the graph.\n\nBelow are some real-world sample Flame Graphs on the Lua land:\n\n* http://agentzh.org/misc/flamegraph/lua-on-cpu-local-waf-jitted-only.svg\n* http://agentzh.org/misc/flamegraph/lua-on-cpu-local-waf-interp-only.svg\n\nBy default, this tool will sample backtraces of both interpreted Lua code and compiled Lua code at the same time (remember that LuaJIT comes with both a fast interpreter and an awesome JIT compiler?).\n\nYou can choose to sample interpreted Lua code only by specifying the `--arg nojit=1` option to ignore backtrace samples for (JIT) compiled Lua code. For instance,\n\n\n    $ ./samples/lj-lua-stacks.sxx --arg nojit=1 \\\n            --arg time=5 --skip-badvars -x $pid \u003e a.bt\n\nSimilarly, you can choose to see samples for JIT compiled Lua code only by specifying the `--arg nointerp=1` option to ignore samples for interpreted code. For example,\n\n    $ ./samples/lj-lua-stacks.sxx --arg nointerp=1 \\\n            --arg time=5 --skip-badvars -x $pid \u003e a.bt\n\nWhen you're sampling interpreted Lua backtraces, you might see some warnings like below when this tool is running:\n\n    WARNING: user string copy fault -14 at 00000000f9b56dac\n    WARNING: user string copy fault -14 at 00000000fffb4196\n    WARNING: user string copy fault -14 at 0000000004014212\n\nThis warnings are normal because our timer callback might run in an arbitrary context of the LuaJIT VM (thanks to the kernel's preemption feature), which means some times the VM might be in an inconsistent state. These read/copy faults will not affect the target (nginx) processes at all because systemtap temporarily disables page faults when reading memory from the userspace. And we should not worry about the accuracy of the sampling results too much unless there are too many such warnings (like hundreds or even thousands). Thanks to the error-tolerance feature of statistics.\n\nWhen sampling compiled Lua code's backtraces, we should not see these \"copy fault\" warnings at all because when compiled Lua code is running, the VM state is almost always consistent (because the running traces do not synchronize the VM state until they exit).\n\nBy default, this tool computes the line numbers at which the corresponding Lua functions are defined (that is, the first line of the Lua function definition). If you want accurate position for the Lua source line being executed in the resulting backtraces, you can specify the `--arg detailed=1` command-line option.\n\nThis tool is superior to LuaJIT's builtin low-overhead profiler in that\n\n1. this tool never flushes the existing compiled traces but LuaJIT's builtin profiler always does upon profiler startups and exits,\n1. this tool does not affect the LuaJIT VM's interpreter dispatcher nor JIT compiler while LuaJIT's builtin profiler requires a special running mode (and collaborations) for both the interpreter and JIT compiler.\n1. this tool can construct backtraces for JIT compiled code directly while LuaJIT's builtin profiler requires falling back to the interpreter to evaluate the current backtraces.\n1. this tool can be turned on and off more easily on any running (nginx) processes specified by pid. And when this tool is not running, there is strictly zero overhead in the target process.\n1. Bugs in this tool has no impact on the target process, even when reading from bad memory addresses.\n\nThe overhead exposed on the target process is usually small. For example, the throughput (req/sec) limit of an nginx worker process doing simplest \"hello world\" requests drops by only 10% (only when this tool is running), as measured by `ab -k -c2 -n100000` when using Linux kernel 3.6.10 and systemtap 2.5. The impact on full-fledged production processes is usually smaller than even that, for instance, only 5% drop in the throughput limit is observed in a production-level Lua CDN application.\n\nSpecial thanks go to Mike Pall for providing technical support in the\nLuaJIT VM internals.\n\nSee also\n\n1. The [lj-vm-states](#lj-vm-states) tool for sampling the LuaJIT VM states.\n1. Brendan Gregg's article \"[Flame Graphs](http://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/)\".\n\n[Back to TOC](#table-of-contents)\n\nepoll-et-lt\n-----------\n\nThis tool can checks how many epoll_ctl syscalls use the epoll edge-trigger (ET) mode and how many use the epoll level-trigger (LT) model. The statistics is gathered and printed out every 1 second. This tool is not specific to Nginx.\n\nHere is an example:\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    $ ./samples/epoll-et-lt.sxx -x 5728\n    Tracing epoll_ctl in user process 5728 (/opt/nginx/sbin/nginx)...\n    Hit Ctrl-C to end.\n    51 ET, 0 LT.\n    384 ET, 0 LT.\n    388 ET, 0 LT.\n    390 ET, 0 LT.\n    389 ET, 0 LT.\n    394 ET, 0 LT.\n    394 ET, 0 LT.\n    396 ET, 0 LT.\n    395 ET, 0 LT.\n    384 ET, 0 LT.\n    394 ET, 0 LT.\n    396 ET, 0 LT.\n    397 ET, 0 LT.\n    ^C\n\nWe can see that Nginx is using epoll ET exclusively :)\n\nBelow is another example for KyotoTycoon servers:\n\n    $ epoll-et-lt.sxx -x 4011\n    Tracing epoll_ctl in user process 4011 (/usr/local/bin/ktserver)...\n    Hit Ctrl-C to end.\n    0 ET, 0 LT.\n    0 ET, 3 LT.\n    0 ET, 0 LT.\n    0 ET, 2 LT.\n    0 ET, 0 LT.\n    0 ET, 0 LT.\n    0 ET, 4 LT.\n    0 ET, 3 LT.\n    0 ET, 0 LT.\n    0 ET, 5 LT.\n    0 ET, 1 LT.\n    0 ET, 1 LT.\n    0 ET, 1 LT.\n    0 ET, 0 LT.\n    ^C\n\nWe can see that the `ktserver` process is using epoll LT all the time :)\n\n[Back to TOC](#table-of-contents)\n\nepoll-loop-blocking-distr\n-------------------------\n\nThis tool can sample any specified user process for the specified time (by default, 5 seconds) and print out the distribution\nof the latency between successive epoll_wait syscalls.\n\nEssentially, it can give you a picture about the blocking latency involved in\nthe process's epoll-based event loop (if there is one).\n\nHere is an example for analyzing a massively blocked Nginx worker processes in production (by really bad disk IO):\n\n    $ ./samples/epoll-loop-blocking-distr.sxx -x 19647 --arg time=60\n    Start tracing 19647...\n    Please wait for 60 seconds.\n    Distribution of epoll loop blocking latencies (in milliseconds)\n    max/avg/min: 1097/0/0\n    value |-------------------------------------------------- count\n        0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  18471\n        1 |@@@@@@@@                                            3273\n        2 |@                                                    473\n        4 |                                                     119\n        8 |                                                      67\n       16 |                                                      51\n       32 |                                                      35\n       64 |                                                      20\n      128 |                                                      23\n      256 |                                                       9\n      512 |                                                       2\n     1024 |                                                       2\n     2048 |                                                       0\n     4096 |                                                       0\n\nNote the long tail from 4ms ~ 1.1sec.\n\nTo further analyse exactly what is blocking the epoll loop, you\ncan use the off-CPU and on-CPU flame graph tools:\n\nhttps://github.com/agentzh/nginx-systemtap-toolkit#sample-bt\n\nhttps://github.com/agentzh/nginx-systemtap-toolkit#sample-bt-off-cpu\n\n[Back to TOC](#table-of-contents)\n\nsample-bt-leaks\n---------------\n\nThis tool can sample backtraces for memory allocations based on glibc's builtins (`malloc`, `calloc`, `realloc`) that have not been freed (via `free`) in the sampling time period.\n\nHere is an example:\n\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming the target process has the pid 16795:\n    $./samples/sample-bt-leaks.sxx -x 16795 --arg time=5 \\\n            -D STP_NO_OVERLOAD -D MAXMAPENTRIES=10000 \u003e a.bt\n    WARNING: Start tracing 16795 (/opt/nginx/sbin/nginx)...\n    WARNING: Wait for 5 sec to complete.\n\n    $ export PATH=/path/to/FlameGraph:$PATH\n    $ stackcollapse-stap.pl a.bt \u003e a.cbt\n    $ flamegraph.pl --countname=bytes \\\n            --title=\"Memory Leak Flame Graph\" a.cbt \u003e a.svg\n\nYou can now open the \"Memory Leak Flame Graph\" file, `a.svg`, in your favorite web browser.\n\nThe tools `stackcollapse-stap.pl` and `flamegraph.pl` are from the FlameGraph repository by Brendan Gregg:\n\nhttps://github.com/brendangregg/FlameGraph\n\nBelow is a sample \"Memory Leak Flame Graph\" for a real memory leak bug\nin older versions of the Nginx core:\n\nhttp://agentzh.org/misc/flamegraph/nginx-leaks-2013-10-08.svg\n\nFor more details about this bug, see http://forum.nginx.org/read.php?2,241478,241478\n\nThis tool is general and not specific to Nginx, for example.\n\nThis tool requires the `uretprobes` feature in the kernel. If you are using an old kernel patched by the utrace patch, then you should be good. If you are using a mainline kernel, then you need at least 3.10.x.\n\nThis tool has relatively high overhead especially for processes without (clever) custom allocators (but still *way* faster than Valgrind memcheck). So be careful when\nusing this tool in production. Only use this tool in production when you really have a leak.\n\nPlease note that the `realloc` function in some builds of glibc may not have correct argument values, so you *may* see false positives on code paths doing `realloc`.\n\nIf you are seeing the systemtap error\n\n```\nERROR: Array overflow, check MAXMAPENTRIES near identifiter ...\n```\n\nthen you should increase the number in the `-D MAXMAPENTRIES=10000` command-line option.\n\nIf you are seeing the error\n\n```\nERROR: probe overhead exceeded threshold\n```\n\nthen you should specify the `-D STP_NO_OVERLOAD` command-line option.\n\nYou can find more details on Memory Leak Flame Graphs in Brendan Gregg's blog post:\n\nhttp://dtrace.org/blogs/brendan/2013/08/16/memory-leak-growth-flame-graphs/\n\nAnd general information about Flame Graphs here:\n\nhttp://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-shdict-writes\n---------------------\n\nThis tool can be used to trace write operations (`set`/`add`/`replace`/`safe_set`) to\n[ngx_lua](https://github.com/chaoslawful/lua-nginx-module)'s shared dictionary zones in any running Nginx worker process\nat real time.\n\nBelow is an example:\n\n```bash\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming one nginx worker process has the pid 28723.\n    $ ./samples/ngx-lua-shdict-writes.sxx -x 28723\n    WARNING: Tracing process 28723 (/opt/nginx/sbin/nginx).\n    Hit Ctrl-C to end.\n    [1383177226] add key=visitor::127.3.20.88 value_len=-1 dict=locks\n    [1383177226] set key=visitor::127.3.20.88 value_len=18 dict=visitor_cache\n    [1383177226] set key=visitor::127.3.20.88 value_len=-1 dict=locks\n    [1383177226] add key=vzone::mybaz.net:127.26.29.8 value_len=-1 dict=locks\n    [1383177226] set key=vzone::mybaz.net:127.26.29.8 value_len=11 dict=zone_cache\n    [1383177226] set key=vzone::mybaz.net:174.96.24.8 value_len=-1 dict=locks\n    [1383177226] set key=::BIC:127.0.0.2:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36:- value_len=4 dict=visitor_cache\n    [1383177226] add key=visitor::127.0.0.1 value_len=-1 dict=locks\n    [1383177226] set key=visitor::127.0.0.1 value_len=18 dict=visitor_cache\n    [1383177226] set key=visitor::127.0.0.1 value_len=-1 dict=locks\n    [1383177226] add key=vzone::foobar.com:127.0.0.1 value_len=-1 dict=locks\n    [1383177226] set key=vzone::foobar.com:127.0.0.1 value_len=11 dict=zone_cache\n    ^C\n```\n\nThe `--arg dict=NAME` option can be used to filter writes to a particular shared dictionary zone:\n\n```bash\n    # assuming one nginx worker process has the pid 28723.\n    $ ./samples/ngx-lua-shdict-writes.sxx -x 28723 --arg dict=cpage_cache\n    WARNING: Tracing process 28723 (/opt/nginx/sbin/nginx).\n    Hit Ctrl-C to end.\n    [1383177035] set key=cpage::/cpage/cf-error/1000s/838156:4891573 value_len=7 dict=cpage_cache\n    [1383177035] set key=cpage::/cpage/block/ip-ban/171116:748534 value_len=1407861 dict=cpage_cache\n    [1383177036] set key=cpage::/cpage/block/ip-ban/171116:748534 value_len=1407861 dict=cpage_cache\n    [1383177040] set key=cpage::/cpage/cf-error/1000s/281904:1355323 value_len=309229 dict=cpage_cache\n    [1383177042] set key=cpage::/cpage/cf-error/1000s/405903:2067154 value_len=7 dict=cpage_cache\n    [1383177043] set key=cpage::/cpage/block/iuam-basic/841468:4916374 value_len=7 dict=cpage_cache\n    [1383177043] set key=cpage::/cpage/block/ip-ban/171116:748534 value_len=1407861 dict=cpage_cache\n    [1383177046] set key=cpage::/cpage/block/ip-ban/171116:748534 value_len=1407861 dict=cpage_cache\n    [1383177047] set key=cpage::/cpage/block/basic-sec-captcha/291111:4428016 value_len=7 dict=cpage_cache\n    ^C\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-shdict-info\n-------------------\n\nThis tool can be used to analyzes the shared dictionary zone specified by the `--arg dict=name`\nin the specified running nginx process.\nThis tool is very similar to [ngx-shm](https://github.com/openresty/nginx-systemtap-toolkit#ngx-shm),\njust one more `rbtree black height`.\n\nBelow is an example:\n\n```bash\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming one nginx worker process has the pid 28723.\n    $ ./samples/ngx-lua-shdict-info.sxx -x 59141 --arg dict=session01\n    Start tracing 59141 (/usr/local/openresty-debug/nginx/sbin/nginx)\n    shm zone \"session01\"\n        owner: ngx_http_lua_shdict\n        total size: 102400 KB\n        free pages: 76792 KB (19198 pages, 1 blocks)\n        rbtree black height: 11\n\n    26 microseconds elapsed in the probe handler.\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-single-req-latency\n----------------------\n\nAnalyze the detailed latency time composition in an individual request served by an Nginx server instance. This tool can measure the time spent in the major Nginx request processing phases (like `rewrite` phase, `access` phase, and `content` phase). It will also measure Nginx upstream modules' latency on upstream connect() and etc.\n\nBy default it just tracks the first request it sees. For example,\n\n```bash\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming an nginx worker process's pid is 27327\n    $ ./samples/ngx-single-req-latency.sxx -x 27327\n    Start tracing process 27327 (/opt/nginx/sbin/nginx)...\n\n    POST /api_json\n        total: 143596us, accept() ~ header-read: 43048us, rewrite: 8us, pre-access: 7us, access: 6us, content: 100507us\n        upstream: connect=29us, time-to-first-byte=99157us, read=103us\n\n    $ ./samples/ngx-single-req-latency.sxx -x 27327\n    Start tracing process 27327 (/opt/nginx/sbin/nginx)...\n\n    GET /robots.txt\n        total: 61198us, accept() ~ header-read: 33410us, rewrite: 7us, pre-access: 7us, access: 5us, content: 27750us\n        upstream: connect=30us, time-to-first-byte=18955us, read=96us\n```\n\nwhere we use `-x \u003cpid\u003e` to specify an nginx worker process. We can also specify `--master \u003cmaster-pid\u003e` to monitor on all the Nginx worker processes under the master process specified.\n\nThe `--arg header=HEADER` option can be used to filter out the request by a specific request header, for instance,\n\n```bash\n    # assuming the nginx master process pid is 7088, and the request\n    # being analyzed has the \"Test\" request header with non-empty header value:\n    $ ./samples/ngx-single-req-latency.sxx --arg header=Test --master 7088\n    Start tracing process 7089 7090 (/opt/nginx/sbin/nginx)...\n\n    GET /proxy/get\n        total: 2941us, accept() ~ header-read: 354us, rewrite: 100us, pre-access: 26us, access: 23us, content: 2356us\n            upstream: connect=424us, time-to-first-byte=1059us, read=357us\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-rewrite-latency-distr\n-------------------------\n\nMeasure the rewrite-phase latency for each Nginx request (including subrequest) in an Nginx worker process specified and output the distribution of the latency.\n\nBy default, you hit Ctrl-C to end the sampling process:\n\n```bash\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming that the Nginx worker process to be analyzed is\n    # of the pid 10972:\n    $ ./samples/ngx-rewrite-latency-distr.sxx -x 10972\n    Start tracing process 10972 (/opt/nginx/sbin/nginx)...\n    Hit Ctrl-C to end.\n    ^C\n    Distribution of the rewrite phase latencies (in microseconds) for 478 samples:\n    (min/avg/max: 19407/20465/21717)\n    value |-------------------------------------------------- count\n     4096 |                                                     0\n     8192 |                                                     0\n    16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    478\n    32768 |                                                     0\n    65536 |                                                     0\n```\n\nYou can also specify the `--arg time=SECONDS` option to let the tool quit automatically after the specified time (in seconds). For example,\n\n```bash\n    $ ./samples/ngx-rewrite-latency-distr.sxx -x 12004 --arg time=3\n    Start tracing process 12004 (/opt/nginx/sbin/nginx)...\n    Please wait for 3 seconds...\n\n    Distribution of the rewrite phase latencies (in microseconds) for 46 samples:\n    (min/avg/max: 0/19533/21009)\n    value |-------------------------------------------------- count\n        0 |@@                                                  2\n        1 |                                                    0\n        2 |                                                    0\n          ~\n     4096 |                                                    0\n     8192 |                                                    0\n    16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       44\n    32768 |                                                    0\n    65536 |                                                    0\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-exec-time\n-----------------\n\nThis tool measures the pure Lua code executation time (excluding any nonblocking IO time) accumulated in every request served by the specified Nginx worker at real-time and outputs the distribution of the latencies.\n\nThe time for the nginx output filters and boilerplate Lua thread initializations are excluded.\n\nNote that both TCP sockets and stream-typed unix domain sockets are supported.\n\nBelow is an example:\n\n```bash\n    # making the ./stap++ tool visible in PATH:\n    $ export PATH=$PWD:$PATH\n\n    # assuming one nginx worker process has the pid 13501.\n    $ ./samples/ngx-lua-exec-time.sxx -x 13501 --arg time=60\n    Start tracing process 13501 (/opt/nginx/sbin/nginx)...\n    Please wait for 60 seconds...\n\n    Distribution of Lua code pure execution time (accumulated in each request, in microseconds) for 1605 samples:\n    (min/avg/max: 92/618/2841)\n    value |-------------------------------------------------- count\n       16 |                                                     0\n       32 |                                                     0\n       64 |                                                     1\n      128 |                                                    13\n      256 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    715\n      512 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  737\n     1024 |@@@@@@@@                                           130\n     2048 |                                                     9\n     4096 |                                                     0\n     8192 |                                                     0\n```\n\nThe overhead exposed on the target process is usually small. For example, the throughput (req/sec) limit of an nginx worker process running a full-fledged Lua CDN application drops by only 14% (only when this tool is running), as measured by `ab -k -c2 -n100000` when using Linux kernel 3.6.10 and systemtap 2.5.\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-tcp-recv-time\n---------------------\n\nThis tool measures the latency involved in individual [receive](https://github.com/chaoslawful/lua-nginx-module#tcpsockreceive) method calls or readers returned from the [receiveuntil](https://github.com/chaoslawful/lua-nginx-module#tcpsockreceiveuntil) method calls on [ngx_lua](https://github.com/chaoslawful/lua-nginx-module#readme) module's [ngx.socket.tcp](https://github.com/chaoslawful/lua-nginx-module#ngxsockettcp) objects in a specified Nginx worker process and outputs the distribution of the latencies.\n\nBelow is an example:\n\n```bash\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming one nginx worker process has the pid 14464.\n$ ./samples/ngx-lua-tcp-recv-time.sxx -x 14464 --arg time=60\nStart tracing process 14464 (/opt/nginx/sbin/nginx)...\nPlease wait for 60 seconds...\n\nDistribution of the ngx_lua ngx.socket.tcp receive latencies (in microseconds) for 3356 samples:\n(min/avg/max: 1/74/3099)\nvalue |-------------------------------------------------- count\n    0 |                                                      0\n    1 |                                                     12\n    2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  2194\n    4 |@@                                                  107\n    8 |                                                      7\n   16 |                                                      3\n   32 |                                                     38\n   64 |@@@@@@@@@@@@@@                                      631\n  128 |@@@                                                 140\n  256 |@@                                                  106\n  512 |@                                                    82\n 1024 |                                                     25\n 2048 |                                                     11\n 4096 |                                                      0\n 8192 |                                                      0\n```\n\nSee also the [ngx-lua-tcp-total-recv-time](#ngx-lua-tcp-total-recv-time) tool.\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-tcp-total-recv-time\n---------------------------\n\nSimilar to the [ngx-lua-tcp-recv-time](#ngx-lua-tcp-recv-time) tool, but accumulate the latencies by every request served by the specified Nginx process.\n\nThis tool is useful to see how much the TCP/streaming cosocket reads contribute to the total request latency.\n\nBut note that, however, latency of cosocket reads in different ngx_lua \"light threads\" within the same request will be simply added up, so in case of multiple \"light threads\" reading at the same time will lead to larger total latency than the actual case.\n\nRequests without TCP/stream cosocket reads will simply get skipped.\n\nBelow is an example,\n\n```bash\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming one nginx worker process has the pid 14464.\n$ ./samples/ngx-lua-tcp-total-recv-time.sxx  -x 14464 --arg time=60\nStart tracing process 14464 (/opt/nginx/sbin/nginx)...\nPlease wait for 60 seconds...\n\nDistribution of the ngx_lua ngx.socket.tcp receive latencies (accumulated in each request, in microseconds) for 649 samples:\n(min/avg/max: 8/621/58875)\n value |-------------------------------------------------- count\n     2 |                                                     0\n     4 |                                                     0\n     8 |                                                     2\n    16 |                                                     0\n    32 |                                                     2\n    64 |@@@@@                                               40\n   128 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    329\n   256 |@@@@@@@@@@@@@@                                      98\n   512 |@@@@@@@@@@@@@@@                                    109\n  1024 |@@@@@@@                                             51\n  2048 |@                                                   13\n  4096 |                                                     2\n  8192 |                                                     1\n 16384 |                                                     0\n 32768 |                                                     2\n 65536 |                                                     0\n131072 |                                                     0\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-udp-recv-time\n---------------------\n\nThis tool measures the latency involved in individual [receive](https://github.com/chaoslawful/lua-nginx-module#udpsockreceive) method calls on [ngx_lua](https://github.com/chaoslawful/lua-nginx-module#readme) module's [ngx.socket.udp](https://github.com/chaoslawful/lua-nginx-module#ngxsocketudp) objects in a specified Nginx worker process and outputs the distribution of the latencies.\n\nThis tool analyzes both UDP and unix domain cosockets.\n\nBelow is an example:\n\n```bash\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming one nginx worker process has the pid 29906.\n$ ./samples/ngx-lua-udp-recv-time.sxx -x 29906 --arg time=60\nStart tracing process 29906 (/opt/nginx/sbin/nginx)...\nPlease wait for 60 seconds...\n\nDistribution of the ngx_lua ngx.socket.udp receive latencies (in microseconds) for 114 samples:\n(min/avg/max: 433/12468/940008)\n  value |-------------------------------------------------- count\n     64 |                                                    0\n    128 |                                                    0\n    256 |@@@@                                                9\n    512 |@@@@@@@@@@@@@@@                                    30\n   1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@                        54\n   2048 |@@@@@@@@                                           16\n   4096 |@                                                   2\n   8192 |                                                    1\n  16384 |                                                    0\n  32768 |                                                    0\n  65536 |                                                    0\n 131072 |                                                    0\n 262144 |                                                    1\n 524288 |                                                    1\n1048576 |                                                    0\n2097152 |                                                    0\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-udp-total-recv-time\n---------------------------\n\nSimilar to the [ngx-lua-udp-recv-time](#ngx-lua-udp-recv-time) tool, but accumulate the latencies by every request served by the specified Nginx process.\n\nThis tool is useful to see how much the UDP/unix-domain cosocket reads contribute to the total request latency.\n\nBut note that, however, latency of cosocket reads in different ngx_lua \"light threads\" within the same request will be simply added up, so in case of multiple \"light threads\" reading at the same time will lead to larger total latency than the actual case.\n\nRequests without UDP/unix-domain cosocket reads will simply get skipped.\n\nBelow is an example,\n\n```bash\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming one nginx worker process has the pid 14464.\n$ ./samples/ngx-lua-udp-total-recv-time.sxx -x 14464 --arg time=60\nStart tracing process 14464 (/opt/nginx/sbin/nginx)...\nPlease wait for 60 seconds...\n\nDistribution of the ngx_lua ngx.socket.udp receive latencies (accumulated in each request, in microseconds) for 101 samples:\n(min/avg/max: 477/60711/2991309)\n  value |-------------------------------------------------- count\n     64 |                                                    0\n    128 |                                                    0\n    256 |@                                                   2\n    512 |@@@@@@@@@@@@@                                      27\n   1024 |@@@@@@@@@@@@@@@@@@@@@@@@@                          51\n   2048 |@@@@@@                                             12\n   4096 |@                                                   3\n   8192 |                                                    0\n  16384 |                                                    0\n  32768 |                                                    1\n  65536 |                                                    0\n 131072 |                                                    1\n 262144 |                                                    0\n 524288 |@                                                   2\n1048576 |                                                    1\n2097152 |                                                    1\n4194304 |                                                    0\n8388608 |                                                    0\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-orig-resp-body-len\n----------------------\n\nThis tool analyzes the original response body size (before going through the gzip filter module or other custom filter modules) served by Nginx and dumps out the distribution.\n\nBelow is an example:\n\n```bash\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming one nginx worker process has the pid 3781.\n$ ./samples/ngx-orig-resp-body-len.sxx -x 3781 --arg time=30\nStart tracing process 3781 (/opt/nginx/sbin/nginx)...\nPlease wait for 30 seconds...\n\nDistribution of original response body sizes (in bytes) for 1634 samples:\n(min/avg/max: 0/55006/6345380)\n   value |-------------------------------------------------- count\n       0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@              225\n       1 |                                                     5\n       2 |                                                     5\n       4 |                                                     3\n       8 |                                                     4\n      16 |@                                                   11\n      32 |@@@@@@                                              37\n      64 |@@@@                                                26\n     128 |@@@@@@@@@                                           56\n     256 |@@@@@@@@                                            48\n     512 |@@@@@@@@@@@@@                                       82\n    1024 |@@@@@@@@@@@@@@@@@                                  107\n    2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@               216\n    4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@        258\n    8192 |@@@@@@@@@@@@@@@@@@@@                               121\n   16384 |@@@@@@@@@@@@@@@@@@@@@                              127\n   32768 |@@@@@@@@@@@@@@@                                     94\n   65536 |@@@@@@@@@@@@@@@                                     91\n  131072 |@@@@@@@@@                                           56\n  262144 |@@@@@                                               31\n  524288 |@@@                                                 18\n 1048576 |@                                                    8\n 2097152 |                                                     3\n 4194304 |                                                     2\n 8388608 |                                                     0\n16777216 |                                                     0\n```\n\n[Back to TOC](#table-of-contents)\n\nzlib-deflate-chunk-size\n-----------------------\n\nThis tool records the data chunk size fed into zlib `deflate()` and `deflateEnd()` calls in the user process specified by pid.\n\nBelow is an example:\n\n```bash\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming one nginx worker process has the pid 3781.\n$ ./samples/zlib-deflate-chunk-size.sxx -x 3781 --arg time=30\nStart tracing process 3781 (/opt/nginx/sbin/nginx)...\nPlease wait for 30 seconds...\n\nDistribution of zlib deflate chunk sizes (in bytes) for 3975 samples:\n(min/avg/max: 0/2744/8192)\nvalue |-------------------------------------------------- count\n    0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  988\n    1 |@@@                                                 61\n    2 |@@                                                  40\n    4 |@@                                                  41\n    8 |@                                                   25\n   16 |@@                                                  41\n   32 |@                                                   30\n   64 |@@                                                  48\n  128 |@@@                                                 64\n  256 |@@@                                                 76\n  512 |@@@@@@@@@@                                         201\n 1024 |@@@@@@@@@@@@@@@@@@@@@@@@@                          511\n 2048 |@@@@@@@@@@@@@@@@@@@@@@@@                           487\n 4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@      905\n 8192 |@@@@@@@@@@@@@@@@@@@@@@                             457\n16384 |                                                     0\n32768 |                                                     0\n```\n\n[Back to TOC](#table-of-contents)\n\nlj-str-tab\n----------\n\nAnalayzing the structure and various statistics of the global Lua string hash table in the LuaJIT v2.1 VM.\n\n[Back to TOC](#table-of-contents)\n\nfunc-latency-distr\n------------------\n\nCalculates the latency distribution of any function-like probes which support both entry and return probes.\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming the target process has the pid 3781.\n\n$ ./samples/func-latency-distr.sxx -x 3781 --arg func='process(\"$^libluajit_path\").function(\"lj_str_new\")'\nStart tracing 18356 (/opt/nginx/sbin/nginx)\nHit Ctrl-C to end.\n^C\nDistribution of lj_str_new latencies (in nanoseconds) for 1202 samples\nmax/avg/min: 27463/3309/2779\nvalue |-------------------------------------------------- count\n  512 |                                                      0\n 1024 |                                                      0\n 2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   1160\n 4096 |@                                                    35\n 8192 |                                                      5\n16384 |                                                      2\n32768 |                                                      0\n65536 |                                                      0\n```\n\n```\n$ func-latency-distr.sxx -x 27603 --arg func=vfs.write --arg time=10\nStart tracing 27603 (/opt/nginx/sbin/nginx)\nPlease wait for 10 seconds...\nDistribution of vfs_write latencies (in nanoseconds) for 1201 samples\nmax/avg/min: 545751/28313/1761\n  value |-------------------------------------------------- count\n    256 |                                                     0\n    512 |                                                     0\n   1024 |                                                     2\n   2048 |                                                     0\n   4096 |                                                     9\n   8192 |@@@@@@@@@@@@@                                      194\n  16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  695\n  32768 |@@@@@@@@@@@@@@@@@@                                 254\n  65536 |@@                                                  41\n 131072 |                                                     5\n 262144 |                                                     0\n 524288 |                                                     1\n1048576 |                                                     0\n2097152 |                                                     0\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-count-conns\n---------------\n\nCount the number of used nginx connections and open files in the specified nginx worker process.\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming the target process has the pid 6259.\n$ ./samples/ngx-count-conns.sxx -x 6259\nStart tracing 6259 (/opt/nginx/sbin/nginx)...\n\n====== CONNECTIONS ======\nMax connections: 1024\nFree connections: 1022\nUsed connections: 2\n\n====== FILES ======\nMax files: 1024\nOpen normal files: 2\n\n# try another worker process\n$ ngx-count-conns.sxx -x 32743\nStart tracing 32743 (/opt/nginx/sbin/nginx)...\n\n====== CONNECTIONS ======\nMax connections: 32768\nFree connections: 27674\nUsed connections: 5094\n\n====== FILES ======\nMax files: 131072\nOpen normal files: 2\n```\n\n[Back to TOC](#table-of-contents)\n\nngx-lua-count-timers\n--------------------\n\nOutputs the current timer statistics in the nginx worker process specified.\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming the target process has the pid 13982.\n$ ./samples/ngx-lua-count-timers.sxx -x 13982\nStart tracing 13982 (/opt/nginx/sbin/nginx)...\nRunning timers: 0\nMax running timers: 256\nPending timers: 1\nMax pending timers: 1024\n```\n\n[Back to TOC](#table-of-contents)\n\ncpu-hogs\n--------\n\nThis tool can measure how CPU time is distributed across all the running processes (and kernel threads), showing the biggest hogs.\n\nFor example, on a quite idle system, you can see a lot of \"swapper\" task shown on the list:\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n$ ./samples/cpu-hogs.sxx\nTracing the whole system...\nHit Ctrl-C to end.\n^C\nnginx-frontend 9%\nnginx-backend 4%\nswapper/15 3%\nswapper/23 3%\nswapper/19 3%\nswapper/12 3%\nswapper/0 3%\nswapper/13 3%\nswapper/21 3%\nswapper/16 3%\nswapper/8 3%\nswapper/5 3%\npdns 3%\n...\n```\n\nwhile on a relatively busy system, we may get something like this:\n\n```\n$ cpu-hogs.sxx --arg time=30\nTracing the whole system...\nPlease wait for 30 seconds...\n\nnginx-frontend 52%\nnginx-backend 15%\nnginx-sslgate 6%\nnginx-waf 5%\nktserver 2%\npdns 2%\nswapper/0 1%\nktfs 1%\nsyslog-ng 1%\nfancy-scheduler 1%\nrh-reader 0%\n```\n\nPlease note that 52% here, for example, means 52% of all of the CPU time actually *used*, instead of the total CPU capacity. Do not get confused.\n\n[Back to TOC](#table-of-contents)\n\ncpu-robbers\n-----------\n\nDisplay how frequently other processes (including user proceses and kernel threads) preempt\nthe target process specified by its PID (that is, stealing CPU time from the target process).\nOnly ouptut the biggest robbers.\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming the target process's pid is 13443\n$ ./samples/cpu-robbers.sxx -x 13443\nStart tracing process 13443 (/opt/nginx/sbin/nginx)...\nHit Ctrl-C to end.\n^C\n#1 pdns: 23% (32 samples)\n#2 ksoftirqd/17: 20% (28 samples)\n#3 unbound: 8% (11 samples)\n#4 nginx-sslgateway: 7% (10 samples)\n#5 nginx-backend: 6% (9 samples)\n#6 kworker/17:1: 5% (7 samples)\n#7 kworker/17:1H: 3% (5 samples)\n#8 rcuos/4: 2% (4 samples)\n#9 rcuos/23: 2% (3 samples)\n```\n\nThis tool is very general and can be used upon any process, *not*\nneed to be an nginx process at all.\n\n[Back to TOC](#table-of-contents)\n\nngx-pcre-dist\n--------\nThis tool can analyse the PCRE regex executation time distribution.\n\nIt requires uretprobes support in the Linux kernel.\n\nAlso, you need to ensure that debug symbols are enabled in your Nginx build, PCRE build, and LuaJIT build. For example, if you build PCRE from source with your Nginx or OpenResty by specifying the `--with-pcre=PATH` option, then you should also specify the `--with-pcre-opt=-g` option at the same time.\n\nYou can analyze the distribution of the length of those subject string data being matched in individual runs. Note that, the time is given in microseconds (us), i.e., 1e-6 seconds.\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming the target process has the pid 6440.\n$ ./samples/ngx-pcre-dist.sxx -x 6440\nStart tracing 6440 (/opt/nginx/sbin/nginx)\nHit Ctrl-C to end.\n^C\nLogarithmic histogram for data length distribution (byte) for 195 samples:\n(min/avg/max: 7/65/90)\nLogarithmic histogram for data length distribution:\nvalue |-------------------------------------------------- count\n    1 |                                                     0\n    2 |                                                     0\n    4 |@@@@@@                                              18\n    8 |@@@@@@@                                             22\n   16 |@@@@@@@@                                            24\n   32 |                                                     0\n   64 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@        131\n  128 |                                                     0\n  256 |                                                     0\n```\n\nBelow is an example that analyzes the PCRE regex executation time distribution for a given Nginx worker process. The `--arg exec_time=1` option is used here.\n\n```\n# assuming the target process has the pid 6440.\n$ ./samples/ngx-pcre-dist.sxx -x 6440 --arg exec_time=1\nStart tracing 6440 (/opt/nginx/sbin/nginx)\nHit Ctrl-C to end.\n^C\nLogarithmic histogram for pcre_exec running time distribution (us) for 135 sample:\n(min/avg/max: 11/18/164)\nvalue |-------------------------------------------------- count\n    2 |                                                    0\n    4 |                                                    0\n    8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   97\n   16 |@@@@@@@@@@@                                        23\n   32 |@@@@@@                                             12\n   64 |@                                                   2\n  128 |                                                    1\n  256 |                                                    0\n  512 |                                                    0\n```\nUsing `--arg utime=1` option can increase the accuracy of time, but it may not be supported on some platforms.\nThis tool supports both the ngx_lua classic API and the lua-resty-core API.\n\n[Back to TOC](#table-of-contents)\n\nngx-pcre-top\n--------\nThis tool can analyze the worst or accumulated PCRE executation time of the individual regex matches using the ngx_lua module's [ngx.re API](http://wiki.nginx.org/HttpLuaModule#ngx.re.match).\n\nHere is an example that analyzes the longest total running time, using accumulated regex execution time:\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming the target process has the pid 6440.\n$ ./samples/ngx-pcre-top.sxx --skip-badvars -x 6440\nStart tracing 6440 (/opt/nginx/sbin/nginx)...\n Hit Ctrl-C to end.\n^C\nTop N regexes with longest total running time:\n1. pattern \".\": 241038us (total data size: 330110)\n2. pattern \"elloA\": 188107us (total data size: 15365120)\n3. pattern \"b\": 28016us (total data size: 36012)\n4. pattern \"ello\": 26241us (total data size: 15005)\n5. pattern \"a\": 26180us (total data size: 36012o)\n```\n\nBelow is an example that analyzes the worst execution time of the individual regex matches. The `--arg worst_time=1` option is used here.\n\n```\n# assuming the target process has the pid 6440.\n$ ./samples/ngx-pcre-top.sxx --skip-badvars -x 6440 --arg worst_time=1\nStart tracing 6440 (/opt/nginx/sbin/nginx)\nHit Ctrl-C to end.\n^C\nTop N regexes with worst running time:\n1. pattern \"elloA\": 125us (data size: 5120)\n2. pattern \".\": 76us (data size: 10)\n3. pattern \"a\": 64us (data size: 12)\n4. pattern \"b\": 29us (data size: 12)\n5. pattern \"ello\": 26us (data size: 5)\n```\nNote that the time values given above are just for individual runs and are not accumulated.\nUsing `--arg utime=1` option can increase the accuracy of time, but it may not be supported on some platforms.\nThis tool supports both the ngx_lua classic API and the lua-resty-core API.\n\n[Back to TOC](#table-of-contents)\n\nvfs-page-cache-misses\n---------------------\nThis tool can analyze the page cache misses:\n\n1. operation miss rate: operation num that page cache missed / total operation num.\n2. size miss rate: size not from page cache / total size read.\n\nThis tool can be used to analyze processes that have multiple threads (not only for nginx processes),\nbut it will get the wrong result when the process has some other operations that can trigger `vfs.read` but not read data from disk, like pipe/FIFO reads.\n\nThe `--arg inode=$inode` option can be used to only analyze the specified inode, `$inode` is an inode ID that can get by `ls -i /path/to/file`.\n\nHere is an example:\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n# assuming the target process has the pid 73485.\n$ ./samples/vfs-page-cache-misses.sxx -x 73485 --arg inode=7905\nTracing 73485...\nHit Ctrl-C to end.\n\n311 vfs.read operations, 86 missed operations, operation miss rate: 27%\ntotal read 29425 KB, 3350 pages added (page size: 4096B), size miss rate: 45%\n186 vfs.read operations, 49 missed operations, operation miss rate: 26%\ntotal read 20373 KB, 1792 pages added (page size: 4096B), size miss rate: 35%\n```\n\n[Back to TOC](#table-of-contents)\n\nopenssl-handshake-diagnosis\n---------------------------\nThis tool can analyze the ciphers used in SSL handshake via peeking at OpenSSL's\n`SSL_do_handshake`. It provides information below:\n\n* whether AES-NI is used in the handshake\n* the usage of different ciphers\n\nNote that the cipher usage is counted per SSL session. Therefore we could\nconfirm if the SSL sessions are reused by comparing the cipher usage.\n\nHere is an example:\n\n```\n# making the ./stap++ tool visible in PATH:\n$ export PATH=$PWD:$PATH\n\n$ ./samples/openssl-handshake-diagnosis.sxx -x $(pidof nginx) --arg time=10\nFound exact match for libcrypto: /lib/x86_64-linux-gnu/libcrypto.so.1.0.0\nFound exact match for libssl: /lib/x86_64-linux-gnu/libssl.so.1.0.0\nStart tracing 331 (/usr/local/openresty/nginx/sbin/nginx)...\nPlease wait for 10 seconds...\nOpenSSL handshake disgnosis:\nAES-NI:\n    on\ncipher usage:\n    AES128-GCM-SHA256                   90\n    ECDHE-RSA-AES128-GCM-SHA256         72\n    ECDHE-RSA-AES256-GCM-SHA384         71\n```\n\n[Back to TOC](#table-of-contents)\n\nInstallation\n============\n\nYou need a recent enough Linux kernel (like 3.5+) *or* a older kernel with the utrace patch applied (for Linux distributions in the RedHat family, like RHEL, CentOS, and Fedora, the utrace patch should be included in their older kernels by default).\n\nYou also need to install systemtap first. It is recommended to build it from the latest source. See this document for detailed instructions: http://openresty.org/#BuildSystemtap\n\n[Back to TOC](#table-of-contents)\n\nAuthor\n======\n\nYichun Zhang (agentzh), \u003cagentzh@gmail.com\u003e, OpenResty Inc.\n\n[Back to TOC](#table-of-contents)\n\nCopyright and License\n=====================\n\nThis module is licensed under the BSD license.\n\nCopyright (C) 2013-2017, by Yichun \"agentzh\" Zhang (章亦春) \u003cagentzh@gmail.com\u003e, OpenResty Inc.\n\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n[Back to TOC](#table-of-contents)\n\nSee Also\n========\n* SystemTap Wiki Home: http://sourceware.org/systemtap/wiki\n* Nginx Systemtap Toolkit: https://github.com/agentzh/nginx-systemtap-toolkit\n\n[Back to TOC](#table-of-contents)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenresty%2Fstapxx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenresty%2Fstapxx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenresty%2Fstapxx/lists"}