{"id":19341733,"url":"https://github.com/stephane-martin/reaper","last_synced_at":"2025-10-28T15:44:43.306Z","repository":{"id":92024448,"uuid":"165416294","full_name":"stephane-martin/reaper","owner":"stephane-martin","description":"Receive access logs from web server and push to a message queue","archived":false,"fork":false,"pushed_at":"2019-11-30T14:10:13.000Z","size":6028,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-06T11:44:51.368Z","etag":null,"topics":["access-logs","apache-logging","nginx-logs","nsqd"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stephane-martin.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-12T17:33:37.000Z","updated_at":"2020-02-19T23:34:15.000Z","dependencies_parsed_at":"2024-06-19T14:40:03.020Z","dependency_job_id":"7e9a38bf-5676-417d-8857-d08774ccc319","html_url":"https://github.com/stephane-martin/reaper","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stephane-martin%2Freaper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stephane-martin%2Freaper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stephane-martin%2Freaper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stephane-martin%2Freaper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stephane-martin","download_url":"https://codeload.github.com/stephane-martin/reaper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240449352,"owners_count":19803120,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["access-logs","apache-logging","nginx-logs","nsqd"],"created_at":"2024-11-10T03:32:17.590Z","updated_at":"2025-10-28T15:44:38.263Z","avatar_url":"https://github.com/stephane-martin.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"======\nreaper\n======\n\n.. contents::\n   :depth: 3\n..\n\n.. section-numbering::\n\n``reaper`` is a simple tool to collect access logs from web servers and\npublish the logs to an external message queue.\n\n::\n\n                                                              ,,,,,          ,,,,,         \n                                                            ,,,,,,,,,     ,,,,,,,,,,       \n                                                           ,,,,,,,,,,,,  ,,,,,,,,,,,,   \n                                                          ,,,,,,,,,,,,,,,,,,,,,,,,,,,,    \n                            ##                           ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, \n                         ####                           ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,    \n                       #####                            ,,,@@@@@*,,,,,,,,,,,,,,,@@@@@,,,  \n                     ######                            ,,,,,#@@@@@\u0026,,,,,,,,,,/@@@@@@,,,,     \n                   #######                             ,,,,,,,@@@@@@,,,,,,,,@@@@@@,,,,,,,    \n                #########                              ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,    \n               ##########                              ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,   \n             ###########                               ,,,,,,,,,,,,,,,,(/,,,,,,,,,,,,,,,,    \n            ###########                                ,,,,,,@,/@@,,@,,@@,,@,,@@*,@,,,,,,    \n           ############                                 ,,,,,,@@@@@@\u0026@@@@@@@@@@@@@,,,,,,     \n         #############                                   ,,,,,,,@@@@@@@@@@@@@@@@,,,,,,,      \n        ##############                                      ,,,,@,@@@@@@@@@@@@,@,,,,        \n       ##############                                          ,,,,@@,@@@@,@@,,*,           \n       ##############                                           .,,,*,,@@,,/,,,             \n      ###############                                             ,,,,,%#,,,,,             \n     ###############                                               ,,,,,,,,,,               \n     ###############                                               .,,,,,,,,                \n    ################                                                                       \n    ################                                                                        \n    ################                                                                       \n    ################                                            GIVE ME YOUR LOGS           \n    ################                                                                        \n     ##################                                                                      \n      ##################                                                                     \n\nFeatures\n========\n\n-  Collect log on TCP/UDP syslog\n-  Syslog RFC3154 or RFC5424\n-  Collect log on stdin\n-  Parse logs formats: JSON, key/values, common, combined\n-  Stream access logs with websocket\n-  Download logs with HTTP\n-  Filter out unwanted log lines (predicate in Javascript)\n-  Can write collected logs to stdout, stderr, file\n-  Can write collected logs to databases: PostgreSQL/TimescaleDB,\n   Elasticsearch\n-  Can write collected logs to message brokers: RabbitMQ, nsqd, STOMP\n   enabled message broker\n-  Can write collected logs to a distributed log: Kafka\n-  Can write collected logs to a redis list\n-  Can forward collected logs to another reaper instance\n-  Should work on any \\*NIX\n\nProject status\n==============\n\nAlpha. Version 0.1.0.\n\nreaper is functional and be used in simple environments. But it lacks\nproper test cases and performance testing in busy environments.\n\nGetting Started\n===============\n\nInstall\n-------\n\n-  Binary releases\n\n   https://github.com/stephane-martin/reaper/releases\n\n   Just copy the binary in your PATH.\n\n-  Compile from source\n\n   ``git clone https://github.com/stephane-martin/reaper`` in an\n   appropriate folder (GOPATH…)\n\n   ``make debug`` or ``make release``\n\nConfigure\n---------\n\nCurrently reaper does not use a configuration file. Arguments are passed\non the command line or with environment variables.\n\nInline help\n-----------\n\n``reaper --help``\n\n``reaper (command) --help``\n\nUse reaper\n----------\n\nListen for access log entries\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTCP syslog\n^^^^^^^^^^\n\nStart reaper with ``--tcp 127.0.0.1:1514``. Here 127.0.0.1 is the listen\naddress.\n\nUDP syslog\n^^^^^^^^^^\n\nStart reaper with ``--udp 127.0.0.1:1514``.\n\nThis can be used with nginx or caddy. In nginx.conf:\n\n::\n\n   access_log syslog:server=127.0.0.1:1514,facility=daemon,tag=nginxaccess,severity=info jrich;\n\nSyslog protocol\n^^^^^^^^^^^^^^^\n\nBy default the syslog protocol is supposed to be RFC3164. Use the global\nflag ‘–rfc5424’ to switch to RFC5424.\n\nstdin\n^^^^^\n\nStart reaper with ``--stdin``.\n\nThis can be used with Apache. For example in Apache configuration:\n\n::\n\n   CustomLog \"||/path/to/reaper --format combined --stdin\" combined\n\nConfigure access logs format\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nreaper needs to know the format in which the web server writes access\nlogs entries. Use the ``--format`` flag.\n\nJSON\n^^^^\n\n``reaper --udp 127.0.0.1:1514 --format json``\n\nExample nginx configuration:\n\n::\n\n   log_format jrich escape=json\n       '{'\n           '\"timestamp\":\"$time_iso8601\",'\n           '\"method\":\"$request_method\",'\n           '\"scheme\":\"$scheme\",'\n           '\"host\":\"$host\",'\n           '\"server\":\"$server_name\",'\n           '\"uri\":\"$uri\",'\n           '\"duration\":$request_time,'\n           '\"length\":$request_length,'\n           '\"status\":$status,'\n           '\"sent\":$bytes_sent,'\n           '\"agent\":\"$http_user_agent\",'\n           '\"remoteaddr\":\"$remote_addr\",'\n           '\"remoteuser\":\"$remote_user\"'\n       '}';\n\n   access_log syslog:server=127.0.0.1:1514,facility=daemon,tag=nginxaccess,severity=info jrich;\n\nKey/values\n^^^^^^^^^^\n\n``reaper --udp 127.0.0.1:1514 --format kv``\n\nExample nginx configuration:\n\n::\n\n   log_format rich\n       'remote_addr=\"$remote_addr\" remote_user=\"$remote_user\" time=\"$time_iso8601\" length=$request_length'\n       ' host=\"$host\" request=\"$request_uri\" uri=\"$uri\" status=$status bytes_sent=$bytes_sent agent=\"$http_user_agent\"'\n       ' duration=$request_time upstream_duration=$upstream_response_time method=\"$request_method\" scheme=\"$scheme\"'\n       ' server=\"$server_name\"';\n\ncommon log format\n^^^^^^^^^^^^^^^^^\n\n``reaper --udp 127.0.0.1:1514 --format common``\n\ncombined log format\n^^^^^^^^^^^^^^^^^^^\n\n``reaper --udp 127.0.0.1:1514 --format combined``\n\nFilter access logs\n~~~~~~~~~~~~~~~~~~\n\nThe ``--filterout EXPR`` global flag can be set to specify a filter.\n\nEXPR is a javascript expression that can use the log entry fields. If\nthe EXPR is True, the entry is filtered out. Multiple –filterout flags\ncan be used. In that case, an entry is filtered out if any of the\nexpressions is True.\n\nExample:\n\n``reaper --udp 127.0.0.1:1514 --format json --filterout 'host==\"example.org\"' stdout``\n\nLog entries for requests to http://example.org will be filtered out.\n\nPlease note that filtering is not free from a performance point of view.\nIt uses an embedded Javascript engine.\n\nForward access logs\n~~~~~~~~~~~~~~~~~~~\n\nreaper can forward access logs to various destinations. The type of the\ndestination is selected through a command on reaper command line, after\nthe previous global flags.\n\nWhen the destination is not reachable, log entries are buffered in the\nembedded nsqd instance. When the destination is reachable again,\nbuffered entries will be forwarded. So you do not need to start the\ndestination before reaper.\n\nEach destination has specific flags to configure it.\n\nstdout, stderr\n^^^^^^^^^^^^^^\n\n-  ``reaper --udp 127.0.0.1 stdout``\n-  ``reaper --udp 127.0.0.1 stderr``\n\nfile\n^^^^\n\n-  ``reaper --udp 127.0.0.1 file --filename /tmp/access.log`` =\u003e write\n   log entries to /tmp/access.log\n-  ``reaper --udp 127.0.0.1 file --gzip --filename /tmp/access.log.gz``\n   =\u003e write compressed log entries to /tmp/access.log.gz\n\nRabbitMQ\n^^^^^^^^\n\nForward logs to a RabbitMQ exchange.\n\n``reaper --udp 127.0.0.1 rabbitmq --uri \"amqp://guest:guest@localhost:5672/\" --exchange exname --routing-key key --type direct``\n\nThis will forward entries to a RabbitMQ broker, located at\nlocalhost:5672, using guest/guest as credentials, to the / virtual host,\nin the direct exchange exname, and with “key” as a routing key.\n\nSTOMP\n^^^^^\n\n``./reaper_debug --udp 127.0.0.1:1514 stomp --login user --passcode password --host virtualhost --destination /queue/reaper --addr 192.168.1.2:61613``\n\nElasticsearch\n^^^^^^^^^^^^^\n\nForward logs to an Elasticsearch server.\n\n``reaper --udp 127.0.0.1 elasticsearch --url http://127.0.0.1:9200 --index indexname``\n\nRedis\n^^^^^\n\nForward logs to Redis, using a redis list (think LPOP, RPUSH).\n\n``reaper --udp 127.0.0.1 redis --addr 127.0.0.1:6379 --listname thelistkey --database 6 --password pass``\n\nKafka\n^^^^^\n\n``reaper --udp 127.0.0.1 kafka --broker 192.168.1.2:9092 --broker 192.168.1.3:9092 --broker 192.168.1.4:9092 --topic topicname``\n\nPostgreSQL/TimescaleDB\n^^^^^^^^^^^^^^^^^^^^^^\n\nFirst you need to create a table in PostgreSQL that is consistent with\nthe log format.\n\nFor example:\n\n::\n\n   +------------+--------------------------+-------------------+\n   | Column     | Type                     | Modifiers         | \n   |------------+--------------------------+-------------------+\n   | timestamp  | timestamp with time zone |  not null         |\n   | method     | text                     |  default ''::text |\n   | scheme     | text                     |  default ''::text |\n   | host       | text                     |  default ''::text |\n   | server     | text                     |  default ''::text |\n   | uri        | text                     |  default ''::text |\n   | duration   | double precision         |  default 0        |\n   | length     | integer                  |  default 0        |\n   | status     | integer                  |  default 0        |\n   | sent       | integer                  |  default 0        |\n   | agent      | text                     |  default ''::text |\n   | remoteaddr | text                     |  default ''::text |\n   | remoteuser | text                     |  default ''::text |\n   +------------+--------------------------+-------------------+\n\n   Indexes:\n       \"reaper_duration_timestamp_idx\" btree (duration, \"timestamp\" DESC)\n       \"reaper_host_timestamp_idx\" btree (host, \"timestamp\" DESC)\n       \"reaper_length_timestamp_idx\" btree (length, \"timestamp\" DESC)\n       \"reaper_method_timestamp_idx\" btree (method, \"timestamp\" DESC)\n       \"reaper_remoteaddr_timestamp_idx\" btree (remoteaddr, \"timestamp\" DESC)\n       \"reaper_scheme_timestamp_idx\" btree (scheme, \"timestamp\" DESC)\n       \"reaper_sent_timestamp_idx\" btree (sent, \"timestamp\" DESC)\n       \"reaper_server_timestamp_idx\" btree (server, \"timestamp\" DESC)\n       \"reaper_timestamp_idx\" btree (\"timestamp\" DESC)\n\nThen:\n\n::\n\n   reaper --udp 127.0.0.1:1514 pgsql \\\n       --uri \"postgres://user:password@127.0.0.1/dbname\"\n       --table tablename\n       --fields \"timestamp,method,scheme,host,server,uri,duration,length,status,sent,agent,remoteaddr,remoteuser\"    \n\nExternal nsqd\n^^^^^^^^^^^^^\n\n``reaper --udp 127.0.0.1:1514 nsq --addr 192.168.1.2:4150 --topic topicname --json``\n\nForward to another reaper instance\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nOn machine A 192.168.1.2 (with web server):\n\n``reaper --udp 127.0.0.1:1514 nsq --addr 192.168.1.3:4150 --topic embedded``\n\nOn machine B 192.168.1.3:\n\n``reaper --nsqd-address 192.168.1.3 --nsqd-tcp-port 4150 pgsql ...``\n\nHTTP API\n~~~~~~~~\n\nIf started with ``--http-address``, reaper exposes a HTTP API.\n\nEndpoints:\n\n-  /status =\u003e just returns 200 HTTP status code.\n\n-  /metrics =\u003e prometheus metrics (with the embedded nsqd metrics).\n\n-  POST /download/:clientid?wait=3000\u0026size=1000 =\u003e creates a channel of\n   access logs entries and download entries.\n\n   size is the number of entries to be returned. wait is the number of\n   milliseconds to wait\n\n   After the first POST call, a nsq channel is created. All received\n   entries will be copied to this channel. Each successive POST call\n   with return different entries.\n\n-  DELETE /download/:clientid =\u003e delete a previously created channel\n\nWebsocket API\n~~~~~~~~~~~~~\n\nIf started with ``--websocket-address``, reaper exposes a websocket\nendpoint.\n\n-  /stream: stream received entries to the websocket client.\n\nLogging\n~~~~~~~\n\nBy default reaper own logs are written on stderr.\n\nThe logging level can be set with ``--loglevel`` [debug, info, warn,\nerror, crit].\n\nAlternatively reaper can use syslog with ``--syslog``\n\nDesign\n======\n\nreaper embeds a nsqd service (https://nsq.io). When access logs entries\nare received on TCP, UDP or stdin, they are first stored in the embedded\nnsqd. Thus, reaper only deletes an access log entry when it has been\nreliably sent to the configured destination.\n\nForwarding to the destination is done asynchronously to achieve good\nperformance.\n\nChangelog\n=========\n\nhttps://github.com/stephane-martin/reaper/blob/master/CHANGELOG.md\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstephane-martin%2Freaper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstephane-martin%2Freaper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstephane-martin%2Freaper/lists"}