{"id":28501932,"url":"https://github.com/fluent/fluent-plugin-grok-parser","last_synced_at":"2025-07-05T02:31:42.704Z","repository":{"id":18204080,"uuid":"21337099","full_name":"fluent/fluent-plugin-grok-parser","owner":"fluent","description":"Fluentd's Grok parser","archived":false,"fork":false,"pushed_at":"2023-09-12T01:23:42.000Z","size":220,"stargazers_count":109,"open_issues_count":12,"forks_count":32,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-06-08T16:08:24.264Z","etag":null,"topics":["fluentd","fluentd-plugin","grok-parser"],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fluent.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-06-30T02:14:25.000Z","updated_at":"2025-03-26T14:37:45.000Z","dependencies_parsed_at":"2024-06-18T19:57:27.042Z","dependency_job_id":"ce1ffdf5-92be-4528-b6d8-907aabc179c7","html_url":"https://github.com/fluent/fluent-plugin-grok-parser","commit_stats":{"total_commits":209,"total_committers":8,"mean_commits":26.125,"dds":"0.26794258373205737","last_synced_commit":"0b4ac165f0db571524df193750b020459735a25b"},"previous_names":[],"tags_count":25,"template":false,"template_full_name":null,"purl":"pkg:github/fluent/fluent-plugin-grok-parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-grok-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-grok-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-grok-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-grok-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fluent","download_url":"https://codeload.github.com/fluent/fluent-plugin-grok-parser/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fluent%2Ffluent-plugin-grok-parser/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263671744,"owners_count":23494026,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fluentd","fluentd-plugin","grok-parser"],"created_at":"2025-06-08T16:08:30.015Z","updated_at":"2025-07-05T02:31:42.692Z","avatar_url":"https://github.com/fluent.png","language":"Ruby","readme":"# Grok Parser for Fluentd\n\n![Testing on Ubuntu](https://github.com/fluent/fluent-plugin-grok-parser/workflows/Testing%20on%20Ubuntu/badge.svg?branch=master)\n![Testing on macOS](https://github.com/fluent/fluent-plugin-grok-parser/workflows/Testing%20on%20macOS/badge.svg?branch=master)\n\nThis is a Fluentd plugin to enable Logstash's Grok-like parsing logic.\n\n## Requirements\n\n| fluent-plugin-grok-parser | fluentd    | ruby   |\n|---------------------------|------------|--------|\n| \u003e= 2.0.0                  | \u003e= v0.14.0 | \u003e= 2.1 |\n| \u003c 2.0.0                   | \u003e= v0.12.0 | \u003e= 1.9 |\n\n\n## What's Grok?\n\nGrok is a macro to simplify and reuse regexes, originally developed by [Jordan Sissel](http://github.com/jordansissel).\n\nThis is a partial implementation of Grok's grammer that should meet most of the needs.\n\n## How It Works\n\nYou can use it wherever you used the `format` parameter to parse texts. In the following example, it\nextracts the first IP address that matches in the log.\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  tag grokked_log\n  \u003cparse\u003e\n    @type grok\n    grok_pattern %{IP:ip_address}\n  \u003c/parse\u003e\n\u003c/source\u003e\n```\n\n**If you want to try multiple grok patterns and use the first matched one**, you can use the following syntax:\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  tag grokked_log\n  \u003cparse\u003e\n    @type grok\n    \u003cgrok\u003e\n      pattern %{HTTPD_COMBINEDLOG}\n      time_format \"%d/%b/%Y:%H:%M:%S %z\"\n    \u003c/grok\u003e\n    \u003cgrok\u003e\n      pattern %{IP:ip_address}\n    \u003c/grok\u003e\n    \u003cgrok\u003e\n      pattern %{GREEDYDATA:message}\n    \u003c/grok\u003e\n  \u003c/parse\u003e\n\u003c/source\u003e\n```\n\n### Multiline support\n\nYou can parse multiple line text.\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  tag grokked_log\n  \u003cparse\u003e\n    @type multiline_grok\n    grok_pattern %{IP:ip_address}%{GREEDYDATA:message}\n    multiline_start_regexp /^[^\\s]/\n  \u003c/parse\u003e\n\u003c/source\u003e\n```\n\nYou can use multiple grok patterns to parse your data.\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  tag grokked_log\n  \u003cparse\u003e\n    @type multiline_grok\n    \u003cgrok\u003e\n      pattern Started %{WORD:verb} \"%{URIPATH:pathinfo}\" for %{IP:ip} at %{TIMESTAMP_ISO8601:timestamp}\\nProcessing by %{WORD:controller}#%{WORD:action} as %{WORD:format}%{DATA:message}Completed %{NUMBER:response} %{WORD} in %{NUMBER:elapsed} (%{DATA:elapsed_details})\n    \u003c/grok\u003e\n  \u003c/parse\u003e\n\u003c/source\u003e\n```\n\nFluentd accumulates data in the buffer forever to parse complete data when no pattern matches.\n\nYou can use this parser without `multiline_start_regexp` when you know your data structure perfectly.\n\n## Configurations\n\n* See also: [Config: Parse Section - Fluentd](https://docs.fluentd.org/configuration/parse-section)\n\n* **time_format** (string) (optional): The format of the time field.\n* **grok_pattern** (string) (optional): The pattern of grok. You cannot specify multiple grok pattern with this.\n* **custom_pattern_path** (string) (optional): Path to the file that includes custom grok patterns\n* **grok_failure_key** (string) (optional): The key has grok failure reason.\n* **grok_name_key** (string) (optional): The key name to store grok section's name\n* **multi_line_start_regexp** (string) (optional): The regexp to match beginning of multiline. This is only for \"multiline_grok\".\n* **grok_pattern_series** (enum) (optional): Specify grok pattern series set.\n  * Default value: `legacy`.\n\n### \\\u003cgrok\\\u003e section (optional) (multiple)\n\n* **name** (string) (optional): The name of this grok section\n* **pattern** (string) (required): The pattern of grok\n* **keep_time_key** (bool) (optional): If true, keep time field in the record.\n* **time_key** (string) (optional): Specify time field for event time. If the event doesn't have this field, current time is used.\n  * Default value: `time`.\n* **time_format** (string) (optional): Process value using specified format. This is available only when time_type is string\n* **timezone** (string) (optional): Use specified timezone. one can parse/format the time value in the specified timezone.\n\n\n## Examples\n\n### Using grok\\_failure\\_key\n\n```aconf\n\u003csource\u003e\n  @type dummy\n  @label @dummy\n  dummy [\n    { \"message1\": \"no grok pattern matched!\", \"prog\": \"foo\" },\n    { \"message1\": \"/\", \"prog\": \"bar\" }\n  ]\n  tag dummy.log\n\u003c/source\u003e\n\n\u003clabel @dummy\u003e\n  \u003cfilter\u003e\n    @type parser\n    key_name message1\n    reserve_data true\n    reserve_time true\n    \u003cparse\u003e\n      @type grok\n      grok_failure_key grokfailure\n      \u003cgrok\u003e\n        pattern %{PATH:path}\n      \u003c/grok\u003e\n    \u003c/parse\u003e\n  \u003c/filter\u003e\n  \u003cmatch dummy.log\u003e\n    @type stdout\n  \u003c/match\u003e\n\u003c/label\u003e\n```\n\nThis generates following events:\n\n```\n2016-11-28 13:07:08.009131727 +0900 dummy.log: {\"message1\":\"no grok pattern matched!\",\"prog\":\"foo\",\"message\":\"no grok pattern matched!\",\"grokfailure\":\"No grok pattern matched\"}\n2016-11-28 13:07:09.010400923 +0900 dummy.log: {\"message1\":\"/\",\"prog\":\"bar\",\"path\":\"/\"}\n```\n\n### Using grok\\_name\\_key\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  tag grokked_log\n  \u003cparse\u003e\n    @type grok\n    grok_name_key grok_name\n    grok_failure_key grokfailure\n    \u003cgrok\u003e\n      name apache_log\n      pattern %{HTTPD_COMBINEDLOG}\n      time_format \"%d/%b/%Y:%H:%M:%S %z\"\n    \u003c/grok\u003e\n    \u003cgrok\u003e\n      name ip_address\n      pattern %{IP:ip_address}\n    \u003c/grok\u003e\n    \u003cgrok\u003e\n      name rest_message\n      pattern %{GREEDYDATA:message}\n    \u003c/grok\u003e\n  \u003c/parse\u003e\n\u003c/source\u003e\n```\n\nThis will add keys like following:\n\n* Add `grok_name: \"apache_log\"` if the record matches `HTTPD_COMBINEDLOG`\n* Add `grok_name: \"ip_address\"` if the record matches `IP`\n* Add `grok_name: \"rest_message\"` if the record matches `GREEDYDATA`\n\nAdd `grokfailure` key to the record if the record does not match any grok pattern.\nSee also test code for more details.\n\n## How to parse time value using specific timezone\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  tag grokked_log\n  \u003cparse\u003e\n    @type grok\n    \u003cgrok\u003e\n      name mylog-without-timezone\n      pattern %{DATESTAMP:time} %{GREEDYDATE:message}\n      timezone Asia/Tokyo\n    \u003c/grok\u003e\n  \u003c/parse\u003e\n\u003c/source\u003e\n```\n\nThis will parse the `time` value as \"Asia/Tokyo\" timezone.\n\nSee [Config: Parse Section - Fluentd](https://docs.fluentd.org/configuration/parse-section) for more details about timezone.\n\n## How to write Grok patterns\n\nGrok patterns look like `%{PATTERN_NAME:name}` where \":name\" is optional. If \"name\" is provided, then it\nbecomes a named capture. So, for example, if you have the grok pattern\n\n```\n%{IP} %{HOST:host}\n```\n\nit matches\n\n```\n127.0.0.1 foo.example\n```\n\nbut only extracts \"foo.example\" as {\"host\": \"foo.example\"}\n\nPlease see `patterns/*` for the patterns that are supported out of the box.\n\n## How to add your own Grok pattern\n\nYou can add your own Grok patterns by creating your own Grok file and telling the plugin to read it.\nThis is what the `custom_pattern_path` parameter is for.\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  \u003cparse\u003e\n    @type grok\n    grok_pattern %{MY_SUPER_PATTERN}\n    custom_pattern_path /path/to/my_pattern\n  \u003c/parse\u003e\n\u003c/source\u003e\n```\n\n`custom_pattern_path` can be either a directory or file. If it's a directory, it reads all the files in it.\n\n## FAQs\n\n### 1. How can I convert types of the matched patterns like Logstash's Grok?\n\nAlthough every parsed field has type `string` by default, you can specify other types. This is useful when filtering particular fields numerically or storing data with sensible type information.\n\nThe syntax is\n\n```\ngrok_pattern %{GROK_PATTERN:NAME:TYPE}...\n```\n\ne.g.,\n\n```\ngrok_pattern %{INT:foo:integer}\n```\n\nUnspecified fields are parsed at the default string type.\n\nThe list of supported types are shown below:\n\n* `string`\n* `bool`\n* `integer` (\"int\" would NOT work!)\n* `float`\n* `time`\n* `array`\n\nFor the `time` and `array` types, there is an optional 4th field after the type name. For the \"time\" type, you can specify a time format like you would in `time_format`.\n\nFor the \"array\" type, the third field specifies the delimiter (the default is \",\"). For example, if a field called \"item\\_ids\" contains the value \"3,4,5\", `types item_ids:array` parses it as [\"3\", \"4\", \"5\"]. Alternatively, if the value is \"Adam|Alice|Bob\", `types item_ids:array:|` parses it as [\"Adam\", \"Alice\", \"Bob\"].\n\nHere is a sample config using the Grok parser with `in_tail` and the `types` parameter:\n\n```aconf\n\u003csource\u003e\n  @type tail\n  path /path/to/log\n  format grok\n  grok_pattern %{INT:user_id:integer} paid %{NUMBER:paid_amount:float}\n  tag payment\n\u003c/source\u003e\n```\n\n## Notice\n\nIf you want to use this plugin with Fluentd v0.12.x or earlier, you can use this plugin version v1.x.\n\nSee also: [Plugin Management | Fluentd](https://docs.fluentd.org/deployment/plugin-management)\n\n## License\n\nApache 2.0 License\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffluent%2Ffluent-plugin-grok-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffluent%2Ffluent-plugin-grok-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffluent%2Ffluent-plugin-grok-parser/lists"}