{"id":36708248,"url":"https://github.com/streamthoughts/kafka-connect-transform-grok","last_synced_at":"2026-01-12T11:45:00.438Z","repository":{"id":54609868,"uuid":"323657430","full_name":"streamthoughts/kafka-connect-transform-grok","owner":"streamthoughts","description":"Grok Expression Transform for Kafka Connect.","archived":false,"fork":false,"pushed_at":"2021-02-08T15:29:56.000Z","size":76,"stargazers_count":16,"open_issues_count":2,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-19T18:06:30.310Z","etag":null,"topics":["apache-kafka","grok-parser","grok-patterns","kafka-connect"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/streamthoughts.png","metadata":{"files":{"readme":"README.adoc","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-12-22T15:02:50.000Z","updated_at":"2023-06-12T06:05:22.000Z","dependencies_parsed_at":"2022-08-13T21:30:35.049Z","dependency_job_id":null,"html_url":"https://github.com/streamthoughts/kafka-connect-transform-grok","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/streamthoughts/kafka-connect-transform-grok","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamthoughts%2Fkafka-connect-transform-grok","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamthoughts%2Fkafka-connect-transform-grok/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamthoughts%2Fkafka-connect-transform-grok/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamthoughts%2Fkafka-connect-transform-grok/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/streamthoughts","download_url":"https://codeload.github.com/streamthoughts/kafka-connect-transform-grok/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamthoughts%2Fkafka-connect-transform-grok/sbom","scorecard":{"id":855085,"data":{"date":"2025-08-11","repo":{"name":"github.com/streamthoughts/kafka-connect-transform-grok","commit":"1c6af013077fde5e534f42a9985a59ec57178b7e"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Code-Review","score":0,"reason":"Found 1/13 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/maven.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven.yml:18: update your workflow using https://app.stepsecurity.io/secureworkflow/streamthoughts/kafka-connect-transform-grok/maven.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven.yml:20: update your workflow using https://app.stepsecurity.io/secureworkflow/streamthoughts/kafka-connect-transform-grok/maven.yml/main?enable=pin","Info:   0 out of   2 GitHub-owned GitHubAction dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":8,"reason":"2 out of the last 2 releases have a total of 2 signed artifacts.","details":["Info: signed release artifact: streamthoughts-kafka-connect-transform-grok-1.1.0.zip.asc: https://github.com/streamthoughts/kafka-connect-transform-grok/releases/tag/v1.1.0","Info: signed release artifact: streamthoughts-kafka-connect-transform-grok-1.0.0.zip.asc: https://github.com/streamthoughts/kafka-connect-transform-grok/releases/tag/v1.0.0","Warn: release artifact v1.1.0 does not have provenance: https://api.github.com/repos/streamthoughts/kafka-connect-transform-grok/releases/35672167","Warn: release artifact v1.0.0 does not have provenance: https://api.github.com/repos/streamthoughts/kafka-connect-transform-grok/releases/35626967"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 1 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":0,"reason":"11 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-2qrg-x229-3v8q","Warn: Project is vulnerable to: GHSA-65fg-84f6-3jq3","Warn: Project is vulnerable to: GHSA-f7vh-qwp3-x37m","Warn: Project is vulnerable to: GHSA-fp5r-v3w9-4333","Warn: Project is vulnerable to: GHSA-w9p3-5cr8-m3jj","Warn: Project is vulnerable to: GHSA-2x2g-32r7-p4x8","Warn: Project is vulnerable to: GHSA-3j6g-hxx5-3q26","Warn: Project is vulnerable to: GHSA-55g7-9cwv-5qfv","Warn: Project is vulnerable to: GHSA-fjpj-2g6w-x25r","Warn: Project is vulnerable to: GHSA-pqr6-cmr2-h8hf","Warn: Project is vulnerable to: GHSA-qcwq-55hx-v3vh"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-23T23:41:44.240Z","repository_id":54609868,"created_at":"2025-08-23T23:41:44.241Z","updated_at":"2025-08-23T23:41:44.241Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28338972,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T10:58:46.209Z","status":"ssl_error","status_checked_at":"2026-01-12T10:58:42.742Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-kafka","grok-parser","grok-patterns","kafka-connect"],"created_at":"2026-01-12T11:45:00.362Z","updated_at":"2026-01-12T11:45:00.427Z","avatar_url":"https://github.com/streamthoughts.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"= Kafka Connect Grok Transformation\n\nimage:https://img.shields.io/badge/License-Apache%202.0-blue.svg[https://github.com/streamthoughts/kafka-connect-transform-grok/blob/master/LICENSE]\nimage:https://img.shields.io/github/issues-raw/streamthoughts/kafka-connect-transform-grok[GitHub issues]\nimage:https://img.shields.io/github/stars/streamthoughts/kafka-connect-transform-grok?style=social[GitHub Repo stars]\n\n== Description\n\nThe Apache Kafka® SMT `io.streamthoughts.kafka.connect.transform.Grok` allows parsing unstructured data text into a data `Struct` from the entire key or value using\na Grok expression.\n\n== Installation\n\nGrok SMT can be installed using the https://docs.confluent.io/current/confluent-hub/client.html[Confluent Hub Client] with:\n\n1. Download the distribution ZIP file for the latest available version.\n+\n[source, bash]\n----\nexport VERSION=1.1.0\nexport GITHUB_HUB_REPO=https://github.com/streamthoughts/kafka-connect-transform-grok\n$ curl -sSL $GITHUB_HUB_REPO/releases/download/v$VERSION/streamthoughts-kafka-connect-transform-grok-$VERSION.zip\n----\n+\n2. Use the `confluent-hub` CLI for installing it.\n+\n[source, bash]\n----\n$ confluent-hub install streamthoughts-kafka-connect-transform-grok-$VERSION.zip\n----\n\nAlternatively, you can extract the ZIP file into one of the directories that is listed on the `plugin.path` worker configuration property.\n\n== Grok Basics\n\nThe syntax for a grok pattern is `%{SYNTAX:SEMANTIC}` or `%{SYNTAX:SEMANTIC:TYPE}`\n\nThe `SYNTAX` is the name of the pattern that should match the input text value.\n\nThe `SEMANTIC` is the field name for the data field that will contain the piece of text being matched.\n\nThe `TYPE` is the target type to which the data field must be converted.\n\nSupported types are: ::\n* `SHORT`\n* `INTEGER`\n* `LONG`\n* `FLOAT`\n* `DOUBLE`\n* `BOOLEAN`\n* `STRING`\n\nThe Kafka Connect Grok Transformer ships with a lot of reusable grok patterns. See the complete list of https://github.com/streamthoughts/kafka-connect-transform-grok/tree/main/src/main/resources/patterns[patterns].\n\n== Debugging Grok Expression\nYou can build and debug your patterns using the useful online tools: http://grokdebug.herokuapp.com/[Grok Debug] and http://grokconstructor.appspot.com/[Grok Constructor].\n\n== Regular Expressions\nGrok sits on top of regular expressions, so any regular expressions are valid in grok as well.\n\nThe Grok SMT uses the regular expression https://github.com/jruby/joni[Joni] library which is the Java port of https://github.com/kkos/oniguruma[Oniguruma] regexp library used by the http://www.elasticsearch.org/overview/[Elastic stack] (i.e Logstash).\n\n== Custom Patterns\n\nSometimes, the patterns provided by the Grok SMT will not be sufficient to match your data.\nTherefore, you have a few options to define custom patterns.\n\nOption #1::\nYou can use the https://github.com/kkos/oniguruma[Oniguruma] syntax for _named capture_ which allows you to match a piece of text and capture it as a field:\n\n[source]\n----\n(?\u003cfield_name\u003ethe regex pattern)\n----\n\nFor example, if you need to capture parts of an email we can you the following pattern :\n[source]\n----\n(?\u003cEMAILADDRESS\u003e(?\u003cEMAILLOCALPART\u003e^[A-Z0-9._%+-]+)(?\u003cHOSTNAME\u003e@[A-Z0-9.-]+\\.[A-Z]{2,6})$)\n----\n\nConfiguration::\n[source, properties]\n----\ntransforms=Grok\ntransforms.Grok.type=io.streamthoughts.kafka.connect.transform.Grok$Value\ntransforms.Grok.pattern=(?\u003cEMAILADDRESS\u003e(?\u003cEMAILLOCALPART\u003e[A-Za-z0-9._%+-]+)@(?\u003cHOSTNAME\u003e[A-Za-z0-9.-]+\\\\.[A-Za-z]{2,6}))\n----\n\n_Note: The pattern `EMAILADDRESS` is already provided by the Grok SMT._\n\nOption #2::\n\nYou can create a custom patterns file that will loaded the first time the Grok SMT is used :\n\nFor example, defining the pattern needed to parse NGINX access logs:::\n* Create a directory (e.g. `grok-patterns`) with a file in it called `nginx`.\n* Then, write the pattern you need in that file as: `\u003cthe pattern name\u003e\u003ca space\u003e\u003cthe regexp for that pattern\u003e`.\n\n[source, bash]\n----\n$ mkdir ./grok-patterns\n$ cat \u003c\u003cEOF \u003e ./grok-patterns/nginx\nNGINX_ACCESS %{IPORHOST:remote_addr} - %{USERNAME:remote_user} \\[%{HTTPDATE:time_local}\\] \\\"%{DATA:request}\\\" %{INT:status} %{NUMBER:bytes_sent} \\\"%{DATA:http_referer}\\\" \\\"%{DATA:http_user_agent}\\\"\nEOF\n----\n\nConfiguration::\n[source, properties]\n----\ntransforms=Grok\ntransforms.Grok.type=io.streamthoughts.kafka.connect.transform.Grok$Value\ntransforms.Grok.pattern=%{NGINX_ACCESS}\ntransforms.Grok.patternsDir=/tmp/grok-patterns\n----\n\n== Grok Configuration\n\n[%header,format=csv]\n|===\nProperty,Description,Type,Importance, Default\n`breakOnFirstPattern`, If true break on the first successful matching. Otherwise, the transformation will try all configured grok patterns, `boolean`, `true`\n`pattern`, The grok expression to match and extract named captures (i.e data fields) with., `string`, High, -\n`patterns.\u003cid\u003e`, An ordered list of grok expression to match and extract named captures (i.e data fields) with., `string`, High, -\n`patternDefinitions`, Custom pattern definitions, `list`, Low, -\n`patternsDir`, List of user-defined pattern directories, `list`, Low, -\n`namedCapturesOnly`, If true then only store named captures from grok, `boolean`, Medium, `true`\n|===\n\n== 💡 Contributions\n\nAny feedback, bug reports and PRs are greatly appreciated!\n\n* Source Code: https://github.com/streamthoughts/kafka-connect-transform-grok[https://github.com/streamthoughts/kafka-connect-transform-grok]\n* Issue Tracker: https://github.com/streamthoughts/kafka-connect-transform-grok/issues[https://github.com/streamthoughts/kafka-connect-transform-grok/issues]\n\n* Releases: https://github.com/streamthoughts/kafka-connect-transform-grok/releases[https://github.com/streamthoughts/kafka-connect-transform-grok/releases]\n\n== About\n\nOriginally, most of the source code used by the Apache Kafka® SMT `io.streamthoughts.kafka.connect.transform.Grok` was developed within the https://github.com/streamthoughts/kafka-connect-file-pulse[Kafka Connect File Pulse] connector plugin.\n\n== Licence\n\nCopyright 2020-2021 StreamThoughts.\n\nLicensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at\n\nhttp://www.apache.org/licenses/LICENSE-2.0[http://www.apache.org/licenses/LICENSE-2.0]\n\nUnless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstreamthoughts%2Fkafka-connect-transform-grok","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstreamthoughts%2Fkafka-connect-transform-grok","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstreamthoughts%2Fkafka-connect-transform-grok/lists"}