{"id":19788337,"url":"https://github.com/graylog2/graylog-s3-lambda","last_synced_at":"2025-05-01T00:30:59.311Z","repository":{"id":45127784,"uuid":"208928543","full_name":"Graylog2/graylog-s3-lambda","owner":"Graylog2","description":"An AWS Lambda function that reads logs from S3 and sends them to Graylog","archived":false,"fork":false,"pushed_at":"2023-06-20T16:36:03.000Z","size":490,"stargazers_count":12,"open_issues_count":12,"forks_count":6,"subscribers_count":22,"default_branch":"master","last_synced_at":"2024-04-15T00:39:06.513Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Graylog2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-09-17T01:07:19.000Z","updated_at":"2023-10-12T16:10:12.000Z","dependencies_parsed_at":"2022-09-10T08:10:45.046Z","dependency_job_id":null,"html_url":"https://github.com/Graylog2/graylog-s3-lambda","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graylog2%2Fgraylog-s3-lambda","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graylog2%2Fgraylog-s3-lambda/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graylog2%2Fgraylog-s3-lambda/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Graylog2%2Fgraylog-s3-lambda/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Graylog2","download_url":"https://codeload.github.com/Graylog2/graylog-s3-lambda/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224229139,"owners_count":17277137,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T06:26:55.408Z","updated_at":"2024-11-12T06:26:56.080Z","avatar_url":"https://github.com/Graylog2.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Graylog S3 Lambda\nAn AWS Lambda function that reads log messages from AWS S3 and sends them to the Graylog GELF (TCP) input.\n\n## Overview\n\nThe Graylog S3 Lambda function reads log files written to an S3 bucket and sends them to a Graylog cluster where a GELF \n(TCP or UDP) input is running. The function triggers automatically each time a new file is written to S3. With each \nfunction execution, each line in the file is streamed and processed by the Lambda function then sent to the specified \nGraylog node or cluster. Each line is considered a single message. Several log formats are supported. The `text/plain` \nCONTENT_TYPE can be used in combination with Graylog [Pipelines](https://docs.graylog.org/en/3.1/pages/pipelines/pipelines.html) \nfor any log formats that are not directly supported.       \n\n## Installation\n\n### Step 1: Create base Lambda function and policy\n\nNavigate to the Lambda service page in the AWS web console. Create a new Lambda function from and specify a function \nname of your choice, and choose the Java-8 runtime. Create or specify an execution role with the following permissions. \nYou can also further restrict the Resource permissions as desired for your specific setup.\n\n```\n{\n    \"Version\": \"2012-10-17\",\n    \"Statement\": [\n        {\n            \"Sid\": \"Policy\",\n            \"Effect\": \"Allow\",\n            \"Action\": [\n                \"logs:CreateLogGroup\"\n                \"s3:GetObject\",\n                \"logs:CreateLogStream\",\n                \"logs:PutLogEvents\"\n            ],\n            \"Resource\": [\n                \"arn:aws:logs:your-region:your-account-number:*\"\n                \"arn:aws:s3:::s3-bucket-name/*\"\n            ]\n        }\n    ]\n}\n```\n\nNOTE: If your Graylog cluster is running in a VPC, you may need to add the AWSLambdaVPCAccessExecutionRole managed \nrole to allow the Lambda function to route traffic to the VPC.\n\nOnce the function is created, upload the function code graylog-s3-lambda.jar located in the Preparation task section. \nSpecify the following method for the Handler: `org.graylog.integrations.s3.GraylogS3Function::handleRequest`\n\n### Step 2: Specify configuration\n\nSpecify the following environment variables to configure the Lambda function for your Graylog cluster:\n\n* `GRAYLOG_HOST`: *(required)* The hostname or IP address of the Graylog host or load balancer.\n* `GRAYLOG_PORT`: *(optional - defaults to `12201`)*: The Graylog service port.\n* `CONTENT_TYPE`: *(optional - defaults to `text/plain`)* The type of log messages to read. Messages will be parsed according to their content type. Supported values: `application/json`, `text/plain`, and `application/x.cloudflare.log`\n* `COMPRESSION_TYPE`: *(optional - defaults to `none`)* The compression type. Supported values: `none`, `gzip`\n* `CONNECT_TIMEOUT` *(optional - defaults to `10000`)* The number of milliseconds to wait for the connection to be established.\n* `LOG_LEVEL` *(optional - defaults to `INFO`)* The level of detail to include in the CloudWatch logs generated from the Lambda function. Supported values are OFF, ERROR, WARN, INFO, DEBUG, TRACE, and ALL. Increase the logging level to help with troubleshooting. See this page for more information.\n* `RECONNECT_DELAY`: *(optional - defaults to `10000`)* The number of milliseconds to wait between reconnection attempts.\n* `TCP_KEEP_ALIVE`: *(optional - defaults to `true`)* Enable TCP Keep Alive.\n* `TCP_NO_DELAY`: *(optional - defaults to `true`)* Enable TCP No Delay.\n* `TCP_QUEUE_SIZE`: *(optional - defaults to `512`)* The queue size for messages that have yet to be sent. \n* `TCP_MAX_IN_FLIGHT_SENDS`: *(optional - defaults to `512`)* The maximum number of messages that can be in flight at one time.\n* `PROTOCOL_TYPE`: *(optional - defaults to `tcp`)* The type of protocol. Supported values: `tcp` `udp`\n* `SHUTDOWN_FLUSH_TIMEOUT_MS`: *(optional - defaults to `100`)* The number of milliseconds to wait or all messages to finish flushing/sending after message processing is complete.    \n* `SHUTDOWN_FLUSH_RETRIES`: *(optional - defaults to `600`)* The number of times to retry the `SHUTDOWN_FLUSH_TIMEOUT_MS`. Increase this value if not all messages are sent by the time the Lambda function exits (only if the maximum Lambda function [timeout](https://docs.aws.amazon.com/lambda/latest/dg/resource-model.html) has not been reached). \n* `CLOUDFLARE_LOGPUSH_MESSAGE_FIELDS`: *(optional - defaults to all fields in Cloudflare log JSON)* The fields to parse from the message. Specify as a comma-separated list of field names.\n* `CLOUDFLARE_LOGPUSH_MESSAGE_SUMMARY_FIELDS `: *(optional - defaults to `ClientRequestHost, ClientRequestPath, OriginIP, ClientSrcPort, EdgeServerIP, EdgeResponseBytes`) The fields to include in the message summary that appears above the parsed fields at the top of each message in Graylog, specify as a comma-separated list of field names.\n\nNote: \nAll log messages are sent over TCP by default. TLS encryption between the Lambda function and Graylog is not currently \nsupported. We recommend taking appropriate measures to secure the log messages in transit (such as placing the Lambda \nfunction within a secure VPC subnet where the Graylog node or cluster is running).\n\n![Environment Variables](images/environment-variables.png)\n\n### Step 3: Create S3 trigger\n\nCreate an AWS S3 Trigger for the Lambda function so that the function can process each log file that is \nwritten. Specify the same S3 bucket that you did in the Preparation step and make sure to choose All object create \nevents option is selected. You can also apply any other desired file filters here.\n\n![Add S3 Trigger](images/add-s3-trigger.png)\n\nIf your Graylog cluster is located within a VPC, you will need to configure your Lambda function to access resources in a VPC.\n\n#### Fully configured function\nOnce the function is fully set up, it should look like the following image.\n\n![Fully Setup Function](images/fully-setup-function.png)\n\n### Step 4: Create GELF (TCP) input\n\nCreate a GELF (TCP) input on a Graylog node. The input can be created globally if load balancing is desired. Note that \nthe port number should match that specified in the configuration.  \n\n![GELF Input](images/gelf-input.png) \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraylog2%2Fgraylog-s3-lambda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgraylog2%2Fgraylog-s3-lambda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgraylog2%2Fgraylog-s3-lambda/lists"}