{"id":18573152,"url":"https://github.com/localstack-samples/sample-cdk-kinesis-firehose-redshift","last_synced_at":"2026-02-02T14:02:29.613Z","repository":{"id":227743912,"uuid":"772084882","full_name":"localstack-samples/sample-cdk-kinesis-firehose-redshift","owner":"localstack-samples","description":"LocalStack sample CDK app deploying a Kinesis Event Stream to Data Firehose to Redshift data pipeline, including sample producer and consumer","archived":false,"fork":false,"pushed_at":"2024-12-04T07:28:12.000Z","size":248,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":15,"default_branch":"main","last_synced_at":"2024-12-04T08:25:18.684Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/localstack-samples.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-14T14:04:20.000Z","updated_at":"2024-12-04T07:28:16.000Z","dependencies_parsed_at":"2024-12-04T08:23:04.029Z","dependency_job_id":"c9d9e741-7332-4ca0-ac51-919c700d1369","html_url":"https://github.com/localstack-samples/sample-cdk-kinesis-firehose-redshift","commit_stats":null,"previous_names":["localstack-samples/sample-cdk-kinesis-firehose-redshift"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localstack-samples%2Fsample-cdk-kinesis-firehose-redshift","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localstack-samples%2Fsample-cdk-kinesis-firehose-redshift/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localstack-samples%2Fsample-cdk-kinesis-firehose-redshift/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localstack-samples%2Fsample-cdk-kinesis-firehose-redshift/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/localstack-samples","download_url":"https://codeload.github.com/localstack-samples/sample-cdk-kinesis-firehose-redshift/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":231367209,"owners_count":18365861,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T23:08:13.852Z","updated_at":"2026-02-02T14:02:24.579Z","avatar_url":"https://github.com/localstack-samples.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CDK deployment of a Kinesis Event Stream to Data Firehose to Redshift data pipeline\nLocalStack sample CDK app deploying a Kinesis Event Stream to Data Firehose to Redshift data pipeline, including sample producer and consumer\n\n| Key          | Value                                                                                                |\n| ------------ | ---------------------------------------------------------------------------------------------------- |\n| Environment  | \u003cimg src=\"https://img.shields.io/badge/LocalStack-deploys-4D29B4.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAEAAAABACAYAAACqaXHeAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAKgAAACoABZrFArwAAABl0RVh0U29mdHdhcmUAd3d3Lmlua3NjYXBlLm9yZ5vuPBoAAALbSURBVHic7ZpNaxNRFIafczNTGIq0G2M7pXWRlRv3Lusf8AMFEQT3guDWhX9BcC/uFAr1B4igLgSF4EYDtsuQ3M5GYrTaj3Tmui2SpMnM3PlK3m1uzjnPw8xw50MoaNrttl+r1e4CNRv1jTG/+v3+c8dG8TSilHoAPLZVX0RYWlraUbYaJI2IuLZ7KKUWCisgq8wF5D1A3rF+EQyCYPHo6Ghh3BrP8wb1en3f9izDYlVAp9O5EkXRB8dxxl7QBoNBpLW+7fv+a5vzDIvVU0BELhpjJrmaK2NMw+YsIxunUaTZbLrdbveZ1vpmGvWyTOJToNlsuqurq1vAdWPMeSDzwzhJEh0Bp+FTmifzxBZQBXiIKaAq8BBDQJXgYUoBVYOHKQRUER4mFFBVeJhAQJXh4QwBVYeHMQJmAR5GCJgVeBgiYJbg4T8BswYPp+4GW63WwvLy8hZwLcd5TudvBj3+OFBIeA4PD596nvc1iiIrD21qtdr+ysrKR8cY42itCwUP0Gg0+sC27T5qb2/vMunB/0ipTmZxfN//orW+BCwmrGV6vd63BP9P2j9WxGbxbrd7B3g14fLfwFsROUlzBmNM33XdR6Meuxfp5eg54IYxJvXCx8fHL4F3w36blTdDI4/0WREwMnMBeQ+Qd+YC8h4g78wF5D1A3rEqwBiT6q4ubpRSI+ewuhP0PO/NwcHBExHJZZ8PICI/e73ep7z6zzNPwWP1djhuOp3OfRG5kLROFEXv19fXP49bU6TbYQDa7XZDRF6kUUtEtoFb49YUbh/gOM7YbwqnyG4URQ/PWlQ4ASllNwzDzY2NDX3WwioKmBgeqidgKnioloCp4aE6AmLBQzUExIaH8gtIBA/lFrCTFB7KK2AnDMOrSeGhnAJSg4fyCUgVHsolIHV4KI8AK/BQDgHW4KH4AqzCQwEfiIRheKKUAvjuuu7m2tpakPdMmcYYI1rre0EQ1LPo9w82qyNziMdZ3AAAAABJRU5ErkJggg==\"\u003e \u003cimg src=\"https://img.shields.io/badge/AWS-deploys-F29100.svg?logo=amazon\"\u003e                                                                                  |\n| Services     | Kinesis Data Stream, Firehose, S3, Redshift                                                          |\n| Integrations | CDK                                                                                                  |\n| Categories   | BigData                                                                                  |\n| Level        | Intermediate                                                                                         |\n| GitHub       | [Repository link](https://github.com/localstack-samples/sample-cdk-kinesis-firehose-redshift)        |\n\n\n![acrhitecture diagram showing the pipeline including producer, kinesis stream, data firehose, s3 bucket, redshift and consumer](architecture-diagram.png)\n\n# Prerequisites\n\n## Required Software\n- Python 3.11\n- node \u003e16\n- Docker\n- AWS CLI\n- AWS CDK\n- LocalStack CLI\n\n\u003cdetails\u003e\n  \u003csummary\u003eif you are on Mac:\u003c/summary\u003e\n\n    1. install python@3.11\n        \n        ```bash\n        brew install pyenv\n        pyenv install 3.11.0\n        ```\n\n    2. install nvm and node \u003e= 16\n    \n        ```bash\n        brew install nvm\n        nvm install 20\n        nvm use 20\n        ```\n    3. install docker\n\n        ```bash\n        brew install docker\n        ```\n\n    4. install aws cli, cdk\n\n        ```bash\n        brew install awscli\n        npm install -g aws-cdk\n        ```\n\n    5. install localstack-cli and cdklocal\n        \n        ```bash\n        brew install localstack/tap/localstack-cli\n        npm install -g aws-cdk-local\n        ```\n\u003c/details\u003e\n\n\n## Setup development environment\nClone the repository and navigate to the project directory.\n    \n    ```bash\n    git clone git@github.com:localstack-samples/sample-cdk-kinesis-firehose-redshift.git\n    cd sample-cdk-kinesis-firehose-redshift\n    ```\n\nCopy `.env.example` to `.env` and set the environment variables based on your target environment.\nYou can use the sample user and password and names, or set your own.\n\n\n\nCreate a virtualenv using python@3.11 and install all the development dependencies there:\n\n```bash\npyenv local 3.11.0\npython -m venv .venv\nsource .venv/bin/activate\npip install -r requirements-dev.txt\n```    \n\n\n# Deployment\n- Configure the AWS CLI\n- Set the environment variables in the .env file based on .env.example\n\n## Deploy the CDK stack manually\nAgainst AWS\n\n- unset the .env variable \"AWS_ENDPOINT_URL\"\nby uncommenting the line in the `.env` file and reloading it.\nIf you run the debugger, you will also need to uncomment the line in `.vscode/launch.json`\n  \n```bash\ncdk synth\ncdk bootstrap\ncdk deploy KinesisFirehoseRedshiftStack1\npython -m utils/prepare_redshift.py\ncdk deploy KinesisFirehoseRedshiftStack2\n```\n\nAgainst LocalStack\n\n- set the .env variable \"AWS_ENDPOINT_URL\" to \"http://localhost:4566\"\n\n```bash\nlocalstack start\ncdklocal synth\ncdklocal bootstrap\ncdklocal deploy KinesisFirehoseRedshiftStack1\npython -m utils/prepare_redshift.py\ncdklocal deploy KinesisFirehoseRedshiftStack2\n```\n\n## Deploy the CDK stack using the Makefile\nAgainst AWS\n\n- unset the .env variable \"AWS_ENDPOINT_URL\" \nby uncommenting the line in the `.env` file and reloading it.\nIf you run the debugger, you will also need to uncomment the line in `.vscode/launch.json`\n\n```bash\nmake deploy-aws\n```\n\nAgainst LocalStack\n\n- set the .env variable \"AWS_ENDPOINT_URL\" to \"http://localhost:4566\"\n\n```bash\nlocalstack start\nmake deploy-localstack\n```\n\n# Testing\n\n## Run the tests either against AWS or LocalStack\n```bash\nmake test\n```\n\nThis will run a pytest defined in `tests/test_cdk.py`, put sample data into the Kinesis stream and check if the data is being ingested into the Redshift table.\nIf you are running the tests against LocalStack, you need to restart the LocalStack container for consecutive runs, since the Redshift table is not being cleaned up after the tests.\nThe same is true for the AWS deployment, you can manually clean up the Redshift table after the tests, or re-deploy the stack.\n\n## Github Actions CI tests\nThe github actions workflow defined in `.github/workflows/main.yaml` will install the required dependencies, start a LocalStack containerdeploy the infrastructure aginast LocalStack and run the test.\nTo set up the workflow, you need to create an environment and set the variables and secrets from you `.env` file.\nThe workflow will run on every push to the main branch.\n\n# Interact with the deployed resources\n\n## Start sample kinesis producer\nset the endpoint url and port acording to your target.\n```bash\nmake start-producer\n```\nThis will run the producer defined in `utils/producer.py` in the background and start sending new data to the kinesis stream, each 10 seconds.\n\n## Read data from Redshift\nOpen the Jupyter Notebook (simples way if you are on VSCode is using the extension: https://code.visualstudio.com/docs/datascience/jupyter-notebooks) and run the cells to read data from Redshift.\nAs new data from the mock Kinesis producer is being sent to the Kinesis stream, the data will be automatically ingested into the Redshift table.\nYou can re-run the cells in the Jupyter Notebook to see the data being updated in real-time.\n\n# Contributing\nWe appreciate your interest in contributing to our project and are always looking for new ways to improve the developer experience.\nWe welcome feedback, bug reports, and even feature ideas from the community. Please refer to the contributing file for more details on how to get started.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocalstack-samples%2Fsample-cdk-kinesis-firehose-redshift","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flocalstack-samples%2Fsample-cdk-kinesis-firehose-redshift","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocalstack-samples%2Fsample-cdk-kinesis-firehose-redshift/lists"}