{"id":21588138,"url":"https://github.com/aws-solutions/clickstream-analytics-on-aws","last_synced_at":"2025-08-23T02:38:12.989Z","repository":{"id":176500228,"uuid":"645683853","full_name":"aws-solutions/clickstream-analytics-on-aws","owner":"aws-solutions","description":"Clickstream Analytics on AWS source code","archived":false,"fork":false,"pushed_at":"2025-06-25T19:34:03.000Z","size":75900,"stargazers_count":81,"open_issues_count":3,"forks_count":28,"subscribers_count":15,"default_branch":"main","last_synced_at":"2025-06-25T20:31:29.429Z","etag":null,"topics":["aws","aws-amplify","aws-cdk","aws-clickstream-solution","aws-emr-serverless","aws-kinesis-stream","aws-msk","aws-quicksight","aws-redshift","aws-solutions","clickstream","data-analysis","web-analysis","web-analytics"],"latest_commit_sha":null,"homepage":"https://aws.amazon.com/solutions/implementations/clickstream-analytics-on-aws/","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aws-solutions.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-05-26T07:48:38.000Z","updated_at":"2025-06-18T10:23:47.000Z","dependencies_parsed_at":"2023-09-24T13:42:23.035Z","dependency_job_id":"f6114899-3845-4869-8af0-865fe24b2b3c","html_url":"https://github.com/aws-solutions/clickstream-analytics-on-aws","commit_stats":null,"previous_names":["awslabs/clickstream-analytics-on-aws","aws-solutions/clickstream-analytics-on-aws"],"tags_count":47,"template":false,"template_full_name":null,"purl":"pkg:github/aws-solutions/clickstream-analytics-on-aws","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fclickstream-analytics-on-aws","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fclickstream-analytics-on-aws/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fclickstream-analytics-on-aws/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fclickstream-analytics-on-aws/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aws-solutions","download_url":"https://codeload.github.com/aws-solutions/clickstream-analytics-on-aws/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fclickstream-analytics-on-aws/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271732968,"owners_count":24811440,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-23T02:00:09.327Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","aws-amplify","aws-cdk","aws-clickstream-solution","aws-emr-serverless","aws-kinesis-stream","aws-msk","aws-quicksight","aws-redshift","aws-solutions","clickstream","data-analysis","web-analysis","web-analytics"],"created_at":"2024-11-24T16:00:44.246Z","updated_at":"2025-08-23T02:38:12.968Z","avatar_url":"https://github.com/aws-solutions.png","language":"TypeScript","funding_links":[],"categories":["TypeScript"],"sub_categories":[],"readme":"# Clickstream Analytics on AWS\n\nAn end-to-end solution to collect, ingest, analyze, and visualize clickstream data inside your web and mobile applications.\n\n## Solution Overview\n\nThis solution collects, ingests, analyzes, and visualizes clickstream events from your websites and mobile applications. Clickstream data is critical for online business analytics use cases, such as user behavior analysis, customer data platform, and marketing analysis. This data derives insights into the patterns of user interactions on a website or application, helping businesses understand user navigation, preferences, and engagement levels to drive product innovation and optimize marketing investments.\n\nWith this solution, you can quickly configure and deploy a data pipeline that fits your business and technical needs. It provides purpose-built software development kits (SDKs) that automatically collect common events and easy-to-use APIs to report custom events, enabling you to easily send your customers’ clickstream data to the data pipeline in your AWS account. The solution also offers pre-assembled dashboards that visualize key metrics about user lifecycle, including acquisition, engagement, activity, and retention, and adds visibility into user devices and geographies. You can combine user behavior data with business backend data to create a comprehensive data platform and generate insights that drive business growth.\n\n## Architecture Overview\n\n![architecture diagram](./docs/images/architecture/01-architecture-end-to-end.png)\n\n1. Amazon CloudFront distributes the frontend web UI assets hosted in the Amazon S3 bucket, and the backend APIs hosted with Amazon API Gateway and AWS Lambda.\n2. The Amazon Cognito user pool or OpenID Connect (OIDC) is used for authentication.\n3. The web UI console uses Amazon DynamoDB to store persistent data.\n4. AWS Step Functions, AWS CloudFormation, AWS Lambda, and Amazon EventBridge are used for orchestrating the lifecycle management of data pipelines.\n5. The data pipeline is provisioned in the Region specified by the system operator. It consists of Application Load Balancer (ALB),\nAmazon ECS, Amazon Managed Streaming for Kafka (Amazon MSK), Amazon Kinesis Data Streams, Amazon S3, Amazon EMR Serverless, Amazon Redshift, and Amazon QuickSight.\n\nFor more information, refer to [the doc][doc-arch].\n\n## SDKs\n\nClickstream Analytics on AWS provides different client-side SDKs, which can make it easier for you to report events to the data pipeline created in the solution. Currently, the solution supports the following platforms:\n\n- [Android][android-sdk]\n- [Swift][swift-sdk]\n- [Web][web-sdk]\n- [Flutter][flutter-sdk]\n- [React Native][react-native-sdk]\n- [WeChat Mini Program][wechat-sdk]\n- [HTTP API][http-api]\n\nSee [this repo][sdk-samples] for different kinds of SDK samples.\n\n## Deployment\n\n### Using AWS CloudFormation template\n\nFollow the [implementation guide][doc-deployment] to deploy the solution using AWS CloudFormation template.\n\n### Using AWS CDK\n\n#### Preparations\n\n- Make sure you have an AWS account\n- Configure [credential of aws cli][configure-aws-cli]\n- Install Node.js LTS version 20.12.0 or later\n- Install Docker Engine\n- Install pnpm `npm install -g pnpm@9.15.3`\n- Install the dependencies of the solution by executing the command `pnpm install \u0026\u0026 pnpm projen \u0026\u0026 pnpm nx build @aws/clickstream-base-lib`\n- Initialize the CDK toolkit stack into AWS environment (only for deploying via [AWS CDK][aws-cdk] for the first time), and run `npx cdk bootstrap`\n\n#### Deploy the web console\n\n```shell\n# deploy the web console of the solution\nnpx cdk deploy cloudfront-s3-control-plane-stack-global --parameters Email=\u003cyour email\u003e --require-approval never\n```\n\n#### Deploy pipeline stacks\n\n```shell\n# deploy the ingestion server with s3 sink\n# 1. check stack name in src/main.ts for other stacks\n# 2. check the stack for required CloudFormation parameters\nnpx cdk deploy ingestion-server-s3-stack --parameters ...\n```\n\n#### Deploy local code for updating existing stacks created by the web console\n\n```shell\n# update the existing data modeling Redshift stack Clickstream-DataModelingRedshift-xxx\nbash e2e-deploy.sh -n modelRedshiftStackName -s Clickstream-DataModelingRedshift-xxx\n# update the existing web console\nbash e2e-deploy.sh -n standardControlPlaneStackName -s \u003cstack name of existing web console\u003e -c\n```\n\n## Test\n\n```shell\npnpm test\n```\n\n## Local development for web console\n\n- Step1: Deploy the solution control plane(create DynamoDB tables, State Machine and other resources). \n- Step2: Open **Amazon Cognito** console, select the corresponding **User pool**, click the **App integration** tab, select application details in the **App client list**, edit **Hosted UI**, and set a new URL: `http://localhost:3000/signin` into **Allowed callback URLs**.\n- Step3: Goto the folder: `src/control-plane/local`\n\n```shell\ncd src/control-plane/local\n```\n\n```shell\n# run backend server local\nbash start.sh -s backend\n```\n\n```shell\n# run frontend server local\nbash start.sh -s frontend\n```\n\n## Local build spark ETL jar\n\n- Step1: Build ETL common\n\n```shell\ncd src/data-pipeline/etl-common \n./gradlew clean build install\n\n```\n\n- Step2: Build spark ETL jar\n\n```shell\ncd src/data-pipeline/spark-etl\n\n# build with unit tests\n./gradlew clean build \n\n# or only build jar and skip all unit tests \n./gradlew clean build -x test -x :coverageCheck\n\n# check the jar file\nls -l ./build/libs/spark-etl-*.jar\n\n```\n\n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n## License\n\nThis project is licensed under the Apache-2.0 License.\n\n## File Structure\n\nUpon successfully cloning the repository into your local development environment but prior to running the initialization script, you will see the following file structure in your editor:\n\n```\n├── CHANGELOG.md                       [Change log file]\n├── CODE_OF_CONDUCT.md                 [Code of conduct file]\n├── CONTRIBUTING.md                    [Contribution guide]\n├── LICENSE                            [LICENSE for this solution]\n├── NOTICE.txt                         [Notice for 3rd-party libraries]\n├── README.md                          [Read me file]\n├── buildspec.yml\n├── cdk.json\n├── codescan-prebuild-custom.sh\n├── deployment                         [shell scripts for packaging distribution assets]\n│   ├── build-open-source-dist.sh\n│   ├── build-s3-dist-1.sh\n│   ├── build-s3-dist.sh\n│   ├── cdk-solution-helper\n│   ├── post-build-1\n│   ├── run-all-test.sh\n│   ├── solution_config\n│   ├── test\n│   ├── test-build-dist.sh\n│   └── test-deploy-tag-images.sh\n├── docs                               [document]\n│   ├── en\n│   ├── index.html\n│   ├── mkdocs.base.yml\n│   ├── mkdocs.en.yml\n│   ├── mkdocs.zh.yml\n│   ├── site\n│   ├── test-deploy-mkdocs.sh\n│   └── zh\n├── examples                           [example code]\n│   ├── custom-plugins\n│   └── standalone-data-generator\n├── frontend                           [frontend source code]\n│   ├── README.md\n│   ├── build\n│   ├── config\n│   ├── esbuild.ts\n│   ├── node_modules\n│   ├── package.json\n│   ├── public\n│   ├── scripts\n│   ├── src\n│   ├── tsconfig.json\n├── package.json\n├── sonar-project.properties\n├── src                                [all backend source code]\n│   ├── alb-control-plane-stack.ts\n│   ├── analytics\n│   ├── base-lib\n│   ├── cloudfront-control-plane-stack.ts\n│   ├── common\n│   ├── control-plane\n│   ├── data-analytics-redshift-stack.ts\n│   ├── data-modeling-athena-stack.ts\n│   ├── data-pipeline\n│   ├── data-pipeline-stack.ts\n│   ├── data-reporting-quicksight-stack.ts\n│   ├── ingestion-server\n│   ├── ingestion-server-stack.ts\n│   ├── kafka-s3-connector-stack.ts\n│   ├── main.ts\n│   ├── metrics\n│   ├── metrics-stack.ts\n│   └── reporting\n├── test                               [test code]\n│   ├── analytics\n│   ├── common\n│   ├── constants.ts\n│   ├── control-plane\n│   ├── data-pipeline\n│   ├── ingestion-server\n│   ├── jestEnv.js\n│   ├── metrics\n│   ├── reporting\n│   ├── rules.ts\n│   └── utils.ts\n├── tsconfig.dev.json\n├── tsconfig.json\n```\n\n[android-sdk]: https://github.com/aws-solutions/clickstream-analytics-on-aws-android-sdk\n[swift-sdk]: https://github.com/aws-solutions/clickstream-analytics-on-aws-swift-sdk\n[flutter-sdk]: https://github.com/aws-solutions/clickstream-analytics-on-aws-flutter-sdk\n[react-native-sdk]: https://github.com/aws-solutions/clickstream-analytics-on-aws-react-native-sdk\n[web-sdk]: https://github.com/aws-solutions/clickstream-analytics-on-aws-web-sdk\n[wechat-sdk]: https://github.com/awslabs/clickstream-wechat\n[http-api]: https://aws-solutions.github.io/clickstream-analytics-on-aws/en/latest/sdk-manual/http-api/\n[configure-aws-cli]: https://docs.aws.amazon.com/zh_cn/cli/latest/userguide/cli-chap-configure.html\n[aws-cdk]: https://aws.amazon.com/cdk/\n[doc-arch]: https://docs.aws.amazon.com/solutions/latest/clickstream-analytics-on-aws/architecture-overview.html\n[doc-deployment]: https://docs.aws.amazon.com/solutions/latest/clickstream-analytics-on-aws/deployment.html\n[sdk-samples]: https://github.com/aws-samples/clickstream-sdk-samples\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws-solutions%2Fclickstream-analytics-on-aws","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faws-solutions%2Fclickstream-analytics-on-aws","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws-solutions%2Fclickstream-analytics-on-aws/lists"}