{"id":19024372,"url":"https://github.com/aws-solutions/maintaining-personalized-experiences-with-machine-learning","last_synced_at":"2025-04-15T17:24:47.206Z","repository":{"id":41141161,"uuid":"409427545","full_name":"aws-solutions/maintaining-personalized-experiences-with-machine-learning","owner":"aws-solutions","description":"The Maintaining Personalized Experiences with Machine Learning solution provides an automated pipeline to maintain resources in Amazon Personalize. This pipeline allows you to keep up to date with your user’s most recent activity while sustaining and improving the relevance of recommendations","archived":false,"fork":false,"pushed_at":"2024-10-24T19:05:50.000Z","size":1118,"stargazers_count":26,"open_issues_count":3,"forks_count":17,"subscribers_count":16,"default_branch":"main","last_synced_at":"2025-03-28T23:04:46.088Z","etag":null,"topics":["amazon-personalize","artificial-intelligence","machine-learning"],"latest_commit_sha":null,"homepage":"https://aws.amazon.com/solutions/implementations/maintaining-personalized-experiences-with-ml","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aws-solutions.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-23T02:57:44.000Z","updated_at":"2025-03-10T22:38:07.000Z","dependencies_parsed_at":"2023-10-13T10:39:46.441Z","dependency_job_id":"fa4b1edc-a898-439e-8b59-0a822db77792","html_url":"https://github.com/aws-solutions/maintaining-personalized-experiences-with-machine-learning","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fmaintaining-personalized-experiences-with-machine-learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fmaintaining-personalized-experiences-with-machine-learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fmaintaining-personalized-experiences-with-machine-learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws-solutions%2Fmaintaining-personalized-experiences-with-machine-learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aws-solutions","download_url":"https://codeload.github.com/aws-solutions/maintaining-personalized-experiences-with-machine-learning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249117234,"owners_count":21215353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amazon-personalize","artificial-intelligence","machine-learning"],"created_at":"2024-11-08T20:36:33.274Z","updated_at":"2025-04-15T17:24:47.183Z","avatar_url":"https://github.com/aws-solutions.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Maintaining Personalized Experiences with Machine Learning\n\nThe Maintaining Personalized Experiences with Machine Learning solution provides a mechanism to automate much of the\nworkflow around Amazon Personalize. This includes dataset group creation, dataset creation and import, solution\ncreation, solution version creation, campaign creation and batch inference job creation\n\nScheduled rules can be configured for setting up import jobs, solution version retraining (with campaign update) and\nbatch inference job creation.\n\n## Table of Contents\n\n- [Architecture for the AWS MLOps for Amazon Personalize Solution](#architecture)\n- [AWS CDK Constructs](#aws-cdk-constructs)\n- [Deployment](#deployment)\n- [Creating a custom build](#creating-a-custom-build)\n- [Collection of operational metrics](#collection-of-operational-metrics)\n\n## Architecture\n\nThe following describes the architecture of the solution\n\n![architecture](source/images/solution-architecture.png)\n\nThe AWS CloudFormation template deploys the resources required to automate your Amazon Personalize usage and deployments.\nThe template includes the following components:\n\n1. An Amazon S3 bucket used to store personalization data and configuration files.\n2. An AWS Lambda function triggered when new/ updated personalization configuration is uploaded to the personalization data bucket.\n3. An AWS Stepfunctions workflow to manage all of the resources of an Amazon Personalize dataset group (including datasets, schemas, event tracker, filters, solutions, campaigns, and batch inference jobs).\n4. Amazon CloudWatch metrics for Amazon Personalize for each new trained solution version are added to help you evaluate the performance of a model over time.\n5. An Amazon Simple Notification Service (SNS) topic and subscription to notify an administrator when the maintenance workflow has completed via email.\n6. Amazon DynamoDB is used to track the scheduled events configured for Amazon Personalize to fully or partially retrain solutions, (re) import datasets and perform batch inference jobs.\n7. An AWS Stepfunctions workflow is used to track the current running scheduled events, and invokes step functions to perform solution maintenance (creating new solution versions, updating campaigns), import updated datasets, and perform batch inference.\n8. A set of maintenance AWS Stepfunctions workflows are provided to:\n   1. Create new dataset import jobs on schedule\n   2. Perform solution FULL retraining on schedule (and update associated campaigns)\n   3. Perform solution UPDATE retraining on schedule (and update associated campaigns)\n   4. Create batch inference jobs\n9. An Amazon EventBridge event bus, where resource status notification updates are posted throughout the AWS Step\n   functions workflow\n10. A command line interface (CLI) lets existing resources be imported and allows schedules to be established for\n    resources that already exist in Amazon Personalize\n\n\u003e **Note**: From v1.0.0, AWS CloudFormation template resources are created by the [AWS CDK](https://aws.amazon.com/cdk/)\n\u003e and [AWS Solutions Constructs](https://aws.amazon.com/solutions/constructs/).\n\n### AWS CDK Constructs\n\n[AWS CDK Solutions Constructs](https://aws.amazon.com/solutions/constructs/) make it easier to consistently create\nwell-architected applications. All AWS Solutions Constructs are reviewed by AWS and use best practices established by\nthe AWS Well-Architected Framework. This solution uses the following AWS CDK Solutions Constructs:\n\n- [aws-lambda-sns](https://docs.aws.amazon.com/solutions/latest/constructs/aws-lambda-sns.html)\n\n## Deployment\n\nYou can launch this solution with one click from [AWS Solutions Implementations](https://aws.amazon.com/solutions/implementations/maintaining-personalized-experiences-with-ml).\n\nTo customize the solution, or to contribute to the solution, see [Creating a custom build](#creating-a-custom-build)\n\n## Configuration\n\nThis solution uses **parameter files**. The parameter file contains all the necessary information to create and maintain\nyour resources in Amazon Personalize.\n\nThe file can contain the following sections\n\n- `datasetGroup`\n- `datasets`\n- `solutions` (can contain `campaigns` and `batchInferenceJobs`)\n- `eventTracker`\n- `filters`\n\n\u003cdetails\u003e\n\u003csummary\u003eSee a sample of the parameter file\u003c/summary\u003e\n\n```json\n{\n\t\"datasetGroup\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"name\": \"dataset-group-name-1\"\n\t\t},\n\t\t\"workflowConfig\": {\n\t\t\t\"schedules\": {\n\t\t\t\t\"import\": \"cron(0 */6 * * ? *)\"\n\t\t\t}\n\t\t}\n\t},\n\t\"datasets\": {\n\t\t\"users\": {\n\t\t\t\"dataset\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"users-data\"\n\t\t\t\t}\n\t\t\t},\n\t\t\t\"schema\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"users-schema\",\n\t\t\t\t\t\"schema\": {\n\t\t\t\t\t\t\"type\": \"record\",\n\t\t\t\t\t\t\"name\": \"users\",\n\t\t\t\t\t\t\"namespace\": \"com.amazonaws.personalize.schema\",\n\t\t\t\t\t\t\"fields\": [\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"USER_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"AGE\",\n\t\t\t\t\t\t\t\t\"type\": \"int\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"GENDER\",\n\t\t\t\t\t\t\t\t\"type\": \"string\",\n\t\t\t\t\t\t\t\t\"categorical\": true\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t]\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t},\n\t\t\"interactions\": {\n\t\t\t\"dataset\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"interactions-data\"\n\t\t\t\t}\n\t\t\t},\n\t\t\t\"schema\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"interactions-schema\",\n\t\t\t\t\t\"schema\": {\n\t\t\t\t\t\t\"type\": \"record\",\n\t\t\t\t\t\t\"name\": \"interactions\",\n\t\t\t\t\t\t\"namespace\": \"com.amazonaws.personalize.schema\",\n\t\t\t\t\t\t\"fields\": [\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"ITEM_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"USER_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"TIMESTAMP\",\n\t\t\t\t\t\t\t\t\"type\": \"long\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"EVENT_TYPE\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"EVENT_VALUE\",\n\t\t\t\t\t\t\t\t\"type\": \"float\"\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t]\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t},\n\t\"solutions\": [\n\t\t{\n\t\t\t\"serviceConfig\": {\n\t\t\t\t\"name\": \"sims-solution\",\n\t\t\t\t\"recipeArn\": \"arn:aws:personalize:::recipe/aws-sims\"\n\t\t\t},\n\t\t\t\"workflowConfig\": {\n\t\t\t\t\"schedules\": {\n\t\t\t\t\t\"full\": \"cron(0 0 ? * 1 *)\"\n\t\t\t\t}\n\t\t\t}\n\t\t},\n\t\t{\n\t\t\t\"serviceConfig\": {\n\t\t\t\t\"name\": \"popularity-count-solution\",\n\t\t\t\t\"recipeArn\": \"arn:aws:personalize:::recipe/aws-popularity-count\"\n\t\t\t},\n\t\t\t\"workflowConfig\": {\n\t\t\t\t\"schedules\": {\n\t\t\t\t\t\"full\": \"cron(0 1 ? * 1 *)\"\n\t\t\t\t}\n\t\t\t}\n\t\t},\n\t\t{\n\t\t\t\"serviceConfig\": {\n\t\t\t\t\"name\": \"user-personalization-solution\",\n\t\t\t\t\"recipeArn\": \"arn:aws:personalize:::recipe/aws-user-personalization\"\n\t\t\t},\n\t\t\t\"workflowConfig\": {\n\t\t\t\t\"schedules\": {\n\t\t\t\t\t\"full\": \"cron(0 2 ? * 1 *)\"\n\t\t\t\t}\n\t\t\t},\n\t\t\t\"campaigns\": [\n\t\t\t\t{\n\t\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\t\"name\": \"user-personalization-campaign\",\n\t\t\t\t\t\t\"minProvisionedTPS\": 1\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t],\n\t\t\t\"batchInferenceJobs\": [\n\t\t\t\t{\n\t\t\t\t\t\"serviceConfig\": {},\n\t\t\t\t\t\"workflowConfig\": {\n\t\t\t\t\t\t\"schedule\": \"cron(0 3 * * ? *)\"\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t]\n\t\t}\n\t],\n\t\"eventTracker\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"name\": \"dataset-group-name-event-tracker\"\n\t\t}\n\t},\n\t\"filters\": [\n\t\t{\n\t\t\t\"serviceConfig\": {\n\t\t\t\t\"name\": \"clicked-or-streamed\",\n\t\t\t\t\"filterExpression\": \"INCLUDE ItemID WHERE Interactions.EVENT_TYPE in (\\\"click\\\", \\\"stream\\\")\"\n\t\t\t}\n\t\t},\n\t\t{\n\t\t\t\"serviceConfig\": {\n\t\t\t\t\"name\": \"interacted\",\n\t\t\t\t\"filterExpression\": \"INCLUDE ItemID WHERE Interactions.EVENT_TYPE in (\\\"*\\\")\"\n\t\t\t}\n\t\t}\n\t]\n}\n```\n\n\u003c/details\u003e\n\nThis solution allows you to manage multiple dataset groups through the use of multiple parameter files. All .json files\ndiscovered under the `train/` prefix will trigger the workflow however, the following structure is recommended:\n\n```\ntrain/\n│\n├── \u003cdataset_group_1\u003e/ (option 1 - single csv files for data import)\n│   ├── config.json\n│   ├── interactions.csv\n│   ├── items.csv (optional)\n│   └── users.csv (optional)\n│\n└── \u003cdataset_group_2\u003e/ (option 2 - multiple csv files for data import)\n    ├── config.json\n    ├── interactions/\n    │   ├── \u003cinteractions_part_1\u003e.csv\n    │   ├── \u003cinteractions_part_2\u003e.csv\n    │   └── \u003cinteractions_part_n\u003e.csv\n    ├── users/ (optional)\n    │   ├── \u003cusers_part_1\u003e.csv\n    │   ├── \u003cusers_part_2\u003e.csv\n    │   └── \u003cusers_part_n\u003e.csv\n    └── items/ (optional)\n        ├── \u003citems_part_1\u003e.csv\n        ├── \u003citems_part_2\u003e.csv\n        └── \u003citems_part_n\u003e.csv\n```\n\nIf batch inference jobs are required, [batch inference job configuration files](https://docs.aws.amazon.com/personalize/latest/dg/recommendations-batch.html#batch-data-upload)\nmust also be uploaded to the following lcoation:\n\n```\nbatch/\n│\n└── \u003cdataset_group_name\u003e/\n    └── \u003csolution_name\u003e/\n        └── job_config.json\n```\n\nBatch inference output will be produced at the following location:\n\n```\nbatch/\n│\n└── \u003cdataset_group_name\u003e/\n    └── \u003csolution_name\u003e/\n        └── \u003csolution_name_YYYY_MM_DD_HH_MM_SS\u003e/\n            ├── _CHECK\n            └── job_config.json.out\n```\n\nNote: It is not recommended to use `performAutoML` as this feature will be deprecated in the future. Please take the time to select the most appropriate recipe for your use-case. If this parameter is used for this solution in the configuration, it will log an error and continue to build the solution without it. Please refer [FAQs](https://github.com/aws-samples/amazon-personalize-samples/blob/master/PersonalizeCheatSheet2.0.md) and [AWS Personalize Developer Guide](https://docs.aws.amazon.com/personalize/latest/dg/API_CreateSolution.html#personalize-CreateSolution-request-performAutoML).\n\n## Configuration with Tags\n\nYou can also optionally supply tags in your configurations:\n\n```json\n{\n\t\"datasetGroup\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"name\": \"dataset-group-name-2\",\n\t\t\t\"tags\": [\n\t\t\t\t{\n\t\t\t\t\t\"tagKey\": \"dataset-group-key\",\n\t\t\t\t\t\"tagValue\": \"dataset-group-value\"\n\t\t\t\t}\n\t\t\t]\n\t\t}\n\t},\n\t\"datasets\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"importMode\": \"FULL\",\n\t\t\t\"tags\": [\n\t\t\t\t{\n\t\t\t\t\t\"tagKey\": \"datasets-key\",\n\t\t\t\t\t\"tagValue\": \"datasets-value\"\n\t\t\t\t}\n\t\t\t]\n\t\t},\n\t\t\"interactions\": {\n\t\t\t\"dataset\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"interactions-data\",\n\t\t\t\t\t\"tags\": [\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\"tagKey\": \"interactions-dataset-key\",\n\t\t\t\t\t\t\t\"tagValue\": \"interactions-dataset-value\"\n\t\t\t\t\t\t}\n\t\t\t\t\t]\n\t\t\t\t}\n\t\t\t},\n\t\t\t\"schema\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"interactions-schema\",\n\t\t\t\t\t\"schema\": {\n\t\t\t\t\t\t\"type\": \"record\",\n\t\t\t\t\t\t\"name\": \"Interactions\",\n\t\t\t\t\t\t\"namespace\": \"com.amazonaws.personalize.schema\",\n\t\t\t\t\t\t\"fields\": [\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"USER_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"ITEM_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"TIMESTAMP\",\n\t\t\t\t\t\t\t\t\"type\": \"long\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"EVENT_TYPE\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t],\n\t\t\t\t\t\t\"version\": \"1.0\"\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t},\n\t\t\"items\": {\n\t\t\t\"dataset\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"items-data\",\n\t\t\t\t\t\"tags\": [\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\"tagKey\": \"items-dataset-key\",\n\t\t\t\t\t\t\t\"tagValue\": \"items-dataset-value\"\n\t\t\t\t\t\t}\n\t\t\t\t\t]\n\t\t\t\t}\n\t\t\t},\n\t\t\t\"schema\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"items-schema\",\n\t\t\t\t\t\"schema\": {\n\t\t\t\t\t\t\"type\": \"record\",\n\t\t\t\t\t\t\"name\": \"Items\",\n\t\t\t\t\t\t\"namespace\": \"com.amazonaws.personalize.schema\",\n\t\t\t\t\t\t\"fields\": [\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"ITEM_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"GENRES\",\n\t\t\t\t\t\t\t\t\"type\": \"string\",\n\t\t\t\t\t\t\t\t\"categorical\": true\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"YEAR\",\n\t\t\t\t\t\t\t\t\"type\": \"int\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"CREATION_TIMESTAMP\",\n\t\t\t\t\t\t\t\t\"type\": \"long\"\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t],\n\t\t\t\t\t\t\"version\": \"1.0\"\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t},\n\t\t\"users\": {\n\t\t\t\"dataset\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"users-data\",\n\t\t\t\t\t\"tags\": [\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\"tagKey\": \"users-dataset-key\",\n\t\t\t\t\t\t\t\"tagValue\": \"users-dataset-value\"\n\t\t\t\t\t\t}\n\t\t\t\t\t]\n\t\t\t\t}\n\t\t\t},\n\t\t\t\"schema\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"users-schema\",\n\t\t\t\t\t\"schema\": {\n\t\t\t\t\t\t\"type\": \"record\",\n\t\t\t\t\t\t\"name\": \"Users\",\n\t\t\t\t\t\t\"namespace\": \"com.amazonaws.personalize.schema\",\n\t\t\t\t\t\t\"fields\": [\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"USER_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"GENDER\",\n\t\t\t\t\t\t\t\t\"type\": \"string\",\n\t\t\t\t\t\t\t\t\"categorical\": true\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t],\n\t\t\t\t\t\t\"version\": \"1.0\"\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t},\n\t\"eventTracker\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"name\": \"event-tracker-name\",\n\t\t\t\"tags\": [\n\t\t\t\t{\n\t\t\t\t\t\"tagKey\": \"event-tracker-key\",\n\t\t\t\t\t\"tagValue\": \"event-tracker-value\"\n\t\t\t\t}\n\t\t\t]\n\t\t}\n\t},\n\t\"solutions\": [\n\t\t{\n\t\t\t\"serviceConfig\": {\n\t\t\t\t\"name\": \"solution-recommender-user-personalization\",\n\t\t\t\t\"recipeArn\": \"arn:aws:personalize:::recipe/aws-user-personalization\",\n\t\t\t\t\"performHPO\": true,\n\t\t\t\t\"tags\": [\n\t\t\t\t\t{\n\t\t\t\t\t\t\"tagKey\": \"solution-key\",\n\t\t\t\t\t\t\"tagValue\": \"solution-value\"\n\t\t\t\t\t}\n\t\t\t\t],\n\t\t\t\t\"solutionVersion\": {\n\t\t\t\t\t\"name\": \"solutionV1\",\n\t\t\t\t\t\"trainingMode\": \"FULL\",\n\t\t\t\t\t\"tags\": [\n\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\"tagKey\": \"solution-version-key\",\n\t\t\t\t\t\t\t\"tagValue\": \"solution-version-value\"\n\t\t\t\t\t\t}\n\t\t\t\t\t]\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t]\n}\n```\n\nNote: You cannot tag already created resources through the configuration. Only \"FULL\" `importMode` for datasets is currently supported.\n\nTags can also be root-level tags and they apply to all components which do not have tags specified. For example, for the datasetGroup `dataset-group-name-3` specified below, `tagKey` \"project\" and `tagValue` \"user-personalization\" applies to `datasetGroup`, `interactions` dataset and its import, the `eventTracker`, and the `solutionVersion`, but the dataset-import gets the specified values for tagKey and tagValue as \"datasets-key\" and \"datasets-value\" respectively (applies for dataset imports of users, interactions and items datasets) and solution `solution-user-personalization` gets \"solution-key\" and \"solution-value\":\n\n```json\n{\n\t\"tags\": [\n\t\t{\n\t\t\t\"tagKey\": \"project\",\n\t\t\t\"tagValue\": \"user-personalization\"\n\t\t}\n\t],\n\t\"datasetGroup\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"name\": \"dataset-group-name-3\"\n\t\t}\n\t},\n\t\"datasets\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"importMode\": \"FULL\",\n\t\t\t\"tags\": [\n\t\t\t\t{\n\t\t\t\t\t\"tagKey\": \"datasets-key\",\n\t\t\t\t\t\"tagValue\": \"datasets-value\"\n\t\t\t\t}\n\t\t\t]\n\t\t},\n\t\t\"interactions\": {\n\t\t\t\"dataset\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"interactions-data\"\n\t\t\t\t}\n\t\t\t},\n\t\t\t\"schema\": {\n\t\t\t\t\"serviceConfig\": {\n\t\t\t\t\t\"name\": \"interactions-schema\",\n\t\t\t\t\t\"schema\": {\n\t\t\t\t\t\t\"type\": \"record\",\n\t\t\t\t\t\t\"name\": \"Interactions\",\n\t\t\t\t\t\t\"namespace\": \"com.amazonaws.personalize.schema\",\n\t\t\t\t\t\t\"fields\": [\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"USER_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"ITEM_ID\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"TIMESTAMP\",\n\t\t\t\t\t\t\t\t\"type\": \"long\"\n\t\t\t\t\t\t\t},\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\t\"name\": \"EVENT_TYPE\",\n\t\t\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\t],\n\t\t\t\t\t\t\"version\": \"1.0\"\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t},\n\t\"eventTracker\": {\n\t\t\"serviceConfig\": {\n\t\t\t\"name\": \"event-tracker-name\"\n\t\t}\n\t},\n\t\"solutions\": [\n\t\t{\n\t\t\t\"serviceConfig\": {\n\t\t\t\t\"name\": \"solution-user-personalization\",\n\t\t\t\t\"recipeArn\": \"arn:aws:personalize:::recipe/aws-user-personalization\",\n\t\t\t\t\"performHPO\": true,\n\t\t\t\t\"tags\": [\n\t\t\t\t\t{\n\t\t\t\t\t\t\"tagKey\": \"solution-key\",\n\t\t\t\t\t\t\"tagValue\": \"solution-value\"\n\t\t\t\t\t}\n\t\t\t\t]\n\t\t\t}\n\t\t}\n\t]\n}\n```\n\nSome key points:\n\n1. Solution version tags can be specified inside the solution's `serviceConfig` field, inside the `solutionVersion` field (see the `dataset-group-name-2` example). `solutionVersion` specification is optional.\n2. Root-level tags apply to all components which do not have explicit tags specified (see notes around `dataset-group-name-2` json example).\n3. [`tags`](https://docs.aws.amazon.com/personalize/latest/dg/tagging-resources.html) are optional fields.\n\n## Training Mode\n\nTraining mode can be described as 'FULL' or 'UPDATE' through the `solutionVersion` field inside the `solution` specification.\n\nThe purpose of trainingMode=\"UPDATE\" is to process new items added to the items dataset (via PutItems or a bulk upload) as well as impression data for new interactions added to the interactions since the last FULL/UPDATE training. The UPDATE mode only brings in new items and impression data and does not retrain the model. Therefore, if there are no dataset updates since the last FULL/UPDATE training, you might get an error saying \"There should be updates to at least one dataset after last active solution version with training mode set to FULL\".\n\nWith User-Personalization, Amazon Personalize automatically updates the latest model (solution version) every two hours behind the scenes to include new data. There is no cost for automatic updates. The solution version must be deployed with an [Amazon Personalize campaign](https://docs.aws.amazon.com/personalize/latest/dg/campaigns.html) for updates to occur. Your campaign automatically uses the updated solution version. No new solution version is created when an auto update completes and no new model metrics are generated. This is because no full retraining occurs. If you create a new solution version, Amazon Personalize will not automatically update older solution versions, even if you have deployed them in a campaign. Updates also do not occur if you have deleted your dataset.\n\nIf every two hours is not frequent enough, you can manually create a solution version with trainingMode set to UPDATE to include those new items in recommendations. Amazon Personalize automatically updates only your latest fully trained solution version, so the manually updated solution version won't be automatically updated in the future. Also note that if you create a solution version with UPDATE, you will be charged for the server hours to perform the update.\n\nFor more information about automatic updates, see the [Amazon Personalize Developer Guide](https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-new-item-USER_PERSONALIZATION.html#automatic-updates)\n\n```json\n...\n\"solutions\": [\n    {\n      \"serviceConfig\": {\n        \"name\": \"affinity_item\",\n        \"recipeArn\": \"arn:aws:personalize:::recipe/aws-item-affinity\",\n        \"solutionVersion\": {\n          \"trainingMode\": \"UPDATE\"\n\t\t  \"tags\": [{\"tagKey\": \"project\", \"tagValue\": \"item-affinity\"}]\n        }\n      },\n      ...\n\t}\n]\n...\n\n```\n\n## Creating a custom build\n\nTo customize the solution, follow the steps below:\n\n### Prerequisites\n\nThe following procedures assumes that all the OS-level configuration has been completed. They are:\n\n- [AWS Command Line Interface](https://aws.amazon.com/cli/)\n- [Python](https://www.python.org/) 3.11 or newer\n- [Node.js](https://nodejs.org/en/) 16.x or newer\n- [AWS CDK](https://aws.amazon.com/cdk/) 2.88.0 or newer\n- [Amazon Corretto OpenJDK](https://docs.aws.amazon.com/corretto/) 17.0.4.1\n\n\u003e **Please ensure you test the templates before updating any production deployments.**\n\n### 1. Download or clone this repo\n\n```\n\ngit clone https://github.com/aws-solutions/maintaining-personalized-experiences-with-machine-learning\n\n```\n\n### 2. Create a Python virtual environment for development\n\n```bash\npython -m virtualenv .venv\nsource ./.venv/bin/activate\ncd ./source\npip install -r requirements-dev.txt\n```\n\n### 2. After introducing changes, run the unit tests to make sure the customizations don't break existing functionality\n\n```bash\npytest --cov\n```\n\n### 3. Build the solution for deployment\n\n#### Using AWS CDK (recommended)\n\nPackaging and deploying the solution with the AWS CDK allows for the most flexibility in development\n\n```bash\ncd ./source/infrastructure\n\n# set environment variables required by the solution\nexport BUCKET_NAME=\"my-bucket-name\"\n\n# bootstrap CDK (required once - deploys a CDK bootstrap CloudFormation stack for assets)\ncdk bootstrap --cloudformation-execution-policies arn:aws:iam::aws:policy/AdministratorAccess\n\n# build the solution\ncdk synth\n\n# build and deploy the solution\ncdk deploy\n```\n\n#### Using the solution build tools\n\nIt is highly recommended to use the AWS CDK to deploy this solution (using the instructions above). While CDK is used to\ndevelop the solution, to package the solution for release as a CloudFormation template, use the `build-s3-cdk-dist`\nbuild tool:\n\n```bash\ncd ./deployment\n\nexport DIST_BUCKET_PREFIX=my-bucket-name\nexport SOLUTION_NAME=my-solution-name\nexport VERSION=my-version\nexport REGION_NAME=my-region\n\nbuild-s3-cdk-dist deploy \\\n  --source-bucket-name $DIST_BUCKET_PREFIX \\\n  --solution-name $SOLUTION_NAME \\\n  --version_code $VERSION \\\n  --cdk-app-path ../source/infrastructure/deploy.py \\\n  --cdk-app-entrypoint deploy:build_app \\\n  --region $REGION_NAME \\\n  --sync\n```\n\n**Parameter Details**\n\n- `$DIST_BUCKET_PREFIX` - The S3 bucket name prefix. A randomized value is recommended. You will need to create an\n  S3 bucket where the name is `\u003cDIST_BUCKET_PREFIX\u003e-\u003cREGION_NAME\u003e`. The solution's CloudFormation template will expect the\n  source code to be located in the bucket matching that name.\n- `$SOLUTION_NAME` - The name of This solution (example: personalize-solution-customization)\n- `$VERSION` - The version number to use (example: v0.0.1)\n- `$REGION_NAME` - The region name to use (example: us-east-1)\n\nThis will result in all global assets being pushed to the `DIST_BUCKET_PREFIX`, and all regional assets being pushed to\n`DIST_BUCKET_PREFIX-\u003cREGION_NAME\u003e`. If your `REGION_NAME` is us-east-1, and the `DIST_BUCKET_PREFIX` is\n`my-bucket-name`, ensure that both `my-bucket-name` and `my-bucket-name-us-east-1` exist and are owned by you.\n\nAfter running the command, you can deploy the template:\n\n- Get the link of the `SOLUTION_NAME.template` uploaded to your Amazon S3 bucket\n- Deploy the solution to your account by launching a new AWS CloudFormation stack using the link of the template above.\n\n\u003e **Note:** `build-s3-cdk-dist` will use your current configured `AWS_REGION` and `AWS_PROFILE`. To set your defaults,\n\u003e install the [AWS Command Line Interface](https://aws.amazon.com/cli/) and run `aws configure`.\n\n## Collection of operational metrics\n\n This solution collects anonymized operational metrics to help AWS improve the quality and features of the solution. For more information, including how to disable this capability, please see the [implementation guide](https://docs.aws.amazon.com/solutions/latest/maintaining-personalized-experiences-with-ml/reference.html).\n\n---\n\nCopyright Amazon.com, Inc. or its affiliates. All Rights Reserved.\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws-solutions%2Fmaintaining-personalized-experiences-with-machine-learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faws-solutions%2Fmaintaining-personalized-experiences-with-machine-learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws-solutions%2Fmaintaining-personalized-experiences-with-machine-learning/lists"}