{"id":27627203,"url":"https://github.com/treasure-data/uid2","last_synced_at":"2025-04-23T13:53:31.250Z","repository":{"id":242972031,"uuid":"805652156","full_name":"treasure-data/uid2","owner":"treasure-data","description":"Trade Desk UID2 Integration","archived":false,"fork":false,"pushed_at":"2025-03-14T15:32:57.000Z","size":76,"stargazers_count":0,"open_issues_count":1,"forks_count":1,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-03-14T16:34:31.769Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/treasure-data.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-25T04:50:14.000Z","updated_at":"2025-01-27T16:46:08.000Z","dependencies_parsed_at":"2024-06-06T00:54:31.321Z","dependency_job_id":"243d7659-dda6-4e72-b17b-85a894fabbbc","html_url":"https://github.com/treasure-data/uid2","commit_stats":null,"previous_names":["treasure-data/uid2"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treasure-data%2Fuid2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treasure-data%2Fuid2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treasure-data%2Fuid2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treasure-data%2Fuid2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/treasure-data","download_url":"https://codeload.github.com/treasure-data/uid2/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250444199,"owners_count":21431601,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-23T13:53:30.703Z","updated_at":"2025-04-23T13:53:31.243Z","avatar_url":"https://github.com/treasure-data.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# UID2 Converter\n\nThis workflow is designed to take DII (Directly Identifying Information) such as Emails and Phone Numbers and turn them into UID2s following the UID2 Framework. \n\n### Why Use UID2 \n\n- **Establish identity without 3rd party cookies** and retain precise targeting with a persistent identifier (e.g. phone number, email) \n- **Measure and optimize**  digital ad performance across the scale of your 1st party data \n    * Follow the full customer journey and track which interactions lead to conversion\n    * Effectively cap frequencies based on user interactions\n    * Based on 1st party audience insights, strategically improve targeting strategies to drive positive outcomes and improve ROAS\n\n## High Level Workflow \nThe following steps provide a high-level outline of the workflow intended for organizations that collect user data and push it to DSPs—for example, advertisers, identity graph providers, and third-party data providers.\n\nThe following steps are an example of how an advertiser can integrate with UID2:\n\n1. Treasure Data sends a user’s directly identifying information (DII) to the UID2 Operator.\n\n2. The UID2 Operator generates and returns a raw UID2 and salt bucket ID.\n\n3. Treasure Data stores the UID2 and salt bucket ID and sends the UID2-based first-party and third-party audience segments to the DSP.\n\nThe following process occurs in the background:\n\n1. Treasure Data monitors the UID2 Operator for rotated salt buckets and updates UID2s as needed.\n\n\n## Concepts\n### UID2 Definition\n\nhttps://unifiedid.com/docs/intro\n\u003eUID2 is a framework that enables deterministic identity for advertising opportunities on the open internet for many participants across the advertising ecosystem. The UID2 framework enables logged-in experiences from publisher websites, mobile apps, and Connected TV (CTV) apps to monetize through programmatic workflows. Built as an open-source, standalone solution with its own unique namespace, the framework offers the user transparency and privacy controls designed to meet local market requirements.\n\n** SEE the UID2 Reference Documentation section at the end of this page for reference diagrams and links\n\n### DII\nDII is Directly Identifying Information, currently Email Addresses and Phone Numbers. \n\n### UID2\nThe unencrypted alphanumeric identifier created from a user’s email address. A UID2 is the actual value that DSPs, data providers, and advertisers will store, but this value should never enter the bid stream.\n\nThis value is created by a UID2 operator (see below) by adding a secret salt to the email address and then passing that value through a hashing function.\n\n### UID2 Operator\nOrganizations that operate the infrastructure required to generate and manage UID2s and UID2 tokens.\n\nOperators will receive salts and encryption keys from the UID2 Administrator. They will also operate an API that participants can call to receive UID2s or UID2 tokens.\n\n### DII Normalization\nDII (email address \u0026 phone numbers) will be normalized by the UID2 Operator. \n\n\u003eIf intending to use hashed email and phones, they must be normalized before hashing to the expected format per Normalization Standards. All DII will be mapped by a UID2 Service Operator as long as it is in the expected format as detailed below. Note that the UID2 Service Operator will map any email address or phone number as long as it is in the expected format, the email/phone does not need to be an actual live or working DII.\n\n\u003e ** This workflow has only been tested with unhashed emails/phones\n\n\u003e email addresses must be normalized per Email Normalization Standard.\nphone numbers must be in E.164 format.\ntimestamps must be in ISO 8601 format.\n\n### Salt Bucket Rotation\nUID2 values are kept encrypted within UID2 Service Operator systems, and the Salt values are rotated on average once per year on a fairly even basis. This means that roughly 1/365th of UID2 Salt Buckets will get rotated per day. In other words, if a customer has 80M DII ↔︎ UID2 mappings, then on average 219,178 UID2 Salt Buckets will get rotated every day. When a Salt Bucket gets rotated, then the associated UID2 values are no longer valid and must be re-mapped.\n\nThe design and implementation of this workflow includes Salt Bucket Rotation functionality every time the WF runs:\n\nThe UID2 Mapping records include a column for the Salt Bucket ID of the corresponding UID2\n\nThe WF uses the Bucket API [/v2/identity/buckets] to check for all rotated Salt Buckets since the last WF run.\n\nThe WF re-maps all UID2 values associated with any stale Salt Buckets\n\n## Background \n\n### Source DII Data Collection\nThe WF can map DII from any source database/tables in the TD instance, including Unification and/or Audience databases. The source tables are configured per the td_uid2_src_lst variable in the config/td_uid2_src_lst.yaml file, as many tables can be configured for collection as desired; details are listed in the [CONFIGURATION] section below.\n\nNew DII sources can be added at any time and will be included in the next WF run.\n\nIf an existing DII source is removed then it will no longer be collected going forward, but the existing DII ↔︎ UID2 mappings from that source will NOT be deleted by the WF. They can be manually selected and deleted per the ttd_uid2_ids.src_db and ttd_uid2_ids.src_tbl columns.\n\n\n## Workflow Installation\n\n### Install \nTo upload this workflow, download this workflow to your local machine and then use the [TD Toolbelt](https://api-docs.treasuredata.com/en/tools/cli/quickstart/#td-toolbelt-quickstart) to Upload this folder as a Project to Treasure Workflows using the `td wf push \u003cproject_name\u003e` command inside the workflow root folder. \n\n\n\n### Where to Run \nThis code should be run immediately after ID Unification and the Output Table `ttd_uid2_ids` should be added as a behavior table to the Parent Segment. \n\n### Setup\nThe UID2 Converter Workflow is grouped into two main sections that need to be configured before running the workflow:\n\n1. Secrets: Secret keys that need to be configured in the Project-level “Secrets” tab\n2. Configuration: Two YML Files are used to configure 1) top-level parameters and 2) TD Tables and columns containing DII for Export to UID2 Service Operator\n\n### Secrets\n| Secrets | README |\n| ------ | ------ |\n| pytd.apikey | TD Master API Key for pytd SDK to query \u0026 update tables. A  UID2 Specific service-account API Key with limited access to  UID2 related databases \u0026 tables is recommended (principle of least privilege). \u003cbr\u003e*NOTE– This must be in pure TDI API Key format “nnnnn/xxxxxxxxxxxxxxxxxxxxxxxxx\" (without quotes), NOT HTTP Authorization Header format “TD1 nnnnn/xxxxxxxxxxxxxxxxxxxxxxxxx\"* |\n| ttd.apikey | UID2 API Key, provided by UID2 Operator as `{UID2 Integ Keys \u003e api_key}` |\n| ttd.clientsecret | UID2 API Secret, provided by UID2 Operator as `{UID2 Integ Keys \u003e v2_secret}` |\n\n### Configuration\n\n#### Main WF Config `ttd_uid2.dig`  \n\n```\n_export: \n  # pytd SDK Config\n  pytd:\n    # TD API Server Endpoint\n    apiserver: https://api.treasuredata.com/\n  # API Config\n  ttd:\n    # Max parallel allowed by UID2 Operator is 10 (ten); probably will never change\n    parallel_max: 10\n    # UID2 Environment, e.g. Integration, Production, etc.\n    environment: prod.uidapi.com\n    # UID2 Mapping API Endpoint; probably will never change\n    url_map: /v2/identity/map\n    # Salt-Bucket Rotation API Endpoint; probably will never change\n    url_buckets: /v2/identity/buckets\n  # TD Integration Database\n  td_uid2_env:\n    # The TD Database to store UID2 Mappings and WF Metadata\n    #   Will get created if not exists\n    #   All required tables will also get created if not exists\n    #   *NO* Initial setup required, simply configure desired database name\n    #     and the WF will install and configure itself accordingly\n    db: MY_DATABASE_NAME\n  # TD Source Tables for DII/UID2 Mapping\n  #   Many source tables may be configured\n  #   Configure in include file for managability\n  #   Configuration design specifications listed below\n  !include : config/td_uid2_src_lst.yaml\n```\n#### Source Tables `config/td_uid2_src_lst.yaml`\n```\n# List of TD Source Tables\n#   3 Tables shown here for example, many tables may be included\n#   Tables can be sourced from any TD database, \n#     including Unification and/or Audience databases\ntd_uid2_src_lst:\n    # Database Name\n  - src_db: stage_db\n    # Table Name\n    src_tbl: email_send\n    # ID Column if available\n    #   If no ID column, use the literal term \"null\" (w/o quotes)\n    src_id_col: td_id\n    # The name of the Source DII column, can be any name\n    src_dii_col: email_address\n    # The type of DII, either \"EMAIL\" or \"PHONE\"; Case-Sensitive, w/o quotes\n    src_dii_typ: EMAIL\n\n    # Database Name\n  - src_db: unification_db\n    # Table Name\n    src_tbl: ecommerce_orders\n    # ID Column if available\n    #   If no ID column, use the literal term \"null\" (w/o quotes)\n    src_id_col: td_id\n    # The name of the Source DII column, can be any name\n    src_dii_col: primary_email\n    # The type of DII, either \"EMAIL\" or \"PHONE\"; Case-Sensitive, w/o quotes\n    src_dii_typ: EMAIL\n\n    # Database Name\n  - src_db: audience_db\n    # Table Name\n    src_tbl: sms_contacts\n    # ID Column if available\n    #   If no ID column, use the literal term \"null\" (w/o quotes)\n    src_id_col: td_id\n    # The name of the Source DII column, can be any name\n    src_dii_col: phone_number\n    # The type of DII, either \"EMAIL\" or \"PHONE\"; Case-Sensitive, w/o quotes\n    src_dii_typ: PHONE\n```\n\n## Output Tables\n\n#### `ttd_uid2_ids` – Transactional Table – Main UID2 Table \n| **COLUMN**       | **TYPE** | **DESCRIPTION**|\n| ---------------- | -------- | ----------------------------- |\n| `time`           | INTEGER  | Unixtime of record `INSERT` |\n| `src_db`         | VARCHAR  | The source database of the DII value |\n| `src_tbl`        | VARCHAR  | The source table of the DII value |\n| `src_id_col`     | VARCHAR  | The ID column for the source table of the DII value |\n| `src_id`         | VARCHAR  | The ID value of the record in the source table of the DII value |\n| `src_col`        | VARCHAR  | The Source column in the source table of the DII value |\n| `src_typ`        | VARCHAR  | The type of DII, one of `{EMAIL, PHONE}`|\n| `src_data`       | VARCHAR  | The source DII value |\n| `advertising_id` | VARCHAR  | The UID2 value (Defined as `advertising_id` in Service Operator Service API's) |\n| `bucket_id`      | VARCHAR  | The Salt Bucket ID |\n| `is_current`     | INTEGER  | Does the UID2 (`advertising_id` column) contain a current UID2 value from a non-expired Salt Bucket? \u003cbr\u003e *   `0` (zero) – NO – Indicates that the `ttd_uid2_ids` record is either new, or that the Salt Bucket has expired. In either case, a new UID2 must be fetched from UID2 Operator \u003cbr\u003e *   `1` (one) – YES – Indicates that the `ttd_uid2_ids` record has a current UID2 in the `advertising_id` column, a new UID2 does _NOT_ need to be fetched from UID2 Operator \u003cbr\u003e The `is_current` state is managed during each WF run and should always have the value `1` (one) for all records at the completion of every successful WF run. If any records have the value `0` (zero) after the WF run has completed that means that something failed. The two primary causes of DII ↔︎ UID2 Mapping failure are: * \u003cbr\u003e   The DII format is not correct and therefore cannot be mapped by the UID2 Service Operator. For example, the email `myname@mysite` is not a valid email format (the domain is missing TLD extension), and cannot be mapped by the Operator. Phone numbers must be in valid [E.164](https://en.wikipedia.org/wiki/E.164) format. Note that the Operator _will_ map any email address or phone number as long as it is in the expected format, the email/phone does not need to be an actual live or working DII. \u003cbr\u003e * The TD UID2 Mapping Workflow failed for any reason |\n\n#### `ttd_uid2_ids_archive` – Transactional Table – Main UID2 Table\n**Same schema as ttd_uid2_ids table, except that the is_current will always have the value -1 to indicate archive records.\n\n| **COLUMN**       | **TYPE** | **DESCRIPTION**    |\n| ---------------- | -------- | -------------------------------------- |\n| `time`           | INTEGER  | Unixtime of record `INSERT` |\n| `src_db`         | VARCHAR  | The source database of the DII value  |\n| `src_tbl`        | VARCHAR  | The source table of the DII value   |\n| `src_id_col`     | VARCHAR  | The ID column for the source table of the DII value  |\n| `src_id`         | VARCHAR  | The ID value of the record in the source table of the DII value |\n| `src_col`        | VARCHAR  | The Source column in the source table of the DII value |\n| `src_typ`        | VARCHAR  | The type of DII, one of `{EMAIL, PHONE}`  |\n| `src_data`       | VARCHAR  | The source DII value   |\n| `advertising_id` | VARCHAR  | The TTD UID2 value (Defined as `advertising_id` in Operator Service API's)     |\n| `bucket_id`      | VARCHAR  | The Salt Bucket ID   |\n| `is_current`     | INTEGER  | Always has the value `-1` (negative-one) to indicate archived records.   |\n\n#### `ttd_bucket_resp` – Staging Table – For UID Salt Bucket Rotation API Responses\n** Important – This table is also used to calculate the since_timestamp for the Salt Bucket rotation API. Even though this table is classified as a staging table, the records should NEVER be manually deleted as they are required for the subsequent WF run to accurately calculate the since_timestamp.\n\nIf the records in this table are ever accidentally deleted, then it is recommend to re-map UID2 for ALL records in the ttd_uid2_ids table.\n\n| **COLUMN**     | **TYPE** | **DESCRIPTION**  |\n| -------------- | -------- | ------------ |\n| `time`   | INTEGER  | Unixtime of record `INSERT`  |\n| `bucket_id`    | VARCHAR  | The UID Salt Bucket ID   |\n| `last_updated` | VARCHAR  | Timestamp in ISO 8601 format of when this Salt Bucket was last updated by Operator (not used by this WF, for analysis purposes). |\n\n#### `ttd_uid2_rqst` – Staging Table – For UID Map API Requests\n| **COLUMN**      | **TYPE** | **DESCRIPTION** |\n| --------------- | -------- |-------------- |\n| `time`          | INTEGER  | Unixtime of record `INSERT` |\n| `rnk_num`       | LONG     | The sequence number of this UID2 Service Operator API batch request |\n| `ttd_uid2_rqst` | VARCHAR  | The actual JSON payload for the UID2 Service Operator API batch request. It is logical and valid JSON, stored as `VARCHAR` for simplicity and convenience. It is stored as plain-text unencrypted, the TD Python client script manages all security and encryption/decryption internally. |\n\n#### `ttd_uid2_resp` – Staging Table – For UID2 Map API Responses\n| **COLUMN**       | **TYPE** | **DESCRIPTION**                                                                                                   |\n| ---------------- | -------- |------ |\n| `time`           | INTEGER  | Unixtime of record `INSERT`  |\n| `rnk_num`        | LONG     | The sequence number of this UID2 Service Operator API batch request (not used by this WF, for analysis purposes). |\n| `identifier`     | VARCHAR  | The DII value, either an Email or Phone |\n| `advertising_id` | VARCHAR  | The UID2 value (Defined as `advertising_id` in Operator Service |\n| `bucket_id`      | VARCHAR  | The Salt Bucket ID   |\n\n## Additional Resources\n\n\n\u003cimg style=\"float: right;\" src=\"https://unifiedid.com/assets/images/UID2AdvertiserAndThirdPartyDataProviderWorkflow-2ac59ad79bedaa9d265f6e4d4f99efa5.svg\"\u003e\n\n\u003cimg style=\"float: right;\" src=\"https://unifiedid.com/assets/images/advertiser-flow-mermaid-d3b67f69ab9afe0241a56fbd3bbf6389.png\"\u003e\n\n\nUID2 Mapping \u0026 Salt Bucket Rotation\n\n- https://unifiedid.com/docs/intro\n- https://unifiedid.com/docs/guides/advertiser-dataprovider-guide\n- https://unifiedid.com/docs/endpoints/post-identity-map\n- https://unifiedid.com/docs/getting-started/gs-normalization-encoding#phone-number-normalization\n- https://unifiedid.com/docs/endpoints/post-identity-buckets\n\nTTD Export Integration\n- https://partner.thetradedesk.com/v3/portal/data/doc/post-data-advertiser-external\n- https://partner.thetradedesk.com/v3/portal/data/doc/DataEnvironments\n- https://partner.thetradedesk.com/v3/portal/data/doc/DataAuthentication\n\nTreasure Data TTD Export Integration\n\n- htps://docs.treasuredata.com/display/public/INT/The+Trade+Desk+Export+Integration\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftreasure-data%2Fuid2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftreasure-data%2Fuid2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftreasure-data%2Fuid2/lists"}