{"id":18733554,"url":"https://github.com/insightsengineering/presidio-action","last_synced_at":"2025-04-12T18:31:49.179Z","repository":{"id":42470044,"uuid":"434201293","full_name":"insightsengineering/presidio-action","owner":"insightsengineering","description":"Github Action that analyze Text for PII Entities with Microsoft Presidio framework.","archived":false,"fork":false,"pushed_at":"2025-03-20T14:24:24.000Z","size":39,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-20T15:34:01.898Z","etag":null,"topics":["actions","pii","presidio","python"],"latest_commit_sha":null,"homepage":"https://github.com/marketplace/actions/presidio-action","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/insightsengineering.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"custom":["https://pharmaverse.org"]}},"created_at":"2021-12-02T11:53:38.000Z","updated_at":"2025-03-20T14:24:27.000Z","dependencies_parsed_at":"2024-11-07T15:10:31.687Z","dependency_job_id":"a41abf49-bdd7-49ff-9212-01fec8261693","html_url":"https://github.com/insightsengineering/presidio-action","commit_stats":{"total_commits":10,"total_committers":4,"mean_commits":2.5,"dds":0.4,"last_synced_commit":"a919f271e66efe03f9e9ee4d20e2f0cd1ab44dca"},"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/insightsengineering%2Fpresidio-action","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/insightsengineering%2Fpresidio-action/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/insightsengineering%2Fpresidio-action/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/insightsengineering%2Fpresidio-action/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/insightsengineering","download_url":"https://codeload.github.com/insightsengineering/presidio-action/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248613544,"owners_count":21133531,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["actions","pii","presidio","python"],"created_at":"2024-11-07T15:10:22.334Z","updated_at":"2025-04-12T18:31:48.841Z","avatar_url":"https://github.com/insightsengineering.png","language":null,"funding_links":["https://pharmaverse.org"],"categories":[],"sub_categories":[],"readme":"# Presidio Action\n\nGithub Action that analyzes text for PII entities with [Microsoft's Presidio framework](https://microsoft.github.io/presidio/).\n\n## Author\n\nInsights Engineering\n\n## Inputs\n\n* `path`:\n\n    _Description_: Path to verify\n\n    _Required_: `false`\n\n    _Default_: \".\"\n\n* `configuration-file`:\n\n    _Description_: Path to custom configuration file\n\n    _Required_: `false`\n\n    _Default_: \"default\"\n\n* `configuration-data`:\n\n    _Description_: Configuration data as an inline YAML configuration\n\n    _Required_: `false`\n\n    _Default_: \"\"\n\n* `output`:\n\n    _Description_: Format of output\n\n    _Required_: `false`\n\n    _Default_: \"auto\"\n\n* `publish`:\n\n    _Description_: Publish result as a PR comment\n\n    _Required_: `false`\n\n    _Default_: \"true\"\n\n* `upload`:\n\n    _Description_: Upload results as an artifact\n\n    _Required_: `false`\n\n    _Default_: \"true\"\n\n* `presidio-cli-version`:\n\n    _Description_: Presidio CLI version\n\n    _Required_: `false`\n\n    _Default_: \"latest\"\n\n* `lang-models`:\n\n    _Description_: List of additional language models to install\n\n    _Required_: `false`\n\n    _Default_: \"\"\n\n* `only-changed-files`:\n\n    _Description_: Only run checks for changed files\n\n    _Required_: `false`\n\n    _Default_: `false`\n\n## Outputs\n\nAn output depends on the `output` parameter:\n\nThe default format is `auto`.\n\nAvailable formats:\n\n* standard - standard output format\n\n```shell\ntests/conftest.py\n  34:58     0.85     PERSON\n  37:33     0.85     PERSON\n```\n\n* github - similar to diff function in github\n\n```shell\n::group::tests/conftest.py\n::0.85 file=tests/conftest.py,line=34,col=58::34:58 [PERSON]\n::0.85 file=tests/conftest.py,line=37,col=33::37:33 [PERSON]\n::endgroup::\n```\n\n* colored - standard output format but with colors\n\n* parsable - easy to parse automaticaly\n\n```shell\n{\"entity_type\": \"PERSON\", \"start\": 57, \"end\": 62, \"score\": 0.85, \"analysis_explanation\": null}\n{\"entity_type\": \"PERSON\", \"start\": 32, \"end\": 37, \"score\": 0.85, \"analysis_explanation\": null}\n```\n\n* auto - default format, switches automatically between those 2 modes:\n  * github, if run on github - environment variables `GITHUB_ACTIONS` and `GITHUB_WORKFLOW` are set\n  * colored, otherwise\n\n## How it works\n\nPresidio action uses [presidio-cli](https://pypi.org/project/presidio-cli/)\nbased on presidio-analyzer from [Microsoft Presidio framework](https://github.com/microsoft/presidio)\nto check code against undesirable types of data such as 'EMAIL_ADDRESS' or 'PHONE_NUMBER' inside application's code.\n\nFor more information please see a full [list of supported entities](https://microsoft.github.io/presidio/supported_entities/).\n\n## Usage\n\nExample usage:\n\n```yaml\n---\nname: Presidio check\n\non:\n  push:\n    branches:\n      - main\n  pull_request:\n    branches:\n      - main\n\njobs:\n  presidio-action:\n    runs-on: ubuntu-latest\n    name: Presidio check\n\n    steps:\n      - name: Checkout Code\n        uses: actions/checkout@v3\n        with:\n          # 0 fetch-depth is needed if you set `only-changed-files` to true\n          # and if you are configuring this check to run on push events\n          fetch-depth: 0\n\n      - name: Produce the presidio report\n        uses: insightsengineering/presidio-action@v1\n        # all parameters below are optional\n        with:\n          # path to project.\n          # if project does not have a specific 'my-project' path,\n          # '.' - current folder is a default value\n          path: \"my-project\"\n          # configuration-file - path to file with specific configuration\n          # or use one of predefined files:\n          #   - default - `conf/default.yaml` file from action repository, check default list of entities\n          #                and ignore content of `.git` folder\n          #   - limited - `conf/limited.yaml` file from action repository, check only PERSON, EMAIL_ADDRESS and CREDIT_CARD\n          #                and ignore `.git` folder and *.cfg files\n          configuration-file: \"my-project/conf/my-presidio-config.yaml\"\n          # configuration-data - content of configuration in raw yaml format.\n          # Give possibility to prepare own configuration without adding file to project\n          # any value in this field will block usage of configuration file\n          configuration-data: |\n            entities:\n              - PERSON\n            threshold: 0.9\n          # output - specify one of output formats\n          output: \"parsable\"\n          # only-changed-files - only run the check for files that were changed\n          # NOTE: You must set fetch-depth: 0 in the actions/checkout@v3 step\n          # for push events while this paramater is set to true\n          only-changed-files: true\n\n```\n\nExample of comment added to the PR:\n\n![Screenshot with PR comment example](example.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finsightsengineering%2Fpresidio-action","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finsightsengineering%2Fpresidio-action","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finsightsengineering%2Fpresidio-action/lists"}