{"id":23125197,"url":"https://github.com/statisticsnorway/dapla-dlp-pseudo-func","last_synced_at":"2025-08-17T03:32:49.451Z","repository":{"id":54204011,"uuid":"229216795","full_name":"statisticsnorway/dapla-dlp-pseudo-func","owner":"statisticsnorway","description":"Data pseudonymization functions used by Statistics Norway (SSB)","archived":false,"fork":false,"pushed_at":"2024-04-09T06:19:49.000Z","size":256,"stargazers_count":7,"open_issues_count":0,"forks_count":0,"subscribers_count":14,"default_branch":"master","last_synced_at":"2024-04-16T02:08:32.339Z","etag":null,"topics":["dapla","dlp","pseudo","statistikktjenester"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/statisticsnorway.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2019-12-20T07:50:44.000Z","updated_at":"2024-02-29T18:41:35.000Z","dependencies_parsed_at":"2024-02-06T15:27:47.768Z","dependency_job_id":"6c9be2a3-2db6-4569-86ce-c2f983e3beca","html_url":"https://github.com/statisticsnorway/dapla-dlp-pseudo-func","commit_stats":null,"previous_names":[],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statisticsnorway%2Fdapla-dlp-pseudo-func","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statisticsnorway%2Fdapla-dlp-pseudo-func/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statisticsnorway%2Fdapla-dlp-pseudo-func/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statisticsnorway%2Fdapla-dlp-pseudo-func/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/statisticsnorway","download_url":"https://codeload.github.com/statisticsnorway/dapla-dlp-pseudo-func/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230085319,"owners_count":18170425,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dapla","dlp","pseudo","statistikktjenester"],"created_at":"2024-12-17T08:12:52.899Z","updated_at":"2024-12-17T08:12:53.505Z","avatar_url":"https://github.com/statisticsnorway.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Dapla Pseudonymization Functions\n\n\u003e Data pseudonymization functions used by Statistics Norway (SSB)\n\n[![Build Status](https://drone.prod-bip-ci.ssb.no/api/badges/statisticsnorway/dapla-dlp-pseudo-func/status.svg)](https://drone.prod-bip-ci.ssb.no/statisticsnorway/dapla-dlp-pseudo-func)\n\nPseudonymization is a data management and de-identification procedure by which personally identifiable information fields within a data record are replaced by one or more artificial identifiers, or pseudonyms. A single pseudonym for each replaced field or collection of replaced fields makes the data record less identifiable while remaining suitable for data analysis and data processing.\n\nThis lib contains functions that can be used to implement data pseudonymization.\n\nIt is important to note that pseudonymization is not the same as anonymization. While pseudonymization targets directly identifying elements, the real information might still be identifiable e.g. by using inherent information such as correlation between data elements. Thus, sensitive data that has been pseudonomized using the functions in this library should still be regarded as sensitive.\n\nThe library is currently in a \"pre-alpha\" stage. We're experimenting with the architecture related to how and when psedonymization is being applied in our data management platform. Also, currently there are only a few and simplistic functions in this library. Breaking changes should be expected.\n\n\n## Installation\n\nMaven coordinates:\n```\n\u003cdependency\u003e\n  \u003cgroupId\u003eno.ssb.dapla.dlp.pseudo.func\u003c/groupId\u003e\n  \u003cartifactId\u003edapla-dlp-pseudo-func\u003c/artifactId\u003e\n  \u003cversion\u003e0.1.0-SNAPSHOT\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## Examples\n\n### Pseudo function config\n\nThe following exemplifies how a pseudo function config can be described in JSON:\n```\n[\n\t{\n\t\t\"name\": \"fpe-alphanumeric-1\",\n\t\t\"impl\": \"no.ssb.dapla.dlp.pseudo.func.FpeFunc\",\n\t\t\"keyId\": \"411f2af1-7588-4c7f-95e4-1c15d82ef202\",\n\t\t\"alphabet\": \"alphanumeric+whitespace\"\n\t},\n\t{\n\t\t\"name\": \"fpe-digits-1\",\n\t\t\"impl\": \"no.ssb.dapla.dlp.pseudo.func.FpeFunc\",\n\t\t\"keyId\": \"411f2af1-7588-4c7f-95e4-1c15d82ef202\",\n\t\t\"alphabet\": \"digits\"\n\t},\n\t{\t\n\t\t\"name\": \"fpe-custom-1\",\n\t\t\"impl\": \"no.ssb.dapla.dlp.pseudo.func.FpeFunc\",\n\t\t\"keyId\": \"411f2af1-7588-4c7f-95e4-1c15d82ef202\",\n\t\t\"alphabet\": \"abcdefghij123_ \"\n\t}\n] \n```\n\nThe `name` and `impl` properties are mandatory for all functions.\nOther properties are dependent on the implentation. In the above,\nthe `FpeFunc` function requires that additional properties `keyId` and `alphabet`\nmust be set. \n\n### Initialize function registry\n```java\nString configJson = ...\nPseudoFuncRegistry registry = new PseudoFuncRegistry();\nregistry.init(configJson)\n```\n\n### Invoke pseudonymization function\n```java\nPseudoFunc func = registry.get(\"fpe-alphanumeric-1\");\nPseudoFuncInput input = PseudoFuncInput.of(\"Ken sent me\");\nPseudoFuncOutput output = func.apply(input); // '2y RazFwxQM'\n```\n\n### Restore to original value (\"depseudonymize\")\n```java\nPseudoFunc func = registry.get(\"fpe-alphanumeric-1\");\nPseudoFuncInput input = PseudoFuncInput.of(\"2y RazFwxQM\");\nPseudoFuncOutput output = func.restore(input); // 'Ken sent me'\n```\n\n### Explicit instantiation of PseudoFunc\n```java\nPseudoFuncConfig config = new PseudoFuncConfig(Map.of(\n    PseudoFuncConfig.Param.FUNC_NAME, \"myfunc-1\",\n    PseudoFuncConfig.Param.FUNC_IMPL, MyFunc.class.getName(),\n    // ... paramName = value\n));\nPseudoFunc func = PseudoFuncFactory.create(config);\n```\n\nFor more usage examples, have a look at the [tests](https://github.com/statisticsnorway/dapla-dlp-pseudo-func/tree/master/tests).\n\n\n## Development\n\nFrom the CLI, run `make help` to see common development commands.\n\n```\nbuild-mvn                      Build the project and install to you local maven repo\ntest                           Run tests\nrelease-dryrun                 Simulate a release in order to detect any issues\nrelease                        Release a new version. Update POMs and tag the new version in git.\n```\n\nE.g. to run tests, execute `make test`.\n\nIf you're on windows, you might need to install make first. Using [chocolatey](https://chocolatey.org/), you can do `choco install make`.\n\n\n## Contributing\n1. Fork it (https://github.com/statisticsnorway/dapla-dlp-pseudo-func/fork)\n2. Create your feature branch (`git checkout -b feature/foo-bar`)\n3. Commit your changes (`git commit -am 'Add some foo bar'`)\n4. Push to the branch (`git push origin feature/foo-bar`)\n5. Create a new Pull Request\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstatisticsnorway%2Fdapla-dlp-pseudo-func","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstatisticsnorway%2Fdapla-dlp-pseudo-func","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstatisticsnorway%2Fdapla-dlp-pseudo-func/lists"}