{"id":18966591,"url":"https://github.com/eellak/gsoc2019-anonymization","last_synced_at":"2025-04-19T14:21:40.091Z","repository":{"id":50186706,"uuid":"185354927","full_name":"eellak/gsoc2019-anonymization","owner":"eellak","description":"Anonymisation of Sensitive Data in Public Documents","archived":false,"fork":false,"pushed_at":"2022-12-08T06:04:07.000Z","size":2409,"stargazers_count":8,"open_issues_count":7,"forks_count":5,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-03-29T08:32:56.989Z","etag":null,"topics":["anonymisation","anonymizer","anonymizer-service","django","gdpr","greek-language","libreoffice-extension","web-gui"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eellak.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-07T08:18:02.000Z","updated_at":"2025-02-02T13:38:59.000Z","dependencies_parsed_at":"2023-01-24T15:00:20.190Z","dependency_job_id":null,"html_url":"https://github.com/eellak/gsoc2019-anonymization","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eellak%2Fgsoc2019-anonymization","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eellak%2Fgsoc2019-anonymization/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eellak%2Fgsoc2019-anonymization/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eellak%2Fgsoc2019-anonymization/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eellak","download_url":"https://codeload.github.com/eellak/gsoc2019-anonymization/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249213735,"owners_count":21231096,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anonymisation","anonymizer","anonymizer-service","django","gdpr","greek-language","libreoffice-extension","web-gui"],"created_at":"2024-11-08T14:37:50.626Z","updated_at":"2025-04-16T07:33:12.589Z","avatar_url":"https://github.com/eellak.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Google Summer Of Code 2019 :sunny:\n\n### Anonymisation Through Data Encryption of Sensitive Data in ODT and Text Files in Greek Language\n\n## Problem Statement\nOver the past year, great importance has been attached to information anonymisation from governments all around the world. The GDPR defines pseudonymization and the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information. Although the GDPR has been implemented since 2018 no reliable infrastructure exists in Greece to encrypt sensitive documents. It is therefore necessary to develop a product specifically for users of the Greek language that can safely and promptly anonymize their data in order for it to abide to the GDPR.\n\n## Abstract\nI propose the creation of a LibreOffice extension as well as a web GUI that will anonymize information in any legal document given. All sensitive information should be easily anonymized through this open-source tool. \n\nOn the subject of the creation of the anonymizer I suggest the following metrics. First of all, given any document the anonymizer should encrypt any greek entity in the file from a standard token vocabulary set. The user will be able to add specific arguments for entities to be anonymized (in addition to the standard ones) and he will be given the option to choose for an additional encryption. I believe that the LibreOffice extension as well as the web GUI should be user-friendly so customizable technologies should be used.\n\n\n## Wiki \nAn extended documentation has been written to [wiki pages](https://github.com/eellak/gsoc2019-anonymization/wiki)\nin order the service to be understandable and maintainable.\n\n\n## Technologies used\n\n#### Anonymizer Service\n The anonymizer service uses the following libraries: [argparse](https://docs.python.org/3/library/argparse.html), [json](https://docs.python.org/3/library/json.html), [termcolor](https://pypi.org/project/termcolor/).\n#### Web GUI\n The web GUI uses the following libraries: [django](https://www.djangoproject.com/), [bootstrap](https://getbootstrap.com/), [requests](https://pypi.org/project/requests/), [crispy-forms](https://django-crispy-forms.readthedocs.io/en/latest/install.html), [django-form-utils](https://pypi.org/project/django-form-utils/).\n#### LibreOffice Extension\nThe libreoffice extension uses the following libraries: [uno](https://wiki.openoffice.org/wiki/Uno), [json](https://docs.python.org/3/library/json.html), [pynput](https://pypi.org/project/pynput/).\n\n## Future work \n- Improvements in user interface.\n\n- Extending Web GUI, so that it can be hosted in VM and serve multiple clients at the same time.\n\n- Creating API.\n\n- Machine learning techniques to identify sensitive information in text.\n\n- Resolving any open issues.\n\n\nFor more information you can visit [future work](https://github.com/eellak/gsoc2019-anonymization/wiki/Future-Work) in wiki pages.\n\n## Final Report Gist\nYou can find the final report [here](https://gist.github.com/DimitrisKatsiros/cf6ad8e338a545a74306e0a52d2bfe26).\n\n## Contributors\n- Google Summer of Code participant: Dimitrios Katsiros\n\n- Mentor: Kostas Papadimas\n\n- Mentor: Panos Louridas\n\n- Mentor: Iraklis Varlamis\n\n- Organization: [GFOSS](https://gfoss.eu/)\n \n## License\nThis project is open source as a part of the Google Summer of Code Program. Here, the MIT license is adopted. For more information see [LICENSE](https://github.com/eellak/gsoc2019-anonymization/blob/master/LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feellak%2Fgsoc2019-anonymization","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feellak%2Fgsoc2019-anonymization","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feellak%2Fgsoc2019-anonymization/lists"}