{"id":16015566,"url":"https://github.com/radekbednarik/bq_anonymization_public","last_synced_at":"2025-10-12T22:37:31.839Z","repository":{"id":104922217,"uuid":"240472894","full_name":"radekBednarik/bq_anonymization_public","owner":"radekBednarik","description":"Public version of BQ anonymization project.","archived":false,"fork":false,"pushed_at":"2020-02-14T09:35:09.000Z","size":12,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-12T22:37:30.883Z","etag":null,"topics":["allure","allure-reporting","behave-framework","bq-api","bq-data","openpyxl","pandas","pyhamcrest"],"latest_commit_sha":null,"homepage":null,"language":"Gherkin","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/radekBednarik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-02-14T09:31:53.000Z","updated_at":"2020-08-11T04:13:52.000Z","dependencies_parsed_at":"2023-05-27T03:30:22.989Z","dependency_job_id":null,"html_url":"https://github.com/radekBednarik/bq_anonymization_public","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/radekBednarik/bq_anonymization_public","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radekBednarik%2Fbq_anonymization_public","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radekBednarik%2Fbq_anonymization_public/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radekBednarik%2Fbq_anonymization_public/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radekBednarik%2Fbq_anonymization_public/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/radekBednarik","download_url":"https://codeload.github.com/radekBednarik/bq_anonymization_public/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radekBednarik%2Fbq_anonymization_public/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279013276,"owners_count":26085250,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["allure","allure-reporting","behave-framework","bq-api","bq-data","openpyxl","pandas","pyhamcrest"],"created_at":"2024-10-08T15:40:42.801Z","updated_at":"2025-10-12T22:37:31.820Z","avatar_url":"https://github.com/radekBednarik.png","language":"Gherkin","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eBig Query Anonymization Test Tool\u003c/h1\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n[![Status](https://img.shields.io/badge/status-active-success.svg)]()\n\n\u003c/div\u003e\n\n---\n\n\u003cp align=\"center\"\u003e PUBLIC VERSION: Testing solution for BQ GDPR anonymization use case.\n    \u003cbr\u003e \n\u003c/p\u003e\n\n## 📝 Table of Contents\n\n- [About](#about)\n- [Getting Started](#getting_started)\n- [Running the tests](#tests)\n- [Authors](#authors)\n- [Acknowledgments](#acknowledgement)\n\n## 🧐 About \u003ca name = \"about\"\u003e\u003c/a\u003e\n\nIMPORTANT: This is a public version of the project. Feature files and SQL templates were anonymized. Also, API connection to BigQuery is not possible. Rest of the codebase is intact.\n\nThis projects implements a testing solution using python-behave framework to test, whether ID fields in BQ datasets' tables were anonymized successfully.\n\n## 🏁 Getting Started \u003ca name = \"getting_started\"\u003e\u003c/a\u003e\n\nThese instructions will get you a copy of the project up and running on your local machine for development and testing purposes.\n\n### Prerequisites\n\nWhat things you need to install the software and how to install them.\n\n- Python 3.6+ with these external packages:\n  - behave\n  - allure-behave\n  - pandas\n  - openpyxl\n  - tqdm\n  - pyhamcrest\n  - google\n  - google-cloud-biqquery\n  - protobuf\n- linux (Ubuntu)/Win10 OS\n- allure reporting tool\n  - on Win10 install using scoop\n  - on Ubuntu/linux install using linuxbrew\n- access to tested BQ data project\n- access to BQ API, have it set up and have proper roles\n- access to this repository\n\n### Get familiar with used external tools' documentation to really understand, what is going on\n\n- \u003ca href=\"https://behave.readthedocs.io/en/latest/index.html\"\u003ebehave framework\u003c/a\u003e\n- \u003ca href=\"http://allure.qatools.ru/\"\u003eallure reporting tool\u003c/a\u003e\n- \u003ca href=\"https://github.com/hamcrest/PyHamcrest\"\u003epyhamcrest\u003c/a\u003e\n- \u003ca href=\"https://github.com/tqdm/tqdm\"\u003etqdm\u003c/a\u003e\n- \u003ca href=\"https://pandas.pydata.org/pandas-docs/stable/index.html\"\u003epandas\u003c/a\u003e\n- \u003ca href=\"https://openpyxl.readthedocs.io/en/stable/\"\u003eopenpyxl\u003c/a\u003e\n- \u003ca href=\"https://pypi.org/project/google-cloud-bigquery/\"\u003egoogle-cloud-bigquery\u003c/a\u003e\n\nGoogle and protobuf packages had to be placed in setup.py file to ensure proper functionality of BQ API library package.\n\n### Installing\n\n1. Install Python (refer to documentation, how to do that on your OS)\n2. fire up your command line tool of choice and get to the directory, where you will want to clone the project from github\n3. clone this repo\n4. run \"python3 setup.py install\" if on ubuntu, or \"py setup.py install\" if on win10. On Win10, package \"pandas\" will not be installed, you will have to do it manually. See comment in the setup.py file for link. Download the package, and run command _pip install [path to package]/packagefile_\n\n## 🔧 Running the tests \u003ca name = \"tests\"\u003e\u003c/a\u003e\n\n1. In the console, be in the root folder of the project\n2. run command _\"behave -f allure_behave.formatter:AllureFormatter -f pretty -o allure-results .\\test\\features\\\"_ if on ubuntu, or _\"behave -f allure_behave.formatter:AllureFormatter -f pretty -o allure-results ./test/features\"_ if on Win10\n3. wait, until tests are finished\n4. failed test have BQ data saved in XLSX file with timestamped name in the _./reports_ folder.\n5. you can also display interactive HTML report. To do this, run _\"allure serve\"_ command in your console and the report will open in your default browser. It should be Firefox or Chrome.\n\n### Pseudo-random feature file test running\n\nAll datasets are divided into 5 feature files, with few exceptions. It is possible to run them either as it is specified above, or, if needed, it is possible to apply pseudo-random selection of the feature file.\n\nTo do that, run _\"python3 (or py on windows) manage.py -r\"_ command in the console.\n\nThis will pick one of the tags stored in the list in the _\"functions.py\"_ file and then run behave test framework, as usual, but only the feature file tagged by this tag will be actually run.\n\nThis process can be repeated as many times, as there are some tags, that were not picked, or \"exhausted\". When that happens, ValueError exception is caught, and you have to manually clear the \"config.json\" file.\n\nTo do that, use the utility _\"py manage.py -c\"_.\n\nYou can also run the utility with both parameters at once, so next time the pseudorandom function will be able to choose from full set of tags again. In this case, run command like this _\"py manage.py -r -c\"_.\n\n### Manage.py utility\n\nTo provide easier and faster work with behave coupled with allure reporting tool - since that console command can be quite long, you can use manage.py utility to cover these scenarios:\n\n- _py manage.py -r_ will run one randomly picked feature file from all tagged feature files. This feature file will not be ran again, until config.json is cleared.\n- _py manage.py -c_ will clear config.json file, which stores tags of feature files, which were already randomly run.\n- _py manage.py -b_ will run all feature files like this command _\"behave -f allure_behave.formatter:AllureFormatter -f pretty -o allure-results .\\test\\features\\\"_ would do.\n- _py manage.py -t \"@tag1\" -t \"@tag2\" etc..._ wil run all feature files or just some of their scenarios tagged by provided tags. Take care to enter the tags wrapped in \" \" !.\n- _py manage.py -h_ is always available by default and will display all available command with short descriptions.\n\n## ✍️ Authors \u003ca name = \"authors\"\u003e\u003c/a\u003e\n\n- [@bednaJedna](https://github.com/bednaJedna) - Idea \u0026 work\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fradekbednarik%2Fbq_anonymization_public","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fradekbednarik%2Fbq_anonymization_public","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fradekbednarik%2Fbq_anonymization_public/lists"}