{"id":18686742,"url":"https://github.com/datenanfragen/android-data-safety-label-analysis","last_synced_at":"2026-04-28T21:33:42.166Z","repository":{"id":94148133,"uuid":"537754636","full_name":"datenanfragen/android-data-safety-label-analysis","owner":"datenanfragen","description":"Source code for reproducing our analysis on the data safety labels on Android","archived":false,"fork":false,"pushed_at":"2022-09-19T14:44:00.000Z","size":5817,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-12-28T00:43:18.147Z","etag":null,"topics":["android","data-protection","data-safety-labels","google-play","privacy","privacy-labels","privacy-research"],"latest_commit_sha":null,"homepage":"https://www.datarequests.org/blog/android-data-safety-labels-analysis/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datenanfragen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-17T09:07:23.000Z","updated_at":"2023-10-27T02:29:40.000Z","dependencies_parsed_at":"2023-07-30T08:02:11.058Z","dependency_job_id":null,"html_url":"https://github.com/datenanfragen/android-data-safety-label-analysis","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datenanfragen%2Fandroid-data-safety-label-analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datenanfragen%2Fandroid-data-safety-label-analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datenanfragen%2Fandroid-data-safety-label-analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datenanfragen%2Fandroid-data-safety-label-analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datenanfragen","download_url":"https://codeload.github.com/datenanfragen/android-data-safety-label-analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239541849,"owners_count":19656102,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["android","data-protection","data-safety-labels","google-play","privacy","privacy-labels","privacy-research"],"created_at":"2024-11-07T10:28:48.428Z","updated_at":"2025-11-08T00:30:32.239Z","avatar_url":"https://github.com/datenanfragen.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Worrying confessions: A look at data safety labels on Android\n\n\u003e Source code for reproducing our [analysis on the data safety labels on Android](https://www.datarequests.org/blog/android-data-safety-labels-analysis/).\n\n![Stylized photo with a blue tint of food containers, above that the text: “Analysis: Data safety labels on Android”](https://www.datarequests.org/blog/android-data-safety-labels-analysis/analysis-data-safety-labels-on-android.jpg)\n\nWe [analyzed the new data safety section on the Google Play Store](https://www.datarequests.org/blog/android-data-safety-labels-analysis/) and found popular apps admitting to collecting and sharing highly sensitive data for advertising and tracking. More than one quarter of apps transmitted tracking data not declared in their data safety label. \n\nThis repository contains the source code for reproducing this analysis. It includes scripts for downloading data safety labels and APKs of Android apps, automated traffic collection of apps, and generating statistics and graphs of the results. \n\n## Steps for running an analysis\n\nThe following steps are necessary for running the analysis:\n\n1. Install and setup [`googleplay`](https://github.com/89z/googleplay) (see below).\n2. Setup emulator for traffic collection (see below).\n3. Create a PostgreSQL database according to `schema.sql` and copy `.env.sample` to `.env`, filling in the correct values.\n4. Install the Node dependencies using `yarn`. Create a Python venv and install the dependencies from `requirements.txt`.\n5. Download data safety labels: `npx tsm src/fetch.ts`\n6. Download APKs: `./src/download.sh \u003cpath_to_googleplay_binary\u003e \u003capp_list\u003e \u003cout_dir\u003e`\n7. Run apps and collect traffic: `npx tsm src/traffic.ts --appsDir \u003cdir_with_apks\u003e --avdName \u003cavd_name\u003e --avdSnapshotName \u003cavd_snapshot_name\u003e`\n8. Compile statistics on labels: `npx tsm src/data.ts \u003clabel_date\u003e`\n9. Compile statistics on recorded traffic: `npx tsm src/traffic-analysis.ts \u003clabel_date\u003e`\n10. Generate graphs using `src/graphs.ipynb`.\n\n## Setup for downloading APKs\n\nWe’re using [`googleplay`](https://github.com/89z/googleplay) to download APKs of Android apps. You need to compile that and log in:\n\n```sh\ngit clone https://github.com/89z/googleplay.git\ncd googleplay/cmd/googleplay\ngo build\n\n./googleplay -email \u003cemail\u003e -password \u003capp_password\u003e\n./googleplay -device -p 1 # armeabi-v7a\n```\n\n## Setup for traffic collection\n\nWe’re using an emulator running Android 11 for the traffic collection. The emulator needs to be set up to accept mitmproxy’s root CA, minimize unrelated background traffic, and [Frida](https://github.com/frida/frida) needs to be installed so we can use [objection](https://github.com/sensepost/objection) to bypass certificate pinning. Then, we create a snapshot that the emulator is reset to after each app.\n\nYou’ll need the [Android command line tools](https://developer.android.com/studio/command-line/). You’ll also need to install [mitmproxy](https://mitmproxy.org/), Frida, and objection. Then, you can create and prepare the emulator like this:\n\n```sh\n# Fetch image.\nsdkmanager \"system-images;android-30;google_apis;x86_64\"\n# Create AVD.\navdmanager create avd --abi google_apis/x86_64 --device \"pixel_2\" --force --name \"dsl\" --package \"system-images;android-30;google_apis;x86_64\"\n\n# Start emu for the first time.\nemulator -avd \"dsl\" -no-audio -no-boot-anim -writable-system -http-proxy 127.0.0.1:8080\n\n# --- Installing our CA cert. ---\n\n# Yields \u003chash\u003e.\nopenssl x509 -inform PEM -subject_hash_old -in ~/.mitmproxy/mitmproxy-ca-cert.pem | head -1\ncp ~/.mitmproxy/mitmproxy-ca-cert.pem \u003chash\u003e.0\n\nadb root\nadb shell avbctl disable-verification\nadb disable-verity\nadb reboot\nadb root\nadb remount\n\nadb push \u003chash\u003e.0 /system/etc/security/cacerts/\nadb shell chmod 644 /system/etc/security/cacerts/\u003chash\u003e.0\nadb reboot\nadb root\n\n# Disable captive portal.\nadb shell 'settings put global captive_portal_detection_enabled 0'\nadb shell 'settings put global captive_portal_server localhost'\nadb shell 'settings put global captive_portal_mode 0'\n\n# Uninstall unnecessary Google apps to avoid their background traffic.\nadb shell 'pm uninstall --user 0 com.android.chrome'\nadb shell 'pm uninstall --user 0 com.google.android.apps.docs'\nadb shell 'pm uninstall --user 0 com.google.android.apps.maps'\nadb shell 'pm uninstall --user 0 com.google.android.apps.messaging'\nadb shell 'pm uninstall --user 0 com.google.android.apps.photos'\nadb shell 'pm uninstall --user 0 com.google.android.apps.pixelmigrate'\nadb shell 'pm uninstall --user 0 com.google.android.apps.wellbeing'\nadb shell 'pm uninstall --user 0 com.google.android.apps.youtube.music'\nadb shell 'pm uninstall --user 0 com.google.android.gm'\nadb shell 'pm uninstall --user 0 com.google.android.googlequicksearchbox'\nadb shell 'pm uninstall --user 0 com.google.android.videos'\nadb shell 'pm uninstall --user 0 com.google.android.youtube'\nadb shell 'pm uninstall --user 0 com.google.mainline.telemetry'\n\n# Set up Frida.\nadb shell getprop ro.product.cpu.abi # should be x86_64\nwget https://github.com/frida/frida/releases/download/15.1.12/frida-server-15.1.12-android-x86_64.xz\n7z x frida-server-15.1.12-android-x86_64.xz\n\nadb push frida-server-15.1.12-android-x86_64 /data/local/tmp/frida-server\nadb shell chmod 777 /data/local/tmp/frida-server\n\nadb shell \"nohup /data/local/tmp/frida-server \u003e/dev/null 2\u003e\u00261 \u0026\"\nfrida-ps -U | grep frida # should have `frida-server`\n\n# Set up honey data.\n\nadb emu avd snapshot save dsl-honey-data\n\n# Stop the emulator by pressing the X button (no shutdown).\n```\n\n## License\n\nThis code is licensed under the MIT license, see the [`LICENSE`](/LICENSE) file for details.\n\nThe [data set of our analysis](https://doi.org/10.5281/zenodo.7088557) is also available.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatenanfragen%2Fandroid-data-safety-label-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatenanfragen%2Fandroid-data-safety-label-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatenanfragen%2Fandroid-data-safety-label-analysis/lists"}