{"id":13717540,"url":"https://github.com/quarkslab/aosp_dataset","last_synced_at":"2026-02-03T10:38:40.409Z","repository":{"id":43928911,"uuid":"459081649","full_name":"quarkslab/aosp_dataset","owner":"quarkslab","description":"Large Commit Precise Vulnerability Dataset based on AOSP CVE","archived":false,"fork":false,"pushed_at":"2023-05-12T18:57:06.000Z","size":421,"stargazers_count":65,"open_issues_count":0,"forks_count":6,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-09-05T04:40:08.693Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quarkslab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-02-14T08:47:37.000Z","updated_at":"2025-05-30T13:47:33.000Z","dependencies_parsed_at":"2024-01-17T09:21:08.100Z","dependency_job_id":"0aed8c3b-073c-4488-afaf-38cd172a8fec","html_url":"https://github.com/quarkslab/aosp_dataset","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/quarkslab/aosp_dataset","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quarkslab%2Faosp_dataset","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quarkslab%2Faosp_dataset/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quarkslab%2Faosp_dataset/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quarkslab%2Faosp_dataset/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quarkslab","download_url":"https://codeload.github.com/quarkslab/aosp_dataset/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quarkslab%2Faosp_dataset/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29041867,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-03T10:09:22.136Z","status":"ssl_error","status_checked_at":"2026-02-03T10:09:16.814Z","response_time":96,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T00:01:23.761Z","updated_at":"2026-02-03T10:38:40.388Z","avatar_url":"https://github.com/quarkslab.png","language":"Python","funding_links":[],"categories":["Dependency intelligence"],"sub_categories":["Vulnerability information exchange"],"readme":"# Building a Commit-level Dataset of Real-world Vulnerabilities\n\nThis repository is the companion for the paper Building a Commit-level Dataset\nof Real-world Vulnerabilities published in CODASPY 2022.\n\n\n## Installation\n\nYou can install the code of the helpers by cloning this repository and using `pip install .` (he trailing dot is important).\n\n\nNote that the code targets `Python 3.10+` and has only been tested on `Debian`.\n\n## Commit level vulnerabilities\n\nThis dataset contains 1900 vulnerabilities information at a fix commit level for\nthe Android Open Source Project. The information are available in directory\n[`cves/`](cves/) using one file per vulnerability. The schema used is described\nin this [file](schemas/AospCve.schema.json) and an helper Python class is\navailable [here](src/aosp_dataset/aospcve.py).\n\n### Usage\n\n```python\nimport json\nimport aosp_dataset\n\ncve_2012_6701 = aosp_dataset.AospCve(\n    **json.load(open(\"cves/CVE-2012-6701.json\"))\n)\n```\n\n## Compiled dataset\nFor some vulnerabilities, the pre-compiled binaries are also available. The\nlinks are found in the precompiled [directory](precompiled/links.json) and the\nschema is available [here](schemas/precompiled.schema.json). An helping class is\nalso available [here](src/compiled.py)\n\nThe complete dataset is around ~120 Gb with about 700 precompiled CVEs.\n\n### Compiled CVE Layout\n\nFor each tuple (CVE-ID, commit), the files are four directories, with two\nsubdirectories each. In the `vuln` one, binaries before the fix commit was\napplied and in the `fix` one after the fixing commit.\n\nEach binary is prefixed by its hash to prevent names collision. Some directories may be empty if the build was incomplete.\n\n\n### Usage of the helper class\n\n```python\nimport json\nimport aosp_dataset\n\nprecompiled_dataset = aosp_dataset.PrecompiledDataset(\n    **json.load(open(\"precompiled/links.json\"))\n)\n```\n\n### Compilation information\n\nEvery CVE was compiled with the default build options of AOSP.\n\n\n### Example\n```console\n$ tree 017838585617f0e492ede866750fcfb8ed77830b\n017838585617f0e492ede866750fcfb8ed77830b\n├── arm\n│   ├── fix\n│   │   ├── 4d7eb89a9529df46d1ae0b1039594705c4a4b6725e09268d97ae85dc467b15c0_libnfc_nci_jni.so\n│   │   ├── afe25ad8735bf7e78ba45d39387dd4316c20f885c2883732a1c8894b0a2f85dc_libnfc_nci_jni.so\n│   │   └── files.json\n│   └── vuln\n│       ├── 92aaa2a42659571150c4331ac22b89d5390ea1bf3956e37f7d8fec5a9d740908_libnfc_nci_jni.so\n│       ├── d15122acd2474f7a24674a5ea732276396a31c402852f493f1a0d7558853f840_libnfc_nci_jni.so\n│       └── files.json\n├── arm64\n│   ├── fix\n│   │   ├── 2980aa7fc0735dd0cc6ba946ba3632e787d3eae557cc4f86375323fd55869636_libnfc_nci_jni.so\n│   │   ├── f921f272f78c3e99a6196d4e99eff2a1abfc2aff00a315bde681ca1fb22ff859_libnfc_nci_jni.so\n│   │   └── files.json\n│   └── vuln\n│       ├── 63143671c7616c85ad1215f53e60c4da978f9c78b09e7cca6f347edab39ce7ff_libnfc_nci_jni.so\n│       ├── 8e4ce03aaef50adab4b1e1d168fe6ca6eafd5b59a3b61d4ada6abf79ade473d3_libnfc_nci_jni.so\n│       └── files.json\n├── functions.json\n├── x64\n│   ├── fix\n│   │   ├── 1e9ea23155ce81bddcd2179dd5ceaea748c18fa20e8b901f4836c62095301e24_libnfc_nci_jni.so\n│   │   ├── c2e8f6ef9fb081b2b817fcfc36176cf333992987ee17b7ff49d2861e053550a8_libnfc_nci_jni.so\n│   │   └── files.json\n│   └── vuln\n│       ├── 30611971a2dab970a1ed6e3a8da88ac61b45c59aedfc90432488c0b9532326a4_libnfc_nci_jni.so\n│       ├── 440cfd9abd9feda87be89c7ed97aad776ac884ce8ee723235e0d5f9b200d4c1c_libnfc_nci_jni.so\n│       └── files.json\n└── x86\n    ├── fix\n    │   ├── 48f844d1e6909da7ff0a2ae7e4e03c81e8b2d854048c86e9dc10e2af528ee6ab_libnfc_nci_jni.so\n    │   ├── e7d7162d279595bfba5f724fa2ad7941d6223b4c1ede6ba1228831193477a70f_libnfc_nci_jni.so\n    │   └── files.json\n    └── vuln\n        ├── 182f7ce7eab20238d6bb83d57227212f1b964c512c70761f26117e22b9ca753b_libnfc_nci_jni.so\n        ├── f76e2bff59ae3bd91b41a23285b4402284b563647ce84f553ea328709470f3a4_libnfc_nci_jni.so\n        └── files.json\n\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquarkslab%2Faosp_dataset","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquarkslab%2Faosp_dataset","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquarkslab%2Faosp_dataset/lists"}