{"id":13734590,"url":"https://github.com/FFRI/PackerDetectionToolEvaluation","last_synced_at":"2025-05-08T10:32:10.407Z","repository":{"id":75343717,"uuid":"307269153","full_name":"FFRI/PackerDetectionToolEvaluation","owner":"FFRI","description":"Evaluation of packer type estimation/detection tools","archived":false,"fork":false,"pushed_at":"2021-03-24T06:21:48.000Z","size":354,"stargazers_count":12,"open_issues_count":0,"forks_count":5,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-15T03:28:01.453Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FFRI.png","metadata":{"files":{"readme":"README.md","changelog":"change_dataset_labels.py","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-10-26T05:22:32.000Z","updated_at":"2025-01-09T20:28:30.000Z","dependencies_parsed_at":"2024-01-06T10:15:13.859Z","dependency_job_id":"5278663d-8cb7-41e9-b10a-dd30d506d3de","html_url":"https://github.com/FFRI/PackerDetectionToolEvaluation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FFRI%2FPackerDetectionToolEvaluation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FFRI%2FPackerDetectionToolEvaluation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FFRI%2FPackerDetectionToolEvaluation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FFRI%2FPackerDetectionToolEvaluation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FFRI","download_url":"https://codeload.github.com/FFRI/PackerDetectionToolEvaluation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253045855,"owners_count":21845785,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T03:00:57.766Z","updated_at":"2025-05-08T10:32:10.058Z","avatar_url":"https://github.com/FFRI.png","language":"Python","funding_links":[],"categories":[":books: Literature"],"sub_categories":["Documentation"],"readme":"# Evaluation of packer type estimation/detection tools\r\n\r\nWe evaluated two packer type estimation/detection tools ([pypeid](https://github.com/FFRI/pypeid) and [Detect It Easy (DIE)](https://github.com/horsicq/Detect-It-Easy)) to fix [this issue](https://github.com/FFRI/ffridataset-scripts/issues/1).\r\n\r\n## Summary\r\n\r\nDIE can detect packed binaries and estimate the type of packer with high precision compared with pypeid. However, the detection coverage of DIE is slightly lower than pypeid. See [results](#Results) for more details.\r\n\r\n## Dataset used for evaluation\r\n\r\nWe use two datasets for evaluating packer type estimation/detection tools.\r\n\r\n- [PackingData](https://github.com/chesvectain/PackingData)\r\n- [RCE\\_Lab](https://github.com/apuromafo/RCE_Lab)\r\n\r\n### PackingData\r\n\r\nThis dataset contains both packed and normal (i.e., non-packed) binaries, which are used in the paper titled [\"All-in-One Framework for Detection, Unpacking, and Verification for Malware Analysis.\"](https://www.hindawi.com/journals/scn/2019/5278137/) Since it contains both packed and normal binaries, we use it for the performance evaluation of both the packer type estimation and detection.\r\n\r\n**Specification**\r\n\r\n- It contains 458 normal binaries.\r\n- It contains 2469 packed binaries.\r\n    - These binaries are created by packing 130 PE files using the following 19 packers (but 129 PE files for JDPack):\r\n        - ASPack, BeRoEXEPacker, FSG, JDpack, MEW, MPRESS, Molebox, NSPack, Neolite, PECompact, Petite, Packman, RLPack, UPX, WinUpack, Yoda’s Crypter, Yoda’s Protector, eXpressor, exe32pack\r\n\r\n**Notes about PackingData dataset (2021/03/11)**\r\n\r\nWe noticed that PackingData dataset contains some mislabeled samples after publishing the [first evaluation result](https://github.com/FFRI/PackerDetectionToolEvaluation/tree/ae0f653ade67e5e0d9d0d7d996dd9816e09a1a3c).\r\n(For example, `PackingData/Notpacked/avs_check_x86.exe` is [an UPX packed-binary](https://www.virustotal.com/gui/file/2fd27a3f6c9644b8105c7934d0f41fe10b056e327491df37750d634336f4b2db/details), but labeled as `NotPacked`.)\r\n\r\nSo, we changed the labels of some samples for the precise evaluation.\r\nTo fix the labeles of mislabeled samples, please run [change\\_dataset\\_labels.py](./change_dataset_labels.py) script.\r\n\r\nTPRs and FPRs slightly differs from the previous result, but the [conclusion does not change](#summary).\r\n\r\n### RCE\\_Lab\r\n\r\nThis dataset contains binaries packed by various different packers. We only use the binaries in `tuts4you/Unpack*` for evaluation. Since this dataset does not contain normal binaries, we mainly use it for evaluating the performance of packer type estimation.\r\n\r\n## Results\r\n\r\n### PackingData\r\n\r\nThe following table shows the comparison of packer type estimation performance between pypeid and DIE. You can see the DIE's improvement of estimation performance to pypeid.\r\n\r\n|     | pypeid | DIE   |\r\n| --- | -----: | ----: |\r\n| Accuracy | 73.2%  | **84.9%** |\r\n\r\nThe following table shows the comparison of packer detection performance between pypeid and DIE. You can see the great reduction of FPR for DIE compared with pypeid.\r\n\r\n|     | pypeid | DIE   |\r\n| --- | -----: | ----: |\r\n| TPR | 94.5%  | 93.5% |\r\n| FPR | **54.8%**  | **0.7%** |\r\n\r\n### RCE\\_Lab\r\n\r\nThe following table shows the comparison of packer type estimation performance between pypeid and DIE. You can also see the improvement of estimation performance in this dataset.\r\n\r\n|      | pypeid | DIE   |\r\n| ---- | -----: | ----: |\r\n| Accuracy  | 65.1%  | **69.0%** |\r\n\r\nThe following table shows the comparison of packer detection performance between pypeid and DIE. We do not show the FPR because this dataset does not contain normal binaries. The packer detection performance of DIE is slight lower than pypeid.\r\n\r\n|     | pypeid |  DIE  |\r\n| --- | -----: | ----: |\r\n| TPR | 88.2%  | 83.1% |\r\n\r\n## How to reproduce the results?\r\n\r\n### Tested platform\r\n\r\n- Ubuntu 20.04 LTS on WSL on Windows 10 version 1909\r\n\r\n### Requirements\r\n\r\n- Python 3.6\r\n- [Poetry](https://python-poetry.org/)\r\n\r\n### Prepare dataset\r\n\r\n```\r\n$ git clone --depth=1 https://github.com/chesvectain/PackingData.git dataset/PackingData\r\n$ git clone --depth=1 https://github.com/apuromafo/RCE_Lab.git\r\n$ mkdir dataset/UnpackMe\r\n$ mv RCE_Lab/tuts4you/Unpack* dataset/UnpackMe\r\n$ python change_dataset_labels.py\r\n```\r\n\r\n### Resolve dependencies\r\n\r\n```\r\n$ sudo apt install unrar # To resolve rarfile's dependencies manually\r\n$ poetry shell\r\n$ poetry update\r\n```\r\n\r\n### Scan with pypeid\r\n\r\n```\r\n$ python peid_packer_scan.py\r\n$ python peid_packer_scan_statistics.py\r\nPackingData\r\n- PackingData.json\r\n  - Total: 2476\r\n    - Scan-failed samples: 0\r\n    - Samples scanned: 2476\r\n       - Purely detected as packed: 129\r\n       - Excessively detected as packed (containing true label): 1810\r\n       - Purely detected as non-packed: 137\r\n       - Excessively detected as packed (not containing true label): 400\r\n- Notpacked.json\r\n  - Total: 451\r\n    - Scan-failed samples: 0\r\n    - Samples scanned: 451\r\n       - Purely detected as packed: 0\r\n       - Excessively detected as packed (containing true label): 0\r\n       - Purely detected as non-packed: 204\r\n       - Excessively detected as packed (not containing true label): 247\r\nCategorical Accuracy:  0.7321489579774513\r\nTPR:  0.9446688206785138\r\nFPR:  0.5476718403547672\r\n...\r\n```\r\n\r\n### Scan with DIE\r\n\r\n```\r\n$ wget https://github.com/horsicq/DIE-engine/releases/download/3.00/die_lin64_portable_3.00.tar.gz\r\n$ mkdir die_lin64_portable_3.00\r\n$ tar -zxvf die_lin64_portable_3.00.tar.gz -C die_lin64_portable_3.00\r\n$ python die_packer_scan.py\r\n$ python die_packer_scan_statistics.py\r\nPackingData\r\n- PackingData.json\r\n  - Total:  2476\r\n    - Scan-failed samples: 0\r\n    - Samples scanned: 2476\r\n       - Purely detected as packed: 2037\r\n       - Excessively detected as packed (containing true label): 146\r\n       - Purely detected as non-packed: 161\r\n       - Excessively detected as packed (not containing true label): 132\r\n- Notpacked.json\r\n  - Total:  451\r\n    - Scan-failed samples: 0\r\n    - Samples scanned: 451\r\n       - Purely detected as packed: 0\r\n       - Excessively detected as packed (containing true label): 0\r\n       - Purely detected as non-packed: 448\r\n       - Excessively detected as packed (not containing true label): 3\r\nCategorical Accuracy:  0.8489921421250427\r\nTPR:  0.9349757673667205\r\nFPR:  0.0066518847006651885\r\n...\r\n```\r\n\r\n## Scan results format\r\n\r\nYou can get the scan result as JSON arrays. Each element of this JSON arrays is as follows.\r\n\r\n```\r\n{\r\n  \"path\": The location where the target executable file at the time of judgment existed,\r\n  \"name\": The name of the target executable file,\r\n  \"scan\": Judgment result,\r\n  \"detectable\": Success or failure of packer type judgment\r\n  \"feature\": [\r\n    Label of the target executable file\r\n  ]\r\n}\r\n```\r\n\r\n## Author\r\n\r\nTsubasa Kuwabara. © FFRI Security, Inc. 2020\r\n\r\n## License\r\n\r\n[Apache version 2.0](./LICENSE)\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFFRI%2FPackerDetectionToolEvaluation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FFFRI%2FPackerDetectionToolEvaluation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFFRI%2FPackerDetectionToolEvaluation/lists"}