{"id":26654132,"url":"https://github.com/malice-plugins/pdf","last_synced_at":"2025-04-11T07:18:36.363Z","repository":{"id":52076496,"uuid":"48825336","full_name":"malice-plugins/pdf","owner":"malice-plugins","description":"Malice PDF Plugin","archived":false,"fork":false,"pushed_at":"2019-01-07T15:01:16.000Z","size":520,"stargazers_count":16,"open_issues_count":1,"forks_count":11,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-29T22:09:58.523Z","etag":null,"topics":["docker","malice","malice-plugin","malware","malware-analysis","malware-analyzer","pdf","pdf-analyzer","pdf-malware","pdf-parsing","pdfid","peepdf","plugin"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/malice-plugins.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-12-31T00:33:10.000Z","updated_at":"2024-09-09T21:10:55.000Z","dependencies_parsed_at":"2022-09-06T07:43:47.579Z","dependency_job_id":null,"html_url":"https://github.com/malice-plugins/pdf","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malice-plugins%2Fpdf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malice-plugins%2Fpdf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malice-plugins%2Fpdf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/malice-plugins%2Fpdf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/malice-plugins","download_url":"https://codeload.github.com/malice-plugins/pdf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248358603,"owners_count":21090405,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","malice","malice-plugin","malware","malware-analysis","malware-analyzer","pdf","pdf-analyzer","pdf-malware","pdf-parsing","pdfid","peepdf","plugin"],"created_at":"2025-03-25T04:57:25.337Z","updated_at":"2025-04-11T07:18:36.328Z","avatar_url":"https://github.com/malice-plugins.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![pdf logo](https://github.com/malice-plugins/pdf/blob/master/docs/pdf.png)\n\n# malice-pdf\n\n[![Circle CI](https://circleci.com/gh/malice-plugins/pdf.png?style=shield)](https://circleci.com/gh/malice-plugins/pdf) [![License](http://img.shields.io/:license-mit-blue.svg)](http://doge.mit-license.org) [![Docker Stars](https://img.shields.io/docker/stars/malice/pdf.svg)](https://hub.docker.com/r/malice/pdf/) [![Docker Pulls](https://img.shields.io/docker/pulls/malice/pdf.svg)](https://hub.docker.com/r/malice/pdf/) [![Docker Image](https://img.shields.io/badge/docker%20image-58.9MB-blue.svg)](https://hub.docker.com/r/malice/pdf/)\n\nMalice PDF Plugin\n\n\u003e This repository contains a **Dockerfile** of **malice/pdf**. It runs [PDFiD](https://blog.didierstevens.com/programs/pdf-tools/) and [pdf-parser.py](https://blog.didierstevens.com/programs/pdf-tools/) on samples and will extract and _(eventually)_ submit extracted files as children back to malice for analysis.\n\n---\n\n### Dependencies\n\n- [malice/alpine](https://hub.docker.com/r/malice/alpine/)\n\n## Installation\n\n1. Install [Docker](https://www.docker.io/).\n2. Download [trusted build](https://hub.docker.com/r/malice/pdf/) from public [DockerHub](https://hub.docker.com): `docker pull malice/pdf`\n\n## Usage\n\n```bash\n$ docker run --rm -v /path/to/malware:/malware malice/pdf --help\n\nUsage: pdfscan [OPTIONS] COMMAND [ARGS]...\n\n  Malice PDF Plugin\n\n  Author: blacktop \u003chttps://github.com/blacktop\u003e\n\nOptions:\n  --version   print the version\n  -h, --help  Show this message and exit.\n\nCommands:\n  scan  scan a file\n  web   start web service\n```\n\n### Scanning\n\n```bash\n$ docker run --rm -v /path/to/malware:/malware malice/pdf scan --help\n\nUsage: pdfscan.py scan [OPTIONS] FILE_PATH\n\n  Malice PDF Plugin.\n\nOptions:\n  -v, --verbose            verbose output\n  -t, --table              output as Markdown table\n  -x, --proxy PROXY        proxy settings for Malice webhook endpoint\n                           [$MALICE_PROXY]\n  -c, --callback ENDPOINT  POST results back to Malice webhook\n                           [$MALICE_ENDPOINT]\n  --elasticsearch HOST     elasticsearch address for Malice to store results\n                           [$MALICE_ELASTICSEARCH_URL]\n  --timeout SECS           malice plugin timeout (default: 10)\n                           [$MALICE_TIMEOUT]\n  --extract PATH           where to extract the embedded objects to\n  -h, --help               Show this message and exit.\n```\n\nThis will output to stdout and POST to malice results API webhook endpoint.\n\n## Sample Output\n\n### [JSON](https://github.com/malice-plugins/pdf/blob/master/docs/results.json)\n\n```json\n{\n  \"pdf\": {\n    \"streams\": {},\n    \"peepdf\": {},\n    \"pdfid\": {\n      \"heuristics\": {\n        \"embeddedfile\": {\n          \"reason\": \"`/EmbeddedFile` flag(s) detected\",\n          \"score\": 0.9\n        },\n        \"nameobfuscation\": {\n          \"reason\": \"no hex encoded flags detected\",\n          \"score\": 0\n        },\n        \"suspicious\": {},\n        \"triage\": {\n          \"reason\": \"sample is likely malicious and requires further analysis\",\n          \"score\": 1\n        }\n      },\n      \"countChatAfterLastEof\": \"0\",\n      \"errorMessage\": \"\",\n      \"dates\": {\n        \"date\": []\n      },\n      \"nonStreamEntropy\": \"4.896895\",\n      \"header\": \"%PDF-1.1\",\n      \"version\": \"0.2.4\",\n      \"entropy\": \"\",\n      \"totalEntropy\": \"7.873045\",\n      \"isPdf\": \"True\",\n      \"keywords\": {\n        \"keyword\": [\n          {\n            \"count\": 9,\n            \"hexcodecount\": 0,\n            \"name\": \"obj\"\n          },\n          {\n            \"count\": 9,\n            \"hexcodecount\": 0,\n            \"name\": \"endobj\"\n          },\n          {\n            \"count\": 2,\n            \"hexcodecount\": 0,\n            \"name\": \"stream\"\n          },\n          {\n            \"count\": 2,\n            \"hexcodecount\": 0,\n            \"name\": \"endstream\"\n          },\n          {\n            \"count\": 1,\n            \"hexcodecount\": 0,\n            \"name\": \"xref\"\n          },\n          {\n            \"count\": 1,\n            \"hexcodecount\": 0,\n            \"name\": \"trailer\"\n          },\n          {\n            \"count\": 1,\n            \"hexcodecount\": 0,\n            \"name\": \"startxref\"\n          },\n          {\n            \"count\": 1,\n            \"hexcodecount\": 0,\n            \"name\": \"/Page\"\n          },\n          ...SNIP...\n          {\n            \"count\": 0,\n            \"hexcodecount\": 0,\n            \"name\": \"/Colors \u003e 2^24\"\n          }\n        ]\n      },\n      \"countEof\": \"1\",\n      \"streamEntropy\": \"7.970107\",\n      \"errorOccured\": \"False\"\n    }\n  }\n}\n```\n\n### [Markdown](https://github.com/malice-plugins/pdf/blob/master/docs/SAMPLE.md)\n\n---\n\n### pdf\n\n#### [PDFiD]\n\n- **PDF Header:** `%PDF-1.1`\n- **Total Entropy:** `7.873045`\n- **Entropy In Streams:** `7.970107`\n- **Entropy Out Streams:** `4.896895`\n- **Count %% EOF:** `1`\n- **Data After EOF:** `0`\n\n| Keyword        | Count |\n| -------------- | ----- |\n| obj            | 9     |\n| endobj         | 9     |\n| stream         | 2     |\n| endstream      | 2     |\n| xref           | 1     |\n| trailer        | 1     |\n| startxref      | 1     |\n| /Page          | 1     |\n| /Encrypt       | 0     |\n| /ObjStm        | 0     |\n| /JS            | 1     |\n| /JavaScript    | 1     |\n| /AA            | 0     |\n| /OpenAction    | 1     |\n| /AcroForm      | 0     |\n| /JBIG2Decode   | 0     |\n| /RichMedia     | 0     |\n| /Launch        | 0     |\n| /EmbeddedFile  | 1     |\n| /XFA           | 0     |\n| /Colors \u003e 2^24 | 0     |\n\n##### Embedded File\n\n\u003e **Score:** `50`\n\n- `/EmbeddedFile` flag(s) detected\n\n##### Triage\n\n\u003e **Score:** `150`\n\n- `/JS`: indicating javascript is present in the file.\n- `/JavaScript`: indicating javascript is present in the file.\n- `/OpenAction`: indicating automatic action to be performed when the page/document is viewed.\n\n##### Suspicious Properties\n\n\u003e **Score:** `50`\n\n- Page count of 1\n\n#### [pdf-parser]\n\n##### Stats\n\n- `Comment: 3`\n- `XREF: 1`\n- `Trailer: 1`\n- `StartXref: 1`\n- `Indirect object: 9`\n- `1: 5`\n- `/Action 1: 9`\n- `/Catalog 1: 1`\n- `/EmbeddedFile 1: 8`\n- `/Filespec 1: 7`\n- `/Font 1: 6`\n- `/Outlines 1: 2`\n- `/Page 1: 4`\n- `/Pages 1: 3`\n\n##### TAGS\n\n**file_name:**\n\n- `eicar-dropper.doc`\n\n**pestudio_blacklist_string:**\n\n- `JavaScript`\n\n##### Embedded Files\n\n| Object | Sha256                                                           |\n| ------ | ---------------------------------------------------------------- |\n| 8      | eb0ae2d1cd318dc1adb970352e84361f9b194ff14f45b0186e4ed6696900394a |\n\n##### Carved Content\n\n**EmbeddedFile:**\n\n```\ns\u003c\u003c++\u003c\u003c            /Names [(eicar-dropper.doc) 7 0 R]    /OpenAction 9 0 R\n```\n\n**OpenAction:**\n\n```\n\u003c\u003c\n /Type /Action\n /S /JavaScript\n /JS (this.exportDataObject({ cName: \"eicar-dropper.doc\", nLaunch: 2 });)\n\u003e\u003e\n```\n\n**JS:**\n\n```javascript\n(this.exportDataObject({ cName: \"eicar-dropper.doc\", nLaunch: 2 })    ; )\n```\n\n---\n\n## Documentation\n\n- [To write results to ElasticSearch](https://github.com/malice-plugins/pdf/blob/master/docs/elasticsearch.md)\n- [To create a PDF scan micro-service](https://github.com/malice-plugins/pdf/blob/master/docs/web.md)\n- [To post results to a webhook](https://github.com/malice-plugins/pdf/blob/master/docs/callback.md)\n\n## Issues\n\nFind a bug? Want more features? Find something missing in the documentation? Let me know! Please don't hesitate to [file an issue](https://github.com/malice-plugins/pdf/issues/new)\n\n## CHANGELOG\n\nSee [`CHANGELOG.md`](https://github.com/malice-plugins/pdf/blob/master/CHANGELOG.md)\n\n## Contributing\n\n[See all contributors on GitHub](https://github.com/malice-plugins/pdf/graphs/contributors).\n\nPlease update the [CHANGELOG.md](https://github.com/malice-plugins/pdf/blob/master/CHANGELOG)\n\n## Credits\n\nHeavily (if not entirely) influenced by CSE-CST's [alsvc_pdfid](https://bitbucket.org/cse-assemblyline/alsvc_pdfid) and [alsvc_peepdf](https://bitbucket.org/cse-assemblyline/alsvc_peepdf)\n\n## TODO\n\n- [x] add PDFiD\n- [x] add pdf-parser for streams\n- [ ] ~~add peepdf for JS~~\n- [ ] add uwsgi to serve webserver (maybe nginx?)\n- [ ] float PDFiD errors up like I do with pdf-parser _(handles errors when file is not a PDF)_\n- [ ] check if PDF is too big (max size 3000000 ??)\n- [ ] add smart timeout to avoid DoS samples\n- [ ] use https://github.com/unidoc/unidoc instead?? I miss you golang, I miss you soooo hard :tired_face:\n\n## License\n\nMIT Copyright (c) 2016-2018 **blacktop**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmalice-plugins%2Fpdf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmalice-plugins%2Fpdf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmalice-plugins%2Fpdf/lists"}