{"id":22433729,"url":"https://github.com/nlitsme/pypdfcrack","last_synced_at":"2025-08-02T20:33:07.149Z","repository":{"id":71128537,"uuid":"73072027","full_name":"nlitsme/pyPdfCrack","owner":"nlitsme","description":"Investigation in PDF encryption","archived":false,"fork":false,"pushed_at":"2023-08-22T20:06:46.000Z","size":35,"stargazers_count":17,"open_issues_count":0,"forks_count":7,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-08-01T13:53:59.316Z","etag":null,"topics":["file-format","pdf-encryption","pdf-parser","reverse-engineering"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nlitsme.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-11-07T11:31:42.000Z","updated_at":"2025-04-21T15:49:09.000Z","dependencies_parsed_at":"2025-08-01T13:31:04.847Z","dependency_job_id":"04326b23-a182-4e81-9113-fde5ce2a02f9","html_url":"https://github.com/nlitsme/pyPdfCrack","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nlitsme/pyPdfCrack","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2FpyPdfCrack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2FpyPdfCrack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2FpyPdfCrack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2FpyPdfCrack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nlitsme","download_url":"https://codeload.github.com/nlitsme/pyPdfCrack/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2FpyPdfCrack/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268448362,"owners_count":24252019,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-02T02:00:12.353Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["file-format","pdf-encryption","pdf-parser","reverse-engineering"],"created_at":"2024-12-05T22:15:45.300Z","updated_at":"2025-08-02T20:33:07.139Z","avatar_url":"https://github.com/nlitsme.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Pdf Encryption\n==============\n\nUsing sample data from:\n * https://github.com/itext/itext7/tree/develop/kernel/src/test/resources/com/itextpdf/kernel/pdf/PdfEncryptionTest\n\nExample code demonstating how pdf encryption with certificates works.\n\n    python3 decryptpdf.py encryptedWithCertificateAes128.pdf test.p12 kspass\n\nWill output the following decrypted data:\n\n    priv b'a2d0faa52fee8453fe20150505050505'\n    cert b'37373532393731340808080808080808'\n    seed = b'cd532522a3f5943f66479ab8b0aa32d0609a0e18'\n    mkey= b'b1ce00167f01fc074d49d6fe18069b942ed9c428'\n    b'Author' b'Alexander Chingarev\\r\\r\\r\\r\\r\\r\\r\\r\\r\\r\\r\\r\\r'\n    b'CreationDate' b\"D:20160809103103-03'00'\\t\\t\\t\\t\\t\\t\\t\\t\\t\"\n    b'Creator' b'iText 6\\t\\t\\t\\t\\t\\t\\t\\t\\t'\n    b'ModDate' b\"D:20160809103103-03'00'\\t\\t\\t\\t\\t\\t\\t\\t\\t\"\n    b'Producer' b'iText\\xae 7.0.1-SNAPSHOT \\xa92000-2016 iText Group NV (AGPL-version)\\x02\\x02'\n\nThe repeating sequences at the end of each decrypted text are the PKCS7 padding.\n\n\nPdf Certificate Encryption\n==========================\n\nThe certificate encryption works as follows:\n\nTo be able to read a pdf, you need the corresponding certificate, for which you need a password.\n\nThe certificate contains two encrypted parts, both encrypted with the same password.\n\n    +---------------------------------------------+\n    |  Certificate.p12                            |\n    +--------+-----+-------+----------------------+\n    | rc2-40 |salt2| count | encrypted ownercert  |\n    +--------+-----+-------+----------------------+\n    | 3des-96|salt3| count | encrypted privatekey |\n    +--------+-----+-------+----------------------+\n\n\nThe first is the certificate owner information, basically a x.509 certificate.\nThis part is encrypted using 40bit RC2, the key can easily be brute forced, less than a day's work on a modern laptop.\nThe second part contains the private rsa key for the owner certificate.\nThis part is encrypted using Triple-DES with a 192 bit key.\nBoth the 40bit RC2 key and 192bit 3DES key are derived from the certificate password by repeatedle doing a SHA1 of the\nsalted password string.\n\n\n                                                         \u003cencrypted ownercert\u003e\n                                                                  |\n    +----------+                                                  v\n    | password |----+---\u003e[genkey, salt2, 1]------\u003e \u003ckey2\u003e ----\u003e [rc2] ----\u003e \u003ccertificate\u003e\n    +----------+    |                                             |\n                    +---\u003e[genkey, salt2, 2]------\u003e \u003civ2\u003e ---------+\n                    |\n                    +---\u003e[genkey, salt3, 2]------\u003e \u003civ3\u003e ---------+\n                    |                                             |\n                    +---\u003e[genkey, salt3, 1]------\u003e \u003ckey3\u003e ----\u003e [3des] ----\u003e \u003cprivatekey\u003e\n                                                                  ^\n                                                                  |\n                                                         \u003cencrypted privatekey\u003e\n\n\nSo, by first cracking the 40bit RC2 key, and then using this as the target for a dictionary attack,\none might be able to crack the encrypted rsa private key.\n\nNote that both rc2 and 3des are used in cbc mode, with a IV calculated by the same salted keygeneration algorithm.\nWhen cracking the RC2 key, you don't actually need the IV, just skip the first block, and use that as your\nbrute force target. The IV for the second block will the the first cipher block.\n\n\n                                         +------------------------+\n    \u003cprivatekey\u003e                         | Recipients(from pdf)   |\n          |                              +------------------------+\n          |    +-------------------------| rsa-encrypted(rc2-key) |\n          v    V                         +------------------------+\n        [rsa decrypt]                    | rc2-encrypted(seed)    |\n               |                         +------------------------+\n        \u003cpkcs1.5 padded key\u003e                         |\n               |                                     v\n               +-------------------------------[rc2 decrypt]\n                                                     |\n                                                     v\n                                                   \u003cseed\u003e\n                                                     |\n                                                     v\n                                                   [sha1]\n                                                     |\n                                                     v\n                                                   \u003cmasterkey\u003e\n\n\nNow that we have decrypted the certificate, in the PDF there is the `Recipients` string in the `Encrypt` dictionary.\nThis contains the pkcs8 encoded decryption seed.\nFirst there is the RSA pubkey-encrypted 128-bit RC2 key, which decrypts the seed + permissions string.\n\nThe `seed` is used to calculate a `mkey` by taking the sha1 of the seed, the recipient, and perms.\nThe `mkey` is used to calculate per-object by taking the md5 of the mkey and the object id.\nThe objectkey is used to decrypt each object using AES.\n\n    +------------+-+-+-+-+-+---+---+---+---+\n    | masterkey  | oid |gen| s | A | l | T |  ---\u003e [md5] ---\u003e \u003cobjectkey\u003e\n    +------------+-+-+-+-+-+---+---+---+---+                      |\n                                                                  v\n                                              \u003cciphertext\u003e ---\u003e [AES] --\u003e \u003cplaintext\u003e\n\n\n\nCracking a PKCS12 certificate\n=============================\n\nhexdump of the first couple of encrypted and decrypted blocks of the 40-bit RC2 encrypted certificate in test.p12:\n\n    2408c5d658c61cad 43036642daa47c52 9023df09863bd662 28c700fee6c81f86 ac5d3c4debda9148 0ac6535dd9574d06  \u003c\u003c cipher text\n    3082____3082____ 060b2a864886f70d 010c0a0103a082__ __3082____060a2a 864886f70d010916 01a082____0482__  \u003c\u003c plain text\n\nThe length dependent parts have been replaced with '\\_\\_\\_\\_'.\nAs you can see the 2nd block contains part of the 'pkcs12' OID representation,\nand will be the same for all PKCS12 encoded data.\n\nSo we can use this as the known plain + cipher pair target for the brute force encryption.\nSince the cipher is used in CBC mode, we have to XOR the plaintext with the 1st cipher block\n\n    2408c5d658c61cad XOR 060b2a864886f70d  = 2203ef501040eba0\n\nAnd then do a full 40-bit search for the key:\n\n    openssl_rc2_crack -e 43036642daa47c52 -p 2203ef501040eba0 -v\n\nRunning on a single core, this will take about 5 days.\nUsing the `-f` and `-t` parameters you can run several instances searching different parts\nof the keyspace.\n\n    ./openssl_rc2_crack -e 43036642daa47c52 -p 2203ef501040eba0 -f 0x6896000000 -t 0x6896300000 -v\n    FOUND key: 689629ff1a\n\nNote that this is the key in reverse byte order.\n\nNow we can search for the passphrase using the following commandline.\n\n    cat wordlist.txt | ./openssl_pass_crack -s 68e8f778efe0db98453532b7ede8e0c09830ec81  -k 1aff299668 -i 1 -n 1024\n    FOUND key: kspass\n\nInstead of `cat wordlist.txt`  you can use JohnTheRipper for password generation:\n\n    john --wordlist=dict.txt --rules --stdout\" | ./openssl_pass_crack ....\n\n\nPdf Parser\n==========\n\nAs a side project i created a simple PDF parser, which outputs the parsing stack plus a list of objects.\n\n    python pdfparser.py encryptedWithCertificateAes128.pdf\n\nproducing the following output: ( not yet with the pretty indenting )\n\n    [PdfComment: ascii:'PDF-1.7', PdfComment: hex:'e2e3cfd3', \n     PdfOperator: xref,\n         PdfNumber: 0, PdfNumber: 9,\n            PdfNumber: 0000000000, PdfNumber: 65535, PdfOperator: f,\n            PdfNumber: 0000000294, PdfNumber: 00000, PdfOperator: n,\n            PdfNumber: 0000000873, PdfNumber: 00000, PdfOperator: n,\n            PdfNumber: 0000000354, PdfNumber: 00000, PdfOperator: n,\n            PdfNumber: 0000000161, PdfNumber: 00000, PdfOperator: n,\n            PdfNumber: 0000000015, PdfNumber: 00000, PdfOperator: n,\n            PdfNumber: 0000000785, PdfNumber: 00000, PdfOperator: n,\n            PdfNumber: 0000000924, PdfNumber: 00000, PdfOperator: n,\n            PdfNumber: 0000003847, PdfNumber: 00000, PdfOperator: n,\n     PdfOperator: trailer, PdfDictionary: [\n         PdfName: Encrypt, PdfReference: 00008.0, \n         PdfName: ID, PdfArray: [PdfHexdata: eadf305edab34545d11859b727274d71, PdfHexdata: eadf305edab34545d11859b727274d71], \n         PdfName: Info, PdfReference: 00003.0, \n         PdfName: Root, PdfReference: 00001.0, \n         PdfName: Size, PdfNumber: 9],\n     PdfComment: ascii:'iText-7.0.1-SNAPSHOT',\n     PdfOperator: startxref, PdfNumber: 4893]\n\n    00001: PdfObject: [PdfDictionary: [\n         PdfName: Metadata, PdfReference: 00007.0, \n         PdfName: Pages, PdfReference: 00002.0, \n         PdfName: Type, PdfName: Catalog]]\n    00002: PdfObject: [PdfDictionary: [\n         PdfName: Count, PdfNumber: 1, \n         PdfName: Kids, PdfArray: [PdfReference: 00004.0], \n         PdfName: Type, PdfName: Pages]]\n    00003: PdfObject: [PdfDictionary: [\n         PdfName: Author, PdfString: hex:'5b97d367c7310b9d80761c86e66fa2c71dab01fb150e6fa4c55cb4bf80fac19cda79513966f9d6c1e938080fc8c87800', \n         PdfName: CreationDate, PdfString: hex:'5b97d367c7310b9d80761c86e66fa2c710cf138a9608a559cf8aa4e14fe9f97ee6e7ef89029fb1f7399dd9e64b0d7cab', \n         PdfName: Creator, PdfString: hex:'5b97d367c7310b9d80761c86e66fa2c7ee23708d7b3f6973b01ff75dae7e0fca', \n         PdfName: ModDate, PdfString: hex:'5b97d367c7310b9d80761c86e66fa2c710cf138a9608a559cf8aa4e14fe9f97ee6e7ef89029fb1f7399dd9e64b0d7cab', \n         PdfName: Producer, PdfString: hex:'5b97d367c7310b9d80761c86e66fa2c7ba3a4b524f422411b737667f5c50c98f8e864b96999ed6247270483364dd492a28c0a6f50da37bfe8e0ad618f419d4e77f31179bf7502fd8606af1b81e271ae3']]\n    00004: PdfObject: [PdfDictionary: [\n         PdfName: Contents, PdfReference: 00005.0, \n         PdfName: MediaBox, PdfArray: [PdfNumber: 0, PdfNumber: 0, PdfNumber: 595, PdfNumber: 842], \n         PdfName: Parent, PdfReference: 00002.0, \n         PdfName: Resources, PdfDictionary: [\n             PdfName: Font, PdfDictionary: [\n                 PdfName: F1, PdfReference: 00006.0]], \n         PdfName: TrimBox, PdfArray: [PdfNumber: 0, PdfNumber: 0, PdfNumber: 595, PdfNumber: 842], \n         PdfName: Type, PdfName: Page]]\n    00005: PdfObject: [PdfStream: PdfDictionary: [\n         PdfName: Filter, PdfName: FlateDecode, \n         PdfName: Length, PdfNumber: 80]]\n    00006: PdfObject: [PdfDictionary: [\n         PdfName: BaseFont, PdfName: Helvetica, \n         PdfName: Encoding, PdfName: WinAnsiEncoding, \n         PdfName: Subtype, PdfName: Type1, \n         PdfName: Type, PdfName: Font]]\n    00007: PdfObject: [PdfStream: PdfDictionary: [\n         PdfName: Length, PdfNumber: 2848, \n         PdfName: Subtype, PdfName: XML, \n         PdfName: Type, PdfName: Metadata]]\n    00008: PdfObject: [PdfDictionary: [\n         PdfName: CF, PdfDictionary: [\n             PdfName: DefaultCryptFilter, PdfDictionary: [\n                 PdfName: CFM, PdfName: AESV2, \n                 PdfName: Recipients, PdfArray: [PdfString: hex:'308201f706092a'...]]], \n         PdfName: Filter, PdfName: Adobe.PubSec, \n         PdfName: Length, PdfNumber: 128, \n         PdfName: R, PdfNumber: 4, \n         PdfName: StmF, PdfName: DefaultCryptFilter, \n         PdfName: StrF, PdfName: DefaultCryptFilter, \n         PdfName: SubFilter, PdfName: adbe.pkcs7.s5, \n         PdfName: V, PdfNumber: 4]]\n\n\n\nCopyright (c) 2016 Willem Hengeveld \u003citsme@xs4all.nl\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnlitsme%2Fpypdfcrack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnlitsme%2Fpypdfcrack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnlitsme%2Fpypdfcrack/lists"}