{"id":13648132,"url":"https://github.com/desgeeko/pdfsyntax","last_synced_at":"2025-08-21T14:14:30.713Z","repository":{"id":50192753,"uuid":"387035771","full_name":"desgeeko/pdfsyntax","owner":"desgeeko","description":"A Python library to inspect and modify the internal structure of a PDF file","archived":false,"fork":false,"pushed_at":"2025-08-03T21:18:37.000Z","size":1036,"stargazers_count":995,"open_issues_count":3,"forks_count":30,"subscribers_count":18,"default_branch":"main","last_synced_at":"2025-08-03T23:26:58.609Z","etag":null,"topics":["api","browse","cli","inspection","library","parser","pdf","pdfsyntax","python","read","syntax","transformation","write"],"latest_commit_sha":null,"homepage":"https://pdfsyntax.dev","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/desgeeko.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-07-17T21:00:03.000Z","updated_at":"2025-08-03T21:18:41.000Z","dependencies_parsed_at":"2025-04-22T07:37:10.990Z","dependency_job_id":"557a7324-1344-4866-a43d-718339282835","html_url":"https://github.com/desgeeko/pdfsyntax","commit_stats":{"total_commits":49,"total_committers":1,"mean_commits":49.0,"dds":0.0,"last_synced_commit":"11236d15d9bca6f0af519f3a2dba707c308eb4c0"},"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/desgeeko/pdfsyntax","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/desgeeko%2Fpdfsyntax","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/desgeeko%2Fpdfsyntax/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/desgeeko%2Fpdfsyntax/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/desgeeko%2Fpdfsyntax/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/desgeeko","download_url":"https://codeload.github.com/desgeeko/pdfsyntax/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/desgeeko%2Fpdfsyntax/sbom","scorecard":{"id":336044,"data":{"date":"2025-08-11","repo":{"name":"github.com/desgeeko/pdfsyntax","commit":"59f213cb13c1642bf6757e8855fc98c7609a1cd5"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":5.6,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":10,"reason":"GitHub workflow tokens follow principle of least privilege","details":["Info: topLevel 'contents' permission set to 'read': .github/workflows/python-publish.yml:16","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":10,"reason":"18 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/python-publish.yml:62: update your workflow using https://app.stepsecurity.io/secureworkflow/desgeeko/pdfsyntax/python-publish.yml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/python-publish.yml:68: update your workflow using https://app.stepsecurity.io/secureworkflow/desgeeko/pdfsyntax/python-publish.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/python-publish.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/desgeeko/pdfsyntax/python-publish.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/python-publish.yml:25: update your workflow using https://app.stepsecurity.io/secureworkflow/desgeeko/pdfsyntax/python-publish.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/python-publish.yml:36: update your workflow using https://app.stepsecurity.io/secureworkflow/desgeeko/pdfsyntax/python-publish.yml/main?enable=pin","Warn: pipCommand not pinned by hash: .github/workflows/python-publish.yml:32","Info:   0 out of   4 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   1 third-party GitHubAction dependencies pinned","Info:   0 out of   1 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/python-publish.yml:41"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}}]},"last_synced_at":"2025-08-18T04:45:11.991Z","repository_id":50192753,"created_at":"2025-08-18T04:45:11.991Z","updated_at":"2025-08-18T04:45:11.991Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271493232,"owners_count":24769117,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-21T02:00:08.990Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","browse","cli","inspection","library","parser","pdf","pdfsyntax","python","read","syntax","transformation","write"],"created_at":"2024-08-02T01:03:59.621Z","updated_at":"2025-08-21T14:14:30.704Z","avatar_url":"https://github.com/desgeeko.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"PDFSyntax\n=========\n\n*A Python library to inspect and transform the internal structure of PDF files*\n\n## Introduction\n\nThe project is focused on chapter 7 (\"Syntax\") of the Portable Document Format (PDF) Specification. It implements all the detailed document structure management down to the byte level for inspection and transformation use cases (access to metadata, rotation,...).\n\n- Internal functions are being exposed as an API toolkit for PDF read/write operations,\n- Some specific functions are additionally exposed as a command line interface for use in a terminal or a  browser.\n\nPDFSyntax is lightweight (no dependencies) and written from scratch in pure Python, with a focus on simplicity and immutability.\n\nIt favors non-destructive edits allowed by the PDF Specification: by default incremental updates are added at the end of the original file (you may rewind or squash all revisions into a single one).\n\n## Project status\n\nWORK IN PROGRESS! This is BETA quality software. The API may change anytime.\nNext on TO-DO list:\n- Cut \u0026 append pages\n- Lossless compression\n- More filters\n- Improve text extraction\n- Augment text extraction with layout detection\n\n## Installation\n\nYou can install from PyPI:\n\n    pip install pdfsyntax\n\n## CLI overview\n\nPlease refer to the [CLI README](docs/cli.md) for details.\n\nThe general form of the CLI usage is:\n\n    pdfsyntax COMMAND FILE\n\nOr this longer form if you installed from source:\n\n    python3 -m pdfsyntax COMMAND FILE\n\nYou can get quick insights on a PDF file with these commands:\n- `overview` outputs text data about the structure and the metadata.\n- `disasm` outputs a dump of the file structure on the terminal.\n- `text` outputs extracted text spatially, as if it was a kind of scan.\n- `fonts` outputs list of fonts used.\n- `browse` outputs static html data that lets you browse the internal structure of the PDF file: the PDF source is pretty-printed and augmented with hyperlinks.\n\n## API overview\n\nPlease refer to the [API README](docs/api.md) for details.\n\nPDFSyntax is mostly made of simple functions. Example:\n\n```Python\n\u003e\u003e\u003e from pdfsyntax import readfile, metadata\n\u003e\u003e\u003e doc = readfile(\"samples/simple_text_string.pdf\")\n\u003e\u003e\u003e metadata(doc) #returns a Python dict whose keys are 'Title', 'Author', etc...\n```\n\nThe Doc object is probably the only dedicated class you will need to handle. It is a black box that stores all the internal states of a document:\n- content that is cached/memoized from an original file,\n- modifications that add/modifiy/delete content and that are tracked as incremental updates.\n\n```Python\n\u003e\u003e\u003e doc\n\u003cPDF Doc in revision 1 with 0 modified object(s)\u003e\n```\n\nThis object exposes as a method the same metadata function, therefore you can get the same result with:\n\n```Python\n\u003e\u003e\u003e doc.metadata() #returns a Python dict whose keys are 'Title', 'Author', etc...\n```\n\nLow-level functions like `get_object` or `update_object` allow you to directly access and manipulate the inner objects of the document structure.\nYou may also use higher-level functions like `rotate`:\n\n```Python\n\u003e\u003e\u003e from pdfsyntax import rotate, writefile\n\u003e\u003e\u003e doc180 = rotate(doc, 180) #rotate pages by 180°\n```\n\nThe original object is unchanged and a new object is created with an incremental update (revision 2) that encloses the ongoing orientation modification:\n\n```Python\n\u003e\u003e\u003e doc180\n\u003cPDF Doc in revision 1 with 1 modified object(s)\u003e\n```\n\nYou then can write the modified PDF to disk. Note that the resulting file contains a new section appended to the original content. You may cut this section to revert the change.\n\n```Python\n\u003e\u003e\u003e writefile(doc180, \"rotated_doc.pdf\")\n```\n\n\n## Open-Source, not Open-Contribution yet\n\nPDFSyntax is [MIT licensed](LICENCE) but is currently closed to contributions.\n\u003e Personal note: this is a pet projet of mine and my time is limited. First I need to focus on my roadmap (new features and refactoring) and then I will happily accept contributions when everything is a little more stabilised. \n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdesgeeko%2Fpdfsyntax","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdesgeeko%2Fpdfsyntax","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdesgeeko%2Fpdfsyntax/lists"}