{"id":13489591,"url":"https://github.com/apache/pdfbox","last_synced_at":"2025-04-29T18:49:30.265Z","repository":{"id":674588,"uuid":"318103","full_name":"apache/pdfbox","owner":"apache","description":"Mirror of Apache PDFBox","archived":false,"fork":false,"pushed_at":"2025-04-25T14:19:41.000Z","size":110731,"stargazers_count":2806,"open_issues_count":31,"forks_count":891,"subscribers_count":90,"default_branch":"trunk","last_synced_at":"2025-04-25T15:28:44.453Z","etag":null,"topics":["content","java","library","pdfbox"],"latest_commit_sha":null,"homepage":"http://pdfbox.apache.org/","language":"Java","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2009-09-26T08:00:19.000Z","updated_at":"2025-04-25T14:19:45.000Z","dependencies_parsed_at":"2023-10-29T11:26:25.182Z","dependency_job_id":"5bfc2c70-b902-47d4-a82e-1c3fa9f1e0cf","html_url":"https://github.com/apache/pdfbox","commit_stats":{"total_commits":11996,"total_committers":20,"mean_commits":599.8,"dds":"0.38737912637545846","last_synced_commit":"a8359d2bed9213298b84fc2e7691ac426d6ef7a9"},"previous_names":[],"tags_count":77,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fpdfbox","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fpdfbox/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fpdfbox/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fpdfbox/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/pdfbox/tar.gz/refs/heads/trunk","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251564106,"owners_count":21609844,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["content","java","library","pdfbox"],"created_at":"2024-07-31T19:00:31.514Z","updated_at":"2025-04-29T18:49:30.240Z","avatar_url":"https://github.com/apache.png","language":"Java","funding_links":[],"categories":["Java","工具","Resources"],"sub_categories":["PDF"],"readme":"\u003c!---\n  Licensed to the Apache Software Foundation (ASF) under one or more\n  contributor license agreements.  See the NOTICE file distributed with\n  this work for additional information regarding copyright ownership.\n  The ASF licenses this file to You under the Apache License, Version 2.0\n  (the \"License\"); you may not use this file except in compliance with\n  the License.  You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n  Unless required by applicable law or agreed to in writing, software\n  distributed under the License is distributed on an \"AS IS\" BASIS,\n  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n  See the License for the specific language governing permissions and\n  limitations under the License.\n---\u003e\n\n[![codeql java](https://github.com/apache/pdfbox/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/apache/pdfbox/actions/workflows/codeql-analysis.yml/badge.svg)\n \nApache PDFBox\n===================================================\n\nThe [Apache PDFBox](https://pdfbox.apache.org/) library is an open source Java tool for working with PDF \ndocuments. This project allows creation of new PDF documents, manipulation \nof existing documents and the ability to extract content from documents.\nPDFBox also includes several command line utilities. PDFBox is published\nunder the Apache License, Version 2.0.\n\nPDFBox is a project of the [Apache Software Foundation](https://www.apache.org/).\n\nBinary Downloads\n----------------\n\nYou can download binary versions for releases currently under development or older\nreleases from our [Download Page](https://pdfbox.apache.org/download.cgi).\n\nBuild\n-----\n\nYou need Java 11 (or higher) and [Maven 3](https://maven.apache.org/) to\nbuild PDFBox. The recommended build command is:\n\n    mvn clean install\n\nThe default build will compile the Java sources and package the binary\nclasses into jar packages. See the Maven documentation for all the\nother available build options.\n\nContribute\n----------\n\nThere are various ways to help us improve PDFBox. \n\n- look at the [Issue Tracker](https://issues.apache.org/jira/browse/PDFBOX) to help us fix bugs.\n- answer questions on our [Users Mailing List](https://pdfbox.apache.org/mailinglists.html \"Subscribe to Mailing List\").\n- help us enhance the [Examples](https://svn.apache.org/repos/asf/pdfbox/trunk/examples/)\n- help us to enhance the [PDFBox Documentation](https://gitbox.apache.org/repos/asf/pdfbox-docs)\nor on [GitHub](https://github.com/apache/pdfbox-docs). \n\nSupport\n-------\n\n**Please follow the guidelines at our [Support Page](https://pdfbox.apache.org/support.html).**\n\nIf you have questions about how to use PDFBox do ask on the\n[Users Mailing List](/mailinglists.html \"Subscribe to Mailing List\").\nThis will get you help from the entire community.\n\nThe PDFBox examples and the test code in the sources will also provide additional information.\n\nAnd there are additional resources available on sites such as\n[Stack Overflow](https://stackoverflow.com/search?q=pdfbox \"Stack Overflow\").\n\nIf you are sure you have found a bug the please report the issue in our \n[Issue Tracker](https://issues.apache.org/jira/browse/PDFBOX). \n\nKnown Limitations and Problems\n------------------------------\n\nSee the [Issue Tracker](https://issues.apache.org/jira/browse/PDFBOX) for\nthe full list of known issues and requested features. Some of the more\ncommon issues are:\n\n1. You get text like \"G38G43G36G51G5\" instead of what you expect when you are\n   extracting text. This is because the characters are a meaningless internal\n   encoding that point to glyphs that are embedded in the PDF document. The\n   only way to access the text is to use OCR. This may be a future\n   enhancement.\n\n2. You get an error message like `java.io.IOException: Can't handle font width`\n   this MIGHT be due to the fact that you don't have the\n   **org/apache/pdfbox/resources** directory in your classpath. The easiest\n   solution is to include the **apache-pdfbox-x.x.x.jar** in your classpath.\n\n3. You get text that has the correct characters, but in the wrong\n   order.  This mght be because you have not enabled sorting.  The text\n   in PDF files is stored in chunks and the chunks do not need to be stored \n   in the order that they are displayed on a page.  By default, PDFBox does \n   not sort the text.\n\nLicense (see also [LICENSE.txt](https://github.com/apache/pdfbox/blob/trunk/LICENSE.txt))\n------------------------------\n\nCollective work: Copyright 2015 The Apache Software Foundation.\n\nLicensed to the Apache Software Foundation (ASF) under one or more\ncontributor license agreements.  See the NOTICE file distributed with\nthis work for additional information regarding copyright ownership.\nThe ASF licenses this file to You under the Apache License, Version 2.0\n(the \"License\"); you may not use this file except in compliance with\nthe License.  You may obtain a copy of the License at\n\n     https://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n\nExport control\n--------------\n\nThis distribution includes cryptographic software.  The country in  which\nyou currently reside may have restrictions on the import,  possession, use,\nand/or re-export to another country, of encryption software.  BEFORE using\nany encryption software, please  check your country's laws, regulations and\npolicies concerning the import, possession, or use, and re-export of\nencryption software, to  see if this is permitted.  See\n\u003chttps://www.wassenaar.org/\u003e for more information.\n\nThe U.S. Government Department of Commerce, Bureau of Industry and\nSecurity (BIS), has classified this software as Export Commodity Control\nNumber (ECCN) 5D002.C.1, which includes information security software using\nor performing cryptographic functions with asymmetric algorithms.  The form\nand manner of this Apache Software Foundation distribution makes it eligible\nfor export under the License Exception ENC Technology Software Unrestricted\n(TSU) exception (see the BIS Export Administration Regulations, Section\n740.13) for both object code and source code.\n\nThe following provides more details on the included cryptographic software:\n\n**Apache PDFBox uses the Java Cryptography Architecture (JCA) and the\nBouncy Castle libraries for handling encryption in PDF documents.**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fpdfbox","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2Fpdfbox","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fpdfbox/lists"}