{"id":25020750,"url":"https://github.com/marioszocs/pdf-splitter","last_synced_at":"2025-07-04T06:07:57.986Z","repository":{"id":158978956,"uuid":"365547968","full_name":"marioszocs/pdf-splitter","owner":"marioszocs","description":"Split PDF files by size, by page, and extract email addresses","archived":false,"fork":false,"pushed_at":"2025-01-28T19:38:42.000Z","size":13,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-30T10:29:45.654Z","etag":null,"topics":["itextpdf","java","pdf","pdfbox","pdfextraction","pdfsplitter"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/marioszocs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-08T15:25:52.000Z","updated_at":"2025-01-28T19:38:46.000Z","dependencies_parsed_at":"2025-03-30T10:36:24.180Z","dependency_job_id":null,"html_url":"https://github.com/marioszocs/pdf-splitter","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/marioszocs/pdf-splitter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marioszocs%2Fpdf-splitter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marioszocs%2Fpdf-splitter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marioszocs%2Fpdf-splitter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marioszocs%2Fpdf-splitter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/marioszocs","download_url":"https://codeload.github.com/marioszocs/pdf-splitter/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marioszocs%2Fpdf-splitter/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263457194,"owners_count":23469293,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["itextpdf","java","pdf","pdfbox","pdfextraction","pdfsplitter"],"created_at":"2025-02-05T12:17:30.570Z","updated_at":"2025-07-04T06:07:57.957Z","avatar_url":"https://github.com/marioszocs.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PDF Splitter\n\nThe **PDF Splitter** is a desktop application built in Java for splitting PDF files by size, by pages, and extracting email addresses from PDF documents. This project utilizes the **PDFBox** and **iTextPDF** libraries to perform these operations effectively.\n\n---\n\n## Features\n- **Split PDFs by size**: Break large PDF files into smaller chunks of a specified size.\n- **Split PDFs by pages**: Divide a PDF into multiple parts after a given number of pages.\n- **Extract email addresses**: Retrieve and save all email addresses found in a PDF document to a `.txt` file.\n\n---\n\n## How It Works\n### Main Operations:\n1. **Split PDF After Specific Pages**:\n   - Select the number of pages after which the PDF should be split.\n   - The resulting PDFs will be saved in the output folder.\n\n2. **Split PDF by Specific Size**:\n   - Specify the maximum allowable size for each split PDF in kilobytes.\n   - The application will create multiple PDFs, ensuring each part adheres to the size limit.\n\n3. **Extract Email Addresses**:\n   - Scans the text within PDF files for valid email addresses.\n   - Extracted emails are saved in a `.txt` file for easy access.\n\n---\n\n## Requirements\n- **Java 8** or higher.\n- **Maven** for dependency management.\n\n---\n\n## Installation\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/your-username/pdf-splitter.git\n   cd pdf-splitter\n   ```\n2. Build the project:\n   ```bash\n   mvn clean install\n   ```\n3. Run the application:\n   ```bash\n   java -cp target/pdfsplitting-0.0.1-SNAPSHOT.jar com.pdfsplitting.Main\n   ```\n\n---\n\n## Example Screenshots\n\n### Split PDF by Size\n![PDF Split by Size](https://user-images.githubusercontent.com/11271085/117545322-89f8e200-b025-11eb-8727-b70582fe6807.PNG)\n\n### Input Selection\n![Input Example](https://user-images.githubusercontent.com/11271085/117545354-b6acf980-b025-11eb-809d-2d05f261eee2.PNG)\n\n### Output Example\n![Output Example](https://user-images.githubusercontent.com/11271085/117545364-c0cef800-b025-11eb-845a-c9dc683ef457.PNG)\n\n---\n\n## Libraries Used\n- **[Apache PDFBox](https://pdfbox.apache.org/)**: For handling PDF documents.\n- **[iTextPDF](https://itextpdf.com/)**: For advanced PDF processing.\n\n---\n\n## Project Structure\n```\nmarioszocs-pdf-splitter/\n├── pom.xml                  # Maven build configuration\n├── README.md                # Project documentation\n├── src/main/java/com/pdfsplitting/\n│   ├── Main.java            # Entry point of the application\n│   ├── PDFFileOperations.java # Interface for PDF operations\n│   ├── PDFFileOperationsImp.java # Implementation of PDF operations\n│   ├── PdfUtilities.java    # Utility methods for PDF handling\n└── .gitignore               # Ignored files\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarioszocs%2Fpdf-splitter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarioszocs%2Fpdf-splitter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarioszocs%2Fpdf-splitter/lists"}