{"id":29796427,"url":"https://github.com/euler16/pdftools","last_synced_at":"2025-09-10T21:37:03.800Z","repository":{"id":303718050,"uuid":"1016443905","full_name":"euler16/pdfTools","owner":"euler16","description":null,"archived":false,"fork":false,"pushed_at":"2025-07-09T03:45:03.000Z","size":20,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-09T04:26:36.683Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/euler16.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-09T03:05:36.000Z","updated_at":"2025-07-09T03:45:06.000Z","dependencies_parsed_at":"2025-07-09T04:26:50.897Z","dependency_job_id":"574927c4-e069-48ed-8ecf-2337a3f71cdc","html_url":"https://github.com/euler16/pdfTools","commit_stats":null,"previous_names":["euler16/pdftools"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/euler16/pdfTools","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euler16%2FpdfTools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euler16%2FpdfTools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euler16%2FpdfTools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euler16%2FpdfTools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/euler16","download_url":"https://codeload.github.com/euler16/pdfTools/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euler16%2FpdfTools/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267464522,"owners_count":24091505,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-28T02:00:09.689Z","response_time":68,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-28T05:08:55.231Z","updated_at":"2025-07-28T05:08:56.268Z","avatar_url":"https://github.com/euler16.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PDF Tools Suite\n\n```\n██████╗ ██████╗ ███████╗    ████████╗ ██████╗  ██████╗ ██╗     ███████╗\n██╔══██╗██╔══██╗██╔════╝    ╚══██╔══╝██╔═══██╗██╔═══██╗██║     ██╔════╝\n██████╔╝██║  ██║█████╗         ██║   ██║   ██║██║   ██║██║     ███████╗\n██╔═══╝ ██║  ██║██╔══╝         ██║   ██║   ██║██║   ██║██║     ╚════██║\n██║     ██████╔╝██║            ██║   ╚██████╔╝╚██████╔╝███████╗███████║\n╚═╝     ╚═════╝ ╚═╝            ╚═╝    ╚═════╝  ╚═════╝ ╚══════╝╚══════╝\n                                                                        \n   ███████╗██╗   ██╗██╗████████╗███████╗                               \n   ██╔════╝██║   ██║██║╚══██╔══╝██╔════╝                               \n   ███████╗██║   ██║██║   ██║   █████╗                                 \n   ╚════██║██║   ██║██║   ██║   ██╔══╝                                 \n   ███████║╚██████╔╝██║   ██║   ███████╗                               \n   ╚══════╝ ╚═════╝ ╚═╝   ╚═╝   ╚══════╝                               \n```\n\nA comprehensive collection of Python tools for PDF manipulation, including compression, splitting, and merging operations.\n\n## 📋 Overview\n\nThis suite provides three powerful PDF utilities:\n\n- **PDF Compressor**: Reduce PDF file sizes using Ghostscript with various quality presets\n- **PDF Splitter**: Split PDFs into individual pages or extract specific page ranges\n- **PDF Merger**: Combine multiple PDFs and images into a single PDF file\n\n## 🛠️ Tools Included\n\n### 1. PDF Compressor (`pdf_compressor/`)\nCompress PDF files to reduce their size while maintaining quality control.\n\n**Features:**\n- Multiple compression levels (screen, ebook, printer, prepress, default)\n- File size comparison with reduction percentage\n- Batch processing support\n- Human-readable file size display\n\n### 2. PDF Splitter (`pdf_splitter/`)\nSplit PDF files into separate pages or extract specific page ranges.\n\n**Features:**\n- One-file-per-page splitting\n- Custom page range extraction\n- Flexible range syntax (e.g., 1-3,5,7-9)\n- Overwrite protection with optional force mode\n\n### 3. PDF Merger (`pdf_merger/`)\nMerge multiple PDF files and images into a single PDF document.\n\n**Features:**\n- Supports PDF, JPG, JPEG, PNG, TIF, and TIFF files\n- Lexicographic file ordering\n- High-resolution image conversion (300 DPI)\n- Memory-efficient processing\n\n## 📦 Installation\n\n### Prerequisites\n\n1. **Python 3.7+** is required\n2. **Ghostscript** (for PDF compression):\n   ```bash\n   # macOS\n   brew install ghostscript\n   \n   # Ubuntu/Debian\n   sudo apt-get install ghostscript\n   \n   # Windows\n   # Download from https://www.ghostscript.com/download/gsdnld.html\n   ```\n\n### Dependencies\n\nInstall the required Python packages:\n\n```bash\npip install pypdf PyPDF2 Pillow\n```\n\nOr create a virtual environment:\n\n```bash\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\npip install pypdf PyPDF2 Pillow\n```\n\n## 🚀 Usage\n\n### PDF Compressor\n\n```bash\ncd pdf_compressor/\n\n# Basic compression (default: ebook quality)\npython compress_pdf2.py input.pdf\n\n# Specify compression level\npython compress_pdf2.py input.pdf -c screen\n\n# Custom output path\npython compress_pdf2.py input.pdf -o compressed_output.pdf\n\n# Force overwrite existing file\npython compress_pdf2.py input.pdf -f\n\n# View all options\npython compress_pdf2.py -h\n```\n\n**Compression Levels:**\n- `screen`: Screen-view-only quality, 72 dpi images\n- `ebook`: Low quality, 150 dpi images (default)\n- `printer`: High quality, 300 dpi images\n- `prepress`: High quality preserving color, 300 dpi images\n- `default`: Almost identical to screen\n\n### PDF Splitter\n\n```bash\ncd pdf_splitter/\n\n# Split into individual pages\npython split_pdf.py document.pdf output_directory/\n\n# Extract specific page ranges\npython split_pdf.py document.pdf output_directory/ --ranges 1-3,5,7-9\n\n# Overwrite existing files\npython split_pdf.py document.pdf output_directory/ --overwrite\n\n# View all options\npython split_pdf.py -h\n```\n\n**Range Syntax:**\n- `1-3`: Pages 1 through 3\n- `5`: Single page 5\n- `7-9`: Pages 7 through 9\n- `1-3,5,7-9`: Multiple ranges combined\n\n### PDF Merger\n\n```bash\ncd pdf_merger/\n\n# Merge all files in a directory\npython merge_to_pdf.py input_directory/ merged_output.pdf\n\n# Use default output name (merged.pdf)\npython merge_to_pdf.py input_directory/\n\n# View all options\npython merge_to_pdf.py -h\n```\n\n**Supported File Types:**\n- PDF files (`.pdf`)\n- Image files (`.jpg`, `.jpeg`, `.png`, `.tif`, `.tiff`)\n\n## 📁 Project Structure\n\n```\npdf-tools-suite/\n├── README.md\n├── pdf_compressor/\n│   ├── compress_pdf2.py\n│   └── [sample PDFs]\n├── pdf_merger/\n│   ├── merge_to_pdf.py\n│   ├── input_dir/\n│   └── [output PDFs]\n└── pdf_splitter/\n    ├── split_pdf.py\n    ├── splits/\n    └── [sample PDFs]\n```\n\n## 💡 Examples\n\n### Example 1: Compress a large PDF\n```bash\ncd pdf_compressor/\npython compress_pdf2.py large_document.pdf -c ebook\n```\n\n### Example 2: Extract specific pages\n```bash\ncd pdf_splitter/\npython split_pdf.py report.pdf extracted_pages/ --ranges 1-5,10,15-20\n```\n\n### Example 3: Merge images and PDFs\n```bash\ncd pdf_merger/\n# Place your PDFs and images in input_dir/\npython merge_to_pdf.py input_dir/ final_document.pdf\n```\n\n## 🔧 Troubleshooting\n\n### Common Issues\n\n1. **Ghostscript not found**: Make sure Ghostscript is installed and in your PATH\n2. **Permission errors**: Check file permissions and write access to output directories\n3. **Memory issues**: For large files, ensure sufficient system memory\n4. **Corrupted PDFs**: Some PDFs may need repair before processing\n\n### Error Messages\n\n- `FileNotFoundError`: Input file doesn't exist or path is incorrect\n- `FileExistsError`: Output file already exists (use `-f` or `--overwrite`)\n- `subprocess.CalledProcessError`: Ghostscript execution failed\n\n## 🤝 Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## 📞 Support\n\nFor issues and questions, please open an issue in the project repository.\n\n---\n\n**Happy PDF Processing!** 🎉\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feuler16%2Fpdftools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feuler16%2Fpdftools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feuler16%2Fpdftools/lists"}