https://github.com/bjornmelin/pdfusion
A lightweight Python utility for effortlessly merging multiple PDF files into a single document.
https://github.com/bjornmelin/pdfusion
automation batch-processing cli command-line-tool document-management document-processing file-management pdf pdf-manipulation pdf-merger pdf-tools pypdf2 python python-library utilities
Last synced: 7 months ago
JSON representation
A lightweight Python utility for effortlessly merging multiple PDF files into a single document.
- Host: GitHub
- URL: https://github.com/bjornmelin/pdfusion
- Owner: BjornMelin
- License: mit
- Created: 2024-11-23T18:00:03.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-11-24T00:40:22.000Z (11 months ago)
- Last Synced: 2025-02-01T13:42:27.493Z (8 months ago)
- Topics: automation, batch-processing, cli, command-line-tool, document-management, document-processing, file-management, pdf, pdf-manipulation, pdf-merger, pdf-tools, pypdf2, python, python-library, utilities
- Language: Python
- Homepage:
- Size: 40 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ๐ PDFusion
A lightweight Python utility for effortlessly merging multiple PDF files into a single document.
[](https://choosealicense.com/licenses/mit/)
[](https://www.python.org/downloads/release/python-3110/)
[](CONTRIBUTING.md)
[](https://github.com/BjornMelin)
[](https://www.linkedin.com/in/bjorn-melin/)
[](https://github.com/psf/black)
[](https://github.com/astral-sh/ruff)## ๐ Table of Contents
- [๐ Description](#-description)
- [๐ Key Features](#-key-features)
- [๐ Repository Structure](#-repository-structure)
- [๐ป Installation](#-installation)
- [For Users ๐](#for-users-)
- [For Developers ๐ง](#for-developers-)
- [๐ฎ Usage](#-usage)
- [Command Line Interface](#command-line-interface)
- [Python API](#python-api)
- [๐ ๏ธ Development](#๏ธ-development)
- [Running Tests](#running-tests)
- [๐ค Contributing](#-contributing)
- [๐จโ๐ป Author](#-author)
- [๐ License](#-license)
- [๐ Star History](#-star-history)
- [๐ Acknowledgments](#-acknowledgments)## ๐ Description
PDFusion is a simple yet powerful command-line tool that makes it easy to combine multiple PDF files into a single document while preserving the original quality. Perfect for combining reports, consolidating documentation, or organizing digital paperwork.
### ๐ Key Features
- ๐ Merge all PDFs in a directory with a single command
- ๐ Automatic alphabetical ordering of files
- โฑ๏ธ Timestamp-based output naming option
- ๐ ๏ธ Both CLI and Python API support
- ๐ก Clear progress feedback and error handling
- ๐ Maintains original PDF quality
- ๐ Detailed logging of the merge process
- ๐ Type hints with full mypy support
- ๐งช Comprehensive test coverage (>90%)
- ๐ Performance benchmarks included
- ๐ Custom exception handling
- ๐ฏ Supports Python 3.11+## ๐ Repository Structure
```mermaid
graph TD
A[pdfusion/] --> B[pdfusion/]
A --> C[tests/]
A --> D[examples/]
A --> E[Documentation]
B --> B1[__init__.py]
B --> B2[exceptions.py]
B --> B3[logging.py]
B --> B4[pdfusion.py]
B --> B5[py.typed]
C --> C1[__init__.py]
C --> C2[conftest.py]
C --> C3[test files]
D --> D1[basic_usage.py]
E --> E1[README.md]
E --> E2[LICENSE]
E --> E3[CONTRIBUTING.md]
E --> E4[Configuration Files]
```## ๐ป Installation
### For Users ๐
```bash
pip install pdfusion
```### For Developers ๐ง
```mermaid
graph LR
A[Clone Repository] --> B[Create Virtual Environment]
B --> C[Activate Environment]
C --> D[Install Dependencies]
D --> E[Ready to Develop!]
```1. Clone the repository:
```bash
git clone https://github.com/BjornMelin/pdfusion.git
cd pdfusion
```2. Create a virtual environment:
```bash
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
```> **Note:** You can also use `virtualenv` instead of `venv`. See the [Virtual Environment Setup Guide](docs/virtualenv-setup.md) for more details.
3. Install development dependencies:
```bash
pip install -r requirements-dev.txt
```## ๐ฎ Usage
### Quick Start Guide
1. **Install PDFusion**
```bash
pip install pdfusion
```2. **Prepare Your PDFs**
- Create a directory with your PDF files
- Example structure:```plaintext
my_pdfs/
โโโ document1.pdf
โโโ document2.pdf
โโโ document3.pdf
```3. **Run PDFusion**
### Command Line Interface
```mermaid
graph LR
A[Input Directory] --> B[PDFusion CLI]
B --> C[Processing]
C --> D[Merged PDF]
style B fill:#f9f,stroke:#333,stroke-width:4px
``````bash
# Basic usage
pdfusion /path/to/pdfs -o merged.pdf# With verbose output
pdfusion /path/to/pdfs -v# Auto timestamp filename
pdfusion /path/to/pdfs
```#### CLI Options
- `-o, --output`: Output filename (optional)
- `-v, --verbose`: Enable verbose output
- `--version`: Show version number
- `-h, --help`: Show help message### Python API
```python
from pdfusion import merge_pdfs# Example 1: Basic usage
result = merge_pdfs(
input_dir="/path/to/pdfs",
output_file="merged.pdf"
)
print(f"Merged {result.files_merged} files into {result.output_path}")# Example 2: With verbose output and auto timestamp
result = merge_pdfs(
input_dir="/path/to/pdfs",
verbose=True
)
print(f"Total pages in merged PDF: {result.total_pages}")# Example 3: Full options
result = merge_pdfs(
input_dir="/path/to/pdfs",
output_file="merged.pdf",
verbose=True,
sort_files=True, # Sort files alphabetically
add_bookmarks=True # Add bookmarks for each merged PDF
)
```### Example Project Structure
Create a simple script `merge_my_pdfs.py`:
```python
from pdfusion import merge_pdfs
import logging# Set up logging (optional)
logging.basicConfig(level=logging.INFO)# Merge PDFs
try:
result = merge_pdfs(
input_dir="./my_pdfs",
output_file="merged_document.pdf",
verbose=True
)
print(f"Successfully merged {result.files_merged} files!")
print(f"Output saved to: {result.output_path}")
print(f"Total pages: {result.total_pages}")except Exception as e:
print(f"Error merging PDFs: {e}")
```Run your script:
```bash
python merge_my_pdfs.py
```### Output Format
The `merge_pdfs` function returns a result object with the following attributes:
- `files_merged`: Number of files merged
- `output_path`: Path to the merged PDF
- `total_pages`: Total number of pages in the merged PDF
- `processing_time`: Time taken to merge the PDFs## ๐ ๏ธ Development
### Running Tests
```bash
# Run all tests
pytest# Run with coverage report
pytest --cov=pdfusion# Run performance benchmarks
pytest tests/test_pdfusion.py -v -m benchmark# Run specific test file
pytest tests/test_pdfusion.py -v
```## ๐ค Contributing
```mermaid
graph LR
A[Fork Repository] --> B[Create Feature Branch]
B --> C[Make Changes]
C --> D[Commit Changes]
D --> E[Push to Branch]
E --> F[Open Pull Request]
style F fill:#f96,stroke:#333,stroke-width:4px
```1. Fork the repository
2. Create your feature branch (`git checkout -b feat/version/AmazingFeature`)
3. Commit your changes (`git commit -m 'type(scope): Add some AmazingFeature'`)
4. Push to the branch (`git push origin feat/version/AmazingFeature`)
5. Open a Pull Request (`feat(scope): Add some AmazingFeature`)## ๐จโ๐ป Author
### Bjorn Melin
[](https://www.credly.com/org/amazon-web-services/badge/aws-certified-solutions-architect-associate)
[](https://www.credly.com/org/amazon-web-services/badge/aws-certified-developer-associate)
[](https://www.credly.com/org/amazon-web-services/badge/aws-certified-ai-practitioner)
[](https://www.credly.com/org/amazon-web-services/badge/aws-certified-cloud-practitioner)AWS-certified Solutions Architect and Developer with expertise in cloud architecture and modern development practices. Connect with me on:
- ๐ [GitHub](https://github.com/BjornMelin)
- ๐ผ [LinkedIn](https://www.linkedin.com/in/bjorn-melin/)Project Link: [https://github.com/BjornMelin/pdfusion](https://github.com/BjornMelin/pdfusion)
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Star History
[](https://star-history.com/#bjornmelin/pdfusion&Date)
## ๐ Acknowledgments
- ๐ [Python](https://www.python.org/)
- ๐ [pypdf2](https://pypdf.readthedocs.io/en/stable/)
- ๐ท๏ธ [GitHub Badges](https://shields.io/)
โก Built with Python 3.11 + pypdf2 by Bjorn Melin