https://github.com/almogtavor/ppt-to-pdf
https://github.com/almogtavor/ppt-to-pdf
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/almogtavor/ppt-to-pdf
- Owner: almogtavor
- Created: 2025-04-06T23:01:49.000Z (7 months ago)
- Default Branch: master
- Last Pushed: 2025-05-15T05:06:05.000Z (5 months ago)
- Last Synced: 2025-05-25T08:03:59.964Z (5 months ago)
- Language: Python
- Homepage: https://ppt-to-pdf.streamlit.app/
- Size: 43.9 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PPT/PDF to Multi-Slide PDF Converter
A tool to convert PowerPoint presentations and PDFs into multi-slide PDFs with customizable layouts.
## Features
- Convert PowerPoint (.ppt, .pptx) and PDF files to multi-slide PDFs
- Customize layout with adjustable slides per row, gaps, and margins
- Combine multiple files into a single PDF
- Option to start each PDF's slides on a new page
- Support for right-to-left (RTL) layout for languages like Arabic and Hebrew
- Smart filtering of progressive build slides (optional)
- Modern web interface with drag-and-drop support
- Layout page showing the order of PDFs in the combined output## Requirements
- Python 3.7+
- Poppler (for PDF conversion)
- Ghostscript (gswin64c) (for PDF processing)
- Tesseract OCR (only required if using the OCR option)### Installing Dependencies
1. Install Python packages:
```bash
pip install -r requirements.txt
```2. Install Poppler:
- Windows: Download from [poppler releases](https://github.com/oschwartz10612/poppler-windows/releases/)
- Extract to a location (e.g., `C:\Program Files\poppler-23.11.0`)
- Add the bin folder to your PATH (e.g., `C:\Program Files\poppler-23.11.0\Library\bin`)3. Install Ghostscript:
- Download from [Ghostscript releases](https://github.com/ArtifexSoftware/ghostpdl-downloads/releases)
- Run the installer
- Make sure to check "Add to PATH" during installation
- Restart your terminal/PowerShell window for the PATH changes to take effect4. Install Tesseract OCR (only needed if using OCR option):
- Visit [Tesseract OCR GitHub](https://github.com/UB-Mannheim/tesseract/wiki)
- Download the installer for your system (64-bit Windows: `tesseract-ocr-w64-setup-v5.3.3.20231005.exe`)
- Run the installer and follow these steps:
- Accept the license agreement
- Choose "Install for all users" if you have admin rights
- Make sure to check the box that says "Add to PATH" during installation
- Complete the installation
- Restart your terminal/PowerShell window for the PATH changes to take effect## Usage
### Web Interface
1. Start the Streamlit app:
```bash
streamlit run app.py
```2. Open your browser and go to `http://localhost:8501`
3. Use the web interface to:
- Upload files
- Adjust layout settings
- Choose output options
- Download the converted PDF### Command Line
```bash
py -3.9 -m venv venv
# Convert a single file
python main.py input.pdf output.pdf --slides_per_row 3# Convert all files in a directory
python main.py input_folder output_folder --slides_per_row 3# Combine multiple files into one PDF
python main.py input_folder combined.pdf --single_file --slides_per_row 3# Combine files with each PDF starting on a new page
python main.py input_folder combined.pdf --single_file --slides_per_row 3 --no_new_page
```## Options
- `--slides_per_row`: Number of slides per row (default: 2)
- `--gap`: Space between slides in points (default: 10)
- `--margin`: Margin on sides and bottom in points (default: 20)
- `--top_margin`: Margin at the top in points (default: 0)
- `--single_file`: Combine all slides into a single PDF
- `--no_new_page`: Disable starting each PDF's slides on a new page (only applies when --single_file is used)
- `--rtl`: Enable right-to-left layout for RTL languages
- `--filter-progressive`: Remove slides that are just progressive builds of the next slide (disabled by default)
- `--ocr`: Add searchable text layer to the PDF (enabled by default)## Notes
- When combining files into a single PDF, each PDF's slides will start on a new page by default
- The layout page at the beginning of the combined PDF shows the order of PDFs
- For best results, use similar-sized slides
- Adjust the margins and gaps to optimize the layout
- The "Slides per Row" setting affects the size of each slide## License
MIT License