https://github.com/2talltyler/powerpoint-extractor
A simple script to extract images, text, and presenter notes from a folder full of PowerPoint files.
https://github.com/2talltyler/powerpoint-extractor
powerpoint python-pptx
Last synced: 9 days ago
JSON representation
A simple script to extract images, text, and presenter notes from a folder full of PowerPoint files.
- Host: GitHub
- URL: https://github.com/2talltyler/powerpoint-extractor
- Owner: 2TallTyler
- License: gpl-3.0
- Created: 2023-07-27T16:49:23.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-08-01T18:50:41.000Z (almost 2 years ago)
- Last Synced: 2025-04-01T02:53:38.510Z (about 2 months ago)
- Topics: powerpoint, python-pptx
- Language: Python
- Homepage:
- Size: 21.5 KB
- Stars: 9
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PowerPoint Extractor
A simple script to extract images, text, and presenter notes from a folder full of PowerPoint files. It uses the `python-pptx` library.## Instructions
You need [python-pptx](https://pypi.org/project/python-pptx/), if you don't have it already. Install with:
```pip install python-pptx```1. Clone the repository onto your local drive.
2. Copy PowerPoint files into the [input](/input) folder.
3. Run `extract.py`.## Output
* Text will be saved to a new `text.csv` file in the root folder. This has a row for each slide, with columns containing the presentation name, page number, all the text from the page, and any presenter notes.
* Images will be saved to a new `images` folder, named sequentially with the name of the presentation.## Development
This is a quick and dirty script I wrote for a specific project. I welcome PRs to clean up code, add features, etc.