https://github.com/veralvx/epubparser
Parses epub files (wrapper around ebooklib package). Extract chapter titles and their corresponding texts. Can also extract the cover image.
https://github.com/veralvx/epubparser
epub text-processing
Last synced: about 1 month ago
JSON representation
Parses epub files (wrapper around ebooklib package). Extract chapter titles and their corresponding texts. Can also extract the cover image.
- Host: GitHub
- URL: https://github.com/veralvx/epubparser
- Owner: veralvx
- License: mit
- Created: 2025-02-23T18:54:54.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-03-06T21:35:33.000Z (3 months ago)
- Last Synced: 2025-04-16T01:57:36.542Z (about 1 month ago)
- Topics: epub, text-processing
- Language: Python
- Homepage:
- Size: 20.5 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# EpubParser
Parses epub files (wrapper around `ebooklib` package). Extract chapter titles and their corresponding texts. Can also extract the cover image.
## Installation
You can install **epubparser** via pip:
```bash
pip install epubparser
```## Usage
```
epubparser input.epub output.txt
```You can apply some arguments:
`--skip-toc`
Skip chapters whose titles match common Table of Contents variants.`--skip-license`
Skip chapters whose titles match common License variants.The arguments above may not be perfect, since it depends on regex an language.
`--extract-cover`
extracts cover to covers directory. If this argument is passed, output file must be specified as `None`## License
This project is licensed under the MIT License.