https://github.com/veralvx/epubparser

Parses epub files (wrapper around ebooklib package). Extract chapter titles and their corresponding texts. Can also extract the cover image.
https://github.com/veralvx/epubparser

epub text-processing

Last synced: 3 months ago
JSON representation

Parses epub files (wrapper around ebooklib package). Extract chapter titles and their corresponding texts. Can also extract the cover image.

Host: GitHub
URL: https://github.com/veralvx/epubparser
Owner: veralvx
License: mit
Created: 2025-02-23T18:54:54.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-03-06T21:35:33.000Z (4 months ago)
Last Synced: 2025-04-16T01:57:36.542Z (3 months ago)
Topics: epub, text-processing
Language: Python
Homepage:
Size: 20.5 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

# EpubParser

Parses epub files (wrapper around `ebooklib` package). Extract chapter titles and their corresponding texts. Can also extract the cover image.

## Installation

You can install **epubparser** via pip:

```bash
pip install epubparser
```

## Usage

```
epubparser input.epub output.txt
```

You can apply some arguments:

`--skip-toc`
Skip chapters whose titles match common Table of Contents variants.

`--skip-license`
Skip chapters whose titles match common License variants.

The arguments above may not be perfect, since it depends on regex an language.

`--extract-cover`
extracts cover to covers directory. If this argument is passed, output file must be specified as `None`

## License
This project is licensed under the MIT License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/veralvx/epubparser

Awesome Lists containing this project

README