{"id":50376957,"url":"https://github.com/twn39/epub-rs","last_synced_at":"2026-05-30T10:01:51.365Z","repository":{"id":350603767,"uuid":"1207562789","full_name":"twn39/epub-rs","owner":"twn39","description":"epub-rs is an industrial-grade, highly performant EPUB 2/3 processing engine for Rust.","archived":false,"fork":false,"pushed_at":"2026-05-06T01:15:50.000Z","size":34559,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-06T03:10:58.949Z","etag":null,"topics":["ebook","epub","epub-reader","epub3","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/twn39.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-11T05:04:43.000Z","updated_at":"2026-05-06T01:15:50.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/twn39/epub-rs","commit_stats":null,"previous_names":["twn39/epub-rs"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/twn39/epub-rs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twn39%2Fepub-rs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twn39%2Fepub-rs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twn39%2Fepub-rs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twn39%2Fepub-rs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/twn39","download_url":"https://codeload.github.com/twn39/epub-rs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twn39%2Fepub-rs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33687722,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ebook","epub","epub-reader","epub3","rust"],"created_at":"2026-05-30T10:01:50.511Z","updated_at":"2026-05-30T10:01:51.353Z","avatar_url":"https://github.com/twn39.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# epub-rs\n\n**epub-rs** is an industrial-grade, highly performant EPUB 2/3 processing engine for Rust. \n\nIt provides an end-to-end toolchain to **parse, process, deobfuscate, and generate** electronic books. Designed for heavy workloads and commercial reading apps, it avoids deep DOM tree memory overheads by utilizing blazing-fast stream processors ([`lol_html`](https://github.com/cloudflare/lol-html)).\n\n## Features\n\n### 🌐 WebAssembly (WASM) Support\n* **Browser-Native EPUB Engine**: Compile the entire parsing and generation engine to `wasm32-unknown-unknown` to run directly in the browser or Node.js.\n* **Zero-FS Architecture**: Parse binary `Uint8Array` EPUB buffers completely in memory without requiring a virtual file system.\n* **JS-Interop FFI**: Full `wasm-bindgen` FFI bindings (`EpubParser`, `EpubGenerator`, `compare_cfi`, `decrypt_font`) with `serde` integration for passing complex metadata and multi-level TOC JSON seamlessly between JS and Rust.\n\n### 📖 Robust Parsing\n* **Multiple Renditions**: Support for fetching and parsing multiple `.opf` rootfiles from a single EPUB container.\n* **Storage Agnostic**: `EpubProvider` trait allows extracting resources from traditional `.epub` ZIP files or exploded local directories without memory bloat.\n* **Smart Cover API**: Heuristic 4-tier fallback extraction algorithm to securely find the book cover.\n* **TOC \u0026 Navigation**: Reverse parses modern EPUB 3 `nav.xhtml` and legacy EPUB 2 `toc.ncx` into nested tree structures.\n* **Font Deobfuscation**: Transparent stream-decryption of commercially obfuscated `.ttf`/`.otf` fonts (supports IDPF and Adobe algorithms via `META-INF/encryption.xml`).\n\n### ⚙️ Content Processing \u0026 CFI\n* **Stream-Based HTML Rewriting**: Inject custom CSS themes (e.g. Dark Mode) into `\u003chead\u003e` or rewrite asset links (`\u003cimg\u003e`, `\u003ca\u003e`) with near-zero latency.\n* **Canonical Fragment Identifier (CFI)**: Full specification support for Point and Range EPUB CFI (`epubcfi(/6/4!/4/2:5)`).\n* **DOM CFI Injection**: Injects exact `data-cfi` paths into every DOM element to bridge frontend web-reader interactions (Highlighting \u0026 Bookmarks).\n* **Full-text Search to CFI**: Search raw HTML using Regex and return the exact CFI ranges pointing to the match.\n* **Synthetic Positions**: Generates virtual reading progress markers across the entire book for unified pagination.\n* **Semantic Extractor**: Extracts TTS (Text-To-Speech) and A11Y friendly structural streams (`ContentElement`), preserving language and block boundaries.\n\n### ✍️ Intelligent Generation (Builder)\n* **Strict EPUB 2 / 3 Generation**: Conditional compilation isolating legacy `NCX` from modern `NAV`, generating compliant `content.opf`.\n* **Streaming Large Files**: Stream massive assets (videos/images) directly into the EPUB ZIP pipe without loading them into memory (`add_resource_stream`).\n* **Rich Metadata \u0026 Layouts**: Full Dublin Core property refinements (authors vs. translators), Pre-Paginated Fixed-Layout (FXL), and Page Spreads (Comics/Manga).\n* **Automatic Property Inference**: Automatically detects `\u003cscript\u003e`, `\u003csvg\u003e`, and `\u003cmath\u003e` to inject EPUB 3 required properties.\n* **Landmarks \u0026 Page-Lists**: Build comprehensive guide mappings for academic and textbook parity.\n\n---\n\n## Installation\n\nAdd this to your `Cargo.toml`:\n\n```toml\n[dependencies]\nepub-rs = \"0.1.0\"\n```\n\n## Quick Start\n\n### 1. Read an EPUB \u0026 Extract Text\n```rust\nuse epub_rs::parser::EpubArchive;\nuse std::fs::File;\n\nfn main() -\u003e Result\u003c(), Box\u003cdyn std::error::Error\u003e\u003e {\n    let file = File::open(\"book.epub\")?;\n    let mut archive = EpubArchive::new(file)?;\n    \n    // Parse OPF and metadata\n    let book = archive.parse()?;\n    println!(\"Title: {:?}\", book.metadata.title);\n    \n    // Read the first chapter from the spine\n    let first_chapter_id = \u0026book.spine[0].idref;\n    let html_bytes = archive.get_resource_by_id(\u0026book, first_chapter_id)?;\n    \n    // Extract plain text for search indexing\n    let plain_text = epub_rs::processor::extract_text(\u0026html_bytes)?;\n    println!(\"Content: {}\", plain_text);\n    \n    Ok(())\n}\n```\n\n### 2. Generate a Compliant EPUB 3 Book\n```rust\nuse epub_rs::generator::{EpubBuilder, TocEntry};\nuse epub_rs::model::{Creator, EpubVersion, Metadata};\nuse std::fs::File;\n\nfn main() {\n    let metadata = Metadata {\n        title: Some(\"My Awesome Book\".to_string()),\n        creators: vec![Creator::new(\"Rustacean\")],\n        language: Some(\"en\".to_string()),\n        ..Default::default()\n    };\n\n    let builder = EpubBuilder::new()\n        .version(EpubVersion::V30)\n        .metadata(metadata)\n        // Auto-inject built-in typography and dark mode CSS\n        .theme(epub_rs::generator::Theme::Modern) \n        // Add nested table of contents\n        .set_toc(vec![TocEntry::new(\"Chapter 1\", \"text/ch1.xhtml\")])\n        // Generate book and HTML\n        .add_chapter(\"ch1\", \"text/ch1.xhtml\", b\"\u003ch1\u003eHello\u003c/h1\u003e\u003cp\u003eWorld!\u003c/p\u003e\".to_vec());\n\n    let mut file = File::create(\"output.epub\").unwrap();\n    builder.generate(\u0026mut file).expect(\"Failed to generate EPUB\");\n}\n```\n\n### 3. Build a Web Reader (CFI Injection)\nPass HTML directly to the browser with exact book-location identifiers, removing the need for complex frontend calculation.\n\n```rust\nlet chapter_html_with_cfi = archive.get_chapter_with_cfi(\u0026book, \"chapter_1_id\")?;\n// output: \u003cp data-cfi=\"epubcfi(/6/4!/4/2)\"\u003e...\u003c/p\u003e\n```\n\n### 4. Semantic TTS Extraction\nIdeal for Text-To-Speech, extracts language-tagged block structures instead of flat strings.\n\n```rust\nlet elements = archive.get_semantic_content(\u0026book, \"chapter_1_id\")?;\nfor el in elements {\n    println!(\"Read {} (in {:?}): {}\", el.tag_name, el.language, el.text);\n    // e.g., Read p (in Some(\"fr\")): Bonjour!\n}\n```\n\n## Performance\nBuilt on Cloudflare's `lol_html` and `zip-rs`, `epub-rs` processes DOMs in a single pass without allocating heavy AST trees.\n\n* **~20 µs**: Open ZIP, parse OPF, setup Domain Models (10 chapters).\n* **~140 µs**: Build, assemble, and compress a full EPUB to memory.\n* **~30 µs**: Find 50 regex text matches and reverse-map them to exact CFI ranges.\n\n*(Benchmarks executed on Apple Silicon M-series via `cargo bench`)*\n\n## License\nMIT License\ns**: Build, assemble, and compress a full EPUB to memory.\n* **~30 µs**: Find 50 regex text matches and reverse-map them to exact CFI ranges.\n\n*(Benchmarks executed on Apple Silicon M-series via `cargo bench`)*\n\n## License\nMIT License\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftwn39%2Fepub-rs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftwn39%2Fepub-rs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftwn39%2Fepub-rs/lists"}