Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/coolwanglu/pdf2htmlEX

Convert PDF to HTML without losing text or format.
https://github.com/coolwanglu/pdf2htmlEX

Last synced: about 2 months ago
JSON representation

Convert PDF to HTML without losing text or format.

Awesome Lists containing this project

README

        

pdf2htmlEX is no longer under active development. New maintainers are [wanted](http://pdf2htmlex.blogspot.ch/2016/12/looking-for-new-maintainer.html).

#![](http://coolwanglu.github.io/pdf2htmlEX/images/pdf2htmlEX-64x64.png) pdf2htmlEX

>一图胜千言
A beautiful demo is worth a thousand words

- **Bible de Genève, 1564** (fonts and typography): [HTML](http://coolwanglu.github.io/pdf2htmlEX/demo/geneve.html) / [PDF](https://github.com/raphink/geneve_1564/releases/download/2015-07-08_01/geneve_1564.pdf)
- **Cheat Sheet** (math formulas): [HTML](http://coolwanglu.github.io/pdf2htmlEX/demo/cheat.html) / [PDF](http://www.tug.org/texshowcase/cheat.pdf)
- **Scientific Paper** (text and figures): [HTML](http://coolwanglu.github.io/pdf2htmlEX/demo/demo.html) / [PDF](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.148.349&rep=rep1&type=pdf)
- **Full Circle Magazine** (read while downloading): [HTML](http://coolwanglu.github.io/pdf2htmlEX/demo/issue65_en.html) / [PDF](http://dl.fullcirclemagazine.org/issue65_en.pdf)
- **Git Manual** (CJK support): [HTML](http://coolwanglu.github.io/pdf2htmlEX/demo/chn.html) / [PDF](http://files.cnblogs.com/phphuaibei/git%E6%90%AD%E5%BB%BA.pdf)

pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies.
Academic papers with lots of formulas and figures? Magazines with complicated layouts? No problem!

pdf2htmlEX is also an [online publishing tool](http://coolwanglu.github.io/pdf2htmlEX/doc/tb108wang.html) which is flexible for many different use cases.

Learn more about [who](https://github.com/coolwanglu/pdf2htmlEX/wiki/Use-Cases) and [why](https://github.com/coolwanglu/pdf2htmlEX/wiki/Introduction) should use pdf2htmlEX.

### Features

* Native HTML text with precise font and location.
* Flexible output: all-in-one HTML or on demand page loading (needs JavaScript).
* Moderate file size, sometimes even smaller than PDF.
* Supporting links, outlines (bookmarks), printing, SVG background, Type 3 fonts and [more...](https://github.com/coolwanglu/pdf2htmlEX/wiki/Feature-List)

[Compare to others](https://github.com/coolwanglu/pdf2htmlEX/wiki/Comparison)

### Portals

* [:house:Wiki Home](https://github.com/coolwanglu/pdf2htmlEX/wiki)
* [Download](https://github.com/coolwanglu/pdf2htmlEX/wiki/Download) & [Building](https://github.com/coolwanglu/pdf2htmlEX/wiki/Building)
* [Quick Start](https://github.com/coolwanglu/pdf2htmlEX/wiki/Quick-Start)
* [Report Issues / Ask for Help](https://github.com/coolwanglu/pdf2htmlEX/blob/master/CONTRIBUTING.md#guidance)
* [:question:FAQ](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ)
* [:envelope:Mailing List](https://groups.google.com/forum/#!forum/pdf2htmlex)
* [:mahjong:中文邮件列表](https://groups.google.com/forum/#!forum/pdf2htmlex-cn)

### LICENSE

pdf2htmlEX, as a whole package, is licensed under GPLv3+.
Some resource files are released with relaxed licenses, read `LICENSE` for more details.

### Acknowledgements

pdf2htmlEX is made possible thanks to the following projects:

* [poppler](http://poppler.freedesktop.org/)
* [Fontforge](http://fontforge.org/)

pdf2htmlEX is inspired by the following projects:

* pdftohtml from poppler
* MuPDF
* PDF.js
* Crocodoc
* Google Doc

#### Special Thanks

* Hongliang Tian
* Wanmin Liu