https://github.com/luochen1990/nodejs-easy-pdf-parser
a lightweight, promise style, functional wrapper of pdf2json, extract text from pdf easily
https://github.com/luochen1990/nodejs-easy-pdf-parser
Last synced: 4 months ago
JSON representation
a lightweight, promise style, functional wrapper of pdf2json, extract text from pdf easily
- Host: GitHub
- URL: https://github.com/luochen1990/nodejs-easy-pdf-parser
- Owner: luochen1990
- License: apache-2.0
- Created: 2018-07-13T14:00:55.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2021-12-03T08:02:25.000Z (over 4 years ago)
- Last Synced: 2025-10-11T11:42:13.597Z (9 months ago)
- Language: JavaScript
- Homepage:
- Size: 17.6 KB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Easy PDF Parser
===============
a lightweight, promise style, functional wrapper of [pdf2json](https://github.com/modesty/pdf2json).
Command Line Tool
-----------------
```
npm install -g easy-pdf-parser
pdf2text test.pdf > test.txt
```
Usage Demo
----------
install:
```
npm install easy-pdf-parser
```
extract plain text from pdf easily:
```
{parsePdf, extractPlainText} = require('easy-pdf-parser')
parsePdf('./test.pdf').then(extractPlainText).then(data => {
console.log(data);
});
```
extract simply structured text from pdf:
```
{parsePdf, extractText} = require('easy-pdf-parser')
parsePdf('./test.pdf').then(extractText).then(data => {
console.log(JSON.stringify(data, '', 2));
});
```
get a full structured parsing result:
```
{parsePdf} = require('easy-pdf-parser')
parsePdf('./test.pdf').then(data => {
console.log(JSON.stringify(data, '', 2));
});
```
More doc about the structure of the parsed result can be found [here](https://github.com/modesty/pdf2json#output-format-reference)