https://github.com/axflow/pdf-ts
PDF text extraction in TypeScript
https://github.com/axflow/pdf-ts
Last synced: about 1 year ago
JSON representation
PDF text extraction in TypeScript
- Host: GitHub
- URL: https://github.com/axflow/pdf-ts
- Owner: axflow
- License: mit
- Created: 2023-08-06T23:44:28.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-08-08T02:26:08.000Z (almost 3 years ago)
- Last Synced: 2025-03-30T21:51:14.326Z (over 1 year ago)
- Language: TypeScript
- Size: 198 KB
- Stars: 35
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# pdf-ts
pdf-ts is a TypeScript library for PDF text extraction. It uses [Mozilla's PDF.js](https://mozilla.github.io/pdf.js) to expose a simple API for text extraction.
```shell
npm i pdf-ts
```
## Examples
Extract text from a PDF.
```ts
import {pdfToText} from 'pdf-ts';
const pdf = await fs.readFile('./path/to/file.pdf');
const text = await pdfToText(pdf);
console.log(text);
```
Extract a list of pages from a PDF.
```ts
import {pdfToPages} from 'pdf-ts';
const pdf = await fs.readFile('./path/to/file.pdf');
const pages = await pdfToPages(pdf);
console.log(pages); // [{page: 1, text: '...'}, {page: 2, text: '...'}, ...]
```
## License
[MIT](LICENSE)