An open API service indexing awesome lists of open source software.

https://github.com/vishwagauravin/pdf-parser-client-side

A lightweight easy to use package to parse text from PDF files on client side without any server dependency.
https://github.com/vishwagauravin/pdf-parser-client-side

client-side pdf pdf-parser pdf-reader pdfjs

Last synced: 9 months ago
JSON representation

A lightweight easy to use package to parse text from PDF files on client side without any server dependency.

Awesome Lists containing this project

README

          



PDF Parser Client Side








## PDF Parser Client Side

A lightweight easy to use package to parse text from PDF files on client side without any server dependency.

## How to Install ?

Use npm or yarn to install this npm package

```js
npm i pdf-parser-client-side
```

or

```js
yarn add pdf-parser-client-side
```

Include the package

```js
import extractTextFromPDF from "pdf-parser-client-side";
```

#### `variant` Parameter

The `variant` parameter is used to specify the type of text extraction and replacement to be performed on the `extractedText`. Depending on the value of the `variant` parameter, different types of characters will be removed or retained.

| `variant` Value | Description | Regular Expression | Retained Characters |
| ----------------------------------------------- | -------------------------------------------------------------------------------------- | ---------------------------------- | -------------------------- |
| `clean` | Removes all non-ASCII characters and any spaces that follow them. | `/[^\x00-\x7F]+\ \*(?:[^\x00-\x7F] | )\*/g` | ASCII characters only |
| `alphanumeric` | Retains only alphanumeric characters (letters and numbers). | `/[^a-zA-Z0-9]+/g` | A-Z, a-z, 0-9 |
| `alphanumericwithspace` | Retains alphanumeric characters and spaces. | `/[^a-zA-Z0-9 ]+/g` | A-Z, a-z, 0-9, space |
| `alphanumericwithspaceandpunctuation` | Retains alphanumeric characters, spaces, and basic punctuation marks (.,!?,). | `/[^a-zA-Z0-9 .,!?]+/g` | A-Z, a-z, 0-9, space, .,!? |
| `alphanumericwithspaceandpunctuationandnewline` | Retains alphanumeric characters, spaces, basic punctuation marks (.,!?), and newlines. | `/[^a-zA-Z0-9 .,!?]+/g` | A-Z, a-z, 0-9, space, .,!? |

#### Example Usage

Javascript

```jsx
import React from "react";
import extractTextFromPDF from "pdf-parser-client-side";

export default function Test() {
const handleFileChange = async (e, variant) => {
const file = e.target.files?.[0];
if (file) {
try {
const text = await extractTextFromPDF(file, variant);
console.log("Extracted Text:", text);
} catch (error) {
console.error("Error extracting text from PDF:", error);
}
}
};

return (


handleFileChange(e, "clean")}
/>

);
}
```

Typescript

```tsx
import React from "react";
import extractTextFromPDF, { Variant } from "pdf-parser-client-side";

export default function Test() {
const handleFileChange = async (
e: React.ChangeEvent,
variant: Variant
) => {
const file = e.target.files?.[0];
if (file) {
try {
const text = await extractTextFromPDF(file, variant);
console.log("Extracted Text:", text);
} catch (error) {
console.error("Error extracting text from PDF:", error);
}
}
};

return (


handleFileChange(e, "clean")}
/>

);
}
```

## Contributing

Feel free to contribute!

1. Fork the repository
2. Make changes
3. Submit a pull request

### [> with 💛 by Vishwa Gaurav](https://itsvg.in)