{"id":18469279,"url":"https://github.com/vishwagauravin/pdf-parser-client-side","last_synced_at":"2025-04-08T10:32:42.353Z","repository":{"id":200176775,"uuid":"705304629","full_name":"VishwaGauravIn/pdf-parser-client-side","owner":"VishwaGauravIn","description":"A lightweight easy to use package to parse text from PDF files on client side without any server dependency.","archived":false,"fork":false,"pushed_at":"2024-06-16T16:23:12.000Z","size":27,"stargazers_count":10,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-23T11:05:02.131Z","etag":null,"topics":["client-side","pdf","pdf-parser","pdf-reader","pdfjs"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/pdf-parser-client-side","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VishwaGauravIn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-10-15T16:39:06.000Z","updated_at":"2025-03-22T01:10:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"9801c8aa-e647-4b89-b871-d1ccfd2844f0","html_url":"https://github.com/VishwaGauravIn/pdf-parser-client-side","commit_stats":null,"previous_names":["vishwagauravin/pdf-parser-client-side"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VishwaGauravIn%2Fpdf-parser-client-side","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VishwaGauravIn%2Fpdf-parser-client-side/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VishwaGauravIn%2Fpdf-parser-client-side/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VishwaGauravIn%2Fpdf-parser-client-side/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VishwaGauravIn","download_url":"https://codeload.github.com/VishwaGauravIn/pdf-parser-client-side/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247824222,"owners_count":21002233,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["client-side","pdf","pdf-parser","pdf-reader","pdfjs"],"created_at":"2024-11-06T10:09:33.155Z","updated_at":"2025-04-08T10:32:40.905Z","avatar_url":"https://github.com/VishwaGauravIn.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n \u003ch1\u003e \u003cimg src=\"https://github.com/VishwaGauravIn/pdf-parser-client-side/assets/81325730/fb8c8369-e2c9-473f-8493-542fafdbfecc\" width=\"80px\"\u003e\u003cbr/\u003ePDF Parser Client Side\u003c/h1\u003e\n \u003ca href=\"https://itsvg.in\" target=\"_blank\"\u003e\u003cimg src=\"https://img.shields.io/badge/Creator-Vishwa%20Gaurav-blue\"/\u003e\u003c/a\u003e \n \u003cimg src=\"https://img.shields.io/npm/v/pdf-parser-client-side?label=%20\"/\u003e\n \u003cimg src=\"https://img.shields.io/npm/dt/pdf-parser-client-side\"\u003e\n \u003cimg src=\"https://img.shields.io/snyk/vulnerabilities/github/VishwaGauravIn/pdf-parser-client-side\"/\u003e\n \u003cimg src=\"https://img.shields.io/badge/License-MIT-brightgreen\"/\u003e\n \u003cimg src=\"https://img.shields.io/github/languages/code-size/VishwaGauravIn/pdf-parser-client-side?logo=github\"\u003e\n\u003c/div\u003e\n\n## PDF Parser Client Side\n\nA lightweight easy to use package to parse text from PDF files on client side without any server dependency.\n\n## How to Install ?\n\nUse npm or yarn to install this npm package\n\n```js\nnpm i pdf-parser-client-side\n```\n\nor\n\n```js\nyarn add pdf-parser-client-side\n```\n\nInclude the package\n\n```js\nimport extractTextFromPDF from \"pdf-parser-client-side\";\n```\n\n#### `variant` Parameter\n\nThe `variant` parameter is used to specify the type of text extraction and replacement to be performed on the `extractedText`. Depending on the value of the `variant` parameter, different types of characters will be removed or retained.\n\n| `variant` Value                                 | Description                                                                            | Regular Expression                 | Retained Characters        |\n| ----------------------------------------------- | -------------------------------------------------------------------------------------- | ---------------------------------- | -------------------------- |\n| `clean`                                         | Removes all non-ASCII characters and any spaces that follow them.                      | `/[^\\x00-\\x7F]+\\ \\*(?:[^\\x00-\\x7F] | )\\*/g`                     | ASCII characters only |\n| `alphanumeric`                                  | Retains only alphanumeric characters (letters and numbers).                            | `/[^a-zA-Z0-9]+/g`                 | A-Z, a-z, 0-9              |\n| `alphanumericwithspace`                         | Retains alphanumeric characters and spaces.                                            | `/[^a-zA-Z0-9 ]+/g`                | A-Z, a-z, 0-9, space       |\n| `alphanumericwithspaceandpunctuation`           | Retains alphanumeric characters, spaces, and basic punctuation marks (.,!?,).          | `/[^a-zA-Z0-9 .,!?]+/g`            | A-Z, a-z, 0-9, space, .,!? |\n| `alphanumericwithspaceandpunctuationandnewline` | Retains alphanumeric characters, spaces, basic punctuation marks (.,!?), and newlines. | `/[^a-zA-Z0-9 .,!?]+/g`            | A-Z, a-z, 0-9, space, .,!? |\n\n#### Example Usage\n\nJavascript\n\n```jsx\nimport React from \"react\";\nimport extractTextFromPDF from \"pdf-parser-client-side\";\n\nexport default function Test() {\n  const handleFileChange = async (e, variant) =\u003e {\n    const file = e.target.files?.[0];\n    if (file) {\n      try {\n        const text = await extractTextFromPDF(file, variant);\n        console.log(\"Extracted Text:\", text);\n      } catch (error) {\n        console.error(\"Error extracting text from PDF:\", error);\n      }\n    }\n  };\n\n  return (\n    \u003cdiv\u003e\n      \u003cinput\n        type=\"file\"\n        name=\"\"\n        id=\"file-selector\"\n        accept=\".pdf\"\n        onChange={(e) =\u003e handleFileChange(e, \"clean\")}\n      /\u003e\n    \u003c/div\u003e\n  );\n}\n```\n\nTypescript\n\n```tsx\nimport React from \"react\";\nimport extractTextFromPDF, { Variant } from \"pdf-parser-client-side\";\n\nexport default function Test() {\n  const handleFileChange = async (\n    e: React.ChangeEvent\u003cHTMLInputElement\u003e,\n    variant: Variant\n  ) =\u003e {\n    const file = e.target.files?.[0];\n    if (file) {\n      try {\n        const text = await extractTextFromPDF(file, variant);\n        console.log(\"Extracted Text:\", text);\n      } catch (error) {\n        console.error(\"Error extracting text from PDF:\", error);\n      }\n    }\n  };\n\n  return (\n    \u003cdiv\u003e\n      \u003cinput\n        type=\"file\"\n        name=\"\"\n        id=\"file-selector\"\n        accept=\".pdf\"\n        onChange={(e) =\u003e handleFileChange(e, \"clean\")}\n      /\u003e\n    \u003c/div\u003e\n  );\n}\n```\n\n## Contributing\n\nFeel free to contribute!\n\n1. Fork the repository\n2. Make changes\n3. Submit a pull request\n\n### [\u003c/\u003e with 💛 by Vishwa Gaurav](https://itsvg.in)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvishwagauravin%2Fpdf-parser-client-side","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvishwagauravin%2Fpdf-parser-client-side","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvishwagauravin%2Fpdf-parser-client-side/lists"}