{"id":37083069,"url":"https://github.com/hex0cter/mercy-reader","last_synced_at":"2026-01-14T10:03:06.440Z","repository":{"id":57440931,"uuid":"367612056","full_name":"hex0cter/mercy-reader","owner":"hex0cter","description":"A Python library to extract clean(er), readable text from web pages via Mercury Web Parser.","archived":false,"fork":true,"pushed_at":"2021-05-16T20:51:47.000Z","size":43,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-09-08T22:42:04.612Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"zyocum/reader","license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hex0cter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-05-15T11:18:48.000Z","updated_at":"2023-08-05T06:41:06.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hex0cter/mercy-reader","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hex0cter/mercy-reader","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hex0cter%2Fmercy-reader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hex0cter%2Fmercy-reader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hex0cter%2Fmercy-reader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hex0cter%2Fmercy-reader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hex0cter","download_url":"https://codeload.github.com/hex0cter/mercy-reader/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hex0cter%2Fmercy-reader/sbom","scorecard":{"id":462710,"data":{"date":"2025-08-11","repo":{"name":"github.com/hex0cter/mercy-reader","commit":"6b014f338b2f66a99c43d37410e9fbbf25d61324"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":1.7,"checks":[{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":0,"reason":"11 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2022-42986 / GHSA-43fp-rhv2-5gv8","Warn: Project is vulnerable to: PYSEC-2023-135 / GHSA-xqr8-7jwr-rhp7","Warn: Project is vulnerable to: PYSEC-2024-60 / GHSA-jjg7-2v4v-x38h","Warn: Project is vulnerable to: GHSA-9hjg-9r4m-mvj7","Warn: Project is vulnerable to: GHSA-9wx4-h78v-vm56","Warn: Project is vulnerable to: PYSEC-2023-74 / GHSA-j8r2-6x86-q33q","Warn: Project is vulnerable to: GHSA-34jh-p97f-mpxf","Warn: Project is vulnerable to: PYSEC-2023-212 / GHSA-g4mx-q9vg-27p4","Warn: Project is vulnerable to: GHSA-pq67-6m6q-mj2v","Warn: Project is vulnerable to: PYSEC-2021-108 / GHSA-q2q7-5pp4-w6pg","Warn: Project is vulnerable to: PYSEC-2023-192 / GHSA-v845-jxx5-vc9f"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-19T11:34:40.596Z","repository_id":57440931,"created_at":"2025-08-19T11:34:40.596Z","updated_at":"2025-08-19T11:34:40.596Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28416512,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:38:59.149Z","status":"ssl_error","status_checked_at":"2026-01-14T08:38:43.588Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-14T10:03:05.741Z","updated_at":"2026-01-14T10:03:06.435Z","avatar_url":"https://github.com/hex0cter.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mercy-reader\nA python library to extract clean(er), readable text from web pages, inspired by [zyocum's reader](https://github.com/zyocum/reader).\n\n## Prerequisite\nPlease install [mercury-parser](https://github.com/postlight/mercury-parser) beforehand.\n```\n# Install Mercury globally\nyarn global add @postlight/mercury-parser\n#   or\nnpm -g install @postlight/mercury-parser\n```\n\n## Install\n\nInstall it as a Python dependency:\n\n```\npip install mercy-reader\n```\n\n## Usage\n\n```python\nfrom mercy_reader import reader\nfrom os import path\n\ntest_data_path = path.join(path.dirname(__file__), \"data.json\")\nobj = reader.main(\n        reader.load(test_data_path),\n        80,\n    )\nprint(reader.Format.formatter['md'](obj))\n\n```\n\n### Supported formats:\n* md\n* json\n* txt\n\n## Examples\n\n### Mercury Web Parser JSON\n\nThe library takes Mercury Web Parser's JSON results as its input. Below is an example:\n```json\n{\n  \"title\": \"Mercury Goes Open Source! — Postlight — Digital product studio\",\n  \"author\": \"Adam Pash\",\n  \"date_published\": \"2019-02-06T14:36:45.000Z\",\n  \"dek\": null,\n  \"lead_image_url\": \"https://postlight.com/wp-content/uploads/2019/02/mercury-open-source-social-card-e1550670446269.png\",\n  \"content\": {\n    \"html\": \"\u003cdiv class=\\\"body__content\\\"\u003e \u003cp\u003eIt\u0026#x2019;s my pleasure to announce that today, Postlight is open-sourcing the \u003ca href=\\\"https://mercury.postlight.com/web-parser/\\\"\u003eMercury Web Parser\u003c/a\u003e.\u003c/p\u003e\\n\u003cp\u003eWritten in JavaScript and running on both Node and in the browser, Mercury Parser is the engine that powers the Mercury Parser API, \u003ca href=\\\"https://mercury.postlight.com/amp-converter/\\\"\u003eMercury AMP Converter\u003c/a\u003e, \u003ca href=\\\"https://mercury.postlight.com/reader/\\\"\u003eMercury Reader\u003c/a\u003e, and \u003ca href=\\\"https://postlight.com/trackchanges/the-secret-engines-of-the-internet\\\"\u003eeven more third-party software and services.\u003c/a\u003e\u003c/p\u003e\\n\u003cp\u003eMercury Parser allows for better reading experiences, easier content migration, and endless opportunities for remixing the web, by making semantic sense out of any article. Mercury Parser sees web pages the same way you do: It sees titles, content, authors, and lead images, and makes all of that extracted data easily available to your software, which, unfortunately, sees only a sea of HTML markup, where page navigation, advertising, and the like are indistinguishable from content.\u003c/p\u003e\\n\u003cp\u003eGet \u003ca href=\\\"https://github.com/postlight/mercury-parser\\\"\u003eMercury Parser\u003c/a\u003e for use in your projects on GitHub:\u003c/p\u003e\\n\u003cblockquote class=\\\"embedly-card\\\"\u003e \u003cp\u003e\u0026#x1F4DC; Extracting content from the chaos of the web. Contribute to postlight/mercury-parser development by creating an account on GitHub.\u003c/p\u003e\\n\u003c/blockquote\u003e \u003ch3\u003eTry Mercury Parser\u003c/h3\u003e\\n\u003cp\u003eWanna see Mercury Parser in action in your own command line? First install it:\u003c/p\u003e\\n\u003cpre\u003e$ yarn global add @postlight/mercury-parser\u003c/pre\u003e\\n\u003cp\u003eThen parse an article and check out the results:\u003c/p\u003e\\n\u003cpre\u003e$ mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source\u003c/pre\u003e\\n\u003cp\u003eNow, as an open-source project \u0026#x2014; and with your help \u0026#x2014; we hope to make the Mercury Parser even better. Say, for example, Mercury\u0026#x2019;s done a less-than-perfect job parsing an article from your favorite web site. You can \u003ca href=\\\"https://github.com/postlight/mercury-parser/blob/master/src/extractors/custom/README.md\\\"\u003ewrite and submit a custom site parser\u003c/a\u003e guaranteed to get it right quickly, every time. We\u0026#x2019;re excited about \u003ca href=\\\"https://github.com/postlight/mercury-parser/blob/master/CONTRIBUTING.md\\\"\u003eall sorts of ways\u003c/a\u003e the Mercury community will contribute to this project.\u003c/p\u003e\\n\u003ch3\u003eWhat about the API?\u003c/h3\u003e\\n\u003cp\u003eOver time, we will deprecate the Mercury Parser API. We\u0026#x2019;ll do it slowly, with lots of warning and advance email notifications, and \u003ca href=\\\"https://github.com/postlight/mercury-parser-api\\\"\u003edrop-in replacement code\u003c/a\u003e. We\u0026#x2019;ve committed to creating an easy path for people who want to use Mercury in any way they see fit, using open source, well-documented code that can be easily rolled into any other service or API. We want to put our energy there, making a more tractable web together\u0026#x2014;not behind a private, hosted API.\u003c/p\u003e\\n\u003cp\u003eIndeed, one of the main drivers for this choice was API users asking us to open source Mercury\u0026#x2014;and asking how they could help improve it.\u003c/p\u003e\\n\u003cp\u003eToday we\u0026#x2019;ve done exactly that. You can use Mercury Parser directly in any JavaScript project, whether on Node or in your browser, starting today, with no API required. If you\u0026#x2019;d like to chat about the Mercury Parser or need some help getting started, join the community in the \u003ca href=\\\"https://gitter.im/postlight/mercury\\\"\u003eMercury Gitter channel\u003c/a\u003e.\u003c/p\u003e\\n\u003cp\u003e\u003cem\u003e\u003ca href=\\\"https://postlight.com/trackchanges/authors/adam-pash\\\"\u003eAdam Pash\u003c/a\u003e is a Director of Engineering at Postlight. Want help making sense of big messy data? Get in touch: \u003ca href=\\\"https://postlight.com/cdn-cgi/l/email-protection#1a727f7676755a6a75696e76737d726e34797577\\\"\u003e\u003cspan class=\\\"__cf_email__\\\"\u003e[email\u0026#xA0;protected]\u003c/span\u003e\u003c/a\u003e.\u003c/em\u003e\u003c/p\u003e \u003c/div\u003e\",\n    \"markdown\": \"It's my pleasure to announce that today, Postlight is open-sourcing the [Mercury Web Parser](https://mercury.postlight.com/web-parser/).\\n\\nWritten in JavaScript and running on both Node and in the browser, Mercury Parser is the engine that powers the Mercury Parser API, [Mercury AMP Converter](https://mercury.postlight.com/amp-converter/), [Mercury Reader](https://mercury.postlight.com/reader/), and [even more third-party software and services.](https://postlight.com/trackchanges/the-secret-engines-of-the-internet)\\n\\nMercury Parser allows for better reading experiences, easier content migration, and endless opportunities for remixing the web, by making semantic sense out of any article. Mercury Parser sees web pages the same way you do: It sees titles, content, authors, and lead images, and makes all of that extracted data easily available to your software, which, unfortunately, sees only a sea of HTML markup, where page navigation, advertising, and the like are indistinguishable from content.\\n\\nGet [Mercury Parser](https://github.com/postlight/mercury-parser) for use in your projects on GitHub:\\n\\n\u003e 📜 Extracting content from the chaos of the web. Contribute to postlight/mercury-parser development by creating an account on GitHub.\\n\\n### Try Mercury Parser\\n\\nWanna see Mercury Parser in action in your own command line? First install it:\\n    \\n    \\n    $ yarn global add @postlight/mercury-parser\\n\\nThen parse an article and check out the results:\\n    \\n    \\n    $ mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source\\n\\nNow, as an open-source project -- and with your help -- we hope to make the Mercury Parser even better. Say, for example, Mercury's done a less-than-perfect job parsing an article from your favorite web site. You can [write and submit a custom site parser](https://github.com/postlight/mercury-parser/blob/master/src/extractors/custom/README.md) guaranteed to get it right quickly, every time. We're excited about [all sorts of ways](https://github.com/postlight/mercury-parser/blob/master/CONTRIBUTING.md) the Mercury community will contribute to this project.\\n\\n### What about the API?\\n\\nOver time, we will deprecate the Mercury Parser API. We'll do it slowly, with lots of warning and advance email notifications, and [drop-in replacement code](https://github.com/postlight/mercury-parser-api). We've committed to creating an easy path for people who want to use Mercury in any way they see fit, using open source, well-documented code that can be easily rolled into any other service or API. We want to put our energy there, making a more tractable web together--not behind a private, hosted API.\\n\\nIndeed, one of the main drivers for this choice was API users asking us to open source Mercury--and asking how they could help improve it.\\n\\nToday we've done exactly that. You can use Mercury Parser directly in any JavaScript project, whether on Node or in your browser, starting today, with no API required. If you'd like to chat about the Mercury Parser or need some help getting started, join the community in the [Mercury Gitter channel](https://gitter.im/postlight/mercury).\\n\\n_[Adam Pash](https://postlight.com/trackchanges/authors/adam-pash) is a Director of Engineering at Postlight. Want help making sense of big messy data? Get in touch: [ [email protected]](https://postlight.com/cdn-cgi/l/email-protection#1a727f7676755a6a75696e76737d726e34797577)._\\n\",\n    \"text\": \"It's my pleasure to announce that today, Postlight is open-sourcing the Mercury Web Parser.\\n\\nWritten in JavaScript and running on both Node and in the browser, Mercury Parser is the engine that powers the Mercury Parser API, Mercury AMP Converter, Mercury Reader, and even more third-party software and services.\\n\\nMercury Parser allows for better reading experiences, easier content migration, and endless opportunities for remixing the web, by making semantic sense out of any article. Mercury Parser sees web pages the same way you do: It sees titles, content, authors, and lead images, and makes all of that extracted data easily available to your software, which, unfortunately, sees only a sea of HTML markup, where page navigation, advertising, and the like are indistinguishable from content.\\n\\nGet Mercury Parser for use in your projects on GitHub:\\n\\n\u003e 📜 Extracting content from the chaos of the web. Contribute to postlight/mercury-parser development by creating an account on GitHub.\\n\\n### Try Mercury Parser\\n\\nWanna see Mercury Parser in action in your own command line? First install it:\\n    \\n    \\n    $ yarn global add @postlight/mercury-parser\\n\\nThen parse an article and check out the results:\\n    \\n    \\n    $ mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source\\n\\nNow, as an open-source project -- and with your help -- we hope to make the Mercury Parser even better. Say, for example, Mercury's done a less-than-perfect job parsing an article from your favorite web site. You can write and submit a custom site parser guaranteed to get it right quickly, every time. We're excited about all sorts of ways the Mercury community will contribute to this project.\\n\\n### What about the API?\\n\\nOver time, we will deprecate the Mercury Parser API. We'll do it slowly, with lots of warning and advance email notifications, and drop-in replacement code. We've committed to creating an easy path for people who want to use Mercury in any way they see fit, using open source, well-documented code that can be easily rolled into any other service or API. We want to put our energy there, making a more tractable web together--not behind a private, hosted API.\\n\\nIndeed, one of the main drivers for this choice was API users asking us to open source Mercury--and asking how they could help improve it.\\n\\nToday we've done exactly that. You can use Mercury Parser directly in any JavaScript project, whether on Node or in your browser, starting today, with no API required. If you'd like to chat about the Mercury Parser or need some help getting started, join the community in the Mercury Gitter channel.\\n\\nAdam Pash is a Director of Engineering at Postlight. Want help making sense of big messy data? Get in touch: [email protected].\\n\"\n  },\n  \"next_page_url\": null,\n  \"url\": \"https://postlight.com/trackchanges/mercury-goes-open-source\",\n  \"domain\": \"postlight.com\",\n  \"excerpt\": \"It’s my pleasure to announce that today, Postlight is open-sourcing the Mercury Web Parser. Written in JavaScript and running on both Node and in the ...\",\n  \"word_count\": 436,\n  \"direction\": \"ltr\",\n  \"total_pages\": 1,\n  \"rendered_pages\": 1\n}\n```\n\n### HTML output\n```html\n\u003cdiv class=\"body__content\"\u003e \u003cp\u003eIt\u0026#x2019;s my pleasure to announce that today, Postlight is open-sourcing the \u003ca href=\"https://mercury.postlight.com/web-parser/\"\u003eMercury Web Parser\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003eWritten in JavaScript and running on both Node and in the browser, Mercury Parser is the engine that powers the Mercury Parser API, \u003ca href=\"https://mercury.postlight.com/amp-converter/\"\u003eMercury AMP Converter\u003c/a\u003e, \u003ca href=\"https://mercury.postlight.com/reader/\"\u003eMercury Reader\u003c/a\u003e, and \u003ca href=\"https://postlight.com/trackchanges/the-secret-engines-of-the-internet\"\u003eeven more third-party software and services.\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eMercury Parser allows for better reading experiences, easier content migration, and endless opportunities for remixing the web, by making semantic sense out of any article. Mercury Parser sees web pages the same way you do: It sees titles, content, authors, and lead images, and makes all of that extracted data easily available to your software, which, unfortunately, sees only a sea of HTML markup, where page navigation, advertising, and the like are indistinguishable from content.\u003c/p\u003e\n\u003cp\u003eGet \u003ca href=\"https://github.com/postlight/mercury-parser\"\u003eMercury Parser\u003c/a\u003e for use in your projects on GitHub:\u003c/p\u003e\n\u003cblockquote class=\"embedly-card\"\u003e \u003cp\u003e\u0026#x1F4DC; Extracting content from the chaos of the web. Contribute to postlight/mercury-parser development by creating an account on GitHub.\u003c/p\u003e\n\u003c/blockquote\u003e \u003ch3\u003eTry Mercury Parser\u003c/h3\u003e\n\u003cp\u003eWanna see Mercury Parser in action in your own command line? First install it:\u003c/p\u003e\n\u003cpre\u003e$ yarn global add @postlight/mercury-parser\u003c/pre\u003e\n\u003cp\u003eThen parse an article and check out the results:\u003c/p\u003e\n\u003cpre\u003e$ mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source\u003c/pre\u003e\n\u003cp\u003eNow, as an open-source project \u0026#x2014; and with your help \u0026#x2014; we hope to make the Mercury Parser even better. Say, for example, Mercury\u0026#x2019;s done a less-than-perfect job parsing an article from your favorite web site. You can \u003ca href=\"https://github.com/postlight/mercury-parser/blob/master/src/extractors/custom/README.md\"\u003ewrite and submit a custom site parser\u003c/a\u003e guaranteed to get it right quickly, every time. We\u0026#x2019;re excited about \u003ca href=\"https://github.com/postlight/mercury-parser/blob/master/CONTRIBUTING.md\"\u003eall sorts of ways\u003c/a\u003e the Mercury community will contribute to this project.\u003c/p\u003e\n\u003ch3\u003eWhat about the API?\u003c/h3\u003e\n\u003cp\u003eOver time, we will deprecate the Mercury Parser API. We\u0026#x2019;ll do it slowly, with lots of warning and advance email notifications, and \u003ca href=\"https://github.com/postlight/mercury-parser-api\"\u003edrop-in replacement code\u003c/a\u003e. We\u0026#x2019;ve committed to creating an easy path for people who want to use Mercury in any way they see fit, using open source, well-documented code that can be easily rolled into any other service or API. We want to put our energy there, making a more tractable web together\u0026#x2014;not behind a private, hosted API.\u003c/p\u003e\n\u003cp\u003eIndeed, one of the main drivers for this choice was API users asking us to open source Mercury\u0026#x2014;and asking how they could help improve it.\u003c/p\u003e\n\u003cp\u003eToday we\u0026#x2019;ve done exactly that. You can use Mercury Parser directly in any JavaScript project, whether on Node or in your browser, starting today, with no API required. If you\u0026#x2019;d like to chat about the Mercury Parser or need some help getting started, join the community in the \u003ca href=\"https://gitter.im/postlight/mercury\"\u003eMercury Gitter channel\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003ca href=\"https://postlight.com/trackchanges/authors/adam-pash\"\u003eAdam Pash\u003c/a\u003e is a Director of Engineering at Postlight. Want help making sense of big messy data? Get in touch: \u003ca href=\"https://postlight.com/cdn-cgi/l/email-protection#4d25282121220d3d223e3921242a2539632e2220\"\u003e\u003cspan class=\"__cf_email__\"\u003e[email\u0026#xA0;protected]\u003c/span\u003e\u003c/a\u003e.\u003c/em\u003e\u003c/p\u003e \u003c/div\u003e\n```\n\n### Markdown output\n```markdown\ndate: 2019-02-06 14:36:45  \nauthor(s): Adam Pash  \n\n# [Mercury Goes Open Source! — Postlight — Digital product studio](https://postlight.com/trackchanges/mercury-goes-open-source)\n\nIt's my pleasure to announce that today, Postlight is open-sourcing the [Mercury Web Parser](https://mercury.postlight.com/web-parser/).\n\nWritten in JavaScript and running on both Node and in the browser, Mercury Parser is the engine that powers the Mercury Parser API, [Mercury AMP Converter](https://mercury.postlight.com/amp-converter/), [Mercury Reader](https://mercury.postlight.com/reader/), and [even more third-party software and services.](https://postlight.com/trackchanges/the-secret-engines-of-the-internet)\n\nMercury Parser allows for better reading experiences, easier content migration, and endless opportunities for remixing the web, by making semantic sense out of any article. Mercury Parser sees web pages the same way you do: It sees titles, content, authors, and lead images, and makes all of that extracted data easily available to your software, which, unfortunately, sees only a sea of HTML markup, where page navigation, advertising, and the like are indistinguishable from content.\n\nGet [Mercury Parser](https://github.com/postlight/mercury-parser) for use in your projects on GitHub:\n\n\u003e 📜 Extracting content from the chaos of the web. Contribute to postlight/mercury-parser development by creating an account on GitHub.\n\n### Try Mercury Parser\n\nWanna see Mercury Parser in action in your own command line? First install it:\n    \n    \n    $ yarn global add @postlight/mercury-parser\n\nThen parse an article and check out the results:\n    \n    \n    $ mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source\n\nNow, as an open-source project -- and with your help -- we hope to make the Mercury Parser even better. Say, for example, Mercury's done a less-than-perfect job parsing an article from your favorite web site. You can [write and submit a custom site parser](https://github.com/postlight/mercury-parser/blob/master/src/extractors/custom/README.md) guaranteed to get it right quickly, every time. We're excited about [all sorts of ways](https://github.com/postlight/mercury-parser/blob/master/CONTRIBUTING.md) the Mercury community will contribute to this project.\n\n### What about the API?\n\nOver time, we will deprecate the Mercury Parser API. We'll do it slowly, with lots of warning and advance email notifications, and [drop-in replacement code](https://github.com/postlight/mercury-parser-api). We've committed to creating an easy path for people who want to use Mercury in any way they see fit, using open source, well-documented code that can be easily rolled into any other service or API. We want to put our energy there, making a more tractable web together--not behind a private, hosted API.\n\nIndeed, one of the main drivers for this choice was API users asking us to open source Mercury--and asking how they could help improve it.\n\nToday we've done exactly that. You can use Mercury Parser directly in any JavaScript project, whether on Node or in your browser, starting today, with no API required. If you'd like to chat about the Mercury Parser or need some help getting started, join the community in the [Mercury Gitter channel](https://gitter.im/postlight/mercury).\n\n_[Adam Pash](https://postlight.com/trackchanges/authors/adam-pash) is a Director of Engineering at Postlight. Want help making sense of big messy data? Get in touch: [ [email protected]](https://postlight.com/cdn-cgi/l/email-protection#86eee3eaeae9c6f6e9f5f2eaefe1eef2a8e5e9eb)._\n\n```\n### Plain-text output\n```text\nurl: https://postlight.com/trackchanges/mercury-goes-open-source\ndate: 2019-02-06 14:36:45\nauthor(s): Adam Pash\n\nMercury Goes Open Source! — Postlight — Digital product studio\n\nIt's my pleasure to announce that today, Postlight is open-sourcing the Mercury Web Parser.\n\nWritten in JavaScript and running on both Node and in the browser, Mercury Parser is the engine that powers the Mercury Parser API, Mercury AMP Converter, Mercury Reader, and even more third-party software and services.\n\nMercury Parser allows for better reading experiences, easier content migration, and endless opportunities for remixing the web, by making semantic sense out of any article. Mercury Parser sees web pages the same way you do: It sees titles, content, authors, and lead images, and makes all of that extracted data easily available to your software, which, unfortunately, sees only a sea of HTML markup, where page navigation, advertising, and the like are indistinguishable from content.\n\nGet Mercury Parser for use in your projects on GitHub:\n\n\u003e 📜 Extracting content from the chaos of the web. Contribute to postlight/mercury-parser development by creating an account on GitHub.\n\n### Try Mercury Parser\n\nWanna see Mercury Parser in action in your own command line? First install it:\n    \n    \n    $ yarn global add @postlight/mercury-parser\n\nThen parse an article and check out the results:\n    \n    \n    $ mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source\n\nNow, as an open-source project -- and with your help -- we hope to make the Mercury Parser even better. Say, for example, Mercury's done a less-than-perfect job parsing an article from your favorite web site. You can write and submit a custom site parser guaranteed to get it right quickly, every time. We're excited about all sorts of ways the Mercury community will contribute to this project.\n\n### What about the API?\n\nOver time, we will deprecate the Mercury Parser API. We'll do it slowly, with lots of warning and advance email notifications, and drop-in replacement code. We've committed to creating an easy path for people who want to use Mercury in any way they see fit, using open source, well-documented code that can be easily rolled into any other service or API. We want to put our energy there, making a more tractable web together--not behind a private, hosted API.\n\nIndeed, one of the main drivers for this choice was API users asking us to open source Mercury--and asking how they could help improve it.\n\nToday we've done exactly that. You can use Mercury Parser directly in any JavaScript project, whether on Node or in your browser, starting today, with no API required. If you'd like to chat about the Mercury Parser or need some help getting started, join the community in the Mercury Gitter channel.\n\nAdam Pash is a Director of Engineering at Postlight. Want help making sense of big messy data? Get in touch: [email protected].\n\n```\n\n### Run the test\n```bash\npython setup.py pytest --addopts -s\n```\n\n## References\n* [mercury-parser](https://github.com/postlight/mercury-parser)\n* [zyocum's reader](https://github.com/zyocum/reader)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhex0cter%2Fmercy-reader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhex0cter%2Fmercy-reader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhex0cter%2Fmercy-reader/lists"}