{"id":13405269,"url":"https://github.com/thephpleague/html-to-markdown","last_synced_at":"2025-05-15T00:05:23.499Z","repository":{"id":1077508,"uuid":"2620408","full_name":"thephpleague/html-to-markdown","owner":"thephpleague","description":"Convert HTML to Markdown with PHP","archived":false,"fork":false,"pushed_at":"2025-02-07T05:56:53.000Z","size":432,"stargazers_count":1813,"open_issues_count":17,"forks_count":208,"subscribers_count":43,"default_branch":"master","last_synced_at":"2025-05-07T23:39:27.410Z","etag":null,"topics":["commonmark","converter","hacktoberfest","html","markdown","php","phpleague"],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thephpleague.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":".github/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"colinodell","tidelift":"packagist/league/html-to-markdown","custom":["https://www.colinodell.com/sponsor","https://www.paypal.me/colinpodell/10.00"]}},"created_at":"2011-10-21T13:38:37.000Z","updated_at":"2025-05-05T18:26:07.000Z","dependencies_parsed_at":"2025-05-04T21:31:36.184Z","dependency_job_id":null,"html_url":"https://github.com/thephpleague/html-to-markdown","commit_stats":{"total_commits":342,"total_committers":54,"mean_commits":6.333333333333333,"dds":0.5526315789473684,"last_synced_commit":"2185d5e26fc8b4ca5cb721c6e153c02894eb9cfe"},"previous_names":[],"tags_count":39,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thephpleague%2Fhtml-to-markdown","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thephpleague%2Fhtml-to-markdown/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thephpleague%2Fhtml-to-markdown/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thephpleague%2Fhtml-to-markdown/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thephpleague","download_url":"https://codeload.github.com/thephpleague/html-to-markdown/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254249198,"owners_count":22039029,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["commonmark","converter","hacktoberfest","html","markdown","php","phpleague"],"created_at":"2024-07-30T19:01:58.174Z","updated_at":"2025-05-15T00:05:21.025Z","avatar_url":"https://github.com/thephpleague.png","language":"PHP","funding_links":["https://github.com/sponsors/colinodell","https://tidelift.com/funding/github/packagist/league/html-to-markdown","https://www.colinodell.com/sponsor","https://www.paypal.me/colinpodell/10.00"],"categories":["PHP","目录","Table of Contents","Tools","标记( Markup )","类库","标记 Markup"],"sub_categories":["标记和CSS Markup and CSS","Markup and CSS","Converters","Markup","Markdown","Globalization"],"readme":"HTML To Markdown for PHP\n========================\n\n[![Latest Version](https://img.shields.io/packagist/v/league/html-to-markdown.svg?style=flat-square)](https://packagist.org/packages/league/html-to-markdown)\n[![Software License](http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat-square)](LICENSE)\n[![Build Status](https://img.shields.io/github/workflow/status/thephpleague/html-to-markdown/Tests/master.svg?style=flat-square)](https://github.com/thephpleague/html-to-markdown/actions?query=workflow%3ATests+branch%3Amaster)\n[![Coverage Status](https://img.shields.io/scrutinizer/coverage/g/thephpleague/html-to-markdown.svg?style=flat-square)](https://scrutinizer-ci.com/g/thephpleague/html-to-markdown/code-structure)\n[![Quality Score](https://img.shields.io/scrutinizer/g/thephpleague/html-to-markdown.svg?style=flat-square)](https://scrutinizer-ci.com/g/thephpleague/html-to-markdown)\n[![Total Downloads](https://img.shields.io/packagist/dt/league/html-to-markdown.svg?style=flat-square)](https://packagist.org/packages/league/html-to-markdown)\n\nLibrary which converts HTML to [Markdown](http://daringfireball.net/projects/markdown/) for your sanity and convenience.\n\n\n**Requires**: PHP 7.2+\n\n**Lead Developer**: [@colinodell](http://twitter.com/colinodell)\n\n**Original Author**: [@nickcernis](http://twitter.com/nickcernis)\n\n\n### Why convert HTML to Markdown?\n\n*\"What alchemy is this?\"* you mutter. *\"I can see why you'd convert [Markdown to HTML](https://github.com/thephpleague/commonmark),\"* you continue, already labouring the question somewhat, *\"but why go the other way?\"*\n\nTypically you would convert HTML to Markdown if:\n\n1. You have an existing HTML document that needs to be edited by people with good taste.\n2. You want to store new content in HTML format but edit it as Markdown.\n3. You want to convert HTML email to plain text email.\n4. You know a guy who's been converting HTML to Markdown for years, and now he can speak Elvish. You'd quite like to be able to speak Elvish.\n5. You just really like Markdown.\n\n### How to use it\n\nRequire the library by issuing this command:\n\n```bash\ncomposer require league/html-to-markdown\n```\n\nAdd `require 'vendor/autoload.php';` to the top of your script.\n\nNext, create a new HtmlConverter instance, passing in your valid HTML code to its `convert()` function:\n\n```php\nuse League\\HTMLToMarkdown\\HtmlConverter;\n\n$converter = new HtmlConverter();\n\n$html = \"\u003ch3\u003eQuick, to the Batpoles!\u003c/h3\u003e\";\n$markdown = $converter-\u003econvert($html);\n```\n\nThe `$markdown` variable now contains the Markdown version of your HTML as a string:\n\n```php\necho $markdown; // ==\u003e ### Quick, to the Batpoles!\n```\n\nThe included `demo` directory contains an HTML-\u003eMarkdown conversion form to try out.\n\n### Conversion options\n\n\u003e [!CAUTION]  \n\u003e By default, this library preserves HTML tags without Markdown equivalents, like `\u003cspan\u003e`, `\u003cdiv\u003e`, `\u003ciframe\u003e`, `\u003cscript\u003e`, etc. If you will be parsing untrusted input from users, **please consider setting the `strip_tags` and/or `remove_nodes` options** documented below, and also using a library (like [HTML Purifier](https://github.com/ezyang/htmlpurifier)) to provide additional HTML filtering.\n\nTo strip HTML tags that don't have a Markdown equivalent while preserving the content inside them, set `strip_tags` to true, like this:\n\n```php\n$converter = new HtmlConverter(array('strip_tags' =\u003e true));\n\n$html = '\u003cspan\u003eTurnips!\u003c/span\u003e';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"Turnips!\"\n```\n\nOr more explicitly, like this:\n\n```php\n$converter = new HtmlConverter();\n$converter-\u003egetConfig()-\u003esetOption('strip_tags', true);\n\n$html = '\u003cspan\u003eTurnips!\u003c/span\u003e';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"Turnips!\"\n```\n\nNote that only the tags themselves are stripped, not the content they hold.\n\nTo strip tags and their content, pass a space-separated list of tags in `remove_nodes`, like this:\n\n```php\n$converter = new HtmlConverter(array('remove_nodes' =\u003e 'span div'));\n\n$html = '\u003cspan\u003eTurnips!\u003c/span\u003e\u003cdiv\u003eMonkeys!\u003c/div\u003e';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"\"\n```\n\nBy default, all comments are stripped from the content. To preserve them, use the `preserve_comments` option, like this:\n\n```php\n$converter = new HtmlConverter(array('preserve_comments' =\u003e true));\n\n$html = '\u003cspan\u003eTurnips!\u003c/span\u003e\u003c!-- Monkeys! --\u003e';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"Turnips!\u003c!-- Monkeys! --\u003e\"\n```\n\nTo preserve only specific comments, set `preserve_comments` with an array of strings, like this:\n\n```php\n$converter = new HtmlConverter(array('preserve_comments' =\u003e array('Eggs!')));\n\n$html = '\u003cspan\u003eTurnips!\u003c/span\u003e\u003c!-- Monkeys! --\u003e\u003c!-- Eggs! --\u003e';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"Turnips!\u003c!-- Eggs! --\u003e\"\n```\n\nBy default, placeholder links are preserved. To strip the placeholder links, use the `strip_placeholder_links` option, like this:\n\n```php\n$converter = new HtmlConverter(array('strip_placeholder_links' =\u003e true));\n\n$html = '\u003ca\u003eGithub\u003c/a\u003e';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"Github\"\n```\n\n### Style options\n\nBy default bold tags are converted using the asterisk syntax, and italic tags are converted using the underlined syntax. Change these by using the `bold_style` and `italic_style` options.\n\n```php\n$converter = new HtmlConverter();\n$converter-\u003egetConfig()-\u003esetOption('italic_style', '*');\n$converter-\u003egetConfig()-\u003esetOption('bold_style', '__');\n\n$html = '\u003cem\u003eItalic\u003c/em\u003e and a \u003cstrong\u003ebold\u003c/strong\u003e';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"*Italic* and a __bold__\"\n```\n\n### Line break options\n\nBy default, `br` tags are converted to two spaces followed by a newline character as per [traditional Markdown](https://daringfireball.net/projects/markdown/syntax#p). Set `hard_break` to `true` to omit the two spaces, as per GitHub Flavored Markdown (GFM).\n\n```php\n$converter = new HtmlConverter();\n$html = '\u003cp\u003etest\u003cbr\u003eline break\u003c/p\u003e';\n\n$converter-\u003egetConfig()-\u003esetOption('hard_break', true);\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"test\\nline break\"\n\n$converter-\u003egetConfig()-\u003esetOption('hard_break', false); // default\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"test  \\nline break\"\n```\n\n### Autolinking options\n\nBy default, `a` tags are converted to the easiest possible link syntax, i.e. if no text or title is available, then the `\u003curl\u003e` syntax will be used rather than the full `[url](url)` syntax. Set `use_autolinks` to `false` to change this behavior to always use the full link syntax.\n\n```php\n$converter = new HtmlConverter();\n$html = '\u003cp\u003e\u003ca href=\"https://thephpleague.com\"\u003ehttps://thephpleague.com\u003c/a\u003e\u003c/p\u003e';\n\n$converter-\u003egetConfig()-\u003esetOption('use_autolinks', true);\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"\u003chttps://thephpleague.com\u003e\"\n\n$converter-\u003egetConfig()-\u003esetOption('use_autolinks', false); // default\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"[https://thephpleague.com](https://thephpleague.com)\"\n```\n\n### Passing custom Environment object\n\nYou can pass current `Environment` object to customize i.e. which converters should be used.\n\n```php\n$environment = new Environment(array(\n    // your configuration here\n));\n$environment-\u003eaddConverter(new HeaderConverter()); // optionally - add converter manually\n\n$converter = new HtmlConverter($environment);\n\n$html = '\u003ch3\u003eHeader\u003c/h3\u003e\n\u003cimg src=\"\" /\u003e\n';\n$markdown = $converter-\u003econvert($html); // $markdown now contains \"### Header\" and \"\u003cimg src=\"\" /\u003e\"\n```\n\n### Table support\n\nSupport for Markdown tables is not enabled by default because it is not part of the original Markdown syntax. To use tables add the converter explicitly:\n\n```php\nuse League\\HTMLToMarkdown\\HtmlConverter;\nuse League\\HTMLToMarkdown\\Converter\\TableConverter;\n\n$converter = new HtmlConverter();\n$converter-\u003egetEnvironment()-\u003eaddConverter(new TableConverter());\n\n$html = \"\u003ctable\u003e\u003ctr\u003e\u003cth\u003eA\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd\u003ea\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\";\n$markdown = $converter-\u003econvert($html);\n```\n\n### Limitations\n\n- Markdown Extra, MultiMarkdown and other variants aren't supported – just Markdown.\n\n### Style notes\n\n- Setext (underlined) headers are the default for H1 and H2. If you prefer the ATX style for H1 and H2 (# Header 1 and ## Header 2), set `header_style` to 'atx' in the options array when you instantiate the object:\n\n    `$converter = new HtmlConverter(array('header_style'=\u003e'atx'));`\n\n     Headers of H3 priority and lower always use atx style.\n\n- Links and images are referenced inline. Footnote references (where image src and anchor href attributes are listed in the footnotes) are not used.\n- Blockquotes aren't line wrapped – it makes the converted Markdown easier to edit.\n\n### Dependencies\n\nHTML To Markdown requires PHP's [xml](http://www.php.net/manual/en/xml.installation.php), [lib-xml](http://www.php.net/manual/en/libxml.installation.php), and [dom](http://www.php.net/manual/en/dom.installation.php) extensions, all of which are enabled by default on most distributions.\n\nErrors such as \"Fatal error: Class 'DOMDocument' not found\" on distributions such as CentOS that disable PHP's xml extension can be resolved by installing php-xml.\n\n### Contributors\n\nMany thanks to all [contributors](https://github.com/thephpleague/html-to-markdown/graphs/contributors) so far. Further improvements and feature suggestions are very welcome.\n\n### How it works\n\nHTML To Markdown creates a DOMDocument from the supplied HTML, walks through the tree, and converts each node to a text node containing the equivalent markdown, starting from the most deeply nested node and working inwards towards the root node.\n\n### To-do\n\n- Support for nested lists and lists inside blockquotes.\n- Offer an option to preserve tags as HTML if they contain attributes that can't be represented with Markdown (e.g. `style`).\n\n### Trying to convert Markdown to HTML?\n\nUse one of these great libraries:\n\n - [league/commonmark](https://github.com/thephpleague/commonmark) (recommended)\n - [cebe/markdown](https://github.com/cebe/markdown)\n - [PHP Markdown](https://michelf.ca/projects/php-markdown/)\n - [Parsedown](https://github.com/erusev/parsedown)\n\nNo guarantees about the Elvish, though.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthephpleague%2Fhtml-to-markdown","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthephpleague%2Fhtml-to-markdown","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthephpleague%2Fhtml-to-markdown/lists"}