{"id":35233954,"url":"https://github.com/chubes4/html-to-blocks-converter","last_synced_at":"2026-05-10T00:00:17.109Z","repository":{"id":326424165,"uuid":"1104853135","full_name":"chubes4/html-to-blocks-converter","owner":"chubes4","description":"WordPress plugin that converts raw HTML to Gutenberg blocks, inspired by Gutenberg's client-side rawHandler","archived":false,"fork":false,"pushed_at":"2026-04-25T18:51:29.000Z","size":72,"stargazers_count":15,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-25T19:05:19.964Z","etag":null,"topics":["gutenberg","wordpress"],"latest_commit_sha":null,"homepage":"https://chubes.net","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chubes4.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-26T19:30:54.000Z","updated_at":"2026-04-25T18:51:33.000Z","dependencies_parsed_at":"2026-04-25T19:01:34.260Z","dependency_job_id":null,"html_url":"https://github.com/chubes4/html-to-blocks-converter","commit_stats":null,"previous_names":["chubes4/html-to-blocks-converter"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/chubes4/html-to-blocks-converter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chubes4%2Fhtml-to-blocks-converter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chubes4%2Fhtml-to-blocks-converter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chubes4%2Fhtml-to-blocks-converter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chubes4%2Fhtml-to-blocks-converter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chubes4","download_url":"https://codeload.github.com/chubes4/html-to-blocks-converter/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chubes4%2Fhtml-to-blocks-converter/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32454170,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T22:27:22.272Z","status":"online","status_checked_at":"2026-04-30T02:00:05.929Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gutenberg","wordpress"],"created_at":"2025-12-30T03:25:38.995Z","updated_at":"2026-05-10T00:00:17.092Z","avatar_url":"https://github.com/chubes4.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HTML to Blocks Converter\n\nA WordPress plugin **and Composer package** that converts raw HTML to Gutenberg block arrays using WordPress Core's HTML API.\n\nIt works in two modes:\n\n- **Plugin mode:** activate the plugin and it automatically converts raw HTML to blocks on `wp_insert_post()` and REST editor reads for public REST-enabled post types.\n- **Package mode:** `composer require chubes4/html-to-blocks-converter` and load WordPress. Composer autoload registers the same conversion library and automatic hooks through the version registry. Consumers can also call `html_to_blocks_raw_handler()` directly.\n\n## Description\n\nThis plugin provides server-side HTML-to-blocks conversion using WordPress Core's HTML API (`WP_HTML_Processor`) for spec-compliant HTML5 parsing. Inspired by Gutenberg's client-side `rawHandler` function from [`packages/blocks/src/api/raw-handling`](https://github.com/WordPress/gutenberg/tree/trunk/packages/blocks/src/api/raw-handling), it enables programmatic content creation with proper block structure, and ensures the block editor sees proper blocks for supported post types even when `post_content` contains raw HTML.\n\n### Use Cases\n\n- Migrating legacy content to Gutenberg blocks\n- Importing content from external sources via REST API\n- Programmatically creating posts with block-based content\n- Converting HTML from headless CMS or content pipelines\n\n## Supported Block Transforms\n\nThe plugin converts high-confidence static HTML patterns to their corresponding Gutenberg blocks:\n\n| HTML signal | Block type |\n|-------------|------------|\n| `\u003ch1\u003e` - `\u003ch6\u003e` | `core/heading` |\n| `\u003cp\u003e` and plain text | `core/paragraph` |\n| `\u003cul\u003e`, `\u003col\u003e` | `core/list` with `core/list-item` children |\n| `\u003cblockquote\u003e` | `core/quote` |\n| `\u003cblockquote class=\"wp-block-pullquote\"\u003e` | `core/pullquote` |\n| `\u003cfigure\u003e\u003cimg\u003e`, `\u003cimg\u003e` | `core/image` |\n| gallery-like wrappers with multiple images | `core/gallery` with `core/image` children |\n| `\u003cvideo\u003e` / `\u003caudio\u003e` with a source | `core/video` / `core/audio` |\n| recognized provider `\u003ciframe\u003e` embeds | `core/embed` |\n| downloadable file anchors | `core/file` |\n| media-text wrappers | `core/media-text` |\n| WordPress button anchors | `core/buttons` with `core/button` children |\n| `\u003cdetails\u003e` | `core/details` |\n| `\u003cpre class=\"wp-block-verse\"\u003e` | `core/verse` |\n| `\u003cpre\u003e\u003ccode\u003e` | `core/code` |\n| `\u003cpre\u003e` | `core/preformatted` |\n| `\u003chr\u003e` | `core/separator` |\n| `\u003ctable\u003e` | `core/table` |\n| WordPress shortcodes | `core/shortcode` |\n| high-confidence semantic/layout wrappers | `core/group`, `core/columns`, `core/column`, `core/cover`, `core/spacer` |\n\nNested lists and blockquotes with multiple paragraphs are fully supported.\n\nFor the source-of-truth status of supported transforms, observed fallbacks,\nfuture candidates, and context-required block families, see the\n[Core Block Coverage Matrix](docs/core-block-coverage.md).\n\nFor Site Editor and block theme boundaries, including which block families\nshould not be inferred from raw HTML alone, see\n[Site Editor Boundary](docs/site-editor-boundary.md).\n\nFor the supported subset h2bc intentionally keeps aligned with Gutenberg's\n`rawHandler`, see [Gutenberg rawHandler Parity](docs/gutenberg-rawhandler-parity.md).\n\nUnsupported top-level elements are preserved as `core/html` instead of guessed.\nWhen that fallback is used, h2bc fires `html_to_blocks_unsupported_html_fallback`\nwith the unsupported HTML fragment, fallback context, and generated block so\nproduction pipelines can log, warn, or fail on unexpected fallback usage.\n\n## Installation\n\n1. Download the plugin zip file\n2. Navigate to Plugins \u003e Add New \u003e Upload Plugin\n3. Upload the zip file and activate\n\nOr clone directly to your plugins directory:\n\n```bash\ncd wp-content/plugins\ngit clone https://github.com/chubes4/html-to-blocks-converter.git\n```\n\nOr install as a Composer package:\n\n```bash\ncomposer require chubes4/html-to-blocks-converter\n```\n\nComposer autoloads `library.php`, which registers the conversion library\nthrough an Action-Scheduler-style version registry. The winning library version\nloads the raw handler and the automatic write/read hooks so bundled consumers get\nthe same HTML → blocks behavior as the standalone plugin.\n\nWhen h2bc is bundled through php-scoper, callbacks registered with WordPress hook\nAPIs must resolve inside the scoped namespace. Build hook callback strings from\n`__NAMESPACE__` so the same source works as the standalone plugin and as a scoped\ndependency.\n\n## Usage\n\nThe plugin hooks into `wp_insert_post_data` and automatically converts HTML content to blocks for supported post types. No configuration required for public REST-enabled post types.\n\n### Programmatic Usage\n\n```php\n// Content will be automatically converted to blocks\nwp_insert_post([\n    'post_title'   =\u003e 'My Post',\n    'post_content' =\u003e '\u003ch1\u003eHello World\u003c/h1\u003e\u003cp\u003eThis is my content.\u003c/p\u003e',\n    'post_status'  =\u003e 'publish',\n    'post_type'    =\u003e 'post',\n]);\n```\n\n### REST API Usage\n\n```bash\ncurl -X POST https://yoursite.com/wp-json/wp/v2/posts \\\n  -H \"Authorization: Bearer YOUR_TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"title\": \"My Post\",\n    \"content\": \"\u003ch1\u003eHello World\u003c/h1\u003e\u003cp\u003eThis is my content.\u003c/p\u003e\",\n    \"status\": \"publish\"\n  }'\n```\n\n### Direct Conversion\n\n```php\n$html = '\u003ch1\u003eTitle\u003c/h1\u003e\u003cp\u003eParagraph with \u003cstrong\u003ebold\u003c/strong\u003e text.\u003c/p\u003e';\n$blocks = html_to_blocks_raw_handler(['HTML' =\u003e $html]);\n$block_content = serialize_blocks($blocks);\n```\n\n## REST API Read Path (v0.4.0+)\n\nThe plugin also converts HTML to blocks when the block editor loads a post via the REST API. When `context=edit` is requested, any post with HTML in `content.raw` (no `\u003c!-- wp:` block markup) is automatically converted to proper block markup before the editor sees it.\n\nThis means the block editor always shows proper blocks — even when `post_content` was written as raw HTML by a migration script, an external API, or another plugin. No \"Convert to blocks\" prompt.\n\nThe REST filters are registered at `init` priority 20 to ensure all custom post types are available.\n\n### Package Mode\n\nWhen loaded by Composer inside WordPress, the version registry loads both the\nconversion API and the automatic hooks. Consumers that only need direct\nconversion can call the raw handler without going through the hooks:\n\n```php\n// Available after Composer autoload runs.\n$blocks = html_to_blocks_raw_handler([\n    'HTML' =\u003e '\u003ch1\u003eHello\u003c/h1\u003e\u003cp\u003eWorld\u003c/p\u003e',\n]);\n```\n\nPackage consumers can call the raw handler directly for adapter pipelines, while\nh2bc still registers its normal hooks for plain HTML write/read paths.\n\n## Filters\n\n### `html_to_blocks_supported_post_types`\n\nModify which post types support automatic HTML-to-blocks conversion.\n\n```php\nadd_filter('html_to_blocks_supported_post_types', function($post_types) {\n    $post_types[] = 'custom_post_type';\n    return $post_types;\n});\n```\n\nDefault: all public REST-enabled post types via `get_post_types(['show_in_rest' =\u003e true, 'public' =\u003e true])`\n\n### `html_to_blocks_unsupported_html_fallback`\n\nObserve unsupported or intentionally ambiguous fragments that are preserved as\n`core/html` instead of guessed.\n\n```php\nadd_action('html_to_blocks_unsupported_html_fallback', function($html, $context, $block) {\n    error_log('h2bc fallback: ' . ($context['reason'] ?? 'unknown'));\n}, 10, 3);\n```\n\n### `html_to_blocks_loaded`\n\nRuns after the version registry initializes the winning h2bc copy. Receives the\nloaded version string.\n\n## Architecture\n\nThe plugin uses WordPress Core's HTML API for parsing:\n\n- **HTML Element Adapter** - DOM-like interface over `WP_HTML_Processor` for familiar traversal methods\n- **Transform Registry** - PHP port of block transforms from `packages/block-library/src/*/transforms.js`\n- **Block Factory** - Creates block arrays compatible with `serialize_blocks()`\n- **Raw Handler** - Main conversion pipeline using `WP_HTML_Processor::create_fragment()`\n- **Attribute Parser** - Extracts block attributes from HTML using WordPress HTML API\n\n### Dual-mode loading\n\n`library.php` is the package entry point. It registers the local copy's version\nand initializer with `HTML_To_Blocks_Versions`. On `plugins_loaded:1`, the\nregistry initializes the highest registered version exactly once. This lets\nmultiple plugins bundle the package while the standalone plugin is also active;\neveryone gets the newest loaded conversion library and no duplicate class/function\ndefinitions.\n\n`html-to-blocks-converter.php` is the plugin shell. It performs the standalone\nplugin's WordPress/PHP guard checks, then loads `library.php`. Composer consumers\nskip the plugin shell but still load the raw handler and automatic hooks through\nthe library initializer.\n\n### Why WordPress HTML API?\n\n- HTML5 spec-compliant parsing that matches browser behavior\n- Proper UTF-8 character encoding handling\n- Correct handling of implied/virtual tags\n- WordPress Core maintained and security hardened\n- Future-proof as the API continues to improve\n\n## Requirements\n\n- WordPress 6.4+ (required for `WP_HTML_Processor`)\n- PHP 7.4+\n\n## License\n\nGPL v2 or later\n\n## Credits\n\nDirectly inspired by the [Gutenberg](https://github.com/WordPress/gutenberg) project's client-side raw handling implementation.\n\n## Author\n\n[Chris Huber](https://chubes.net)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchubes4%2Fhtml-to-blocks-converter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchubes4%2Fhtml-to-blocks-converter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchubes4%2Fhtml-to-blocks-converter/lists"}