https://github.com/chubes4/html-to-blocks-converter
WordPress plugin that converts raw HTML to Gutenberg blocks, inspired by Gutenberg's client-side rawHandler
https://github.com/chubes4/html-to-blocks-converter
gutenberg wordpress
Last synced: about 1 month ago
JSON representation
WordPress plugin that converts raw HTML to Gutenberg blocks, inspired by Gutenberg's client-side rawHandler
- Host: GitHub
- URL: https://github.com/chubes4/html-to-blocks-converter
- Owner: chubes4
- Created: 2025-11-26T19:30:54.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2026-04-25T18:51:29.000Z (about 2 months ago)
- Last Synced: 2026-04-25T19:05:19.964Z (about 2 months ago)
- Topics: gutenberg, wordpress
- Language: PHP
- Homepage: https://chubes.net
- Size: 70.3 KB
- Stars: 15
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# HTML to Blocks Converter
A WordPress plugin **and Composer package** that converts raw HTML to Gutenberg block arrays using WordPress Core's HTML API.
It works in two modes:
- **Plugin mode:** activate the plugin and it automatically converts raw HTML to blocks on `wp_insert_post()` and REST editor reads for public REST-enabled post types.
- **Package mode:** `composer require chubes4/html-to-blocks-converter` and load WordPress. Composer autoload registers the same conversion library and automatic hooks through the version registry. Consumers can also call `html_to_blocks_raw_handler()` directly.
## Description
This plugin provides server-side HTML-to-blocks conversion using WordPress Core's HTML API (`WP_HTML_Processor`) for spec-compliant HTML5 parsing. Inspired by Gutenberg's client-side `rawHandler` function from [`packages/blocks/src/api/raw-handling`](https://github.com/WordPress/gutenberg/tree/trunk/packages/blocks/src/api/raw-handling), it enables programmatic content creation with proper block structure, and ensures the block editor sees proper blocks for supported post types even when `post_content` contains raw HTML.
### Use Cases
- Migrating legacy content to Gutenberg blocks
- Importing content from external sources via REST API
- Programmatically creating posts with block-based content
- Converting HTML from headless CMS or content pipelines
## Supported Block Transforms
The plugin converts high-confidence static HTML patterns to their corresponding Gutenberg blocks:
| HTML signal | Block type |
|-------------|------------|
| `
` - `` | `core/heading` |
| `
` and plain text | `core/paragraph` |
| `
- `, `
- ` | `core/list` with `core/list-item` children |
| `
` | `core/quote` |
| `` | `core/pullquote` |
| ``, `
` | `core/image` |
| gallery-like wrappers with multiple images | `core/gallery` with `core/image` children |
| `` / `` with a source | `core/video` / `core/audio` |
| recognized provider `` embeds | `core/embed` |
| downloadable file anchors | `core/file` |
| media-text wrappers | `core/media-text` |
| WordPress button anchors | `core/buttons` with `core/button` children |
| `` | `core/details` |
| `` | `core/verse` |
| `` | `core/code` |
| `` | `core/preformatted` |` | `core/table` |
| `
` | `core/separator` |
| `
| WordPress shortcodes | `core/shortcode` |
| high-confidence semantic/layout wrappers | `core/group`, `core/columns`, `core/column`, `core/cover`, `core/spacer` |Nested lists and blockquotes with multiple paragraphs are fully supported.
For the source-of-truth status of supported transforms, observed fallbacks,
future candidates, and context-required block families, see the
[Core Block Coverage Matrix](docs/core-block-coverage.md).For Site Editor and block theme boundaries, including which block families
should not be inferred from raw HTML alone, see
[Site Editor Boundary](docs/site-editor-boundary.md).For the supported subset h2bc intentionally keeps aligned with Gutenberg's
`rawHandler`, see [Gutenberg rawHandler Parity](docs/gutenberg-rawhandler-parity.md).Unsupported top-level elements are preserved as `core/html` instead of guessed.
When that fallback is used, h2bc fires `html_to_blocks_unsupported_html_fallback`
with the unsupported HTML fragment, fallback context, and generated block so
production pipelines can log, warn, or fail on unexpected fallback usage.## Installation
1. Download the plugin zip file
2. Navigate to Plugins > Add New > Upload Plugin
3. Upload the zip file and activateOr clone directly to your plugins directory:
```bash
cd wp-content/plugins
git clone https://github.com/chubes4/html-to-blocks-converter.git
```Or install as a Composer package:
```bash
composer require chubes4/html-to-blocks-converter
```Composer autoloads `library.php`, which registers the conversion library
through an Action-Scheduler-style version registry. The winning library version
loads the raw handler and the automatic write/read hooks so bundled consumers get
the same HTML → blocks behavior as the standalone plugin.When h2bc is bundled through php-scoper, callbacks registered with WordPress hook
APIs must resolve inside the scoped namespace. Build hook callback strings from
`__NAMESPACE__` so the same source works as the standalone plugin and as a scoped
dependency.## Usage
The plugin hooks into `wp_insert_post_data` and automatically converts HTML content to blocks for supported post types. No configuration required for public REST-enabled post types.
### Programmatic Usage
```php
// Content will be automatically converted to blocks
wp_insert_post([
'post_title' => 'My Post',
'post_content' => 'Hello World
This is my content.
',
'post_status' => 'publish',
'post_type' => 'post',
]);
```### REST API Usage
```bash
curl -X POST https://yoursite.com/wp-json/wp/v2/posts \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"title": "My Post",
"content": "Hello World
This is my content.
",
"status": "publish"
}'
```### Direct Conversion
```php
$html = 'Title
Paragraph with bold text.
';
$blocks = html_to_blocks_raw_handler(['HTML' => $html]);
$block_content = serialize_blocks($blocks);
```## REST API Read Path (v0.4.0+)
The plugin also converts HTML to blocks when the block editor loads a post via the REST API. When `context=edit` is requested, any post with HTML in `content.raw` (no `