https://github.com/hyvor/phrosemirror

Prosemirror in PHP
https://github.com/hyvor/phrosemirror
php prosemirror
Last synced: 6 months ago
JSON representation
Prosemirror in PHP
Host: GitHub
URL: https://github.com/hyvor/phrosemirror
Owner: hyvor
License: mit
Created: 2022-10-31T11:59:05.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2024-03-28T16:28:01.000Z (over 1 year ago)
Last Synced: 2025-04-16T02:12:28.011Z (6 months ago)
Topics: php, prosemirror
Language: PHP
Homepage:
Size: 115 KB
Stars: 5
Watchers: 1
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          Phrosemirror is a PHP library to work with Prosemirror (or TipTap) JSON content in an easy and type-safe way.

Here is what this library can do:

* Convert Prosemirror JSON into a Document with typed Nodes, Marks, and Attributes

* Analyze and change Documents

* Convert a Document to HTML

* Convert a Document to Text

* Parse HTML to a Document

* `content` and `group` for more strict schema conformity

## Installation

```

composer require hyvor/phrosemirror

```

## 1. Schema

This library is unopinionated, which means there is no default schema. To start, you have to start with defining your schema that is similar to your front-end Prosemirror configurations. 

> You can find an example schema in the `/example` directory in this repo, which is similar to `prosemirror-schema-basic` package's schema.

```php

use Hyvor\Phrosemirror\Types\Schema;

$schema = new Schema(

    [

        new Doc,

        new Text,

        new Paragraph,

        new Blockquote,

        new Image

    ],

    [

        new Strong,

        new Italic,  

    ]

);

```

In the `Schema` constructor, first argument is an array of **Nodes Types** and the second one is an array of **Marks Types**.

### Node Types

A basic node type looks like this:

```php

use Hyvor\Phrosemirror\Types\NodeType;

class Doc extends NodeType

{

    public string $name = 'doc';

    public ?string $content = 'block+';

}

```

They can contain `content` and `group` properties. If `content` is not set, no content is allowed in this node. See [Content & Grouping](#content--grouping) below for more information on how these properties work.

Here is another example of a Node Type:

```php

class Paragraph extends NodeType

{

    public string $name = 'paragraph';

    public ?string $content = 'inline*';

    public string $group = 'block';

}

```

### Mark Types

A basic mark type looks like this:

```php

use Hyvor\Phrosemirror\Types\MarkType;

class Strong extends MarkType

{

    public string $name = 'strong';

}

```

### Attributes (Attrs)

One main goal of this library is to achieve type-safety. Therefore, attributes are defined in a typed class.

```php

use Hyvor\Phrosemirror\Types\AttrsType;

class ImageAttrs extends AttrsType

{

    public string $src;

    public ?string $alt;

    

}

```

> By defining explicit types, we are sure that `src` attribute of the Image is always a string. `alt` can be a string or null.

You can also define default values for attributes, which will be used if they are not present in the JSON document.

```php

class ImageAttrs extends AttrsType

{

    public string $src = 'https://hyvor.com/placeholder.png';

}

```

Then, in the Node Type or Mark Type, you have to mention the Attrs class.

```php

use Hyvor\Phrosemirror\Types\NodeType;

class Image extends NodeType

{

    // ...

    public string $attrs = ImageAttrs::class;

}

```

## 2. Document

Once the Schema is ready, we can start working with Documents.

```php

use Hyvor\Phrosemirror\Types\Schema;

use Hyvor\Phrosemirror\Document\Document;

$schema = new Schema($nodes, $marks);

$json = '{}'; // <- this is the JSON from the front-end

$document = Document::fromJson($schema, $json);

```

`$json` can be a JSON string, a PHP array, or a PHP object. If the given JSON is valid, `$document` will be an instance of `Hyvor\Phrosemirror\Document\Document`. If not, an error will be thrown. See Error Handling below.

### Node

A `Document` is just a `Node` with the `doc` type. These are the properties of a Node.

```php

namespace Hyvor\Phrosemirror\Document;

use Hyvor\Phrosemirror\Types\NodeType;

use Hyvor\Phrosemirror\Types\AttrsType;

class Node

{

    public NodeType $type;

    public AttrsType $attrs;

    public Fragment $content;

    public Mark[] $marks;

}

```

`NodeType $type` is the type of the node, which you defined in the schema

`AttrsType $attrs` is the attributes of the Node. This will be an object of the class you defined in Node Type attrs. For example, as in the above example of Node Types, if the node is `Image`, `$attrs` will be an object of `ImageAttrs`.

`Fragment $content` is a collection of children Nodes.

`Mark[] $marks` is an array of Marks assigned to this node.

> `TextNode` is a special `Node` that represents the `text` node type in Prosemirror. It has the `string $text` property in addition to the above properties. Also, `$marks` only makes sense in the context of `TextNode`.

#### Checking Node Type

Use `isOfType()` to check if a `Node` is of a particular `NodeType` defined in your schema.

```php

$json = ['type' => 'paragraph'];

$node = Node::fromJson($schema, $json);

$node->isOfType(Paragraph::class); // true

$node->isOfType(Image::class); // false

$node->isOfType([Paragraph::class, Image::class]); // true

```

#### Accessing Attributes

Use the `attr()` method to access an attribute of the Node.

```php

$json = ['type' => 'image', 'attrs' => ['src' => 'image.png']];

$image = Node::fromJson($schema, $json);

// html-escaped (safe to use in HTML output)

$src = $image->attr('src');

// not html-escaped

$src = $image->attr('src', escape: false);

```

#### Traversing Through Nested Nodes

You can traverse through nested nodes using the `traverse()` method with a callback. Here is an example that traverse through all nodes and finds all image nodes.

```php

$document = Document::fromJson($schema, $json);

$images = [];

$document->traverse(function(Node $node) use(&$images) {

    if ($node->isOfType(Image::class)) {

        $images[] = $node;

    }

})

```

> `traverse()` traverses through `TextNode`s too!

#### Traversing Through Direct Children

Use `foreach` with `$node->content`.

```php

foreach ($node->content as $child) {

    if ($child->isOfType(Image::class)) {

        echo "I found an image!";

    }

}

```

#### Finding Nodes

Earlier, we used `traverse()` to find nodes, but there is the `getNodes()` method to make it easier. It searches through the all nested nodes and returns `Node[]` of matched nodes.

```php

// images

$node->getNodes(Image::class);

// all nodes (including TextNodes)

$node->getNodes();

// nodes of multiple types

$node->getNodes([Paragraph::class, Blockquote::class]);

// images (only direct children)

$node->getNodes(Image::class, false);

```

#### Finding Marks

Similar to `getNodes()` you can use `getMarks()` to find marks within the current node. It searches all nested nodes and returns `Mark[]` of matched marks.

```php

// links

$node->getMarks(Link::class);

// all marks

$node->getMarks();

// multiple types

$node->getMarks([Strong::class, Italic::class]);

// without nesting (marks of the current node only)

$node->getMarks(Link::class, false);

```

#### JSON Serialize

You can serialize a Node/Document back to JSON.

```php

$node->toJson(); // JSON string

$node->toArray(); // PHP array

```

### Mark

```php

namespace Hyvor\Phrosemirror\Document;

use Hyvor\Phrosemirror\Types\MarkType;

use Hyvor\Phrosemirror\Types\AttrsType;

class Mark

{

    public MarkType $type;

    public AttrsType $attrs;  

}

```

`$type` and `$attrs` are analogous to those of Node's.

`Mark` has `isOfType()`, `attr()`, `toArray()`, and `toJson()`, which works similar to `Node`'s methods.

```php

$mark = Mark::fromJson(['type' => 'link', 'attrs' => ['src' => 'https://hyvor.com']);

$mark->isOfType(Strong::class); // false

$mark->attr('src'); // https://hyvor.com

```

### Fragment

`$node->content` is a `Fragment`. It contains an array of children nodes. You can think of it just as an array, but with helper methods that makes things easier.

```php

$fragment = $node->content();

// READ

$fragment->first(); // Node | null

$fragment->last(); // Node | null

$fragment->nth(2); // Node | null

$fragment->count(); // int

// get all Nodes in the Fragment as an array

$fragment->all(); // Node[]

// loop through each node

$fragment->each(fn (Node $node) => false);

// WRITE (Be careful, these methods changes the document)

$fragment->addNodeToStart($node);

$fragment->addNodeToEnd($node);

$fragment->addNode($node); // same as addNodeToEnd

$fragment->setNodes($nodes);

$fragment->map(fn (Node $node) => $node); // update nodes in a callback

```

## 3. HTML

Next, let's convert your document to HTML. To do this, you have to define the `toHtml()` method in Node Types and Mark Types.

```php

use Hyvor\Phrosemirror\Document\Node;

use Hyvor\Phrosemirror\Types\NodeType;

class Paragraph extends NodeType

{

    public function toHtml(Node $node, string $children) : string

    {

        return "
$children";

    }

}

```

`toHtml()` should return the HTML string of the node, placing the `$children` string in it.

Here is another example using the attributes of that Node.

```php

use Hyvor\Phrosemirror\Document\Node;

use Hyvor\Phrosemirror\Types\NodeType;

class Image extends NodeType

{

    public function toHtml(Node $node, string $children) : string

    {

        $src = $node->attr('src');

        return "$children";

    }

}

```

> Do not directly use `$node->attrs->src` as the raw attributes are not HTML-escaped. Always use `$node->attr()` or `$node->attrs->get()`

### HTML: Document -> HTML

Use the `toHtml()` method to serialize a document (or any node) to HTML. 

```php

$document = Document::fromJson($schema, $json);

$html = $document->toHtml();

```

## Parsing HTML

The `HtmlParser` class is responsible for parsing HTML to a Document. It takes the Schema and some parsing rules to parse the HTML.

```php

parse($html);

```

However, in most cases, you only need one rule set to parse from multiple HTML inputs. Therefore, you can directly define rules in the Schema (in the `fromHtml()` method of Nodes and Marks).

```php

use Hyvor\Phrosemirror\Types\NodeType;

use Hyvor\Phrosemirror\Converters\HtmlParser\ParserRule;

use Hyvor\Phrosemirror\Document\Node;

class Paragraph extends NodeType

{

    public string $name = 'paragraph';

    public ?string $content = 'inline*';

    public string $group = 'block';

    public function toHtml(Node $node, string $children): string

    {

        return "
$children";

    }

    public function fromHtml(): array

    {

        return [

            new ParserRule(tag: 'p'),

        ];

    }

}

```

The `fromHtml()` method should return `ParserRule[]`. Here, the `node` property is not required as it is the same as the Node Type's name.

Then, use the `fromSchema()` method to create the parser.

```php

$parser = HtmlParser::fromSchema($schema);

$doc = $parser->parse($html);

```

### Parsing HTML Attributes to Node Attributes

Use the `getAttrs()` method to parse attributes from the HTML element.

```php

use DOMElement;

class Image extends NodeType

{

    public string $name = 'image';

    public string $attrs = ImageAttrs::class;

    

    public function fromHtml() : array

    {

    

        return [

            new ParserRule(

                tag: 'img', 

                getAttrs: fn (DOMElement $element) => ImageAttrs::fromArray([

                    'src' => $element->getAttribute('src'),

                    'alt' => $element->getAttribute('alt'),

                ])

            )

        ];

    

    }

}

```

The `getAttrs()` callback should return one of the following:

- `false` to ignore the element

- `null` if the attributes are not found

- `AttrsType` if the attributes are found

### Content & Grouping

Defining `content` and `group` properties in Node Types is important for parsing HTML. 

For example, let's say we have a `blockquote` node with `content` set to `block+`. This means that the `blockquote` node can only contain block nodes. Therefore, the following HTML does not conform to the schema.

```html

Hello World

```

This is when `content` and `group` properties come in handy. Because, we know that the `blockquote` node can only contain block nodes such as paragraphs, the HTML parser will automatically wrap the text in a paragraph when parsing. The resulting HTML would be:

```html

Hello World

```

This logic is handled by `Sanitizer` class. Simply, it does the following to ensure `content` and `group` conformity:

* Tries to wrap nodes

* Tries to promote children

* Tries to connect inline nodes

* If all fails, it will remove the node

`content` expressions support everything the Prosemirror front-end library supports. Here are some examples:

* `paragraph`

* `paragraph|heading`

* `paragraph?`

* `paragraph*`

* `paragraph+`

* `paragraph{1,3}`

* `paragraph{1,}`

* `paragraph heading` (subsequent nodes)

* `paragraph (heading | code_block)+`

* `block+` (using groups)

**Note**: This sanitization process is only run when parsing a document from HTML. It is not run when parsing a document from JSON, because we expect the JSON (usually from your front-end) to be valid. However, you can still run the sanitization process as follows if needed:

```php

$doc = Document::fromJson($schema, $json);

$sanitizedDoc = Sanitizer::sanitize($schema, $doc);

```

### Disabling Sanitization

You can disable `content` sanitization when parsing HTML by setting `sanitize: false`.

```php

$parser = HtmlParser::fromSchema($schema);

$doc = $parser->parse($html, sanitize: false);

```

> ⚠️ **Warning:** Disabling sanitization can result in invalid documents.

## Error Handling

This library is strict, and it expects correct input from the front-end. It can throw the following exceptions:

* `InvalidJsonException` - on invalid JSON

* `InvalidAttributeTypeException` - on invalid attribute type

Both exceptions extend the `PhrosemirrorException` class. Therefore, the best practise would be catching it when building the document.

```php

use Hyvor\Phrosemirror\Exception\PhrosemirrorException;

try {

    $document = Document::fromJson($schema, $json);

} catch (PhrosemirrorException $e) {

    // invalid document

}

```

If the front-end (JS) and back-end (PHP) schema matches, the only way an exception can happen is when the Prosemirror JSON is altered. Therefore, it is a good practise to stop processing here.

## Who uses this Library?

* [Hyvor Talk](https://talk.hyvor.com)

* [Hyvor Blogs](https://blogs.hyvor.com)

* Add yours with a PR
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hyvor/phrosemirror

Awesome Lists containing this project

README