Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/longbridgeapp/html-pipeline
HTML processing filters and utilities in Go version
https://github.com/longbridgeapp/html-pipeline
go html html-processor markdown pipeline text-processor
Last synced: about 1 month ago
JSON representation
HTML processing filters and utilities in Go version
- Host: GitHub
- URL: https://github.com/longbridgeapp/html-pipeline
- Owner: longbridgeapp
- License: mit
- Created: 2020-02-17T04:47:56.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-11-19T11:57:32.000Z (about 3 years ago)
- Last Synced: 2024-05-22T22:08:31.890Z (7 months ago)
- Topics: go, html, html-processor, markdown, pipeline, text-processor
- Language: Go
- Homepage:
- Size: 91.8 KB
- Stars: 20
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- awesome-crystal - html-pipeline - HTML processing filters and utilities (Markdown/Text Processors)
README
# HTML Pipeline for Go
[![Go](https://github.com/longbridgeapp/html-pipeline/actions/workflows/go.yml/badge.svg)](https://github.com/longbridgeapp/html-pipeline/actions/workflows/go.yml)
This is go version of [html-pipeline](https://github.com/jch/html-pipeline)
## Other versions
- [html-pipeline](https://github.com/jch/html-pipeline) - Ruby
- [html-pipeline.cr](https://github.com/huacnlee/html-pipeline.cr) - Crystal## Usage
```go
package mainimport (
"fmt""github.com/PuerkitoBio/goquery"
pipeline "github.com/longbridgeapp/html-pipeline"
)// ImageMaxWidthFilter a custom filter example
type ImageMaxWidthFilter struct{}func (f ImageMaxWidthFilter) Call(doc *goquery.Document) (err error) {
doc.Find("img").Each(func(i int, node *goquery.Selection) {
node.SetAttr("style", `max-width: 100%`)
})return
}func main() {
pipe := pipeline.NewPipeline([]pipeline.Filter{
pipeline.MarkdownFilter{},
pipeline.SanitizationFilter{},
ImageMaxWidthFilter{},
pipeline.MentionFilter{
Prefix: "#",
Format: func(name string) string {
return fmt.Sprintf(`#%s`, name, name)
},
},
pipeline.MentionFilter{
Prefix: "@",
Format: func(name string) string {
return fmt.Sprintf(`@%s`, name, name)
},
},
})markdown := `# Hello world
![](javascript:alert) [Click me](javascript:alert)
This is #html-pipeline example, @huacnlee created.`
out, _ := pipe.Call(markdown)
fmt.Println(out)/*
Hello world
Click me
This is #html-pipeline example, @huacnlee created.
*/
}
```https://play.golang.org/p/zB0T7KczdB4
## Use for Plain Text case
Sometimes, you may want use html-pipeline to manage the Plain Text process.
For example:
- Match mentions, and then send notifications.
- Convert Mention / HashTag or other text into other format.But in HTML mode, it will escape some chars (`"`, `'`, `&`) ... We don't wants that.
So, there have `NewPlainPipeline` method for you to create a plain mode pipeline without any escape.
> NOTE: For secruity, this pipeline will remove all HTML tags `<.+?>`
```go
package mainimport (
"fmt"
"github.com/longbridgeapp/html-pipeline"
)func main() {
pipe := pipeline.NewPlainPipeline([]pipeline.Filter{
pipeline.MentionFilter{
Prefix: "#",
Format: func(name string) string {
return fmt.Sprintf(`[hashtag name="%s"]%s[/hashtag]`, name, name)
},
},
pipeline.MentionFilter{
Prefix: "@",
Format: func(name string) string {
return fmt.Sprintf(`[mention name="%s"]@%s[/mention]`, name, name)
},
},
})text := `"Hello" & 'world' this danger is #html-pipeline created by @huacnlee.`
out, _ := pipe.Call(text)
fmt.Println(out)
// "Hello" & 'world' this danger is [hashtag name="html-pipeline"]html-pipeline[/hashtag] created by [mention name="huacnlee"]@huacnlee[/mention].
}
```https://play.golang.org/p/vxKZU9jJi3u
## Built-in filters
- [SanitizationFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/sanitization_filter.go) - Use [bluemonday](github.com/microcosm-cc/bluemonday) default UGCPolicy to sanitize html
- [MarkdownFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/markdown_filter.go) - Use [blackfriday](https://github.com/russross/blackfriday) to covert Markdown to HTML.
- [MentionFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/mention_filter.go) - Match Mention or HashTag like Twitter.
- [HTMLEscapeFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/html_escape_filter.go) - HTML Escape for plain text.
- [SimpleFormatFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/simple_format_filter.go) - Format plain text for covert `\n\n` into paragraph, like Rails [simple_format](https://api.rubyonrails.org/classes/ActionView/Helpers/TextHelper.html#method-i-simple_format).
- [AutoCorrectFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/auto_correct_filter.go) - Use [AutoCorrect](https://github.com/longbridgeapp/autocorrect) to automatically add spaces between CJK and English words.
- [ImageProxyFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/image_proxy_filter.go) - _DEPRECATED_ A filter can match all `img` to replace src as proxy url with [imageproxy](https://github.com/longbridgeapp/imageproxy).
- [ImageURLFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/image_url_filter.go) - A filter can match `img` to replace with rules like ([imageproxy](https://github.com/willnorris/imageproxy), Ban URL, Thumb version ...).
- [ExternalLinkFilter](https://github.com/longbridgeapp/html-pipeline/blob/master/external_link_filter.go) a filter to match external links to add `rel="nofollow"`, `target="_blank"`.## License
MIT License