https://github.com/go-andiamo/splitter
Go package for splitting strings (enclosing bracket and quotes aware)
https://github.com/go-andiamo/splitter
go golang split splitter splitting string
Last synced: about 1 year ago
JSON representation
Go package for splitting strings (enclosing bracket and quotes aware)
- Host: GitHub
- URL: https://github.com/go-andiamo/splitter
- Owner: go-andiamo
- License: apache-2.0
- Created: 2022-10-25T17:54:33.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-10-30T19:16:42.000Z (over 3 years ago)
- Last Synced: 2025-03-29T22:12:00.889Z (about 1 year ago)
- Topics: go, golang, split, splitter, splitting, string
- Language: Go
- Homepage:
- Size: 57.6 KB
- Stars: 7
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Splitter
[](https://pkg.go.dev/github.com/go-andiamo/splitter)
[](https://github.com/go-andiamo/splitter/releases)
[](https://codecov.io/gh/go-andiamo/splitter)
[](https://goreportcard.com/report/github.com/go-andiamo/splitter)
## Overview
Go package for splitting strings (aware of enclosing braces and quotes)
The problem with standard Golang `strings.Split` is that it does not take into consideration that the string being split may
contain enclosing braces and/or quotes (where the separator should not be considered where it's inside braces or quotes)
Take for example a string representing a slice of comma separated strings...
```go
str := `"aaa","bbb","this, for sanity, should not be split"`
```
running `strings.Split` on that...
```go
package main
import "strings"
func main() {
str := `"aaa","bbb","this, for sanity, should not be parts"`
parts := strings.Split(str, `,`)
println(len(parts))
}
```
would yield 5 ([try on go-playground](https://go.dev/play/p/bEnwjc-gfQS)) - instead of the desired 3
However, with splitter, the result would be different...
```go
package main
import "github.com/go-andiamo/splitter"
func main() {
commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotes)
str := `"aaa","bbb","this, for sanity, should not be split"`
parts, _ := commaSplitter.Split(str)
println(len(parts))
}
```
which yields the desired 3! [try on go-playground](https://go.dev/play/p/lIae-RjzSe6)
Note: The varargs, after the first separator arg, are the desired 'enclosures' (e.g. quotes, brackets, etc.) to be taken
into consideration
While splitting, any enclosures specified are checked for balancing!
## Installation
To install Splitter, use go get:
go get github.com/go-andiamo/splitter
To update Splitter to the latest version, run:
go get -u github.com/go-andiamo/splitter
## Enclosures
Enclosures instruct the splitter specific start/end sequences within which the separator is not to be considered. An enclosure can be one of two types: quotes or brackets.
Quote type enclosures only differ from bracket type enclosures in the way that their optional escaping works -
* Quote enclosures can be:
* escaped by escape prefix - e.g. a quote enclosure starting with `"` and ending with `"` but `\"` is not seen as ending
* escaped by doubles - e.g. a quote enclosure starting with `'` and ending with `'` but any doubles `''` are not seen as ending
* Bracket enclosures can only be:
* escaped by escape prefix - e.g. a bracket enclosure starting with `(` and ending with `)` and escape set to \
* `\(` is not seen as a start
* `\)` is not seen as an end
Note that brackets are ignored inside quotes - but quotes can exist within brackets. And when splitting, separators found within any specified quote or bracket enclosure are not considered.
The Splitter provides many pre-defined enclosures:
Var Name
Type
Start - End
Escaped end
DoubleQuotes
Quote
" "
none
DoubleQuotesBackSlashEscaped
Quote
" "
\"
DoubleQuotesDoubleEscaped
Quote
" "
""
SingleQuotes
Quote
' '
none
SingleQuotesBackSlashEscaped
Quote
' '
\'
SingleQuotesDoubleEscaped
Quote
' '
''
SingleInvertedQuotes
Quote
` `
none
SingleInvertedQuotesBackSlashEscaped
Quote
` `
\'
SingleInvertedQuotesDoubleEscaped
Quote
` `
``
SinglePointingAngleQuotes
Quote
‹ ›
none
SinglePointingAngleQuotesBackSlashEscaped
Quote
‹ ›
\›
DoublePointingAngleQuotes
Quote
« »
none
LeftRightDoubleDoubleQuotes
Quote
“ ”
none
LeftRightDoubleSingleQuotes
Quote
‘ ’
none
LeftRightDoublePrimeQuotes
Quote
〝 〞
none
SingleLowHigh9Quotes
Quote
‚ ‛
none
DoubleLowHigh9Quotes
Quote
„ ‟
none
Parenthesis
Brackets
( )
none
CurlyBrackets
Brackets
{ }
none
SquareBrackets
Brackets
[ ]
none
LtGtAngleBrackets
Brackets
< >
none
LeftRightPointingAngleBrackets
Brackets
〈 〉
none
SubscriptParenthesis
Brackets
₍ ₎
none
SuperscriptParenthesis
Brackets
⁽ ⁾
none
SmallParenthesis
Brackets
﹙ ﹚
none
SmallCurlyBrackets
Brackets
﹛ ﹜
none
DoubleParenthesis
Brackets
⸨ ⸩
none
MathWhiteSquareBrackets
Brackets
⟦ ⟧
none
MathAngleBrackets
Brackets
⟨ ⟩
none
MathDoubleAngleBrackets
Brackets
⟪ ⟫
none
MathWhiteTortoiseShellBrackets
Brackets
⟬ ⟭
none
MathFlattenedParenthesis
Brackets
⟮ ⟯
none
OrnateParenthesis
Brackets
﴾ ﴿
none
AngleBrackets
Brackets
〈 〉
none
DoubleAngleBrackets
Brackets
《 》
none
FullWidthParenthesis
Brackets
( )
none
FullWidthSquareBrackets
Brackets
[ ]
none
FullWidthCurlyBrackets
Brackets
{ }
none
SubstitutionBrackets
Brackets
⸂ ⸃
none
SubstitutionQuotes
Quote
⸂ ⸃
none
DottedSubstitutionBrackets
Brackets
⸄ ⸅
none
DottedSubstitutionQuotes
Quote
⸄ ⸅
none
TranspositionBrackets
Brackets
⸉ ⸊
none
TranspositionQuotes
Quote
⸉ ⸊
none
RaisedOmissionBrackets
Brackets
⸌ ⸍
none
RaisedOmissionQuotes
Quote
⸌ ⸍
none
LowParaphraseBrackets
Brackets
⸜ ⸝
none
LowParaphraseQuotes
Quote
⸜ ⸝
none
SquareWithQuillBrackets
Brackets
⁅ ⁆
none
WhiteParenthesis
Brackets
⦅ ⦆
none
WhiteCurlyBrackets
Brackets
⦃ ⦄
none
WhiteSquareBrackets
Brackets
〚 〛
none
WhiteLenticularBrackets
Brackets
〖 〗
none
WhiteTortoiseShellBrackets
Brackets
〘 〙
none
FullWidthWhiteParenthesis
Brackets
⦅ ⦆
none
BlackTortoiseShellBrackets
Brackets
⦗ ⦘
none
BlackLenticularBrackets
Brackets
【 】
none
PointingCurvedAngleBrackets
Brackets
⧼ ⧽
none
TortoiseShellBrackets
Brackets
〔 〕
none
SmallTortoiseShellBrackets
Brackets
﹝ ﹞
none
ZNotationImageBrackets
Brackets
⦇ ⦈
none
ZNotationBindingBrackets
Brackets
⦉ ⦊
none
MediumOrnamentalParenthesis
Brackets
❨ ❩
none
LightOrnamentalTortoiseShellBrackets
Brackets
❲ ❳
none
MediumOrnamentalFlattenedParenthesis
Brackets
❪ ❫
none
MediumOrnamentalPointingAngleBrackets
Brackets
❬ ❭
none
MediumOrnamentalCurlyBrackets
Brackets
❴ ❵
none
HeavyOrnamentalPointingAngleQuotes
Quote
❮ ❯
none
HeavyOrnamentalPointingAngleBrackets
Brackets
❰ ❱
none
_Note: To convert any of the above enclosures to escaping - use the `MakeEscapable()` or `MustMakeEscapable()` functions._
### Quote enclosures with escaping
Quotes within quotes can be handled by using an enclosure that specifies how the escaping works, for example the following uses \ (backslash) prefixed escaping...
```go
package main
import "github.com/go-andiamo/splitter"
func main() {
commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotesBackSlashEscaped)
str := `"aaa","bbb","this, for sanity, \"should\" not be split"`
parts, _ := commaSplitter.Split(str)
println(len(parts))
}
```
[try on go-playground](https://go.dev/play/p/wgJ68hXBp1n)
Or with double escaping...
```go
package main
import "github.com/go-andiamo/splitter"
func main() {
commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotesDoubleEscaped)
str := `"aaa","bbb","this, for sanity, """"should,,,,"" not be split"`
parts, _ := commaSplitter.Split(str)
println(len(parts))
}
```
[try on go-playground](https://go.dev/play/p/3BpayDZyaA7)
#### Not separating when separator encountered in quotes or brackets...
```go
package main
import (
"fmt"
"github.com/go-andiamo/splitter"
)
func main() {
encs := []*splitter.Enclosure{
splitter.Parenthesis, splitter.SquareBrackets, splitter.CurlyBrackets,
splitter.DoubleQuotesDoubleEscaped, splitter.SingleQuotesDoubleEscaped,
}
commaSplitter, _ := splitter.NewSplitter(',', encs...)
str := `do(not,)split,'don''t,split,this',[,{,(a,"this has "" quotes")}]`
parts, _ := commaSplitter.Split(str)
println(len(parts))
for i, pt := range parts {
fmt.Printf("\t[%d]%s\n", i, pt)
}
}
```
[try on go-playground](https://go.dev/play/p/bvzC1NXfG3z)
## Options
Options define behaviours that are to be carried out on each found part during splitting.
An option, by virtue of it's return args from `.Apply()`, can do one of three things:
1. return a modified string of what is to be added to the split parts
2. return a `false` to indicate that the split part is not to be added to the split result
3. return an `error` to indicate that the split part is unacceptable (and cease further splitting - the error is returned from the `Split` method)
Options can be added directly to the Splitter using `.AddDefaultOptions()` method. These options are checked for every call to the splitters `.Split()` method.
Options can also be specified when calling the splitter `.Split()` method - these options are only carried out for this call (and after any options already specified on the splitter)
### Option Examples
#### 1. Stripping empty parts
```go
package main
import (
"fmt"
"github.com/go-andiamo/splitter"
)
func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.IgnoreEmpties)
parts, _ := s.Split(`/a//c/`)
println(len(parts))
fmt.Printf("%+v", parts)
}
```
[try on go-playground](https://go.dev/play/p/l1YnMoeA9Jm)
#### 2. Stripping empty first/last parts
```go
package main
import (
"fmt"
"github.com/go-andiamo/splitter"
)
func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.IgnoreEmptyFirst, splitter.IgnoreEmptyLast)
parts, _ := s.Split(`/a//c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)
parts, _ = s.Split(`a//c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)
parts, _ = s.Split(`/a//c`)
println(len(parts))
fmt.Printf("%+v\n", parts)
}
```
[try on go-playground](https://go.dev/play/p/n1NEKQhtWsY)
#### 3. Trimming parts
```go
package main
import (
"fmt"
"github.com/go-andiamo/splitter"
)
func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.TrimSpaces)
parts, _ := s.Split(`/a/b/c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)
parts, _ = s.Split(` / a /b / c/ `)
println(len(parts))
fmt.Printf("%+v\n", parts)
parts, _ = s.Split(`/ a / b / c /`)
println(len(parts))
fmt.Printf("%+v\n", parts)
}
```
[try on go-playground](https://go.dev/play/p/d8FZXJCBPze)
#### 4. Trimming spaces (and removing empties)
```go
package main
import (
"fmt"
"github.com/go-andiamo/splitter"
)
func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.TrimSpaces, splitter.IgnoreEmpties)
parts, _ := s.Split(`/a/ /c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)
parts, _ = s.Split(` / a // c/ `)
println(len(parts))
fmt.Printf("%+v\n", parts)
parts, _ = s.Split(`/ a / / c /`)
println(len(parts))
fmt.Printf("%+v\n", parts)
}
```
[try on go-playground](https://go.dev/play/p/S_ald78xtSi)
#### 5. Error for empties found
```go
package main
import (
"fmt"
"github.com/go-andiamo/splitter"
)
func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.TrimSpaces, splitter.NoEmpties)
if parts, err := s.Split(`/a/ /c/`); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}
if parts, err := s.Split(` / a // c/ `); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}
if parts, err := s.Split(`/ a / / c /`); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}
if parts, err := s.Split(` a / b/c `); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}
}
```
[try on go-playground](https://go.dev/play/p/LVLkuRMoYJX)