An open API service indexing awesome lists of open source software.

https://github.com/go-andiamo/splitter

Go package for splitting strings (enclosing bracket and quotes aware)
https://github.com/go-andiamo/splitter

go golang split splitter splitting string

Last synced: about 1 year ago
JSON representation

Go package for splitting strings (enclosing bracket and quotes aware)

Awesome Lists containing this project

README

          

# Splitter
[![GoDoc](https://godoc.org/github.com/go-andiamo/splitter?status.svg)](https://pkg.go.dev/github.com/go-andiamo/splitter)
[![Latest Version](https://img.shields.io/github/v/tag/go-andiamo/splitter.svg?sort=semver&style=flat&label=version&color=blue)](https://github.com/go-andiamo/splitter/releases)
[![codecov](https://codecov.io/gh/go-andiamo/splitter/branch/main/graph/badge.svg?token=igjnZdgh0e)](https://codecov.io/gh/go-andiamo/splitter)
[![Go Report Card](https://goreportcard.com/badge/github.com/go-andiamo/splitter)](https://goreportcard.com/report/github.com/go-andiamo/splitter)

## Overview

Go package for splitting strings (aware of enclosing braces and quotes)

The problem with standard Golang `strings.Split` is that it does not take into consideration that the string being split may
contain enclosing braces and/or quotes (where the separator should not be considered where it's inside braces or quotes)

Take for example a string representing a slice of comma separated strings...
```go
str := `"aaa","bbb","this, for sanity, should not be split"`
```
running `strings.Split` on that...
```go
package main

import "strings"

func main() {
str := `"aaa","bbb","this, for sanity, should not be parts"`
parts := strings.Split(str, `,`)
println(len(parts))
}
```
would yield 5 ([try on go-playground](https://go.dev/play/p/bEnwjc-gfQS)) - instead of the desired 3

However, with splitter, the result would be different...
```go
package main

import "github.com/go-andiamo/splitter"

func main() {
commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotes)

str := `"aaa","bbb","this, for sanity, should not be split"`
parts, _ := commaSplitter.Split(str)
println(len(parts))
}
```
which yields the desired 3! [try on go-playground](https://go.dev/play/p/lIae-RjzSe6)

Note: The varargs, after the first separator arg, are the desired 'enclosures' (e.g. quotes, brackets, etc.) to be taken
into consideration

While splitting, any enclosures specified are checked for balancing!

## Installation
To install Splitter, use go get:

go get github.com/go-andiamo/splitter

To update Splitter to the latest version, run:

go get -u github.com/go-andiamo/splitter

## Enclosures
Enclosures instruct the splitter specific start/end sequences within which the separator is not to be considered. An enclosure can be one of two types: quotes or brackets.

Quote type enclosures only differ from bracket type enclosures in the way that their optional escaping works -
* Quote enclosures can be:
* escaped by escape prefix - e.g. a quote enclosure starting with `"` and ending with `"` but `\"` is not seen as ending
* escaped by doubles - e.g. a quote enclosure starting with `'` and ending with `'` but any doubles `''` are not seen as ending
* Bracket enclosures can only be:
* escaped by escape prefix - e.g. a bracket enclosure starting with `(` and ending with `)` and escape set to \
* `\(` is not seen as a start
* `\)` is not seen as an end

Note that brackets are ignored inside quotes - but quotes can exist within brackets. And when splitting, separators found within any specified quote or bracket enclosure are not considered.

The Splitter provides many pre-defined enclosures:



Var Name


Type


Start - End


Escaped end



DoubleQuotes
Quote
" "
none


DoubleQuotesBackSlashEscaped
Quote
" "
\"


DoubleQuotesDoubleEscaped
Quote
" "
""


SingleQuotes
Quote
' '
none


SingleQuotesBackSlashEscaped
Quote
' '
\'


SingleQuotesDoubleEscaped
Quote
' '
''


SingleInvertedQuotes
Quote
` `
none


SingleInvertedQuotesBackSlashEscaped
Quote
` `
\'


SingleInvertedQuotesDoubleEscaped
Quote
` `
``


SinglePointingAngleQuotes
Quote

none


SinglePointingAngleQuotesBackSlashEscaped
Quote

\›


DoublePointingAngleQuotes
Quote
« »
none


LeftRightDoubleDoubleQuotes
Quote

none


LeftRightDoubleSingleQuotes
Quote

none


LeftRightDoublePrimeQuotes
Quote

none


SingleLowHigh9Quotes
Quote

none


DoubleLowHigh9Quotes
Quote

none


Parenthesis
Brackets
( )
none


CurlyBrackets
Brackets
{ }
none


SquareBrackets
Brackets
[ ]
none


LtGtAngleBrackets
Brackets
< >
none


LeftRightPointingAngleBrackets
Brackets

none


SubscriptParenthesis
Brackets

none


SuperscriptParenthesis
Brackets

none


SmallParenthesis
Brackets

none


SmallCurlyBrackets
Brackets

none


DoubleParenthesis
Brackets

none


MathWhiteSquareBrackets
Brackets

none


MathAngleBrackets
Brackets

none


MathDoubleAngleBrackets
Brackets

none


MathWhiteTortoiseShellBrackets
Brackets

none


MathFlattenedParenthesis
Brackets

none


OrnateParenthesis
Brackets
﴿
none


AngleBrackets
Brackets

none


DoubleAngleBrackets
Brackets

none


FullWidthParenthesis
Brackets

none


FullWidthSquareBrackets
Brackets

none


FullWidthCurlyBrackets
Brackets

none


SubstitutionBrackets
Brackets

none


SubstitutionQuotes
Quote

none


DottedSubstitutionBrackets
Brackets

none


DottedSubstitutionQuotes
Quote

none


TranspositionBrackets
Brackets

none


TranspositionQuotes
Quote

none


RaisedOmissionBrackets
Brackets

none


RaisedOmissionQuotes
Quote

none


LowParaphraseBrackets
Brackets

none


LowParaphraseQuotes
Quote

none


SquareWithQuillBrackets
Brackets

none


WhiteParenthesis
Brackets

none


WhiteCurlyBrackets
Brackets

none


WhiteSquareBrackets
Brackets

none


WhiteLenticularBrackets
Brackets

none


WhiteTortoiseShellBrackets
Brackets

none


FullWidthWhiteParenthesis
Brackets

none


BlackTortoiseShellBrackets
Brackets

none


BlackLenticularBrackets
Brackets

none


PointingCurvedAngleBrackets
Brackets

none


TortoiseShellBrackets
Brackets

none


SmallTortoiseShellBrackets
Brackets

none


ZNotationImageBrackets
Brackets

none


ZNotationBindingBrackets
Brackets

none


MediumOrnamentalParenthesis
Brackets

none


LightOrnamentalTortoiseShellBrackets
Brackets

none


MediumOrnamentalFlattenedParenthesis
Brackets

none


MediumOrnamentalPointingAngleBrackets
Brackets

none


MediumOrnamentalCurlyBrackets
Brackets

none


HeavyOrnamentalPointingAngleQuotes
Quote

none


HeavyOrnamentalPointingAngleBrackets
Brackets

none

_Note: To convert any of the above enclosures to escaping - use the `MakeEscapable()` or `MustMakeEscapable()` functions._

### Quote enclosures with escaping
Quotes within quotes can be handled by using an enclosure that specifies how the escaping works, for example the following uses \ (backslash) prefixed escaping...
```go
package main

import "github.com/go-andiamo/splitter"

func main() {
commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotesBackSlashEscaped)

str := `"aaa","bbb","this, for sanity, \"should\" not be split"`
parts, _ := commaSplitter.Split(str)
println(len(parts))
}
```
[try on go-playground](https://go.dev/play/p/wgJ68hXBp1n)

Or with double escaping...
```go
package main

import "github.com/go-andiamo/splitter"

func main() {
commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotesDoubleEscaped)

str := `"aaa","bbb","this, for sanity, """"should,,,,"" not be split"`
parts, _ := commaSplitter.Split(str)
println(len(parts))
}
```
[try on go-playground](https://go.dev/play/p/3BpayDZyaA7)

#### Not separating when separator encountered in quotes or brackets...
```go
package main

import (
"fmt"
"github.com/go-andiamo/splitter"
)

func main() {
encs := []*splitter.Enclosure{
splitter.Parenthesis, splitter.SquareBrackets, splitter.CurlyBrackets,
splitter.DoubleQuotesDoubleEscaped, splitter.SingleQuotesDoubleEscaped,
}
commaSplitter, _ := splitter.NewSplitter(',', encs...)

str := `do(not,)split,'don''t,split,this',[,{,(a,"this has "" quotes")}]`
parts, _ := commaSplitter.Split(str)
println(len(parts))
for i, pt := range parts {
fmt.Printf("\t[%d]%s\n", i, pt)
}
}
```
[try on go-playground](https://go.dev/play/p/bvzC1NXfG3z)

## Options
Options define behaviours that are to be carried out on each found part during splitting.

An option, by virtue of it's return args from `.Apply()`, can do one of three things:
1. return a modified string of what is to be added to the split parts
2. return a `false` to indicate that the split part is not to be added to the split result
3. return an `error` to indicate that the split part is unacceptable (and cease further splitting - the error is returned from the `Split` method)

Options can be added directly to the Splitter using `.AddDefaultOptions()` method. These options are checked for every call to the splitters `.Split()` method.

Options can also be specified when calling the splitter `.Split()` method - these options are only carried out for this call (and after any options already specified on the splitter)

### Option Examples
#### 1. Stripping empty parts
```go
package main

import (
"fmt"
"github.com/go-andiamo/splitter"
)

func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.IgnoreEmpties)

parts, _ := s.Split(`/a//c/`)
println(len(parts))
fmt.Printf("%+v", parts)
}
```
[try on go-playground](https://go.dev/play/p/l1YnMoeA9Jm)

#### 2. Stripping empty first/last parts
```go
package main

import (
"fmt"
"github.com/go-andiamo/splitter"
)

func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.IgnoreEmptyFirst, splitter.IgnoreEmptyLast)

parts, _ := s.Split(`/a//c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)

parts, _ = s.Split(`a//c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)

parts, _ = s.Split(`/a//c`)
println(len(parts))
fmt.Printf("%+v\n", parts)
}
```
[try on go-playground](https://go.dev/play/p/n1NEKQhtWsY)

#### 3. Trimming parts
```go
package main

import (
"fmt"
"github.com/go-andiamo/splitter"
)

func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.TrimSpaces)

parts, _ := s.Split(`/a/b/c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)

parts, _ = s.Split(` / a /b / c/ `)
println(len(parts))
fmt.Printf("%+v\n", parts)

parts, _ = s.Split(`/ a / b / c /`)
println(len(parts))
fmt.Printf("%+v\n", parts)
}
```
[try on go-playground](https://go.dev/play/p/d8FZXJCBPze)

#### 4. Trimming spaces (and removing empties)
```go
package main

import (
"fmt"
"github.com/go-andiamo/splitter"
)

func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.TrimSpaces, splitter.IgnoreEmpties)

parts, _ := s.Split(`/a/ /c/`)
println(len(parts))
fmt.Printf("%+v\n", parts)

parts, _ = s.Split(` / a // c/ `)
println(len(parts))
fmt.Printf("%+v\n", parts)

parts, _ = s.Split(`/ a / / c /`)
println(len(parts))
fmt.Printf("%+v\n", parts)
}
```
[try on go-playground](https://go.dev/play/p/S_ald78xtSi)

#### 5. Error for empties found
```go
package main

import (
"fmt"
"github.com/go-andiamo/splitter"
)

func main() {
s := splitter.MustCreateSplitter('/').
AddDefaultOptions(splitter.TrimSpaces, splitter.NoEmpties)

if parts, err := s.Split(`/a/ /c/`); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}

if parts, err := s.Split(` / a // c/ `); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}

if parts, err := s.Split(`/ a / / c /`); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}

if parts, err := s.Split(` a / b/c `); err != nil {
println(err.Error())
} else {
println(len(parts))
fmt.Printf("%+v\n", parts)
}
}
```
[try on go-playground](https://go.dev/play/p/LVLkuRMoYJX)