https://github.com/turtlemonvh/altscanner
A version of `bufio.Scanner` that works for lines of arbitrary length.
https://github.com/turtlemonvh/altscanner
buffer go scanner
Last synced: about 2 months ago
JSON representation
A version of `bufio.Scanner` that works for lines of arbitrary length.
- Host: GitHub
- URL: https://github.com/turtlemonvh/altscanner
- Owner: turtlemonvh
- License: mit
- Created: 2016-03-04T16:42:52.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2016-12-23T18:19:20.000Z (over 9 years ago)
- Last Synced: 2025-08-13T19:39:12.212Z (10 months ago)
- Topics: buffer, go, scanner
- Language: Go
- Size: 9.77 KB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AltScanner [](https://godoc.org/github.com/turtlemonvh/altscanner) [](https://travis-ci.org/turtlemonvh/altscanner)
A version of `bufio.Scanner` that works with lines of arbitrary length.
## Why
If you're getting a `bufio.Scanner: token too long` error, this may be what you want.
## How
If your code used to look like this:
```golang
import "bufio"
s := bufio.NewScanner(myIoReader)
for s.Scan() {
// Do work
}
```
You can now handle very long lines without errors by changing to:
```golang
import "github.com/turtlemonvh/altscanner"
s := altscanner.NewAltScanner(myIoReader)
for s.Scan() {
// Do work
}
```
## Caveats
* Only breaks on newlines.
* Just appends bytes to a byte slice instead of using [a real buffer](https://golang.org/pkg/bytes/#Buffer).
## Alternatives
If you have a good idea about the size of your data and are running go>1.6 ([where the `Scanner.Buffer` method was introduced](https://golang.org/doc/go1.6#minor_library_changes)), you probably just want to change the size of the buffer used by the scanner. For example:
// Create a scanner and resize its buffer to be 10X larger than usual (640 Kb instead of 64 Kb)
scanner := bufio.NewScanner(file)
scanner.Buffer(make([]byte, bufio.MaxScanTokenSize), bufio.MaxScanTokenSize*10)
However, if you need to be compatible with go<1.6 or you really have no idea about the size of your data, this approach works pretty well.
## Performance
It is robust, but not very fast. The benchmark results below show the performance of reading in 5 lines of content. The lines used in the tests are either 30 bytes (short) or 300K bytes (long).
```bash
$ go test -test.bench=Scanner -test.run=^$ -test.benchmem
BenchmarkBufioScannerSmall-8 1000000 1061 ns/op 4128 B/op 2 allocs/op
BenchmarkBufferedBufioScannerSmall-8 1000000 1059 ns/op 4128 B/op 2 allocs/op
BenchmarkAltScannerSmall-8 1000000 1779 ns/op 5824 B/op 8 allocs/op
BenchmarkBufferedBufioScannerLong-8 50000 28077 ns/op 127008 B/op 6 allocs/op
BenchmarkAltScannerLong-8 2000 1142195 ns/op 7032704 B/op 78 allocs/op
PASS
ok github.com/turtlemonvh/altscanner 13.458s
```
`AltScanner` is significantly slower, has many more allocations, and uses significantly more bytes per operation than the buffer `bufio.Scanner`. In short: it is always faster to use `Scanner.Buffer` to adjust the size of the buffer if you are using go1.6+ and you are confident about the max possible size of an line.
## License
MIT