Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rtyley/line-break-preserving-line-splitting
https://github.com/rtyley/line-break-preserving-line-splitting
io-streams jvm linebreaks newline-character-sequences splitting-text stream-processing
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/rtyley/line-break-preserving-line-splitting
- Owner: rtyley
- License: gpl-3.0
- Created: 2021-01-30T23:21:01.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2024-01-17T10:16:49.000Z (10 months ago)
- Last Synced: 2024-05-01T23:01:42.832Z (7 months ago)
- Topics: io-streams, jvm, linebreaks, newline-character-sequences, splitting-text, stream-processing
- Language: Scala
- Homepage:
- Size: 37.1 KB
- Stars: 3
- Watchers: 3
- Forks: 6
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# line-break-preserving-line-splitting
[![line-break-preserving-line-splitting artifacts](https://index.scala-lang.org/rtyley/line-break-preserving-line-splitting/line-break-preserving-line-splitting/latest-by-scala-version.svg)](https://index.scala-lang.org/rtyley/line-break-preserving-line-splitting/line-break-preserving-line-splitting/)
[![CI](https://github.com/rtyley/line-break-preserving-line-splitting/actions/workflows/ci.yml/badge.svg)](https://github.com/rtyley/line-break-preserving-line-splitting/actions/workflows/ci.yml)### Features
* Streams from a `java.io.Reader`, so **can process very large files**
* Unlike `java.io.BufferedReader.readLine()`, **retains the line-break at the end of
the line**, whether it's UNIX, Windows, etc
* Implements the `scala.collection.Iterator` trait. Lines are not retained in memory
after being returned by `next()`.
* Limits memory consumption to a single ring-buffer the lives the lifetime of the
iterator, and, for lines that loop round the ring-buffer, temporary `StringBuilder`
instances that are garbage-collected after the line is returned.
* Reading a line using `next()` will return as soon as the `java.io.Reader` has
returned enough characters - will at most only attempt to fill the ring-buffer.
For my purposes, the first two requirements were the most important. If I hadn't
wanted to retain the line-breaks, I'd probably have used
`java.io.BufferedReader.readLine()`.### Installation
```scala
libraryDependencies += "com.madgag" %% "line-break-preserving-line-splitting" % "[version]"
```### Example usage
```scala
import com.madgag.linesplitting._val splitLines = {
val reader: java.io.Reader = new java.io.StringReader("Foo\nBar\rBoo")val iter = new LineBreakPreservingIterator(reader)
iter.toSeq
}.toSeq
// splitLines: Seq[String] = List(
// """Foo
// """,
// """Bar
""",
// "Boo"
// )splitLines.map(_.replace("\n", "[LF]").replace("\r", "[CR]"))
// res0: Seq[String] = List("Foo[LF]", "Bar[CR]", "Boo")
```