https://github.com/marschall/readers
Specialized implementations of java.io.Reader with an emphasis on reducing intermediate allocations.
https://github.com/marschall/readers
input java utf-8
Last synced: 4 months ago
JSON representation
Specialized implementations of java.io.Reader with an emphasis on reducing intermediate allocations.
- Host: GitHub
- URL: https://github.com/marschall/readers
- Owner: marschall
- Created: 2020-09-17T11:23:15.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-11-29T18:14:20.000Z (over 4 years ago)
- Last Synced: 2025-01-16T02:45:02.209Z (5 months ago)
- Topics: input, java, utf-8
- Language: Java
- Homepage:
- Size: 41 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Readers
=======Specialized implementations of `java.io.Reader` with an emphasis on reducing intermediate allocations.
`java.io.InputStreamReader` relies on `sun.nio.cs.StreamDecoder` which relies on `java.nio.charset.CharsetDecoder` which is very generic but produces quite a few intermediate allocations. This can be a problem for small reads.
* `com.github.marschall.readers.Utf8InputStreamReader` a UTF-8 decoding `Reader` on an `InputStream` that performs no buffering, eg. because the `InputStream` already buffers. Avoids intermediate allocations in favor of more `java.io.InputStream#read()` invocations.
* `com.github.marschall.readers.BufferedUtf8InputStreamReader` a UTF-8 decoding `Reader` that also buffers. Avoids intermediate allocations except for the one time buffer allocation.The implementations are currently very biased towards ASCII input.
The implementations fully support non-BMP code points that result in two Java `char` (high and low surrogate).
The implementations are currently not thread-safe.The implementations perform full validation against table 3.1B from [Corrigendum #1: UTF-8 Shortest Form](https://unicode.org/versions/corrigendum1.html) to catch non-shortest form.