An open API service indexing awesome lists of open source software.

https://github.com/clasp-developers/trivial-utf-8

Mirror of https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8.git
https://github.com/clasp-developers/trivial-utf-8

Last synced: 4 months ago
JSON representation

Mirror of https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8.git

Awesome Lists containing this project

README

        

# Trivial UTF-8 Manual

###### \[in package TRIVIAL-UTF-8\]
## TRIVIAL-UTF-8 ASDF System

- Description: A small library for doing UTF-8-based input and output.
- Licence: ZLIB
- Author: Marijn Haverbeke
- Maintainer: Gábor Melis
- Homepage: [https://common-lisp.net/project/trivial-utf-8/](https://common-lisp.net/project/trivial-utf-8/)
- Bug tracker: [https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8/-/issues](https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8/-/issues)
- Source control: [GIT](https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8.git)

## Introduction

Trivial UTF-8 is a small library for doing UTF-8-based in- and
output on a Lisp implementation that already supports Unicode -
meaning CHAR-CODE and CODE-CHAR deal with Unicode character codes.

The rationale for the existence of this library is that while
Unicode-enabled implementations usually do provide some kind of
interface to dealing with character encodings, these are typically
not terribly flexible or uniform.

The [Babel][babel] library solves a similar problem while
understanding more encodings. Trivial UTF-8 was written before Babel
existed, but for new projects you might be better off going with
Babel. The one plus that Trivial UTF-8 has is that it doesn't depend
on any other libraries.

[babel]: https://common-lisp.net/project/babel/

## Links

Here is the [official repository][trivial-utf-8-repo] and the
[HTML documentation][trivial-utf-8-doc] for the latest version.

[trivial-utf-8-repo]: https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8

[trivial-utf-8-doc]: http://melisgl.github.io/mgl-pax-world/trivial-utf-8-manual.html

## Reference

- [function] UTF-8-BYTE-LENGTH STRING

Calculate the amount of bytes needed to encode STRING.

- [function] STRING-TO-UTF-8-BYTES STRING &KEY NULL-TERMINATE

Convert STRING into an array of unsigned bytes containing its UTF-8
representation. If NULL-TERMINATE, add an extra 0 byte at the end.

- [function] UTF-8-GROUP-SIZE BYTE

Determine the amount of bytes that are part of the character whose
encoding starts with BYTE. May signal UTF-8-DECODING-ERROR.

- [function] UTF-8-BYTES-TO-STRING BYTES &KEY (START 0) (END (LENGTH BYTES))

Convert the START, END subsequence of the array of BYTES containing
UTF-8 encoded characters to a [STRING][type]. The element type of
BYTES may be anything as long as it can be `COERCE`d into
an `(UNSIGNED-BYTES 8)` array. May signal UTF-8-DECODING-ERROR.

- [function] READ-UTF-8-STRING INPUT &KEY NULL-TERMINATED STOP-AT-EOF (CHAR-LENGTH -1) (BYTE-LENGTH -1)

Read UTF-8 encoded data from INPUT, a byte stream, and construct a
string with the characters found. When NULL-TERMINATED is given,
stop reading at a null character. If STOP-AT-EOF, then stop at
END-OF-FILE without raising an error. The CHAR-LENGTH and
BYTE-LENGTH parameters can be used to specify the max amount of
characters or bytes to read, where -1 means no limit. May signal
UTF-8-DECODING-ERROR.

- [condition] UTF-8-DECODING-ERROR SIMPLE-ERROR

* * *
###### \[generated by [MGL-PAX](https://github.com/melisgl/mgl-pax)\]