Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tahonermann/text_view

A C++ concepts and range based character encoding and code point enumeration library
https://github.com/tahonermann/text_view

Last synced: 28 days ago
JSON representation

A C++ concepts and range based character encoding and code point enumeration library

Awesome Lists containing this project

README

        

[![Travis Build Status](https://travis-ci.org/tahonermann/text_view.svg?branch=master)](https://travis-ci.org/tahonermann/text_view)
Travis CI (Linux:gcc)

[![codecov](https://codecov.io/gh/tahonermann/text_view/branch/master/graph/badge.svg)](https://codecov.io/gh/tahonermann/text_view)

[![experimental](http://badges.github.io/stability-badges/dist/experimental.svg)](http://github.com/badges/stability-badges)

# [Text_view]
A [C++ Concepts][ISO/IEC 19217:2015]
based character encoding and code point enumeration library.

This project is the reference implementation for proposal [P0244] for the C++
standard.

This port of [Text_view] requires a C++17 conforming compiler that implements
[ISO/IEC technical specification 19217:2015, **C++ Extensions for concepts**
][ISO/IEC 19217:2015]. A port of [Text_view] that builds with C++11 conforming
compilers is available at [Text_view for range-v3][Text_view-range-v3].

For discussion of this project, please post and/or subscribe to the
`[email protected]` group hosted at
https://groups.google.com/d/forum/text_view

- [Overview](#overview)
- [Current features and limitations](#current-features-and-limitations)
- [Requirements](#requirements)
- [Build and installation](#build-and-installation)
- [Building and installing gcc](#building-and-installing-gcc)
- [Building and installing cmcstl2](#building-and-installing-cmcstl2)
- [Building and installing Text_view](#building-and-installing-text_view)
- [Usage](#usage)
- [Header <experimental/text_view> synopsis](#header-experimentaltext_view-synopsis)
- [Concepts](#concepts)
- [Error Policies](#error-policies)
- [Error Status](#error-status)
- [Exceptions](#exceptions)
- [Type traits](#type-traits)
- [Character sets](#character-sets)
- [Character set identification](#character-set-identification)
- [Character set information](#character-set-information)
- [Characters](#characters)
- [Encodings](#encodings)
- [Text iterators](#text-iterators)
- [Text view](#text-view)
- [Supported Encodings](#supported-encodings)
- [Terminology](#terminology)
- [Code Unit](#code-unit)
- [Code Point](#code-point)
- [Character Set](#character-set)
- [Character](#character)
- [Encoding](#encoding)
- [References](#references)

# Overview
[C++11][ISO/IEC 14882:2011] added support for new character types ([N2249]) and
[Unicode] string literals ([N2442]), but neither [C++11][ISO/IEC 14882:2011],
nor more recent standards have provided means of efficiently and conveniently
enumerating [code points](#code-point) in [Unicode] or legacy encodings. While
it is possible to implement such enumeration using interfaces provided in the
standard `` library, doing to is awkward, requires that text be
provided as pointers to contiguous memory, and inefficent due to virtual
function call overhead (__examples and data required to back up these
assertions__).

[Text_view] provides iterator and range based interfaces for encoding and
decoding strings in a variety of [character encodings](#encoding). The
interface is intended to support all modern and legacy
[character encodings](#encoding), though this library does not yet provide
implementations for legacy [encodings](#encoding).

An example usage follows. Note that `\u00F8` (LATIN SMALL LETTER O WITH STROKE)
is encoded as UTF-8 using two [code units](#code-unit) (`\xC3\xB8`), but
iterator based enumeration sees just the single [code point](#code-point).

```C++
using CT = utf8_encoding::character_type;
auto tv = make_text_view(u8"J\u00F8erg is my friend");
auto it = tv.begin();
assert(*it++ == CT{0x004A}); // 'J'
assert(*it++ == CT{0x00F8}); // 'ø'
assert(*it++ == CT{0x0065}); // 'e'
```

The iterators and ranges that [Text_view] provides are compatible with the
non-modifying sequence utilities provided by the standard C++ ``
library. This enables use of standard algorithms to search encoded text.

```C++
it = std::find(tv.begin(), tv.end(), CT{0x00F8});
assert(it != tv.end());
```

The iterators provided by [Text_view] also provide access to the underlying
[code unit](#code-unit) sequence.

```C++
auto base_it = it.base_range().begin();
assert(*base_it++ == '\xC3');
assert(*base_it++ == '\xB8');
assert(base_it == it.base_range().end());
```

[Text_view] ranges satisfy the requirements for use in
[C++11][ISO/IEC 14882:2011] range-based for statements with the removed
same type restriction for the begin and end expressions provided by [P0184R0]
as adopted for C++17.

```C++
for (const auto &ch : tv) {
...
}
```

# Current features and limitations
[Text_view] provides interfaces for the following:
- Encoding and decoding of text for the [encodings](#encoding) listed in
[supported encodings](#supported-encodings).
- Encoding text using [C++11][ISO/IEC 14882:2011] compliant output iterators.
- Decoding text using input, forward, bidirectional, and random access
iterators that are compliant with standard iterator requirements as specified
in the [ranges proposal][N4560].
- Constructing view adapters for encoded text stored in arrays, containers,
or std::basic_string, or referenced by another range or view. These view
adapters meet the requirements for views in the [ranges proposal][N4560].

[Text_view] does **not** currently provide interfaces for the following:
- Transcoding of code points from one [character set](#character-set) to
another.
- Iterators for grapheme clusters or other boundary conditions.
- Collation.
- Localization.
- Internationalization.
- [Unicode] code point properties.
- [Unicode] normalization.

# Requirements
[Text_view] requires a C++ compiler that implements [ISO/IEC technical
specification 19217:2015, **C++ Extensions for concepts**][ISO/IEC 19217:2015]
As of 2016-08-26, this specification is only supported by [gcc] release
6.2.0 or later. Additionally, [Text_view] depends on the [cmcstl2]
implementation of the [ranges proposal][N4560] for concept definitions.

# Build and installation
This section provides instructions for building [Text_view] and suitable
versions of its dependencies.

## Building and installing [gcc]
[Text_view] requires [gcc] version 6.2.0 or later. The following commands
can be used to perform a suitable build of the current in-development release
of [gcc] on Linux if an installation of [gcc] 6.2.0 or later is not available.
If you have an installation of [gcc] 6.2.0 or later available, then there is
no need to build [gcc] yourself.

```sh
$ svn co svn://gcc.gnu.org/svn/gcc/trunk gcc-trunk-src
$ curl -O ftp://ftp.gnu.org/gnu/gmp/gmp-5.1.1.tar.bz2
$ curl -O ftp://ftp.gnu.org/gnu/mpfr/mpfr-3.1.2.tar.bz2
$ curl -O ftp://ftp.gnu.org/gnu/mpc/mpc-1.0.1.tar.gz
$ cd gcc-trunk-src
$ svn update -r 234230 # Optional command to select a known good gcc version
$ bzip2 -d -c ../gmp-5.1.1.tar.bz2 | tar -xvf -
$ mv gmp-5.1.1 gmp
$ bzip2 -d -c ../mpfr-3.1.2.tar.bz2 | tar -xvf -
$ mv mpfr-3.1.2 mpfr
$ tar -zxvf ../mpc-1.0.1.tar.gz
$ mv mpc-1.0.1 mpc
$ cd ..
$ mkdir gcc-trunk-build
$ cd gcc-trunk-build
$ LIBRARY_PATH=/usr/lib/$(gcc -print-multiarch); export LIBRARY_PATH
$ CPATH=/usr/include/$(gcc -print-multiarch); export CPATH
$ ../gcc-trunk-src/configure \
CC=gcc \
CXX=g++ \
--prefix $(pwd)/../gcc-trunk-install \
--disable-multilib \
--disable-bootstrap \
--enable-languages=c,c++
$ make -j 4
$ make install
$ cd ..
```

When complete, the new [gcc] build will be present in the `gcc-trunk-install`
directory.

## Building and installing [cmcstl2]
[Text_view] only depends on headers provided by [cmcstl2] and no build or
installation is required. [Text_view] is known to build successfully with
[cmcstl2] git revision `eb5ecdf79e22eb68c86cb62fd0912559593e5597`. The
following commands can be used to checkout a known good revision.

```sh
$ git clone https://github.com/CaseyCarter/cmcstl2.git cmcstl2
$ cd cmcstl2
$ git checkout eb5ecdf79e22eb68c86cb62fd0912559593e5597
```

## Building and installing [Text_view]
[Text_view] has a [CMake] based build system sufficient to build and run its
tests, to validate example code, and to perform a minimal installation following
established operating system conventions. By default, files will be installed
under `/usr/local` on UNIX and UNIX-like systems, and under `C:\Program Files`
on Windows. The installation location can be changed by invoking `cmake` with
a `-DCMAKE_INSTALL_PREFIX=` option. On UNIX and UNIX-like systems, header
files will be installed in the `include` directory of the installation
destination, and other files will be installed under `share/text_view`. On
Windows, header files be installed in the `text_view\include` directory of the
installation destination, and other files will be installed under `text_view`.

Unless [cmcstl2] is installed to a common location, it will be necessary to
inform the build where it is installed. This is typically done by setting the
`CMCSTL2_INSTALL_PATH` environment variable. As of this writing, [cmcstl2]
does not provide an installation option, so `CMCSTL2_INSTALL_PATH` should
specify the location where the [cmcstl2] source resides (the directory that
contains the [cmcstl2] `include` directory).

The following commands suffice to build and run tests and examples, and perform
an installation. If the build succeeds, built test and example programs will
be present in the `test` and `examples` subdirectories of the build directory
(the built test and example programs are not installed), and header files,
example code, cmake package configuration modules, and other miscellaneous files
will be present in the installation directory.

```sh
$ vi setenv.sh # Update GCC_INSTALL_PATH and CMCSTL2_INSTALL_PATH.
$ . ./setenv.sh
$ mkdir build
$ cd build
$ cmake .. [-DCMAKE_INSTALL_PREFIX=/path/to/install/to]
$ cmake --build . --target install
$ ctest
```

`check` and `check-install` [CMake] targets are also available for automating
build and test. The `check` target performs a build without installation and
then runs the tests. The `check-install` target performs a build, runs tests,
installs to a location within the build directory, and then performs tests
(verifying that example code builds) on the installation.

The installation includes a [CMake] based build system for building the example
code. To build all of the examples, run `cmake` specifying the `examples`
directory of the installation as the source directory. Alternatively, each
example can be built independently by specifying its source directory as the
source directory in a `cmake` invocation. If the installation was to a
non-default installation location (`-DCMAKE_INSTALL_PREFIX` was specified),
then it may be necessary to set `CMAKE_PREFIX_PATH` to the [Text_view]
installation location (the location `CMAKE_INSTALL_PREFIX` was set to) or
`text_view_DIR` to the directory containing the installed
`text_view-config.cmake` file, so that the [Text_view] package configuration
file is found. See the [CMake] documentation for more details.

The following commands suffice to build all of the installed examples.

```sh
$ cd /path/to/installation/text_view/examples
$ mkdir build
$ cd build
$ cmake .. [-DCMAKE_PREFIX_PATH=/path/to/installation]
$ cmake --build .
$ ctest
```

# Usage
To use [Text_view] in your own code, perform a build and installation as
described above, add include paths for the `text_view/include` and [cmcstl2]
installation locations, add a library search path for the `text_view/lib`
directory, include the `text_view` header file in your sources, and link the
[text_view] library with your executable.

```C++
#include
```

[Text_view] installations include a [CMake] package configuration file suitable
for use in [CMake] based projects. To use it, specify `text_view` as the
`` argument to `find_package` in your [CMake] file and add invocations
of `target_link_libraries` for each relevant target with the `` argument
set to `text-view`. This will automatically apply compiler and linker options
required to use [Text_view] to each target. See the `CMakeLists.txt` files for
the utilities under the `examples` directory for reference. If [Text_view] was
installed to a non-default installation location (`-DCMAKE_INSTALL_PREFIX` was
specified), then it may be necessary to set `CMAKE_PREFIX_PATH` to the
[Text_view] installation location (the location `CMAKE_INSTALL_PREFIX` was set
to) or `text_view_DIR` to the directory containing the installed
`text_view-config.cmake` file, so that the [Text_view] package configuration
file is found. It is also possible to use the build directory as a
(non-relocatable) installation directory by setting The `CMAKE_PREFIX_PATH`
or `text_view_DIR` variables appropriately. See the [CMake] documentation for
more details. The `CMakeLists.txt` files provided with the installed examples
exemplify a minimal [CMake] based build system for a downstream consumer of
[Text_view].

All interfaces intended for public use are declared in the
`std::experimental::text` namespace. The `text` namespace is an inline
namespace, so all entities are available from the `std::experimental` namespace
itself.

The interface descriptions in the sections that follow use the concept names
from the [ranges proposal][N4560], are intended to be used as specification,
and should be considered authoritative. Any differences in behavior as
defined by these definitions as compared to the [Text_view] implementation are
unintentional and should be considered indicatative of a defect in either the
specification or the implementation.

## Header <experimental/text_view> synopsis

```C++
namespace std {
namespace experimental {
inline namespace text {

// concepts:
template concept bool CodeUnit();
template concept bool CodePoint();
template concept bool CharacterSet();
template concept bool Character();
template concept bool CodeUnitIterator();
template concept bool CodeUnitOutputIterator();
template concept bool TextEncodingState();
template concept bool TextEncodingStateTransition();
template concept bool TextErrorPolicy();
template concept bool TextEncoding();
template concept bool TextEncoder();
template concept bool TextForwardDecoder();
template concept bool TextBidirectionalDecoder();
template concept bool TextRandomAccessDecoder();
template concept bool TextIterator();
template concept bool TextSentinel();
template concept bool TextOutputIterator();
template concept bool TextInputIterator();
template concept bool TextForwardIterator();
template concept bool TextBidirectionalIterator();
template concept bool TextRandomAccessIterator();
template concept bool TextView();
template concept bool TextInputView();
template concept bool TextForwardView();
template concept bool TextBidirectionalView();
template concept bool TextRandomAccessView();

// error policies:
class text_error_policy;
class text_strict_error_policy;
class text_permissive_error_policy;
using text_default_error_policy = text_strict_error_policy;

// error handling:
enum class encode_status : int {
no_error = /* implementation-defined */,
invalid_character = /* implementation-defined */,
invalid_state_transition = /* implementation-defined */
};
enum class decode_status : int {
no_error = /* implementation-defined */,
no_character = /* implementation-defined */,
invalid_code_unit_sequence = /* implementation-defined */,
underflow = /* implementation-defined */
};
constexpr inline bool status_ok(encode_status es) noexcept;
constexpr inline bool status_ok(decode_status ds) noexcept;
constexpr inline bool error_occurred(encode_status es) noexcept;
constexpr inline bool error_occurred(decode_status ds) noexcept;
const char* status_message(encode_status es) noexcept;
const char* status_message(decode_status ds) noexcept;

// exception classes:
class text_error;
class text_encode_error;
class text_decode_error;

// character sets:
class any_character_set;
class basic_execution_character_set;
class basic_execution_wide_character_set;
class unicode_character_set;

// implementation defined character set type aliases:
using execution_character_set = /* implementation-defined */ ;
using execution_wide_character_set = /* implementation-defined */ ;
using universal_character_set = /* implementation-defined */ ;

// character set identification:
class character_set_id;

template
inline character_set_id get_character_set_id();

// character set information:
class character_set_info;

template
inline const character_set_info& get_character_set_info();
const character_set_info& get_character_set_info(character_set_id id);

// character set and encoding traits:
template
using code_unit_type_t = /* implementation-defined */ ;
template
using code_point_type_t = /* implementation-defined */ ;
template
using character_set_type_t = /* implementation-defined */ ;
template
using character_type_t = /* implementation-defined */ ;
template
using encoding_type_t = /* implementation-defined */ ;
template
using default_encoding_type_t = /* implementation-defined */ ;

// characters:
template class character;
template <> class character;

template
bool operator==(const character &lhs,
const character &rhs);
template
bool operator==(const character &lhs,
const character &rhs);
template
bool operator!=(const character &lhs,
const character &rhs);
template
bool operator!=(const character &lhs,
const character &rhs);

// encoding state and transition types:
class trivial_encoding_state;
class trivial_encoding_state_transition;
class utf8bom_encoding_state;
class utf8bom_encoding_state_transition;
class utf16bom_encoding_state;
class utf16bom_encoding_state_transition;
class utf32bom_encoding_state;
class utf32bom_encoding_state_transition;

// encodings:
class basic_execution_character_encoding;
class basic_execution_wide_character_encoding;
#if defined(__STDC_ISO_10646__)
class iso_10646_wide_character_encoding;
#endif // __STDC_ISO_10646__
class utf8_encoding;
class utf8bom_encoding;
class utf16_encoding;
class utf16be_encoding;
class utf16le_encoding;
class utf16bom_encoding;
class utf32_encoding;
class utf32be_encoding;
class utf32le_encoding;
class utf32bom_encoding;

// implementation defined encoding type aliases:
using execution_character_encoding = /* implementation-defined */ ;
using execution_wide_character_encoding = /* implementation-defined */ ;
using char8_character_encoding = /* implementation-defined */ ;
using char16_character_encoding = /* implementation-defined */ ;
using char32_character_encoding = /* implementation-defined */ ;

// itext_iterator:
template
requires TextForwardDecoder()
class itext_iterator;

// itext_sentinel:
template
class itext_sentinel;

// otext_iterator:
template> CUIT,
TextErrorPolicy TEP = text_default_error_policy>
class otext_iterator;

// otext_iterator factory functions:
template> IT>
auto make_otext_iterator(typename ET::state_type state, IT out)
-> otext_iterator;
template> IT>
auto make_otext_iterator(typename ET::state_type state, IT out)
-> otext_iterator;
template> IT>
auto make_otext_iterator(IT out)
-> otext_iterator;
template> IT>
auto make_otext_iterator(IT out)
-> otext_iterator;

// basic_text_view:
template
class basic_text_view;

// basic_text_view type aliases:
using text_view = basic_text_view;
using wtext_view = basic_text_view;
using u8text_view = basic_text_view;
using u16text_view = basic_text_view;
using u32text_view = basic_text_view;

// basic_text_view factory functions:
template ST>
auto make_text_view(typename ET::state_type state, IT first, ST last)
-> basic_text_view;
template ST>
requires requires () {
typename default_encoding_type_t>;
}
auto make_text_view(typename default_encoding_type_t>::state_type state,
IT first,
ST last)
-> basic_text_view>, /* implementation-defined */ >;
template ST>
auto make_text_view(IT first, ST last)
-> basic_text_view;
template ST>
requires requires () {
typename default_encoding_type_t>;
}
auto make_text_view(IT first, ST last)
-> basic_text_view>, /* implementation-defined */ >;
template
auto make_text_view(typename ET::state_type state,
IT first,
typename std::make_unsigned>::type n)
-> basic_text_view;
template
requires requires () {
typename default_encoding_type_t>;
}
auto make_text_view(typename typename default_encoding_type_t>::state_type state,
IT first,
typename std::make_unsigned>::type n)
-> basic_text_view>, /* implementation-defined */ >;
template
auto make_text_view(IT first,
typename std::make_unsigned>::type n)
-> basic_text_view;
template
requires requires () {
typename default_encoding_type_t>;
}
auto make_text_view(IT first,
typename std::make_unsigned>::type n)
-> basic_text_view>, /* implementation-defined */ >;
template
auto make_text_view(typename ET::state_type state,
const Iterable &iterable)
-> basic_text_view;
template
requires requires () {
typename default_encoding_type_t>>;
}
auto make_text_view(typename default_encoding_type_t>>::state_type state,
const Iterable &iterable)
-> basic_text_view>>, /* implementation-defined */ >;
template
auto make_text_view(const Iterable &iterable)
-> basic_text_view;
template
requires requires () {
typename default_encoding_type_t>>;
}
auto make_text_view(const Iterable &iterable)
-> basic_text_view>>, /* implementation-defined */ >;
template TST>
auto make_text_view(TIT first, TST last)
-> basic_text_view;
template
TVT make_text_view(TVT tv);

} // inline namespace text
} // namespace experimental
} // namespace std
```

## Concepts

- [Concept CodeUnit](#concept-codeunit)
- [Concept CodePoint](#concept-codepoint)
- [Concept CharacterSet](#concept-characterset)
- [Concept Character](#concept-character)
- [Concept CodeUnitIterator](#concept-codeunititerator)
- [Concept CodeUnitOutputIterator](#concept-codeunitoutputiterator)
- [Concept TextEncodingState](#concept-textencodingstate)
- [Concept TextEncodingStateTransition](#concept-textencodingstatetransition)
- [Concept TextErrorPolicy](#concept-texterrorpolicy)
- [Concept TextEncoding](#concept-textencoding)
- [Concept TextEncoder](#concept-textencoder)
- [Concept TextForwardDecoder](#concept-textforwarddecoder)
- [Concept TextBidirectionalDecoder](#concept-textbidirectionaldecoder)
- [Concept TextRandomAccessDecoder](#concept-textrandomaccessdecoder)
- [Concept TextIterator](#concept-textiterator)
- [Concept TextSentinel](#concept-textsentinel)
- [Concept TextOutputIterator](#concept-textoutputiterator)
- [Concept TextInputIterator](#concept-textinputiterator)
- [Concept TextForwardIterator](#concept-textforwarditerator)
- [Concept TextBidirectionalIterator](#concept-textbidirectionaliterator)
- [Concept TextRandomAccessIterator](#concept-textrandomaccessiterator)
- [Concept TextView](#concept-textview)
- [Concept TextInputView](#concept-textinputview)
- [Concept TextForwardView](#concept-textforwardview)
- [Concept TextBidirectionalView](#concept-textbidirectionalview)
- [Concept TextRandomAccessView](#concept-textrandomaccessview)

### Concept CodeUnit
The `CodeUnit` concept specifies requirements for a type usable as the
[code unit](#code-unit) type of a string type.

```C++
template concept bool CodeUnit() {
return /* implementation-defined */ ;
}
```

`CodeUnit()` is satisfied if and only if
`std::is_integral::value` is true and at least one of
`std::is_unsigned::value` is true,
`std::is_same, char>::value` is true, or
`std::is_same, wchar_t>::value` is true.

### Concept CodePoint
The `CodePoint` concept specifies requirements for a type usable as the
[code point](#code-point) type of a [character set](#character-set) type.

```C++
template concept bool CodePoint() {
return /* implementation-defined */ ;
}
```

`CodePoint()` is satisfied if and only if
`std::is_integral::value` is true and at least one of
`std::is_unsigned::value` is true,
`std::is_same, char>::value` is true, or
`std::is_same, wchar_t>::value` is true.

### Concept CharacterSet
The `CharacterSet` concept specifies requirements for a type that describes
a [character set](#character-set). Such a type has a member typedef-name
declaration for a type that satisfies `CodePoint`, a static member function
that returns a name for the [character set](#character-set), and a static
member function that returns a code point value to be used to construct a
substitution character to stand in when errors occur during encoding and
decoding operations when the permissive error policy is in effect.

```C++
template concept bool CharacterSet() {
return CodePoint>()
&& requires () {
{ T::get_name() } noexcept -> const char *;
{ T::get_substitution_code_point() } noexcept -> code_point_type_t;
};
}
```

### Concept Character
The `Character` concept specifies requirements for a type that describes a
[character](#character) as defined by an associated
[character set](#character-set). Non-static member functions provide access to
the [code point](#code-point) value of the described [character](#character).
Types that satisfy `Character` are regular and copyable.

```C++
template concept bool Character() {
return ranges::Regular()
&& ranges::Constructible>>()
&& CharacterSet>()
&& requires (T t,
const T ct,
code_point_type_t> cp)
{
{ t.set_code_point(cp) } noexcept;
{ ct.get_code_point() } noexcept
-> code_point_type_t>;
{ ct.get_character_set_id() }
-> character_set_id;
};
}
```

### Concept CodeUnitIterator
The `CodeUnitIterator` concept specifies requirements of an iterator that
has a value type that satisfies `CodeUnit`.

```C++
template concept bool CodeUnitIterator() {
return ranges::Iterator()
&& CodeUnit>();
}
```

### Concept CodeUnitOutputIterator
The `CodeUnitOutputIterator` concept specifies requirements of an output
iterator that can be assigned from a type that satisfies `CodeUnit`.

```C++
template concept bool CodeUnitOutputIterator() {
return ranges::OutputIterator()
&& CodeUnit();
}
```

### Concept TextEncodingState
The `TextEncodingState` concept specifies requirements of types that hold
[encoding](#encoding) state. Such types are semiregular.

```C++
template concept bool TextEncodingState() {
return ranges::Semiregular();
}
```

### Concept TextEncodingStateTransition
The `TextEncodingStateTransition` concept specifies requirements of types
that hold [encoding](#encoding) state transitions. Such types are
semiregular.

```C++
template concept bool TextEncodingStateTransition() {
return ranges::Semiregular();
}
```

### Concept TextErrorPolicy
The `TextErrorPolicy` concept specifies requirements of types used
to specify error handling policies. Such types are semiregular class types
that derive from class `text_error_policy`.

```C++
template concept bool TextErrorPolicy() {
return ranges::Semiregular()
&& ranges::DerivedFrom()
&& !ranges::Same, text_error_policy>();
}
```

### Concept TextEncoding
The `TextEncoding` concept specifies requirements of types that define an
[encoding](#encoding). Such types define member types that identify the
[code unit](#code-unit), [character](#character), encoding state, and encoding
state transition types, a static member function that returns an initial
encoding state object that defines the encoding state at the beginning of a
sequence of encoded characters, and static data members that specify the
minimum and maximum number of [code units](#code-units) used to encode any
single character.

```C++
template concept bool TextEncoding() {
return requires () {
{ T::min_code_units } noexcept -> int;
{ T::max_code_units } noexcept -> int;
}
&& TextEncodingState()
&& TextEncodingStateTransition()
&& CodeUnit>()
&& Character>()
&& requires () {
{ T::initial_state() } noexcept
-> const typename T::state_type&;
};
}
```

### Concept TextEncoder
The `TextEncoder` concept specifies requirements of types that are used to
encode [characters](#character) using a particular [code unit](#code-unit)
iterator that satisfies `OutputIterator`. Such a type satisifies
`TextEncoding` and defines static member functions used to encode state
transitions and [characters](#character).

```C++
template concept bool TextEncoder() {
return TextEncoding()
&& ranges::OutputIterator>()
&& requires (
typename T::state_type &state,
CUIT &out,
typename T::state_transition_type stt,
int &encoded_code_units)
{
{ T::encode_state_transition(state, out, stt, encoded_code_units) }
-> encode_status;
}
&& requires (
typename T::state_type &state,
CUIT &out,
character_type_t c,
int &encoded_code_units)
{
{ T::encode(state, out, c, encoded_code_units) }
-> encode_status;
};
}
```

### Concept TextForwardDecoder
The `TextForwardDecoder` concept specifies requirements of types that are used
to decode [characters](#character) using a particular [code unit](#code-unit)
iterator that satisifies `ForwardIterator`. Such a type satisfies
`TextEncoding` and defines a static member function used to decode state
transitions and [characters](#character).

```C++
template concept bool TextForwardDecoder() {
return TextEncoding()
&& ranges::ForwardIterator()
&& ranges::ConvertibleTo,
code_unit_type_t>()
&& requires (
typename T::state_type &state,
CUIT &in_next,
CUIT in_end,
character_type_t &c,
int &decoded_code_units)
{
{ T::decode(state, in_next, in_end, c, decoded_code_units) }
-> decode_status;
};

}
```

### Concept TextBidirectionalDecoder
The `TextBidirectionalDecoder` concept specifies requirements of types that
are used to decode [characters](#character) using a particular
[code unit](#code-unit) iterator that satisifies `BidirectionalIterator`. Such
a type satisfies `TextForwardDecoder` and defines a static member function
used to decode state transitions and [characters](#character) in the reverse
order of their encoding.

```C++
template concept bool TextBidirectionalDecoder() {
return TextForwardDecoder()
&& ranges::BidirectionalIterator()
&& requires (
typename T::state_type &state,
CUIT &in_next,
CUIT in_end,
character_type_t &c,
int &decoded_code_units)
{
{ T::rdecode(state, in_next, in_end, c, decoded_code_units) }
-> decode_status;
};
}
```

### Concept TextRandomAccessDecoder
The `TextRandomAccessDecoder` concept specifies requirements of types that
are used to decode [characters](#character) using a particular
[code unit](#code-unit) iterator that satisifies `RandomAccessIterator`. Such a
type satisfies `TextBidirectionalDecoder`, requires that the minimum and
maximum number of [code units](#code-unit) used to encode any character have
the same value, and that the encoding state be an empty type.

```C++
template concept bool TextRandomAccessDecoder() {
return TextBidirectionalDecoder()
&& ranges::RandomAccessIterator()
&& T::min_code_units == T::max_code_units
&& std::is_empty::value;
}
```

### Concept TextIterator
The `TextIterator` concept specifies requirements of iterator types that are
used to encode and decode [characters](#character) as an [encoded](#encoding)
sequence of [code units](#code-unit). [Encoding](#encoding) state and error
indication is held in each iterator instance and is made accessible via
non-static member functions.

```C++
template concept bool TextIterator() {
return ranges::Iterator()
&& TextEncoding>()
&& TextErrorPolicy()
&& TextEncodingState()
&& requires (const T ct) {
{ ct.state() } noexcept
-> const typename encoding_type_t::state_type&;
{ ct.error_occurred() } noexcept
-> bool;
};
}
```

### Concept TextSentinel
The `TextSentinel` concept specifies requirements of types that are used to
mark the end of a range of encoded [characters](#character). A type T that
satisfies `TextIterator` also satisfies `TextSentinel` there by enabling
`TextIterator` types to be used as sentinels.

```C++
template concept bool TextSentinel() {
return ranges::Sentinel()
&& TextIterator()
&& TextErrorPolicy();
}
```

### Concept TextOutputIterator
The `TextOutputIterator` concept refines `TextIterator` with a requirement that
the type also satisfy `ranges::OutputIterator` for the character type of the
associated encoding and that a member function be provided for retrieving error
information.

```C++
template concept bool TextOutputIterator() {
return TextIterator();
&& ranges::OutputIterator>>()
&& requires (const T ct) {
{ ct.get_error() } noexcept
-> encode_status;
};
}
```

### Concept TextInputIterator
The `TextInputIterator` concept refines `TextIterator` with requirements that
the type also satisfy `ranges::InputIterator`, that the iterator value type
satisfy `Character`, and that a member function be provided for retrieving error
information.

```C++
template concept bool TextInputIterator() {
return TextIterator()
&& ranges::InputIterator()
&& Character>()
&& requires (const T ct) {
{ ct.get_error() } noexcept
-> decode_status;
};
}
```

### Concept TextForwardIterator
The `TextForwardIterator` concept refines `TextInputIterator` with a requirement
that the type also satisfy `ranges::ForwardIterator`.

```C++
template concept bool TextForwardIterator() {
return TextInputIterator()
&& ranges::ForwardIterator();
}
```

### Concept TextBidirectionalIterator
The `TextBidirectionalIterator` concept refines `TextForwardIterator` with a
requirement that the type also satisfy `ranges::BidirectionalIterator`.

```C++
template concept bool TextBidirectionalIterator() {
return TextForwardIterator()
&& ranges::BidirectionalIterator();
}
```

### Concept TextRandomAccessIterator
The `TextRandomAccessIterator` concept refines `TextBidirectionalIterator` with
a requirement that the type also satisfy `ranges::RandomAccessIterator`.

```C++
template concept bool TextRandomAccessIterator() {
return TextBidirectionalIterator()
&& ranges::RandomAccessIterator();
}
```

### Concept TextView
The `TextView` concept specifies requirements of types that provide view access
to an underlying [code unit](#code-unit) range. Such types satisfy
`ranges::View`, provide iterators that satisfy `TextIterator`, define member
types that identify the [encoding](#encoding), encoding state, and underlying
[code unit](#code-unit) range and iterator types. Non-static member functions
are provided to access the underlying [code unit](#code-unit) range and initial
[encoding](#encoding) state.

Types that satisfy `TextView` do not own the underlying [code unit](#code-unit)
range and are copyable in constant time. The lifetime of the underlying range
must exceed the lifetime of referencing `TextView` objects.

```C++
template concept bool TextView() {
return ranges::View()
R& TextIterator>()
&& TextEncoding>()
&& ranges::View()
&& TextErrorPolicy()
&& TextEncodingState()
&& CodeUnitIterator>()
R& requires (T t, const T ct) {
{ ct.base() } noexcept
-> const typename T::view_type&;
{ ct.initial_state() } noexcept
-> const typename T::state_type&;
};
}
```

### Concept TextInputView
The `TextInputView` concept refines `TextView` with a requirement that the
view's iterator type also satisfy `TextInputIterator`.

```C++
template concept bool TextInputView() {
return TextView()
&& TextInputIterator>();
}
```

### Concept TextForwardView
The `TextForwardView` concept refines `TextInputView` with a requirement that
the view's iterator type also satisfy `TextForwardIterator`.

```C++
template concept bool TextForwardView() {
return TextInputView()
&& TextForwardIterator>();
}
```

### Concept TextBidirectionalView
The `TextBidirectionalView` concept refines `TextForwardView` with a requirement
that the view's iterator type also satisfy `TextBidirectionalIterator`.

```C++
template concept bool TextBidirectionalView() {
return TextForwardView()
&& TextBidirectionalIterator>();
}
```

### Concept TextRandomAccessView
The `TextRandomAccessView` concept refines `TextBidirectionalView` with a
requirement that the view's iterator type also satisfy
`TextRandomAccessIterator`.

```C++
template concept bool TextRandomAccessView() {
return TextBidirectionalView()
&& TextRandomAccessIterator>();
}
```

## Error Policies

- [Class text_error_policy](#class-text_error_policy)
- [Class text_strict_error_policy](#class-text_strict_error_policy)
- [Class text_permissive_error_policy](#class-text_permissive_error_policy)
- [Alias text_default_error_policy](#alias-text_default_error_policy)

### Class text_error_policy

Class `text_error_policy` is a base class from which all text error policy
classes must derive.

```C++
class text_error_policy {};
```

### Class text_strict_error_policy

The `text_strict_error_policy` class is a policy class that specifies that
exceptions be thrown for errors that occur during encoding and decoding
operations initiated through text iterators. This class satisfies
`TextErrorPolicy`.

```C++
class text_strict_error_policy : public text_error_policy {};
```

### Class text_permissive_error_policy

The `class_text_permissive_error_policy` class is a policy class that specifies
that substitution characters such as the Unicode replacement character
`U+FFFD` be substituted in place of errors that occur during encoding and
decoding operations initiated through text iterators. This class satisfies
`TextErrorPolicy`.

```C++
class text_permissive_error_policy : public text_error_policy {};
```

### Alias text_default_error_policy

The `text_default_error_policy` alias specifies the default text error policy.
Conforming implementations must alias this to `text_strict_error_policy`, but
may have options to select an alternative default policy for environments that
do not support exceptions. The referred class shall satisfy `TextErrorPolicy`.

```C++
using text_default_error_policy = text_strict_error_policy;
```

## Error Status

- [Enum encode_status](#enum-encode_status)
- [Enum decode_status](#enum-decode_status)
- [status_ok](#status_ok)
- [error_occurred](#error_occurred)
- [status_message](#status_message)

### Enum encode_status

The `encode_status` enumeration type defines enumerators used to report errors
that occur during text encoding operations.

The `no_error` enumerator indicates that no error has occurred.

The `invalid_character` enumerator indicates that an attempt was made to encode
a character that was not valid for the encoding.

The `invalid_state_transition` enumerator indicates that an attempt was made to
encode a state transition that was not valid for the encoding.

```C++
enum class encode_status : int {
no_error = /* implementation-defined */,
invalid_character = /* implementation-defined */,
invalid_state_transition = /* implementation-defined */
};
```

### Enum decode_status

The `decode_status` enumeration type defines enumerators used to report errors
that occur during text decoding operations.

The `no_error` enumerator indicates that no error has occurred.

The `no_character` enumerator indicates that no error has occurred, but that no
character was decoded for a code unit sequence. This typically indicates that
the code unit sequence represents an encoding state transition such as for an
escape sequence or byte order marker.

The `invalid_code_unit_sequence` enumerator indicates that an attempt was made
to decode an invalid code unit sequence.

The `underflow` enumerator indicates that the end of the input range was
encountered before a complete code unit sequence was decoded.

```C++
enum class decode_status : int {
no_error = /* implementation-defined */,
no_character = /* implementation-defined */,
invalid_code_unit_sequence = /* implementation-defined */,
underflow = /* implementation-defined */
};
```

### status_ok

The `status_ok` function returns `true` if the `encode_status` argument value
is `encode_status::no_error` or if the `decode_status` argument is either of
`decode_status::no_error` or `decode_status::no_character`. `false` is
returned for all other values.

```C++
constexpr inline bool status_ok(encode_status es) noexcept;
constexpr inline bool status_ok(decode_status ds) noexcept;
```

### error_occurred

The `error_occurred` function returns `false` if the `encode_status` argument
value is `encode_status::no_error` or if the `decode_status` argument is either
of `decode_status::no_error` or `decode_status::no_character`. `true` is
returned for all other values.

```C++
constexpr inline bool error_occurred(encode_status es) noexcept;
constexpr inline bool error_occurred(decode_status ds) noexcept;
```

### status_message

The `status_message` function returns a pointer to a statically allocated
string containing a short description of the value of the `encode_status` or
`decode_status` argument.

```C++
const char* status_message(encode_status es) noexcept;
const char* status_message(decode_status ds) noexcept;
```

## Exceptions

- [Class text_error](#class-text_error)
- [Class text_encode_error](#class-text_encode_error)
- [Class text_decode_error](#class-text_decode_error)

### Class text_error

The `text_error` class defines the base class for the types of objects
thrown as exceptions to report errors detected during text processing.

```C++
class text_error : public std::runtime_error
{
public:
using std::runtime_error::runtime_error;
};
```

### Class text_encode_error

The `text_encode_error` class defines the types of objects thrown as exceptions
to report errors detected during encoding of a [character](#character). Objects
of this type are generally thrown in response to an attempt to encode a
[character](#character) with an invalid [code point](#code-point) value, or to
encode an invalid state transition.

```C++
class text_encode_error : public text_error
{
public:
explicit text_encode_error(encode_status es) noexcept;

const encode_status& status_code() const noexcept;

private:
encode_status es; // exposition only
};
```

### Class text_decode_error

The `text_decode_error` class defines the types of objects thrown as exceptions
to report errors detected during decoding of a [code unit](#code-unit) sequence.
Objects of this type are generally thrown in response to an attempt to decode
an ill-formed [code unit](#code-unit) sequence, a [code unit](#code-unit)
sequence that specifies an invalid [code point](#code-point) value, or a
[code unit](#code-unit) sequence that specifies an invalid state transition.

```C++
class text_decode_error : public text_error
{
public:
explicit text_decode_error(decode_status ds) noexcept;

const decode_status& status_code() const noexcept;

private:
decode_status ds; // exposition only
};
```

## Type traits

- [code_unit_type_t](#code_unit_type_t)
- [code_point_type_t](#code_point_type_t)
- [character_set_type_t](#character_set_type_t)
- [character_type_t](#character_type_t)
- [encoding_type_t](#encoding_type_t)
- [default_encoding_type_t](#default_encoding_type_t)

### code_unit_type_t

The `code_unit_type_t` type alias template provides convenient means for
selecting the associated [code unit](#code-unit) type of some other type,
such as an [encoding](#encoding) type that satisfies `TextEncoding`. The
aliased type is the same as `typename T::code_unit_type`.

```C++
template
using code_unit_type_t = /* implementation-defined */ ;
```

### code_point_type_t

The `code_point_type_t` type alias template provides convenient means for
selecting the associated [code point](#code-point) type of some other type,
such as a type that satisfies `CharacterSet` or `Character`. The aliased
type is the same as `typename T::code_point_type`.

```C++
template
using code_point_type_t = /* implementation-defined */ ;
```

### character_set_type_t

The `character_set_type_t` type alias template provides convenient means for
selecting the associated [character set](#character-set) type of some other
type, such as a type that satisfies `Character`. The aliased type is the same
as `typename T::character_set_type`.

```C++
template
using character_set_type_t = /* implementation-defined */ ;
```

### character_type_t

The `character_type_t` type alias template provides convenient means for
selecting the associated [character](#character) type of some other type, such
as a type that satisfies `TextEncoding`. The aliased type is the same as
`typename T::character_type`.

```C++
template
using character_type_t = /* implementation-defined */ ;
```

### encoding_type_t

The `encoding_type_t` type alias template provides convenient means for
selecting the associated [encoding](#encoding) type of some other type, such
as a type that satisfies `TextIterator` or `TextView`. The aliased type is the
same as `typename T::encoding_type`.

```C++
template
using encoding_type_t = /* implementation-defined */ ;
```

### default_encoding_type_t

The `default_encoding_type_t` type alias template resolves to the default
[encoding](#encoding) type, if any, for a given type, such as a type that
satisfies `CodeUnit`. Specializations are provided for the following
cv-unqualified and reference removed fundamental types. Otherwise, the alias
will attempt to resolve against a `default_encoding_type` member type.

When `std::remove_cv_t>` is ... | the default encoding is ...
---------------------------------------------------------- | ---------------------------
`char` | `execution_character_encoding`
`wchar_t` | `execution_wide_character_encoding`
`char16_t` | `char16_character_encoding`
`char32_t` | `char32_character_encoding`

```C++
template
using default_encoding_type_t = /* implementation-defined */ ;
```

## Character sets

- [Class any_character_set](#class-any_character_set)
- [Class basic_execution_character_set](#class-basic_execution_character_set)
- [Class basic_execution_wide_character_set](#class-basic_execution_wide_character_set)
- [Class unicode_character_set](#class-unicode_character_set)
- [Character set type aliases](#character-set-type-aliases)

### Class any_character_set

The `any_character_set` class provides a generic [character set](#character-set)
type used when a specific [character set](#character-set) type is unknown or
when the ability to switch between specific [character sets](#character-set)
is required. This class satisfies the `CharacterSet` concept and has an
implementation defined `code_point_type` that is able to represent
[code point](#code-point) values from all of the implementation provided
[character set](#character-set) types. The code point returned by
`get_substitution_code_point` is implementation defined.

```C++
class any_character_set {
public:
using code_point_type = /* implementation-defined */;

static const char* get_name() noexcept {
return "any_character_set";
}

static constexpr code_point_type get_substitution_code_point() noexcept;
};
```

### Class basic_execution_character_set

The `basic_execution_character_set` class represents the basic execution
character set specified in `[lex.charset]p3` of the [C++11][ISO/IEC 14882:2011]
standard. This class satisfies the `CharacterSet` concept and has a
`code_point_type` member type that aliases `char`. The code point returned by
`get_substitution_code_point` is the code point for the `'?'` character.

```C++
class basic_execution_character_set {
public:
using code_point_type = char;

static const char* get_name() noexcept {
return "basic_execution_character_set";
}

static constexpr code_point_type get_substitution_code_point() noexcept;
};
```

### Class basic_execution_wide_character_set

The `basic_execution_wide_character_set` class represents the basic execution
wide character set specified in `[lex.charset]p3` of the
[C++11][ISO/IEC 14882:2011] standard. This class satisfies the `CharacterSet`
concept and has a `code_point_type` member type that aliases `wchar_t`. The
code point returned by `get_substitution_code_point` is the code point for the
`L'?'` character.

```C++
class basic_execution_wide_character_set {
public:
using code_point_type = wchar_t;

static const char* get_name() noexcept {
return "basic_execution_wide_character_set";
}

static constexpr code_point_type get_substitution_code_point() noexcept;
};
```

### Class unicode_character_set

The `unicode_character_set` class represents the [Unicode]
[character sets](#character-set). This class satisfies the `CharacterSet`
concept and has a `code_point_type` member type that aliases `char32_t`.
The code point returned by `get_substitution_code_point` is the `U+FFFD`
Unicode replacement character.

```C++
class unicode_character_set {
public:
using code_point_type = char32_t;

static const char* get_name() noexcept {
return "unicode_character_set";
}

static constexpr code_point_type get_substitution_code_point() noexcept;
};
```

### Character set type aliases

The `execution_character_set`,
`execution_wide_character_set`, and
`universal_character_set` type aliases reflect the implementation
defined execution, wide execution, and universal
[character sets](#character-set) specified in `[lex.charset]p2-3` of the C++
standard.

The [character set](#character-set) aliased by `execution_character_set` must be
a superset of the `basic_execution_character_set`
[character set](#character-set). This alias refers to the
[character set](#character-set) that the compiler assumes during
translation; the [character set](#character-set) that the compiler uses when
translating [characters](#character) specified by universal-character-name
designators in ordinary string literals, not the locale sensitive run-time
execution [character set](#character-set).

The [character set](#character-set) aliased by `execution_wide_character_set`
must be a superset of the `basic_execution_wide_character_set`
[character set](#character-set). This alias refers to the
[character set](#character-set) that the compiler assumes during
translation; the [character set](#character-set) that the compiler uses when
translating [characters](#character) specified by universal-character-name
designators in wide string literals, not the locale sensitive run-time
execution wide [character set](#character-set).

The [character set](#character-set) aliased by `universal_character_set` must
be a superset of the `unicode_character_set` [character set](#character-set).

```C++
using execution_character_set = /* implementation-defined */ ;
using execution_wide_character_set = /* implementation-defined */ ;
using universal_character_set = /* implementation-defined */ ;
```

## Character set identification

- [Class character_set_id](#class-character_set_id)
- [get_character_set_id](#get_character_set_id)

### Class character_set_id

The `character_set_id` class provides unique, opaque values used to identify
[character sets](#character-set) at run-time. Values of this type are produced
by `get_character_set_id()` and can be passed to `get_character_set_info()` to
obtain [character set](#character-set) information. Values of this type are
copy constructible, copy assignable, equality comparable, and strictly totally
ordered.

```C++
class character_set_id {
public:
character_set_id() = delete;

friend bool operator==(character_set_id lhs, character_set_id rhs) noexcept;
friend bool operator!=(character_set_id lhs, character_set_id rhs) noexcept;

friend bool operator<(character_set_id lhs, character_set_id rhs) noexcept;
friend bool operator>(character_set_id lhs, character_set_id rhs) noexcept;
friend bool operator<=(character_set_id lhs, character_set_id rhs) noexcept;
friend bool operator>=(character_set_id lhs, character_set_id rhs) noexcept;
};
```

### get_character_set_id

`get_character_set_id()` returns a unique, opaque value for the
[character set](#character-set) type specified by the template parameter.

```C++
template
inline character_set_id get_character_set_id();
```

## Character set information

- [Class character_set_info](#class-character_set_info)
- [get_character_set_info](#get_character_set_info)

### Class character_set_info

The `character_set_info` class stores information about a
[character set](#character-set). Values of this type are produced by the
`get_character_set_info()` functions based on a [character set](#character-set)
type or ID.

```C++
class character_set_info {
public:
character_set_info() = delete;

character_set_id get_id() const noexcept;

const char* get_name() const noexcept;

private:
character_set_id id; // exposition only
};
```

### get_character_set_info

The `get_character_set_info()` functions return a reference to a
`character_set_info` object based on a [character set](#character-set) type or
ID.

```C++
const character_set_info& get_character_set_info(character_set_id id);

template
inline const character_set_info& get_character_set_info();
```

## Characters

- [Class template character](#class-template-character)

### Class template character

Objects of `character` class template specialization type define a
[character](#character) via the association of a [code point](#code-point)
value and a [character set](#character-set). The specialization provided for
the `any_character_set` type is used to maintain a dynamic
[character set](#character-set) association while specializations for other
[character sets](#character-set) specify a static association. These types
satisfy the `Character` concept and are default constructible, copy
constructible, copy assignable, and equality comparable. Member functions
provide access to the [code point](#code-point) and
[character set](#character-set) ID values for the represented
[character](#character). Default constructed objects represent a null
[character](#character) using a zero initialized [code point](#code-point)
value.

Objects with different [character set](#character-set) type are not equality
comparable with the exception that objects with a static
[character set](#character-set) type of `any_character_set` are comparable with
objects with any static [character set](#character-set) type. In this case,
objects compare equally if and only if their [character set](#character-set)
ID and [code point](#code-point) values match. Equality comparison between
objects with different static [character set](#character-set) type is not
implemented to avoid potentially costly unintended implicit transcoding between
[character sets](#character-set).

```C++
template
class character {
public:
using character_set_type = CST;
using code_point_type = code_point_type_t;

character() = default;
explicit character(code_point_type code_point) noexcept;

friend bool operator==(const character &lhs,
const character &rhs) noexcept;
friend bool operator!=(const character &lhs,
const character &rhs) noexcept;

void set_code_point(code_point_type code_point) noexcept;
code_point_type get_code_point() const noexcept;

static character_set_id get_character_set_id();

private:
code_point_type code_point; // exposition only
};

template<>
class character {
public:
using character_set_type = any_character_set;
using code_point_type = code_point_type_t;

character() = default;
explicit character(code_point_type code_point) noexcept;
character(character_set_id cs_id, code_point_type code_point) noexcept;

friend bool operator==(const character &lhs,
const character &rhs) noexcept;
friend bool operator!=(const character &lhs,
const character &rhs) noexcept;

void set_code_point(code_point_type code_point) noexcept;
code_point_type get_code_point() const noexcept;

void set_character_set_id(character_set_id new_cs_id) noexcept;
character_set_id get_character_set_id() const noexcept;

private:
character_set_id cs_id; // exposition only
code_point_type code_point; // exposition only
};

template
bool operator==(const character &lhs,
const character &rhs);
template
bool operator==(const character &lhs,
const character &rhs);
template
bool operator!=(const character &lhs,
const character &rhs);
template
bool operator!=(const character &lhs,
const character &rhs);
```

## Encodings

- [class trivial_encoding_state](#class-trivial_encoding_state)
- [class trivial_encoding_state_transition](#class-trivial_encoding_state_transition)
- [Class basic_execution_character_encoding](#class-basic_execution_character_encoding)
- [Class basic_execution_wide_character_encoding](#class-basic_execution_wide_character_encoding)
- [Class iso_10646_wide_character_encoding](#class-iso_10646_wide_character_encoding)
- [Class utf8_encoding](#class-utf8_encoding)
- [Class utf8bom_encoding](#class-utf8bom_encoding)
- [Class utf16_encoding](#class-utf16_encoding)
- [Class utf16be_encoding](#class-utf16be_encoding)
- [Class utf16le_encoding](#class-utf16le_encoding)
- [Class utf16bom_encoding](#class-utf16bom_encoding)
- [Class utf32_encoding](#class-utf32_encoding)
- [Class utf32be_encoding](#class-utf32be_encoding)
- [Class utf32le_encoding](#class-utf32le_encoding)
- [Class utf32bom_encoding](#class-utf32bom_encoding)
- [Encoding type aliases](#encoding-type-aliases)

### Class trivial_encoding_state

The `trivial_encoding_state` class is an empty class used by stateless
[encodings](#encoding) to implement the parts of the generic
[encoding](#encoding) interfaces necessary to support stateful
[encodings](#encoding).

```C++
class trivial_encoding_state {};
```

### Class trivial_encoding_state_transition

The `trivial_encoding_state_transition` class is an empty class used by
stateless [encodings](#encoding) to implement the parts of the generic
[encoding](#encoding) interfaces necessary to support stateful
[encodings](#encoding) that support non-[code-point](#code-point)
encoding [code unit](#code-unit) sequences.

```C++
class trivial_encoding_state_transition {};
```

### Class basic_execution_character_encoding

The `basic_execution_character_encoding` class implements support for the
[encoding](#encoding) used for ordinary string literals limited to support
for the basic execution character set as defined in `[lex.charset]p3` of
the C++ standard.

This [encoding](#encoding) is trivial, stateless, fixed width, supports
random access decoding, and has a [code unit](#code-unit) of type `char`.

Errors that occur during encoding and decoding operations are reported via
the `encode_status` and `decode_status` return types. Exceptions are not
directly thrown, but may propagate from operations performed on the
dependent code unit iterator.

```C++
class basic_execution_character_encoding {
public:
using state_type = trivial_encoding_state;
using state_transition_type = trivial_encoding_state_transition;
using character_type = character;
using code_unit_type = char;

static constexpr int min_code_units = 1;
static constexpr int max_code_units = 1;

static const state_type& initial_state() noexcept;

template CUIT>
static encode_status encode_state_transition(state_type &state,
CUIT &out,
const state_transition_type &stt,
int &encoded_code_units)
noexcept(/* implementation defined */);

template CUIT>
static encode_status encode(state_type &state,
CUIT &out,
character_type c,
int &encoded_code_units)
noexcept(/* implementation defined */);

template
requires ranges::ForwardIterator()
&& ranges::Convertible, code_unit_type>()
&& ranges::Sentinel()
static decode_status decode(state_type &state,
CUIT &in_next,
CUST in_end,
character_type &c,
int &decoded_code_units
noexcept(/* implementation defined */);

template
requires ranges::ForwardIterator()
&& ranges::Convertible, code_unit_type>()
&& ranges::Sentinel()
static decode_status rdecode(state_type &state,
CUIT &in_next,
CUST in_end,
character_type &c,
int &decoded_code_units
noexcept(/* implementation defined */);
};
```

### Class basic_execution_wide_character_encoding

The `basic_execution_wide_character_encoding` class implements support for the
[encoding](#encoding) used for wide string literals limited to support for the
basic execution wide-character set as defined in `[lex.charset]p3` of
the C++ standard.

This [encoding](#encoding) is trivial, stateless, fixed width, supports
random access decoding, and has a [code unit](#code-unit) of type `wchar_t`.

Errors that occur during encoding and decoding operations are reported via
the `encode_status` and `decode_status` return types. Exceptions are not
directly thrown, but may propagate from operations performed on the
dependent code unit iterator.

```C++
class basic_execution_wide_character_encoding {
public:
using state_type = trivial_encoding_state;
using state_transition_type = trivial_encoding_state_transition;
using character_type = character;
using code_unit_type = wchar_t;

static constexpr int min_code_units = 1;
static constexpr int max_code_units = 1;

static const state_type& initial_state() noexcept;

template