Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/adamretter/utf8-validator-c
UTF8 Validator (C edition)
https://github.com/adamretter/utf8-validator-c
Last synced: 24 days ago
JSON representation
UTF8 Validator (C edition)
- Host: GitHub
- URL: https://github.com/adamretter/utf8-validator-c
- Owner: adamretter
- License: apache-2.0
- Created: 2018-10-07T08:29:43.000Z (about 6 years ago)
- Default Branch: main
- Last Pushed: 2021-11-24T21:42:24.000Z (almost 3 years ago)
- Last Synced: 2024-10-04T13:15:37.170Z (about 1 month ago)
- Language: C
- Size: 11.7 KB
- Stars: 23
- Watchers: 6
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# UTF8 Validator (C edition)
This is a more basic but much faster version of [UTF8 Validator](https://github.com/digital-preservation/utf8-validator). The C edition, uses the [fastvalidate-utf-8](https://github.com/lemire/fastvalidate-utf-8) library from [Daniel Lemire](https://github.com/lemire). The C edition only returns a pass or fail result. It does not provide information about the position at which validation fails or allow to continue validating further after the first error.
A UTF-8 Validation Tool which may be used as a command line tool, if you are looking for a C library to use with your own program see [fastvalidate-utf-8](https://github.com/lemire/fastvalidate-utf-8).
Released under the [Apache 2.0 Licence](https://opensource.org/licenses/Apache-2.0).
[![CI](https://github.com/adamretter/utf8-validator-c/workflows/CI/badge.svg)](https://github.com/adamretter/utf8-validator-c/actions?query=workflow%3ACI)
## Use from the Command Line
You can [build from the source code](#building-from-source-code). You can then run `utf8validate` (Linux/Mac/Unix).For example:
```
$ ./utf8validate
```## Command Line Exit Codes
* **0** Success
* **1** Invalid Arguments provided to the application
* **2** File was not UTF-8 Valid
* **4** IO Error, e.g. could not read file## Building from Source Code
### Prerequisite
* c99 compaitble compiler, e.g. modern GCC or Clang
* Make### Steps
* Git clone the repository from https://github.com/adamretter/utf8-validator-c.git
* Build using Make, by running `make` in the cloned directory, you will then find a binary of the compiled application named `utf8validate`.## Reference
- John Keiser, Daniel Lemire, [Validating UTF-8 In Less Than One Instruction Per Byte](https://arxiv.org/abs/2010.03090), Software: Practice & Experience (to appear)