Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/etrepum/pyutf8

Python extension for dealing with validation and cleanup of UTF-8 strings
https://github.com/etrepum/pyutf8

Last synced: 3 days ago
JSON representation

Python extension for dealing with validation and cleanup of UTF-8 strings

Awesome Lists containing this project

README

        

pyutf8 provides tools to efficiently deal with the validation
and cleanup of UTF-8 strings. The primary use case is "garbage in utf-8 out".

Usage::

>>> from pyutf8 import valid_utf8_bytes
>>> SNOWMAN = u'\N{SNOWMAN}'
>>> SNOWMAN_BYTES = SNOWMAN.encode('utf-8')
>>> valid_utf8_bytes(SNOWMAN) == SNOWMAN_BYTES
True
>>> valid_utf8_bytes(SNOWMAN_BYTES) == SNOWMAN_BYTES
True
>>> valid_utf8_bytes('\xff' + SNOWMAN_BYTES + '\xff') == SNOWMAN_BYTES
True