https://github.com/elliotwutingfeng/rfc9839
Python library to check for problematic Unicode code points.
https://github.com/elliotwutingfeng/rfc9839
assignables cbor control hacktoberfest json noncharacters parse scalars security surrogates unicode xml yaml
Last synced: 2 months ago
JSON representation
Python library to check for problematic Unicode code points.
- Host: GitHub
- URL: https://github.com/elliotwutingfeng/rfc9839
- Owner: elliotwutingfeng
- License: mit
- Created: 2025-09-03T22:25:46.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2026-04-07T20:56:33.000Z (2 months ago)
- Last Synced: 2026-04-07T22:31:55.186Z (2 months ago)
- Topics: assignables, cbor, control, hacktoberfest, json, noncharacters, parse, scalars, security, surrogates, unicode, xml, yaml
- Language: Python
- Homepage: https://pypi.org/project/rfc9839
- Size: 381 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# RFC9839
[](https://pypi.org/project/rfc9839)
[](https://coveralls.io/github/elliotwutingfeng/rfc9839?branch=main)
[](./LICENSE)
Python library to check for problematic Unicode code points.
Port of [Go library of the same name](https://github.com/timbray/rfc9839).
Based on the Unicode code-point subsets specified in [RFC9839](https://www.rfc-editor.org/rfc/rfc9839.html).
## Usage
```python
from rfc9839 import unicode_scalar, xml_character, unicode_assignable
code_point = 0xFDDA # ARABIC LIGATURE SAD WITH MEEM WITH ALEF MAKSURA FINAL FORM
print(unicode_scalar.is_valid_code_point(code_point)) # True
print(xml_character.is_valid_code_point(code_point)) # True
print(unicode_assignable.is_valid_code_point(code_point)) # False
print(unicode_assignable.is_valid_string(chr(code_point))) # False
print(xml_character.is_valid_utf8(chr(code_point).encode("utf-8"))) # True
```