Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ratijas/colorize_pinyin
search for chinese pinyin and wrap it with HTML (python version).
https://github.com/ratijas/colorize_pinyin
Last synced: 26 days ago
JSON representation
search for chinese pinyin and wrap it with HTML (python version).
- Host: GitHub
- URL: https://github.com/ratijas/colorize_pinyin
- Owner: ratijas
- Created: 2016-01-24T04:44:24.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2016-03-01T17:59:51.000Z (over 8 years ago)
- Last Synced: 2024-09-15T04:22:38.987Z (about 2 months ago)
- Language: Python
- Homepage:
- Size: 10.7 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
Awesome Lists containing this project
README
colorize pinyin
===============search for chinese pinyin and wrap it with HTML.
module provides some useful functions for working with Chinese pinyin,
"phonetic system for transcribing the Mandarin pronunciations of
Chinese characters into the Latin alphabet" (c) `wikipedia `_
.usage sample
============>>> colorized_HTML_string_from_string('nǐ hǎo')
'nǐ hǎo'>>> ranges_of_pinyin_in_string('nǐ hǎo')
[Range(location=0, length=2), Range(location=3, length=3)] # == [(0, 2), (3, 3)]requirements
============``lxml`` is only required if you want to work with DOM. but it does not listed in requirements because if you want to use them, surely you already have ``lxml`` on the board; for other cases we don't need additional third-party libs.
functions that designed to work with ``lxml`` marked with ``[*]``.
functions
=========modify given DOM by replacing children text nodes containing pinyin with
wrapper element:``colorize_DOM`` ``[*]``
undo colorize:
``uncolorize_DOM`` ``[*]``
detect and wrap pinyin with HTML in plain text string:
``colorized_HTML_string_from_string``
do the same, but returns a wrapper -- DOM element:
``colorized_HTML_element_from_string`` ``[*]``
searching for pinyin in string of text:
``ranges_of_pinyin_in_string``
finding out what tone has some pinyin word:
``determine_tone``
remove tones (diacritics) from pinyin string:
``lowercase_string_by_removing_pinyin_tones``
constants
=========``PINYIN_LIST`` -- specially sorted list of all possible pinyin words.
``PINYIN_LIST_BY_LEN`` -- same as ``PINYIN_LIST`` but grouped by len descending.
``PINYIN_WRAPPER_CLASS`` -- default class used by ``[un]colorize_DOM``.
classes
=======``Range`` -- 2-named-tuple with ``[0]`` location and ``[1]`` length.