https://github.com/howl-anderson/hanzi_chaizi

汉字拆字库，可以将汉字拆解成偏旁部首，在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components, which can be used as character shape features in machine learning.
https://github.com/howl-anderson/hanzi_chaizi

chinese chinese-characters components hanzi radicals strokes

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/howl-anderson/hanzi_chaizi
Owner: howl-anderson
License: apache-2.0
Created: 2018-11-29T10:14:19.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2024-10-17T04:04:57.000Z (about 1 year ago)
Last Synced: 2025-03-31T23:31:48.871Z (8 months ago)
Topics: chinese, chinese-characters, components, hanzi, radicals, strokes
Language: Python
Homepage:
Size: 469 KB
Stars: 366
Watchers: 3
Forks: 59
Open Issues: 12
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

          # Hanzi decomposition (Chinese character decomposition) | 汉字拆字

> 拆字是指將一文字，以筆畫、字形等基本組成單位分解成多個文字。

> The decomposition of characters refers to breaking down a single character into multiple characters based on its basic components, such as strokes and structural elements.

> 汉字拆字让字型相似的字具有相似的拆解结果。

> Hanzi decomposition yields similar decomposition results for characters with similar structures.

> 这种特性可以被深度学习模型用来作为字的特征之一：字形的特征。

> This feature can be used by deep learning models as one of the features of characters: the structural feature.

## Installation

```bash

pip install hanzi_chaizi

```

## Usage

```python

from hanzi_chaizi import HanziChaizi

hc = HanziChaizi()

result = hc.query('名')

print(result)

```

Output:

```text

['夕', '口']

```

## Development

### Data source

Data from this project: [漢語拆字字典](https://github.com/kfcd/chaizi)

### parsing and convert data format

```bash

pytohn dev_scripts/parse.py

```

## Credits

Data from this project: [漢語拆字字典](https://github.com/kfcd/chaizi)

## Citation

```

@misc{kong2018hanzichaizi,

  title={Hanzi Chaizi},

  author={Xiaoquan Kong},

  howpublished={https://github.com/howl-anderson/hanzi_chaizi},

  year={2018}

}

```

If the package is cited in books, seminars, and academic research papers, or used in company products, you are welcome (but not required) to email me about this. I'm glad to see the package being used and valuable to everyone.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/howl-anderson/hanzi_chaizi

Awesome Lists containing this project

README