Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/yuanjie-ai/inlp

https://pypi.org/project/iNLP/
https://github.com/yuanjie-ai/inlp

nlp nlp-apis nlp-keywords-extraction nlp-library nlp-machine-learning nlp-parsing nlp-resources

Last synced: about 1 month ago
JSON representation

https://pypi.org/project/iNLP/

Host: GitHub
URL: https://github.com/yuanjie-ai/inlp
Owner: yuanjie-ai
License: mit
Created: 2018-06-04T03:36:44.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2018-12-04T07:23:04.000Z (almost 6 years ago)
Last Synced: 2024-10-04T03:19:15.580Z (about 2 months ago)
Topics: nlp, nlp-apis, nlp-keywords-extraction, nlp-library, nlp-machine-learning, nlp-parsing, nlp-resources
Language: Python
Homepage:
Size: 5.88 MB
Stars: 11
Watchers: 4
Forks: 4
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        
:rocket: iNLP :facepunch:


---

## Install

```sh

pip install iNLP

pip install git+https://github.com/Jie-Yuan/iNLP.git

```

---

## Usage

### **`inlp.convert`**

- 简繁体转换

```python

from inlp.convert import chinese

chinese.s2t('忧郁的台湾乌龟') # chinese.simple2tradition('忧郁的台湾乌龟')

chinese.t2s('憂郁的臺灣烏龜') # chinese.tradition2simple('憂郁的臺灣烏龜')

```

- 全角半角转换

```python

from inlp.convert import char

char.half2full("0123456789")

char.full2half("０１２３４５６７８９")

```

### **`inlp.explode`**

- 汉字拆成字

```python

from inlp.explode import Chars

Chars().get_chars('袁') # ['土 口 衣']

```

- 汉字拆成笔画

```python

from inlp.explode import Strokes

Strokes().get_strokes('袁') # ['一', '丨', '一', '丨', 'フ', '一', 'ノ', 'フ', 'ノ', '丶']

```

### **`inlp.similarity`**

- 基于词库的相似度

```python

from inlp.similarity import thesaurus

s1 = ['周杰伦', '是', '一个', '歌手']

s2 = ['刘若英', '是', '个', '演员']

thesaurus.cilin(s1, s2) # 基于词林的相似度

thesaurus.hownet(s1, s2) # 基于知网的相似度

```

- 基于`hash`的相似度

```python

from inlp.similarity import simhash

simhash(s1, s2)

```

---

> 计划：增加基于词向量相似词相似句的方法