https://github.com/mocobeta/janome
Japanese morphological analysis engine written in pure Python
https://github.com/mocobeta/janome
japanese-language nlp-library python
Last synced: 42 minutes ago
JSON representation
Japanese morphological analysis engine written in pure Python
- Host: GitHub
- URL: https://github.com/mocobeta/janome
- Owner: mocobeta
- License: apache-2.0
- Created: 2015-02-14T09:47:00.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2025-02-11T08:54:15.000Z (10 days ago)
- Last Synced: 2025-02-14T14:30:54.883Z (7 days ago)
- Topics: japanese-language, nlp-library, python
- Language: Python
- Homepage: https://mocobeta.github.io/janome/en/
- Size: 403 MB
- Stars: 865
- Watchers: 32
- Forks: 52
- Open Issues: 15
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGES.txt
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
Awesome Lists containing this project
README
=======
Janome
=======.. image:: https://github.com/mocobeta/janome/workflows/Checks/badge.svg
:target: https://github.com/mocobeta/janome/actions?query=workflow%3AChecks.. image:: https://coveralls.io/repos/github/mocobeta/janome/badge.svg?branch=master
:target: https://coveralls.io/github/mocobeta/janome?branch=master.. image:: https://img.shields.io/pypi/dm/Janome.svg
:target: https://pypistats.org/packages/janome.. image:: https://img.shields.io/conda/v/conda-forge/janome
:target: https://anaconda.org/conda-forge/janomeJanome is a Japanese morphological analysis engine written in pure Python.
General documentation:
https://janome.mocobeta.dev/en/ (English)
https://janome.mocobeta.dev/ja/ (Japanese)
Requirements
=============Python 3.7+ is required.
Install
========[Note] This consumes about 500 MB memory for building.
.. code:: bash
(venv) $ pip install janome
Run
====.. code:: bash
(venv) $ python
>>> from janome.tokenizer import Tokenizer
>>> t = Tokenizer()
>>> for token in t.tokenize('すもももももももものうち'):
... print(token)
...
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチLicense
========Licensed under Apache License 2.0 and uses the MeCab-IPADIC dictionary/statistical model.
See LICENSE.txt and NOTICE.txt for license details.
Acknowledgement
================Special thanks to @ikawaha, @takuyaa, @nakagami and @janome_oekaki.
Copyright
==========Copyright(C) 2015-2025, Tomoko Uchida. All rights reserved.