An open API service indexing awesome lists of open source software.

https://github.com/mocobeta/janome

Japanese morphological analysis engine written in pure Python
https://github.com/mocobeta/janome

japanese-language nlp-library python

Last synced: 42 minutes ago
JSON representation

Japanese morphological analysis engine written in pure Python

Awesome Lists containing this project

README

        

=======
Janome
=======

.. image:: https://github.com/mocobeta/janome/workflows/Checks/badge.svg
:target: https://github.com/mocobeta/janome/actions?query=workflow%3AChecks

.. image:: https://coveralls.io/repos/github/mocobeta/janome/badge.svg?branch=master
:target: https://coveralls.io/github/mocobeta/janome?branch=master

.. image:: https://img.shields.io/pypi/dm/Janome.svg
:target: https://pypistats.org/packages/janome

.. image:: https://img.shields.io/conda/v/conda-forge/janome
:target: https://anaconda.org/conda-forge/janome

Janome is a Japanese morphological analysis engine written in pure Python.

General documentation:

https://janome.mocobeta.dev/en/ (English)

https://janome.mocobeta.dev/ja/ (Japanese)

Requirements
=============

Python 3.7+ is required.

Install
========

[Note] This consumes about 500 MB memory for building.

.. code:: bash

(venv) $ pip install janome

Run
====

.. code:: bash

(venv) $ python
>>> from janome.tokenizer import Tokenizer
>>> t = Tokenizer()
>>> for token in t.tokenize('すもももももももものうち'):
... print(token)
...
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ

License
========

Licensed under Apache License 2.0 and uses the MeCab-IPADIC dictionary/statistical model.

See LICENSE.txt and NOTICE.txt for license details.

Acknowledgement
================

Special thanks to @ikawaha, @takuyaa, @nakagami and @janome_oekaki.

Copyright
==========

Copyright(C) 2015-2025, Tomoko Uchida. All rights reserved.