{"id":19530808,"url":"https://github.com/alexeyev/mystem-scala","last_synced_at":"2025-04-26T13:31:01.296Z","repository":{"id":36815877,"uuid":"41122752","full_name":"alexeyev/mystem-scala","owner":"alexeyev","description":"Morphological analyzer `mystem` (Russian language) wrapper for JVM languages","archived":false,"fork":false,"pushed_at":"2024-08-29T11:07:41.000Z","size":61,"stargazers_count":24,"open_issues_count":1,"forks_count":16,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-08-30T10:12:21.014Z","etag":null,"topics":["computational-linguistics","java","lemmatizer","mystem","natural-language-processing","russian-morphology","russian-specific","scala","tokenizer","yandex"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alexeyev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-08-20T23:04:06.000Z","updated_at":"2024-08-29T11:07:45.000Z","dependencies_parsed_at":"2024-08-29T10:06:09.546Z","dependency_job_id":null,"html_url":"https://github.com/alexeyev/mystem-scala","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexeyev%2Fmystem-scala","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexeyev%2Fmystem-scala/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexeyev%2Fmystem-scala/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexeyev%2Fmystem-scala/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alexeyev","download_url":"https://codeload.github.com/alexeyev/mystem-scala/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224036202,"owners_count":17245030,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computational-linguistics","java","lemmatizer","mystem","natural-language-processing","russian-morphology","russian-specific","scala","tokenizer","yandex"],"created_at":"2024-11-11T01:36:17.561Z","updated_at":"2024-11-11T01:36:19.619Z","avatar_url":"https://github.com/alexeyev.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# A Scala wrapper for morphological analyzer Yandex.MyStem\n\n## Introduction\n\nDetails about the algorithm can be found in [I. Segalovich «A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine», MLMTA-2003, Las Vegas, Nevada, USA.](http://download.yandex.ru/company/iseg-las-vegas.pdf)\n\nThe wrapper's code in under MIT license, but please remember that Yandex.MyStem is not open source and licensed under [conditions of the Yandex License](https://legal.yandex.ru/mystem/).\n\n## System Requirements\n\nThe wrapper should at least work on Ubuntu Linux 12.04+, Windows 7+ (+ people say it also works on OS X).\n\n## Install\n\n### Maven\n\n[Maven central](http://search.maven.org/#artifactdetails|ru.stachek66.nlp|mystem-scala|0.1.4|jar)\n\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003eru.stachek66.nlp\u003c/groupId\u003e\n  \u003cartifactId\u003emystem-scala\u003c/artifactId\u003e\n  \u003cversion\u003e0.1.6\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## Issues\n\nOnly mystem 3.{0,1} are supported currently.\nPlease [create issues for compatibility troubles and other requests.](https://github.com/alexeyev/mystem-scala/issues)\n\n## Examples\n\nProbably the most important thing to remember when working with mystem-scala is \nthat you should have just one MyStem instance per mystem/mystem.exe file in your application.\n\n### Scala \n\n```scala\nimport java.io.File\n\nimport ru.stachek66.nlp.mystem.holding.{Factory, MyStem, Request}\n\nobject MystemSingletonScala {\n\n  val mystemAnalyzer: MyStem =\n    new Factory(\"-igd --eng-gr --format json --weight\")\n      .newMyStem(\n        \"3.0\",\n        Option(new File(\"/home/coolguy/coolproject/3dparty/mystem\"))).get()\n}\n\nobject AppExampleScala extends App {\n\n  MystemSingletonScala\n    .mystemAnalyzer\n    .analyze(Request(\"Есть большие пассажиры мандариновой травы\"))\n    .info\n    .foreach(info =\u003e println(info.initial + \" -\u003e \" + info.lex))\n}\n```\n\n### Java \n\n```java\nimport ru.stachek66.nlp.mystem.holding.Factory;\nimport ru.stachek66.nlp.mystem.holding.MyStem;\nimport ru.stachek66.nlp.mystem.holding.MyStemApplicationException;\nimport ru.stachek66.nlp.mystem.holding.Request;\nimport ru.stachek66.nlp.mystem.model.Info;\nimport scala.Option;\nimport scala.collection.JavaConversions;\n\nimport java.io.File;\n\npublic class MyStemJavaExample {\n\n    private final static MyStem mystemAnalyzer =\n            new Factory(\"-igd --eng-gr --format json --weight\")\n                    .newMyStem(\"3.0\", Option.\u003cFile\u003eempty()).get();\n\n    public static void main(final String[] args) throws MyStemApplicationException {\n\n        final Iterable\u003cInfo\u003e result =\n                JavaConversions.asJavaIterable(\n                        mystemAnalyzer\n                                .analyze(Request.apply(\"И вырвал грешный мой язык\"))\n                                .info()\n                                .toIterable());\n\n        for (final Info info : result) {\n            System.out.println(info.initial() + \" -\u003e \" + info.lex() + \" | \" + info.rawResponse());\n        }\n    }\n}\n```\n## How to Cite\n\nThe references to this repository are highly appreciated, if you use our work. \n\n```bibtex\n@misc{alekseev2018mystemscala, \n    author = {Anton Alekseev}, \n    title = {mystem-scala}, \n    year = {2018}, \n    publisher = {GitHub}, \n    journal = {GitHub repository}, \n    howpublished = {\\url{https://github.com/alexeyev/mystem-scala/}}, \n    commit = {the latest commit of the codebase you have used}\n}\n```\n\nIf you do cite it, please do not forget to cite [the original algorithm's author's paper](http://download.yandex.ru/company/iseg-las-vegas.pdf) as well.\n\n## Contacts\n\nAnton Alekseev \u003canton.m.alexeyev@gmail.com\u003e\n\n## Thanks for reviews, reports and contributions\n\n* Vladislav Dolbilov, @darl\n* Mikhail Malchevsky\n* @anton-shirikov\n* Filipp Malkovsky\n* @dizzy7\n\n## Also please see\n\n* https://tech.yandex.ru/mystem/\n* https://nlpub.ru/Mystem\n* https://github.com/Digsolab/pymystem3\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexeyev%2Fmystem-scala","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falexeyev%2Fmystem-scala","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexeyev%2Fmystem-scala/lists"}