{"id":22607859,"url":"https://github.com/patch/lingua-stem-unine-pm5","last_synced_at":"2025-03-28T22:20:00.004Z","repository":{"id":5939215,"uuid":"7159602","full_name":"patch/lingua-stem-unine-pm5","owner":"patch","description":"University of Neuchâtel stemmers for Bulgarian, Czech, German, and Persian","archived":false,"fork":false,"pushed_at":"2014-09-02T20:44:18.000Z","size":4616,"stargazers_count":3,"open_issues_count":4,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-03T08:48:03.881Z","etag":null,"topics":["nlp","perl5"],"latest_commit_sha":null,"homepage":"https://metacpan.org/pod/Lingua::Stem::UniNE","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/patch.png","metadata":{"files":{"readme":"README.md","changelog":"Changes","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-12-14T03:43:11.000Z","updated_at":"2018-09-09T10:39:57.000Z","dependencies_parsed_at":"2022-09-11T08:22:07.675Z","dependency_job_id":null,"html_url":"https://github.com/patch/lingua-stem-unine-pm5","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patch%2Flingua-stem-unine-pm5","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patch%2Flingua-stem-unine-pm5/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patch%2Flingua-stem-unine-pm5/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patch%2Flingua-stem-unine-pm5/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/patch","download_url":"https://codeload.github.com/patch/lingua-stem-unine-pm5/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246106697,"owners_count":20724401,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nlp","perl5"],"created_at":"2024-12-08T14:22:35.295Z","updated_at":"2025-03-28T22:19:59.985Z","avatar_url":"https://github.com/patch.png","language":"Perl","funding_links":[],"categories":["📦 Legacy \u0026 Inactive Projects"],"sub_categories":[],"readme":"[![Build status](https://travis-ci.org/patch/lingua-stem-unine-pm5.png)](https://travis-ci.org/patch/lingua-stem-unine-pm5)\n[![Coverage status](https://coveralls.io/repos/patch/lingua-stem-unine-pm5/badge.png)](https://coveralls.io/r/patch/lingua-stem-unine-pm5)\n[![CPAN version](https://badge.fury.io/pl/Lingua-Stem-UniNE.png)](http://badge.fury.io/pl/Lingua-Stem-UniNE)\n\n# NAME\n\nLingua::Stem::UniNE - University of Neuchâtel stemmers\n\n# VERSION\n\nThis document describes Lingua::Stem::UniNE v0.08.\n\n# SYNOPSIS\n\n```perl\nuse Lingua::Stem::UniNE;\n\n# create Bulgarian stemmer\n$stemmer = Lingua::Stem::UniNE-\u003enew(language =\u003e 'bg');\n\n# get stem for word\n$stem = $stemmer-\u003estem($word);\n\n# get list of stems for list of words\n@stems = $stemmer-\u003estem(@words);\n```\n\n# DESCRIPTION\n\nThis module contains a collection of stemmers for multiple languages based on\nstemming algorithms provided by Jacques Savoy of the University of Neuchâtel\n(UniNE). The languages currently implemented are\n[Bulgarian](https://metacpan.org/pod/Lingua::Stem::UniNE::BG), [Czech](https://metacpan.org/pod/Lingua::Stem::UniNE::CS),\n[German](https://metacpan.org/pod/Lingua::Stem::UniNE::DE), and [Persian](https://metacpan.org/pod/Lingua::Stem::UniNE::FA). Work\nis ongoing for Arabic, Bengali, Finnish, French, Hindi, Hungarian, Italian,\nPortuguese, Marathi, Russian, Spanish, and Swedish. The top priority is\nlanguages for which there are no stemmers available on CPAN.\n\n## Attributes\n\n- language\n\n    The following language codes are currently supported.\n\n        ┌───────────┬────┐\n        │ Bulgarian │ bg │\n        │ Czech     │ cs │\n        │ German    │ de │\n        │ Persian   │ fa │\n        └───────────┴────┘\n\n    They are in the two-letter ISO 639-1 format and are case-insensitive but are\n    always returned in lowercase when requested.\n\n    ```perl\n    # instantiate a stemmer object\n    $stemmer = Lingua::Stem::UniNE-\u003enew(language =\u003e $language);\n\n    # get current language\n    $language = $stemmer-\u003elanguage;\n\n    # change language\n    $stemmer-\u003elanguage($language);\n    ```\n\n    Country codes such as `cz` for the Czech Republic are not supported, nor are\n    IETF language tags such as `fa-AF` or `fa-IR`.\n\n- aggressive\n\n    By default, if there are multiple strengths of stemmers, a light stemmer will be\n    used. When `aggressive` is set to true, an aggressive stemmer will be used if\n    available.\n\n    ```perl\n    $stemmer-\u003eaggressive(1);\n    ```\n\n    Czech and German have aggressive options.\n\n## Methods\n\n- stem\n\n    Accepts a list of words, stems each word, and returns a list of stems. The list\n    returned will always have the same number of elements in the same order as the\n    list provided. When no stemming rules apply to a word, the original word is\n    returned.\n\n    ```perl\n    @stems = $stemmer-\u003estem(@words);\n\n    # get the stem for a single word\n    $stem = $stemmer-\u003estem($word);\n    ```\n\n    The words should be provided as character strings and the stems are returned as\n    character strings. Byte strings in arbitrary character encodings are\n    intentionally not supported.\n\n- languages\n\n    Returns a list of supported two-letter language codes using lowercase letters.\n\n    ```perl\n    # object method\n    @languages = $stemmer-\u003elanguages;\n\n    # class method\n    @languages = Lingua::Stem::UniNE-\u003elanguages;\n    ```\n\n# SEE ALSO\n\n[Lingua::Stem::Any](https://metacpan.org/pod/Lingua::Stem::Any) provides a unified interface to any stemmer on CPAN,\nincluding this module, as well as additional features like normalization,\ncasefolding, and in-place stemming.\n\n[Lingua::Stem::Snowball](https://metacpan.org/pod/Lingua::Stem::Snowball) provides alternate stemming algorithms for Finnish,\nFrench, German, Hungarian, Italian, Portuguese, Russian, Spanish, and Swedish,\nas well as other languages.\n\nThese stemming algorithms are based on definition and implementations by Jacques\nSavoy and Ljiljana Dolamic of the University of Neuchâtel and provided at\n[IR Multilingual Resources at UniNE](http://members.unine.ch/jacques.savoy/clef/).\n\n# AUTHOR\n\nNick Patch \u003cpatch@cpan.org\u003e\n\nThis project is brought to you by [Shutterstock](http://www.shutterstock.com/).\nAdditional open source projects from Shutterstock can be found at\n[code.shutterstock.com](http://code.shutterstock.com/).\n\n# COPYRIGHT AND LICENSE\n\n© 2012–2014 Shutterstock, Inc.\n\nThis library is free software; you can redistribute it and/or modify it under\nthe same terms as Perl itself.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatch%2Flingua-stem-unine-pm5","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpatch%2Flingua-stem-unine-pm5","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatch%2Flingua-stem-unine-pm5/lists"}