{"id":39085180,"url":"https://github.com/giellalt/lang-sms","last_synced_at":"2026-01-17T18:36:58.990Z","repository":{"id":59459874,"uuid":"257563678","full_name":"giellalt/lang-sms","owner":"giellalt","description":"Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Skolt Sami language","archived":false,"fork":false,"pushed_at":"2026-01-16T20:19:33.000Z","size":133459,"stargazers_count":4,"open_issues_count":2,"forks_count":0,"subscribers_count":20,"default_branch":"main","last_synced_at":"2026-01-17T09:08:15.699Z","etag":null,"topics":["constraint-grammar","finite-state-transducers","geo-nordic","giellalt-langs","indigenous-languages","langfam-uralic","language-resources","maturity-prod","minority-language","nlp","proofing-tools"],"latest_commit_sha":null,"homepage":"https://giellalt.uit.no","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/giellalt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-04-21T10:45:56.000Z","updated_at":"2026-01-16T20:15:48.000Z","dependencies_parsed_at":"2025-10-21T12:27:33.288Z","dependency_job_id":null,"html_url":"https://github.com/giellalt/lang-sms","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/giellalt/lang-sms","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giellalt%2Flang-sms","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giellalt%2Flang-sms/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giellalt%2Flang-sms/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giellalt%2Flang-sms/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/giellalt","download_url":"https://codeload.github.com/giellalt/lang-sms/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giellalt%2Flang-sms/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28516195,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T18:28:00.501Z","status":"ssl_error","status_checked_at":"2026-01-17T18:28:00.150Z","response_time":85,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["constraint-grammar","finite-state-transducers","geo-nordic","giellalt-langs","indigenous-languages","langfam-uralic","language-resources","maturity-prod","minority-language","nlp","proofing-tools"],"created_at":"2026-01-17T18:36:58.885Z","updated_at":"2026-01-17T18:36:58.967Z","avatar_url":"https://github.com/giellalt.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"The Skolt Sami morphology and tools\n===================================\n\n[![Maturity](https://img.shields.io/endpoint?url=https%3A%2F%2Fraw.githubusercontent.com%2Fgiellalt%2Flang-sms%2Fgh-pages%2Fmaturity.json)](https://giellalt.github.io/MaturityClassification.html)\n![Lemma count](https://img.shields.io/endpoint?url=https%3A%2F%2Fraw.githubusercontent.com%2Fgiellalt%2Flang-sms%2Fgh-pages%2Flemmacount.json)\n[![GitHub issues](https://img.shields.io/github/issues-raw/giellalt/lang-sms)](https://github.com/giellalt/lang-sms/issues)\n[![License](https://img.shields.io/github/license/giellalt/lang-sms)](https://github.com/giellalt/lang-sms/blob/main/LICENSE)\n[![Doc Build Status](https://github.com/giellalt/lang-sms/workflows/Docs/badge.svg)](https://github.com/giellalt/lang-sms/actions)\n[![CI/CD Build Status](https://divvun-tc.giellalt.org/api/github/v1/repository/giellalt/lang-sms/main/badge.svg)](https://divvun-tc.giellalt.org/api/github/v1/repository/giellalt/lang-sms/main/latest)\n\nDownload nightly / CI/CD installation packages for testing (contains the core zhfst file(s)):\n\n[![Windows](https://img.shields.io/badge/download%40latest-Windows--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sms?platform=windows\u0026channel=nightly)\n[![MacOS](https://img.shields.io/badge/download%40latest-macOS--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sms?platform=macos\u0026channel=nightly)\n[![Mobile](https://img.shields.io/badge/download%40latest-mobile--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sms?platform=mobile\u0026channel=nightly)\n\n__NB!!__ Note that the nightly / CI/CD installation packages are not tested for language quality, and might contain regressions and errors.\n\nThis directory contains source files for the Skolt Sami language\nmorphology and dictionary. The data and implementation are licenced\nunder \\_\\_LICENSE\\_\\_ licence also detailed in the\n[LICENSE](https://github.com/giellalt/lang-sms/blob/main/LICENSE). The\nauthors named in the AUTHORS file are available to grant\nother licencing choices.\n\nInstall proofing tools and [keyboards](https://github.com/giellalt/keyboard-sms)\nfor the Skolt Sami language by using the [Divvun Installer](http://divvun.no)\n\nDownload and test speller files\n-------------------------------\n\nThe speller files downloadable at the top of this page (the `*.bhfst` files) can\nbe used with [divvunspell](https://github.com/divvun/divvunspell), to test their\nperformance. These files are the exact same ones as installed on users' computers\nand mobile phones. Desktop and mobile speller files differ from each other in the\nerror model and should be tested separately — thus also two different downloads.\n\n\nDocumentation\n-------------\n\nDocumentation can be found here:\n\n- [Language specific documentation](https://giellalt.github.io/lang-sms/)\n- [General documentation](https://giellalt.github.io/)\n\nCore dependencies\n-----------------\n\nIn order to compile and use the Skolt Sami language morphology and\ndictionaries, you need:\n\n- an FST compiler: [HFST](https://github.com/hfst/hfst), [Foma](https://github.com/mhulden/foma) or [Xerox Xfst](https://web.stanford.edu/~laurik/fsmbook/home.html)\n- [VislCG3](https://visl.sdu.dk/svn/visl/tools/vislcg3/trunk) Constraint Grammar tools\n\nTo install VislCG3 and HFST, just copy/paste this into your Terminal on **macOS**:\n\n```\ncurl https://apertium.projectjj.com/osx/install-nightly.sh | sudo bash\n```\n\nor terminal on **Ubuntu, Debian or Windows Subsystem for Linux**:\n\n```\nwget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash\nsudo apt-get install cg3 hfst\n```\n\nor terminal on **RedHat, Fedora, CentOS or Windows Subsystem for Linux**:\n\n```\nwget https://apertium.projectjj.com/rpm/install-nightly.sh -O - | sudo bash\nsudo dnf install cg3 hfst\n```\n\nAlternatively, the Apertium wiki has good instructions on how to [install the dependencies for Mac\nOS X](https://wiki.apertium.org/wiki/Apertium_on_Mac_OS_X) and how to [install\nthe dependencies on\nlinux](https://wiki.apertium.org/wiki/Installation_of_grammar_libraries)\n\nFurther details and dependencies are described on the GiellaLT [Getting Started](https://giellalt.uit.no/infra/GettingStarted.html) pages.\n\n## Downloading\n\nUsing Git:\n```\ngit clone https://github.com/giellalt/lang-sms\n```\n\nUsing Subversion:\n```\nsvn checkout https://github.com/giellalt/lang-sms.git/trunk lang-sms\n```\n\nBuilding and installation\n-------------------------\n\n[INSTALL](https://github.com/giellalt/lang-sms/blob/main/INSTALL)\ndescribes the GNU build system in detail, but for most users it is the usual:\n\n```sh\n./autogen.sh # This will automatically clone or check out other GiellaLT dependencies\n./configure\nmake\n(as root) make install\n```\n\n## Citing\n\n\u003c!-- Add language specific citation stuff here and to the CITATION.cff --\u003e\n\nIf you use the Skolt Sami FST in an academic publication, please cite it\nas follows:\n\nRueter, Jack \u0026 Hämäläinen, Mika. (2020). [FST Morphology for the Endangered Skolt Sami Language](https://www.researchgate.net/publication/340598493_FST_Morphology_for_the_Endangered_Skolt_Sami_Language). In *Proceedings of the 1st Joint SLTU and CCURL Workshop*, May 2020, Marseille, France. European Language Resources association, pp. 250\\--257.\n\n```bibtex\n@InProceedings{rueter-hmlinen:2020:SLTUCCURL,\nauthor = {Rueter, Jack and Hämäläinen, Mika}, \ntitle = {FST Morphology for the Endangered Skolt Sami Language},\nbooktitle ={Proceedings of the 1st Joint SLTU and CCURL Workshop},\nyear = {2020}\n}\n```\n\nIf you use language data from more than one GiellaLT language, consider citing\n[our LREC 2022 article on whole\ninfra](https://aclanthology.org/2022.lrec-1.125/):\n\n\u003e Linda Wiechetek, Katri Hiovain-Asikainen, Inga Lill Sigga Mikkelsen,\n  Sjur Moshagen, Flammie Pirinen, Trond Trosterud, and Børre Gaup. 2022.\n  *Unmasking the Myth of Effortless Big Data - Making an Open Source\n  Multi-lingual Infrastructure and Building Language Resources from Scratch*.\n  In Proceedings of the Thirteenth Language Resources and Evaluation Conference,\n  pages 1167–1177, Marseille, France. European Language Resources Association.\n\nIf you use bibtex, following is as it is on ACL anthology:\n\n```bibtex\n@inproceedings{wiechetek-etal-2022-unmasking,\n    title = \"Unmasking the Myth of Effortless Big Data - Making an Open Source\n    Multi-lingual Infrastructure and Building Language Resources from Scratch\",\n    author = \"Wiechetek, Linda  and\n      Hiovain-Asikainen, Katri  and\n      Mikkelsen, Inga Lill Sigga  and\n      Moshagen, Sjur  and\n      Pirinen, Flammie  and\n      Trosterud, Trond  and\n      Gaup, B{\\o}rre\",\n    booktitle = \"Proceedings of the Thirteenth Language Resources and Evaluation\n    Conference\",\n    month = jun,\n    year = \"2022\",\n    address = \"Marseille, France\",\n    publisher = \"European Language Resources Association\",\n    url = \"https://aclanthology.org/2022.lrec-1.125\",\n    pages = \"1167--1177\"\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgiellalt%2Flang-sms","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgiellalt%2Flang-sms","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgiellalt%2Flang-sms/lists"}