{"id":21769926,"url":"https://github.com/mideind/greynirseq","last_synced_at":"2025-04-13T16:32:37.970Z","repository":{"id":40386945,"uuid":"276082894","full_name":"mideind/GreynirSeq","owner":"mideind","description":"GreynirSeq is a natural language parsing toolkit for Icelandic focused on sequence modeling with neural networks.","archived":false,"fork":false,"pushed_at":"2024-10-23T15:08:17.000Z","size":4575,"stargazers_count":9,"open_issues_count":3,"forks_count":1,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-03-27T07:21:25.915Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mideind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-06-30T11:43:47.000Z","updated_at":"2024-10-23T15:08:20.000Z","dependencies_parsed_at":"2024-10-23T17:03:23.034Z","dependency_job_id":"f98fb0f7-28bb-410c-83aa-5b444be010f9","html_url":"https://github.com/mideind/GreynirSeq","commit_stats":{"total_commits":245,"total_committers":8,"mean_commits":30.625,"dds":0.5428571428571429,"last_synced_commit":"8baa0c11fad36cfa1d94810dbf01a3141e6a988f"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FGreynirSeq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FGreynirSeq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FGreynirSeq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FGreynirSeq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mideind","download_url":"https://codeload.github.com/mideind/GreynirSeq/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248743997,"owners_count":21154784,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-26T14:10:38.444Z","updated_at":"2025-04-13T16:32:37.950Z","avatar_url":"https://github.com/mideind.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![superlinter](https://github.com/mideind/greynirseq/actions/workflows/superlinter.yml/badge.svg)]()\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)\n[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)\n\n---\n\n\u003cimg src=\"assets/greynir-logo-large.png\" alt=\"Greynir\" width=\"200\" height=\"200\" align=\"right\" style=\"margin-left:20px; margin-bottom: 20px;\"\u003e\n\n# GreynirSeq\n\nGreynirSeq is a natural language parsing toolkit for Icelandic focused on sequence modeling with neural networks. It is under active development and is in its early stages.\n\nThe modeling part (nicenlp) of GreynirSeq is built on top of the excellent [fairseq](https://github.com/pytorch/fairseq) from Facebook (which is built on top of pytorch).\n\nGreynirSeq is licensed under the MIT license unless otherwise stated at the top of a file.\nModel files hosted by Miðeind or on Hugging Face are under the CC-BY-4.0 or the GNU AFFERO GPLv3\nlicenses; please see the individual repositories for details.\n\n---\n\nBe aware that usage of the CLI or otherwise downloading model files will result in downloading of **gigabytes** of data.\nSimply installing `greynirseq` will not download any models, they are automatically downloaded on-demand.\n\n## Installation\nIn a suitable virtual environment\n``` bash\n# From PyPI\n$ pip install greynirseq\n# or from git main branch\n$ pip install git+https://github.com/mideind/greynirseq@main\n```\n\n## Features\n\n### TL;DR give me the CLI\n\nThe `greynirseq` CLI interface can be used to run pretrained models for various tasks. Run `pip install greynirseq \u0026\u0026 greynirseq -h` to see what options are available.\n\n#### POS\nInput is accepted from file containing a single [tokenized](https://github.com/mideind/Tokenizer) sentence per line, or from stdin.\n\n``` bash\n$ echo \"Systurnar Guðrún og Monique átu einar um jólin á McDonalds .\" | greynirseq pos --input -\n\nnvfng nven-s c n---s sfg3fþ lvfnsf af nhfog af n----s pl\n```\n\n#### NER\nInput is accepted from file containing a single [tokenized](https://github.com/mideind/Tokenizer) sentence per line, or from stdin.\n\n``` bash\n$ echo \"Systurnar Guðrún og Monique átu einar um jólin á McDonalds .\" | greynirseq ner --input -\n\nO B-Person O B-Person O O O O O B-Organization O\n```\n\n#### Translation\nInput is accepted from file containing a single **untokenized** sentence per line, or from stdin.\n\n``` bash\n# For en-\u003eis translation\n$ echo \"This is an awesome test that shows how to use a pretrained translation model.\" | greynirseq translate --source-lang en --target-lang is\n\nÞetta er æðislegt próf sem sýnir hvernig nota má forprófað þýðingarlíkan.\n\n# For is-\u003een translation\n$ echo \"Þetta er æðislegt próf sem sýnir hvernig nota má forprófað þýðingarlíkan.\" | greynirseq translate --source-lang is --target-lang en\n\nThis is an awesome test that shows how a pre-tested translation model can be used.\n```\n\n### Neural Icelandic Language Processing - NIceNLP\n\nIceBERT is an Icelandic BERT-based (RoBERTa) language model that is suitable for fine tuning on downstream tasks.\n\nThe following fine tuning tasks are available both through the `greynirseq` CLI and for loading programmatically.\n\n1. [POS tagging](https://github.com/mideind/GreynirSeq/blob/main/src/greynirseq/nicenlp/examples/pos/README.md)\n1. [NER tagging](https://github.com/mideind/GreynirSeq/blob/main/src/greynirseq/nicenlp/examples/ner/README.md)\n\nThere are also a some translation models available. They are Transformer models trained from scratch or finetuned based on mBART25.\n\n1. [Translation](https://github.com/mideind/GreynirSeq/blob/main/src/greynirseq/nicenlp/examples/translation/README.md)\n\n## Development\nTo install GreynirSeq in development mode install it with pip in editable mode\n\n```bash\npip install -e .[dev]\n```\n\n### Linting\n\nAll code is checked with [Super-Linter](https://github.com/github/super-linter) in a *GitHub Action*, we recommend running it locally before pushing\n\n```bash\n./run_linter.sh\n```\n\n### Type annotation\n\nType annotation will soon be checked with mypy and should be included.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmideind%2Fgreynirseq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmideind%2Fgreynirseq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmideind%2Fgreynirseq/lists"}