{"id":21769944,"url":"https://github.com/mideind/yfirlestur","last_synced_at":"2025-04-13T16:32:38.316Z","repository":{"id":37956941,"uuid":"246099784","full_name":"mideind/Yfirlestur","owner":"mideind","description":"The yfirlestur.is web application.","archived":false,"fork":false,"pushed_at":"2025-04-03T16:12:49.000Z","size":1437,"stargazers_count":6,"open_issues_count":1,"forks_count":1,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-03T17:26:22.113Z","etag":null,"topics":["grammar","greynir","icelandic","icelandic-language","natural-language-processing","nlp","spelling","web"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mideind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-03-09T17:29:28.000Z","updated_at":"2025-04-03T16:12:53.000Z","dependencies_parsed_at":"2023-01-30T01:15:57.917Z","dependency_job_id":"dd74f3dd-a2c9-4dd5-9c90-7aef145004e3","html_url":"https://github.com/mideind/Yfirlestur","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FYfirlestur","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FYfirlestur/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FYfirlestur/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mideind%2FYfirlestur/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mideind","download_url":"https://codeload.github.com/mideind/Yfirlestur/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248743835,"owners_count":21154748,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["grammar","greynir","icelandic","icelandic-language","natural-language-processing","nlp","spelling","web"],"created_at":"2024-11-26T14:10:42.075Z","updated_at":"2025-04-13T16:32:38.288Z","avatar_url":"https://github.com/mideind.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.8](https://img.shields.io/badge/python-3.8-blue.svg)](https://www.python.org/downloads/release/python-380/)\n[![Build](https://github.com/mideind/Yfirlestur/actions/workflows/python-app.yml/badge.svg)]()\n\n\u003cimg src=\"static/img/yfirlestur-logo-large.png\" alt=\"Yfirlestur\" width=\"200\" height=\"200\"\n  align=\"right\" style=\"margin-left:20px; margin-bottom: 20px;\"\u003e\n\n# Yfirlestur\n\n### Spelling and grammar correction for Icelandic\n\n*Yfirlestur.is* is a web application where you can enter or submit\nIcelandic text and have it checked for spelling and grammar errors.\n\nThe tool also gives hints on words and structures that might not be appropriate,\ndepending on the intended audience for the text.\n\nTry Yfirlestur (in Icelandic) at [https://yfirlestur.is](https://yfirlestur.is)!\n\n\u003cimg src=\"static/img/yfirlestur-example-small.png\" width=\"720\" height=\"536\"\n  alt=\"Yfirlestur annotation\" style=\"margin-top: 18px; margin-bottom: 6px\"\u003e\n\n*Text with annotations, as displayed by Yfirlestur.is*\n\nThe core spelling and grammar checking functionality of Yfirlestur.is is provided by the\n[GreynirCorrect](https://github.com/mideind/GreynirCorrect) engine, by the same authors.\nUser feedback is greatly appreciated, either through GitHub Issues\nor by e-mail to [mideind@mideind.is](mailto:mideind@mideind.is).\n\n## HTTPS API\n\nIn addition to its graphical web front-end, Yfirlestur.is exposes a public\nHTTPS/JSON application programming interface (API) to perform spelling and grammar\nchecking.\n\n### From the command line\n\nThis API can for example by accessed by `curl` from the Linux/MacOS command line\nas follows (try it!):\n\n```bash\n    curl https://yfirlestur.is/correct.api -d \"text=Manninum á verkstæðinu vantar hamar\"\n```\n\n...or, of course, via a HTTPS `POST` from your own code; see below.\n\nAll text is assumed to be coded in UTF-8.\n\nThe example returns the following JSON (shown indented, for ease of reading):\n\n```json\n{\n  \"result\": [\n    [\n      {\n        \"annotations\": [\n          {\n            \"code\":\"P_WRONG_CASE_þgf_þf\",\n            \"detail\":\"Sögnin 'að vanta' er ópersónuleg. Frumlag hennar á að vera í þolfalli í stað þágufalls.\",\n            \"start\":0,\n            \"end\":2,\n            \"start_char\":0,\n            \"end_char\":21,\n            \"suggest\":\"Manninn á verkstæðinu\",\n            \"text\":\"Á líklega að vera 'Manninn á verkstæðinu'\"\n          }\n        ],\n        \"corrected\":\"Manninum á verkstæðinu vantar hamar\",\n        \"tokens\": [\n          {\"k\":6,\"x\":\"Manninum\"},\n          {\"k\":6,\"x\":\"á\"},\n          {\"k\":6,\"x\":\"verkstæðinu\"},\n          {\"k\":6,\"x\":\"vantar\"},\n          {\"k\":6,\"x\":\"hamar\"}\n        ]\n      }\n    ]\n  ],\n  \"stats\":\n    {\n      \"ambiguity\":1.0,\n      \"num_parsed\":1,\n      \"num_sentences\":1,\n      \"num_tokens\":5\n    },\n  \"text\":\"Manninum á verkstæðinu vantar hamar\",\n  \"valid\":true\n}\n```\n\nThe `result` field contains the result of the annotation, as a list of paragraphs,\neach containing a list of sentences, each containing a list of annotations (under\nthe `annotations` field). Of course, if a sentence is correct and has no annotations,\nits annotation list will be empty. An overview of error codes used in annotations is available [here](https://github.com/mideind/GreynirCorrect/blob/master/doc/errorcodes.rst).\n\nEach sentence entry has a field containing a `corrected` version of it, where\nlikely errors have been corrected. The `corrected` string includes corrections\nof most spelling errors but only a subset of suspected grammar errors;\nthe system is intentionally less aggressive about automatically applying those\n(as can be seen in the example above).\n\nSentence entries also contain a list of `tokens`. The tokens\noriginate in the [Tokenizer package](https://github.com/mideind/Tokenizer) and contain the following fields:\n\n`i`: Character index of token start.\n`k`: Number identifying the token type (WORD, DATEREL, AMOUNT, etc.). The mapping from numbers to token types can be found in the documentation for the [Tokenizer package](https://github.com/mideind/Tokenizer).\n`o`: Original token text.\n`x`: Corrected text of token.\n\nOther possible fields:\n`s`: Lemma of word. It can contain '-' if the lemma does not appear in BÍN and the word has been identified as a compound word.\n`c`: Part-of-speech (kk/kvk/hk, so, lo, ao, fs, st, etc.).\n`b`: Inflectional form given in BÍN. Can be '-' if the word cannot be inflected.\n`t`: Terminal that the token is connected to in the CFG.\n`v`: Token value (if applicable). Number, amount, date or name of currency.\n`f`: BÍN category (alm, ism, fyr, örn, etc.).\n\nEach annotation applies to a span of sentence tokens, starting\nat the token whose index is\ngiven in `start` and ending with the token whose index is\nin `end`. Both indices are 0-based\nand inclusive. Also, a starting character index is found\nin `start_char` and an ending index in `end_char`. Again,\nboth are 0-based and inclusive. Note that these are character\nindices within the original source string, not byte indices.\n\nAn annotation has a `code` which uniquely determines the type\nof error or warning. If the code ends with `/w`, it is a warning, otherwise\nit is an error.\n\nAn annotation has a short, human-readable `text` field which describes\nthe annotation succintly, as well as a `detail` field which has further detail\non the annotation, possibly containing grammatical explanations.\n\nFinally, some annotations contain a `suggest` field with text that could\nreplace the text within the token span, if the user agrees with\nthe suggestion being made.\n\nThe result JSON further includes a `stats` field with information about\nthe annotation job, such as the number of tokens and sentences processed,\nand how many of those sentences could be parsed. The `valid` field is\n`true` if the request was correctly formatted and could be processed\nwithout error, or `false` if there was a problem.\n\n#### Options\n\nThe `/correct.api` endpoint supports several options that can be included\nwith the request data, either as additional form fields (for `x-www-form-urlencoded`\nrequests) or JSON properties (for `application/json` requests).\n\n| Key                           | Type | Default | Explanation\n| ----------------------------- | ---- | ------- | ------------------------------\n| annotate\\_unparsed\\_sentences | bool | true    | Annotate sentence even when parsing fails\n| suppress_suggestions          | bool | false   | Don't return suggestions\n| ignore_wordlist               | list | []      | Words to accept without comment\n| ignore_rules                  | list | []      | Rules to ignore when annotating\n\nAs an example, to suppress suggestions:\n\n```bash\n    curl https://yfirlestur.is/correct.api -d \"text=Manninum á verkstæðinu vantar hamar\u0026suppress_suggestions=true\"\n```\n\n### From Python\n\nAs an example of accessing the Yfirlestur API from Python, here is\na short demo program which submits two paragraphs of text to the\nspelling and grammar checker:\n\n```python\n# $ pip install requests\nimport requests\nimport json\n\n# The text to check, two paragraphs of two and one sentences, respectively\nmy_text = (\n    \"Manninum á verkstæðinu vanntar hamar. Guðjón setti kókið í kælir.\\n\"\n    \"Mér dreimdi stórann brauðhleyf.\"\n)\n\n# Make the POST request, submitting the text\n# Include additional keys in the dict if you want to specify options,\n# such as dict(text=mytext, suppress_suggestions=True)\nrq = requests.post(\"https://yfirlestur.is/correct.api\", data=dict(text=my_text))\n\n# Retrieve the JSON response\nresp = rq.json()\n\n# Enumerate through the returned paragraphs, sentences and annotations\nfor ix, pg in enumerate(resp[\"result\"]):\n    print(f\"\\n{ix+1}. efnisgrein\")\n    for sent in pg:\n        print(f\"   {sent['corrected']}\")\n        for ann in sent[\"annotations\"]:\n            print(\n                f\"      {ann['start']:03} {ann['end']:03} \"\n                f\"{ann['code']:20} {ann['text']}\"\n            )\n```\n\nThis program prints the following output:\n\n```bash\n$ python test.py\n\n1. efnisgrein\n   Manninum á verkstæðinu vantar hamar.\n      000 002 P_WRONG_CASE_þgf_þf  Á líklega að vera 'Manninn á verkstæðinu'\n      003 003 S004                 Orðið 'vanntar' var leiðrétt í 'vantar'\n   Guðjón setti kókið í kælir.\n      004 004 P_NT_EndingIR        Á sennilega að vera 'kæli'\n\n2. efnisgrein\n   Mér dreymdi stóran brauðhleif.\n      000 000 P_WRONG_CASE_þgf_þf  Á líklega að vera 'Mig'\n      001 001 S004                 Orðið 'dreimdi' var leiðrétt í 'dreymdi'\n      002 002 S001                 Orðið 'stórann' var leiðrétt í 'stóran'\n      003 003 S004                 Orðið 'brauðhleyf' var leiðrétt í 'brauðhleif'\n```\n\nThe open source *GreynirCorrect* engine that powers Yfirlestur.is\nis further [documented here](https://yfirlestur.is/doc/).\n\n## Running for development\n\nThe service can be packaged and started in development mode using\n[Docker](https://www.docker.com). Run the following commands to start the service\nand expose it via HTTP on port 5002:\n\n```bash\n# Set internal Gunicorn (WSGI web server) user and password\nif [ ! -f \"./gunicorn_user.txt\" ]; then\n    echo 'root' \u003e gunicorn_user.txt\n    echo 'root' \u003e\u003e gunicorn_user.txt\nfi\n\ndocker build -t yfirlestur:latest .\ndocker run -it -p 5002:5002 yfirlestur\n```\n\nFor production use, the Docker module should be packaged inside a robust server\nsuch as [nginx](https://www.nginx.com), and the [Gunicorn](https://gunicorn.org)\nuser should be configured appropriately.\n\n## Acknowledgements\n\nParts of this software were developed under the auspices of the\nIcelandic Government's 5-year Language Technology Programme for Icelandic,\nmanaged by Almannarómur. The LT Programme is described\n[here](https://www.stjornarradid.is/lisalib/getfile.aspx?itemid=56f6368e-54f0-11e7-941a-005056bc530c)\n(English version [here](https://clarin.is/media/uploads/mlt-en.pdf)).\n\n## Copyright and licensing\n\nYfirlestur.is is Copyright © 2023 [Miðeind ehf.](https://mideind.is)  \nThe original author of this software is *Vilhjálmur Þorsteinsson*.\n\n\u003ca href=\"https://mideind.is\"\u003e\u003cimg src=\"static/img/mideind-horizontal-small.png\" alt=\"Miðeind ehf.\"\n    width=\"214\" height=\"66\" align=\"right\" style=\"margin-left:20px; margin-bottom: 20px;\"\u003e\u003c/a\u003e\n\nThis software is licensed under the **MIT License**:\n\n*Permission is hereby granted, free of charge, to any person*\n*obtaining a copy of this software and associated documentation*\n*files (the \"Software\"), to deal in the Software without restriction,*\n*including without limitation the rights to use, copy, modify, merge,*\n*publish, distribute, sublicense, and/or sell copies of the Software,*\n*and to permit persons to whom the Software is furnished to do so,*\n*subject to the following conditions:*\n\n**The above copyright notice and this permission notice shall be**\n**included in all copies or substantial portions of the Software.**\n\n*THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,*\n*EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF*\n*MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.*\n*IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY*\n*CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,*\n*TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE*\n*SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.*\n\nIf you would like to use this software in ways that are incompatible\nwith the standard MIT license, [contact Miðeind ehf.](mailto:mideind@mideind.is)\nto negotiate custom arrangements.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmideind%2Fyfirlestur","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmideind%2Fyfirlestur","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmideind%2Fyfirlestur/lists"}