{"id":38095380,"url":"https://github.com/davidheineman/thresh","last_synced_at":"2026-01-16T21:01:58.776Z","repository":{"id":176818540,"uuid":"655422172","full_name":"davidheineman/thresh","owner":"davidheineman","description":"🌾 Universal, customizable and deployable fine-grained evaluation for text generation.","archived":false,"fork":false,"pushed_at":"2023-10-26T01:10:15.000Z","size":91706,"stargazers_count":24,"open_issues_count":1,"forks_count":5,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-11-28T05:36:58.570Z","etag":null,"topics":["annotation-tool","evaluation-framework","natural-language-processing","nlp","thresh"],"latest_commit_sha":null,"homepage":"https://thresh.tools","language":"Vue","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/davidheineman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-06-18T20:57:35.000Z","updated_at":"2025-09-25T16:56:49.000Z","dependencies_parsed_at":"2023-11-14T03:44:48.448Z","dependency_job_id":null,"html_url":"https://github.com/davidheineman/thresh","commit_stats":{"total_commits":189,"total_committers":2,"mean_commits":94.5,"dds":0.06349206349206349,"last_synced_commit":"7a06e57487dd759125dfd792a394c47da8aa8efa"},"previous_names":["davidheineman/nlproc.tools","davidheineman/thresh.tools"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/davidheineman/thresh","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidheineman%2Fthresh","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidheineman%2Fthresh/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidheineman%2Fthresh/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidheineman%2Fthresh/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/davidheineman","download_url":"https://codeload.github.com/davidheineman/thresh/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidheineman%2Fthresh/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28482481,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotation-tool","evaluation-framework","natural-language-processing","nlp","thresh"],"created_at":"2026-01-16T21:01:58.030Z","updated_at":"2026-01-16T21:01:58.767Z","avatar_url":"https://github.com/davidheineman.png","language":"Vue","readme":"\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/logo.png\" width=\"400\" /\u003e\n    \u003c!-- \u003ch3 style=\"margin: 0\"\u003ethresh.tools: Fine-grained evaluation for text generation\u003c/h3\u003e --\u003e\n\n[**Build an Interface**](https://thresh.tools) | [**Video Tutorial**](#quick_start) | [**Paper**](https://arxiv.org/abs/2308.06953)\n\n\u003c!-- | [**Paper**](https://arxiv.org/) --\u003e\n\u003c/div\u003e\n\u003cbr /\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/github-banner.jpg\" width=\"100%\" style=\"max-width: 1000px\" /\u003e\n\u003c/div\u003e\n\n\u003c!-- TODO ADD GIF DEMO HERE --\u003e\n\n------------------------------------------------\n\n**thresh.tools** is a platform which makes it easy to create and share fine-grained annotation. It was written specifically for complex annotation of text generation (such as [**Scarecrow**](https://thresh.tools/scarecrow) or [**SALSA**](https://thresh.tools/salsa)) and built to be universal across annotation tasks, quickly customizable and easily deployable.\n\n\u003ca id=\"quick_start\"\u003e\u003c/a\u003e\n\n## Quick Start\nVisit [**thresh.tools/demo**](https://thresh.tools/?t=demo_start) for an explanation of how our interface creation works!\n\nhttps://github.com/davidheineman/thresh/assets/9833172/6138408a-1650-42ec-8588-0affb5c98eb5\n\n## Getting Started with `thresh.tools`\n\n### Overview\n`thresh.tools` can be used to [***customize***](#customize) a fine-grained typology, [***deploy***](#deploy) an interface with co-authors, annotators or the research community and [***manage***](#manage) fine-grained annotations using Python. We support each step of the fine-grained annotation lifecycle:\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/lifecycle.png\" width=\"100%\" style=\"max-width: 1000px\" /\u003e\n\u003c/div\u003e\n\n### Interface Customization Tutorials\nThese tutorials show how to customize an interface for annotation on `thresh.tools`. \n\n| feature | tutorial | documentation |\n|:--- | :--: |  :-- | \n| Edit Types | [🔗](https://thresh.tools/?t=demo_edit_types) | [**Adding Edits**](#demo_edit_types)\n| Recursive Question Trees | [🔗](https://thresh.tools/?t=demo_question_trees) | [**Annotating with Recursive Structure**](#demo_question_trees)\n| Custom Instructions | [🔗](https://thresh.tools/?t=demo_instructions) | [**Add Instructions**](#demo_instructions)\n| Paragraph-level Annotation | [🔗](https://thresh.tools/?t=demo_paragraph) | [**Paragraph-level Annotation**](#demo_paragraph)\n| Adjudication | [🔗](https://thresh.tools/?t=demo_adjudication) | [**Multi-interface Adjudication**](#demo_adjudication)\n| Disable Features | [🔗](https://thresh.tools/?t=demo_disable) | --\n| Sub-word Selection | [🔗](https://thresh.tools/?t=demo_tokenization) | [**Sub-word Selection**](#demo_tokenization)\n| Multi-lingual Annotation | [🔗](https://thresh.tools/?t=demo_multilingual) | [**Multi-lingual Annotation**](#demo_multilingual)\n| Crowdsource Deployment | [🔗](https://thresh.tools/?t=demo_crowdsource) | [**Deploy to Crowdsource Platforms**](#demo_crowdsource)\n\n### Deploy \u0026 Manage Annotation Tutorials\nThese notebook tutorials show broader usage of `thresh` for deploying to annoation platforms and managing annotations in Python:\n\n| description | tutorial |\n|:--- | :--: |\n| Load data using the `thresh` library | [**load_data.ipynb**](./notebook_tutorials/load_data.ipynb) |\n| Deploy an interface to the Prolific platform | [**deploy_to_prolific.ipynb**](./notebook_tutorials/deploy_to_prolific.ipynb) |\n| Use `tokenizers` to pre-process your dataset | [**subword_annotation.ipynb**](./notebook_tutorials/subword_annotation.ipynb) |\n\n\u003ca id=\"customize\"\u003e\u003c/a\u003e\n\n## Customize an Interface\n\nAll interfaces consists of two elements the *typology* and the *data*. The typology defines your interface and data defines the examples to be annotated.\n\n`\u003ctypology\u003e.yml`:\n\n```yaml\ntemplate_name: my_template\ntemplate_label: My First thresh.tools Template!\nedits:\n    ...\n```\n\n`\u003cdata\u003e.json`:\n```json\n{\n    \"source\": \"...\",\n    \"target\": \"...\"\n}\n```\n\n\u003ca id=\"demo_edit_types\"\u003e\u003c/a\u003e\n\n### Adding Edits [↗️](https://thresh.tools/?t=demo_edit_types)\n\nThe `edits` command defines a list of edits. Each edit can be one of these types:\n\n| ![](./public/img/docs/edit_input_output.png) | ![](./public/img/docs/edit_multispan.png) | ![](./public/img/docs/edit_composite.png) |\n| :--: | :--: | :--: | \n| `type: single_span` | `type: multi_span` | `type: composite` |\n\nAdditionally, the `enable_input` and `enable_output` commands are used to enable selecting the span on the source or target sentences respectively.\n\n| ![](./public/img/docs/edit_input.png) | ![](./public/img/docs/edit_output.png) | ![](./public/img/docs/edit_input_output.png) |\n|:--: | :--: | :--: |\n| `enable_input: true` | `enable_output: true` | `enable_input: true` \u0026 `enable_output: true` |\n\nTo style your edits, the `icon` is any [**Font Awesome icon**](http://fontawesome.com/search?o=r\u0026m=free) and `color` is the associated edit color. \n\n```yaml\nedits:\n  - name: edit_with_annotation\n    label: \"Custom Annotation\"\n    icon: fa-\u003cicon\u003e\n    color: \u003cred|orange|yellow|green|teal|blue\u003e\n    type: \u003csingle_span|multi_span|composite\u003e\n    enable_input: \u003ctrue|false\u003e\n    enable_output: \u003ctrue|false\u003e\n    annotation: ...\n```\n\n\u003ca id=\"demo_question_trees\"\u003e\u003c/a\u003e\n\n### Annotating with Recursive Structure [↗️](https://thresh.tools/?t=demo_question_trees)\n\nWithin each edit, the `annotation` command is used to specify the annotation questions for each edit. Using the `options` command, you can specify the question type:\n\n| ![](./public/img/docs/question_binary.png) | ![](./public/img/docs/question_likert.png) | ![](./public/img/docs/question_text.png) |\n| :--: | :--: | :--: | \n| `options: binary` | `options: likert-3` | `options: textbox` \u0026 `options: textarea` |\n\nQustions are structured as a tree, so if you list sub-questions under the `options` field, they will appear after the user has selected a certain annotation.\n\n| ![](./public/img/docs/question_custom.png) | ![](./public/img/docs/question_multi.png) | ![](./public/img/docs/question_children.png) |\n|:--: | :--: | :--: |\n| List of children in `options` | Multiple questions in `options` | Nested sub-children in `options` |\n\n```yaml\nedits:\n  - name: edit_with_annotation\n    ...\n    annotation:\n    - name: simple_question\n      question: \"Can you answer this question?\"\n      options: \u003clikert-3|binary|textbox|textarea\u003e\n    - name: grandparent_question\n      question: \"Which subtype question is important\"\n      options:\n      - name: parent_question_1\n        label: \"Custom Parent Question\"\n        question: \"Which subchild would you like to select\"\n        options:\n      - name: child_1\n        label: \"Custom Child Option 1\"\n      - name: child_2\n        label: \"Custom Child Option 2\"\n        ...\n      - name: parent_question_2\n        label: \"Pre-defined Parent Question\"\n        question: \"Can you rate the span on a scale of 1-3?\"\n        options: \u003clikert-3|binary|textbox|textarea\u003e\n      ...\n  ...\n```\n\n\u003ca id=\"demo_instructions\"\u003e\u003c/a\u003e\n\n### Add Instructions [↗️](https://thresh.tools/?t=demo_instructions)\n\nUsing the `instructions` flag, you can add an instructions modal, or prepend the text above the interface using the `prepend_instructions` flag. Instructions are fomatted with [**Markdown**](https://www.markdownguide.org/cheat-sheet/).\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/docs/instructions.png\" width=\"400\" /\u003e\n\u003c/div\u003e\n\n```yaml\nprepend_instructions: \u003ctrue|false\u003e\ninstructions: |\n  Your instruction text in markdown format.\n```\n\n\u003ca id=\"demo_paragraph\"\u003e\u003c/a\u003e\n\n### Paragraph-level Annotation [↗️](https://thresh.tools/?t=demo_paragraph)\n\nTo add text before or after the annotation, add the `context` and `_context_before` entries to your data JSON. The context field is formatted in [**Markdown**](https://www.markdownguide.org/cheat-sheet/), allowing for titles, subsections or code in your annotation context.\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/docs/paragraph.png\" width=\"400\" /\u003e\n\u003c/div\u003e\n\n```json\n[\n  {\n    \"context\": \"\u003ccontext written in markdown\u003e\",\n    \"source_context_before\": \"...\",\n    \"source\": \"\u003cselectable text with context\u003e\",\n    \"source_context_after\": \"...\",\n    \"target_context_before\": \"...\",\n    \"target\": \"\u003cselectable text with context\u003e\",\n    \"target_context_after\": \"...\",\n  }\n]\n```\n\nAdditionally, we have utilities under the `display` command to help with side-by-side or long-context annotations:\n```yaml\ndisplay:\n - side-by-side         # shows text and editor next to each other\n - text-side-by-side    # shows source and target next to each other\n - disable-lines        # disables lines between annotations which can be distracting\n - hide-context         # hides the context by default, adding a \"show context\" button\n```\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/docs/paragraph_horizontal.png\" width=\"600\" /\u003e\n\u003c/div\u003e\n\n\u003ca id=\"demo_adjudication\"\u003e\u003c/a\u003e\n\n### Multi-interface Adjudication [↗️](https://thresh.tools/?t=demo_adjudication)\n\nTo display multiple interfaces simultaneously, use the `adjudication` flag with the number of interfaces you want to show, and use `highlight_first_interface` to add a \"Your Annotations\" label on the first interface. \n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/docs/adjudication.png\" width=\"600\" /\u003e\n\u003c/div\u003e\n\n```yaml\nadjudication: 2\nhighlight_first_interface: \u003ctrue|false\u003e\n```\n\nUnlike the traditional data loader (which uses the `d` parameter), you can specify multiple datasources with the `dX` parameter as such:\n\n```\nthresh.tools/?d1=\u003cDATASET_1\u003e\u0026d2=\u003cDATASET_2\u003e\n```\n\n\u003ca id=\"demo_tokenization\"\u003e\u003c/a\u003e\n\n### Sub-word Selection [↗️](https://thresh.tools/?t=demo_tokenization)\n\nTo allow a smoother annotation experience, the span selection will \"snap\" to the closest word boundary. This boundary is `word` by default, but can also be defined as such:\n\n\u003c!-- TODO: ADD SCREENSHOT --\u003e\n\nFor a guide on pre-processing your dataset, please see [**notebook_tutorials/subword_annotation.ipynb**](./notebook_tutorials/subword_annotation.ipynb).\n\n```yaml\ntokenization: \u003cword|char|tokenized\u003e\n```\n\n\u003ca id=\"demo_multilingual\"\u003e\u003c/a\u003e\n\n### Multi-lingual Annotation [↗️](https://thresh.tools/?t=demo_multilingual)\n\nAny text in our interface can be overriden by specifying its source using the `interface_text` flag. We create templates for different languages which can be used the `language` flag.\n\nFor a full list of interface text overrides, please reference a [**langauage template**](./public/lang/en.yml).\n\n```yaml\nlanguage: \u003czh|en|es|hi|pt|bn|ru|ja|vi|tr|ko|fr|ur\u003e\ninterface_text:\n  typology:\n    source_label: \"莎士比亚\"\n    target_label: \"现代英语\"\n  ...\n```\n\nLooking to expand our language support? See our section on [**contributing**](#language_contribute).\n\n\u003ca id=\"deploy\"\u003e\u003c/a\u003e\n\n## Deploy an Interface\n\nPlease reference the \"Deploy\" modal within [**the interface builder**](https://thresh.tools/) for more detail!\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./public/img/docs/deploy.png\" width=\"400\" /\u003e\n\u003c/div\u003e\n\n\u003ca id=\"demo_database\"\u003e\u003c/a\u003e\n\n### Deploy with a Database [↗️](https://thresh.tools/?t=demo_crowdsource)\n\nUse the `database` command to specify a public database to save annotations after users click a \"Submit\" button. We currently support [**Firebase**](https://firebase.google.com/) for any deployment method (in-house or crowdsourcing). Please see [**notebook_tutorials/deploy_database_with_firebase.md**](./notebook_tutorials/deploy_database_with_firebase.md) for a full tutorial on connecting a Firebase database to Thresh.\n\n```yml\ncrowdsource: \"custom\"\ndatabase: \n    type: firebase\n    project_id: [your-project-id]\n    url: https://[your-project-id].firebaseio.com/\n    # collection: thresh     # (default: thresh) The database to use\n    # document: annotation   # (default: annotation) The document to use\n    field: annotation_set_1  # The document field to store annotations\n```\n\n\u003ca id=\"demo_crowdsource\"\u003e\u003c/a\u003e\n\n### Deploy to Crowdsource Platforms [↗️](https://thresh.tools/?t=demo_crowdsource)\n\nUse the `crowdsource` command to specify a \"Submit\" button at the end of annotation. Please see [**notebook_tutorials/deploy_to_prolific.ipynb**](./notebook_tutorials/deploy_to_prolific.ipynb) for a full guide on deploying an interface programatically.\n\n```yaml\ncrowdsource: \u003cprolific\u003e\nprolific_completion_code: \"XXXXXXX\"\n```\n\n\u003ca id=\"manage\"\u003e\u003c/a\u003e\n\n## Manage Data with the `thresh` Library\n```sh\npip install thresh\n```\n\n### Loading Annotations\nTo load annotations, simply load your JSON data and call `load_annotations`:\n\n```python\nfrom thresh import load_interface\n\n# Serialize your typology into a class\nYourInterface = load_interface(\n    \"\u003cpath_to_typology\u003e.yml\"\n)\n\n# Load \u0026 serialize data from \u003cfile_name\u003e.json\nthresh_data = YourInterface.load_annotations(\n    \"\u003cfile_name\u003e.json\"\n)\n```\n\nFor example, using the SALSA demo data:\n```python\nfrom thresh import load_interface\n\n# Load SALSA data using the SALSA typology\nSalsa = load_interface(\"salsa.yml\")\nsalsa_data = Salsa.load_annotations(\"salsa.json\")\n\nprint(salsa_data[0])\n\u003e\u003e SalsaEntry(\n\u003e\u003e   annotator: annotator_1, \n\u003e\u003e   system: new-wiki-1/Human-2-written, \n\u003e\u003e   source: \"Further important aspects of Fungi ...\", \n\u003e\u003e   target: \"An important aspect of Fungi in Art is ...\", \n\u003e\u003e   edits: [\n\u003e\u003e     DeletionEdit(\n\u003e\u003e       input_idx: [[259, 397]], \n\u003e\u003e       annotation: DeletionAnnotation(\n\u003e\u003e         deletion_type: GoodDeletion(\n\u003e\u003e           val: 3\n\u003e\u003e         ), \n\u003e\u003e         coreference: False, \n\u003e\u003e         grammar_error: False\n\u003e\u003e       ),\n\u003e\u003e     ), \n\u003e\u003e     ...\n\u003e\u003e   ]\n\u003e\u003e )\n```\n\nTo prepare a dataset for annotation, simply export your `List[Annotation]` object and call `export_data`:\n```python\n# Export data to \u003cfile_name\u003e.json for annotation\nYourInterface.export_data(\n    data=thresh_data,\n    output_filename=\"\u003cfile_name\u003e.json\"\n)\n```\n\nFor a full tutorial with examples and advanced usage, please see [**/notebook_tutorials/load_data.ipynb**](./notebook_tutorials/load_data.ipynb).\n\n### Internal Data Classes\nOur data loading code is backed by custom internal classes which are created based on your typology. You can access these classes directly:\n```python\nfrom thresh import get_entry_class\n\n# Get the custom data class for the SALSA typology\nSalsa = load_interface(\"salsa.yml\")\nSalsaEntry = Salsa.get_entry_class()\n\n# Create a new entry\ncustom_entry = SalsaEntry(\n    annotator = annotator_1, \n    system = new-wiki-1/GPT-3-zero-shot, \n    target = The film has made more than $552 million at the box office and is currently the eighth most successful movie of 2022., \n    source = The film has grossed over $552 million worldwide, becoming the eighth highest-grossing film of 2022.\n)\n\nprint(custom_entry.system)\n\u003e\u003e new-wiki-1/GPT-3-zero-shot\n```\n\n## Data Conversion\nOur `thresh` data format is meant to be universal across fine-grained annotation tasks. To show this, we have created conversion scripts from exisitng fine-grained typologies. Use the `thresh` library to convert from existing data formats:\n\n```sh\npip install thresh\n```\n\nTo convert to our standardized data format, our library includes bi-directional conversion from existing fine-grained annotation typologies:\n\n```python\nfrom thresh import convert_dataset\n\n# To convert to the thresh.tools standardized format:\nthresh_data = convert_dataset(\n    data_path=\"\u003cpath_to_original_data\u003e\", \n    output_path=\"\u003cpath_to_output_data\u003e.json\", # (Optional) Will save data locally\n    dataset_name=\"\u003cdataset_name\u003e\"\n)\n\n# To convert back to the original format:\noriginal_data = convert_dataset(\n    data_path=\"\u003cpath_to_original_data\u003e.json\", \n    output_path=\"\u003cpath_to_output_data\u003e\",\n    dataset_name=\"\u003cdataset_name\u003e\", \n    reverse=True\n)\n```\n\nWe support conversion for the following datasets:\n```\nfrank, scarecrow, mqm, snac, fg-rlhf, propaganda, arxivedits\n```\n\n### Demo Data Sources\n\nIn the table below you can find all the original data for each interface. For our demo data, we randomly selected 50 annotations from each dataset. We include the file names of the specific datsets we use below, selecting from the test set when applicable:\n\n| interface | data | implementation | file name |\n|:--- | :--: | :--- | :---: |\n| FRANK | [🔗](https://github.com/artidoro/frank) | [**thresh.tools/frank**](https://thresh.tools/frank) | `human_annotations.json` |\n| Scarecrow | [🔗](https://yao-dou.github.io/scarecrow) | [**thresh.tools/scarecrow**](https://thresh.tools/scarecrow) | `grouped_data.csv` |\n| MQM | [🔗](https://github.com/google/wmt-mqm-human-evaluation) | [**thresh.tools/mqm**](https://thresh.tools/mqm) | `mqm_newstest2020_ende.tsv` |\n| SALSA | [🔗](https://github.com/davidheineman/salsa) | [**thresh.tools/salsa**](https://thresh.tools/salsa) | `salsa_test.json` |\n| SNaC | [🔗](https://github.com/tagoyal/snac) | [**thresh.tools/snac**](https://thresh.tools/snac) | `SNaC_data.json` |\n| arXivEdits | [🔗](https://github.com/chaojiang06/arXivEdits) | [**thresh.tools/arxivedits**](https://thresh.tools/arxivedits) | `test.json` |\n| Wu et al., 2023 | [🔗](https://github.com/allenai/FineGrainedRLHF) | [**thresh.tools/fg-rlhf**](https://thresh.tools/fg-rlhf) | `dev_feedback.json` |\n| Da San Martino et al., 2019 | [🔗](https://propaganda.qcri.org/) | [**thresh.tools/propaganda**](https://thresh.tools/propaganda) | `test/article\u003cX\u003e.labels.tsv` |\n\nWe do not create dataloaders for the following interfaces:\n\n| interface | reason |\n|:---: | :-- |\n| MultiPIT | This is an inspection interface, examples are taken from Table 7 of the [**MultiPIT paper**](https://aclanthology.org/2022.emnlp-main.631). |\n| CWZCC | The example is taken from App. B of the [**CWZCC paper**](https://aclanthology.org/2020.lrec-1.327). Full dataset is not publically available due to copyright and privacy concerns. |\n| ERRANT | Our example data is taken from the annotations from the [**W\u0026I+LOCNESS corpus**]() collected by [**Bryant et al., 2019**](https://aclanthology.org/W19-4406/) from original exerpts from [**Yannakoudakis et al., 2018**](https://www.tandfonline.com/doi/abs/10.1080/08957347.2018.1464447) and [**Granger, 1998**](https://www.learnercorpusassociation.org/resources/tools/locness-corpus/). The dataset was released as part of the [**Building Educational Applications 2019 Shared Task**](https://www.cl.cam.ac.uk/research/nl/bea2019st/#data). |\n\n## Contributing\n\n\u003ca id=\"local\"\u003e\u003c/a\u003e\n\n### Set Up `thresh.tools` Locally\nClone this repo: \n```sh\ngit clone https://github.com/davidheineman/thresh.git\n```\n\nSet up Vue: \n```sh\nnpm install\nnpm run dev     # To run a dev environment\nnpm run build   # To build a prod environment in ./build\nnpm run deploy  # Push to gh-pages\n```\n\nDeployment will create a `gh-pages` branch. You will need to go into GitHub Pages settings and set the source branch to `gh-pages`.\n\n### Submit a New Typology\nYou do *not* need to do this if you want to use your interface (please see [**Deploy an Interface**](#deploy)). This will add your interface to the `thresh.tools` homepage!\n\n\u003cdiv align=\"left\"\u003e\n    \u003cimg src=\"./public/img/hosted-interface.png\" width=\"300px\" /\u003e\n\u003c/div\u003e\n\nTo make your interface available in the `thresh.tools` builder, please clone this repo and submit a pull request with the following:\n\n1. Add your typology YML file to [**public/templates/**](./public/templates/).\n2. Add your demo data JSON file to [**public/data/**](./public/data/). We encourage authors to submit a sample of 50 examples for their full dataset, but this is not required.\n3. Modify [**src/main.js**](./src/main.js) to link to your dataset, by adding a line to `templates`:\n\n    ```js\n    const templates = [\n        { name: \"SALSA\", path: \"salsa\", task: \"Simplification\", hosted: true },\n        { name: \"Scarecrow\", path: \"scarecrow\", task: \"Open-ended Generation\", hosted: true },\n        ...\n        { name: \"\u003cdisplay_name\u003e\", path: \"\u003cyour_interface\u003e\", task: \"\u003cyour_task\u003e\", hosted: true }\n    ]\n    ```\n    In this case `\u003cyour_task\u003e` will correspond to the task you are grouped with. *Note: You can preview your changes by setting up [**thresh.tools locally**](#local)!*\n4. Submit a [**pull request**](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) with your changes! Then we will merge with the `thresh.tools` main branch. Please reach out if you have any questions.\n\n\u003ca id=\"language_contribute\"\u003e\u003c/a\u003e\n\n### Add Language Support\nMulti-lingual deployment is core to `thresh.tools`, and we are actively working to add support for more languages. If you would like to add support for a new language (or revise our existing support), our language templates are located in [**public/lang/**](./public/lang/).\n- To add support for a new language, simply create a new `.yml` using the structure of an [**existing language template**](./public/lang/en.yml).\n- To revise an existing template, simply make changes within the template.\n\nWhen you are finished, please submit a [**pull request**](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) with your changes.\n\n### Set Up the `thresh` Python Library\nClone this repo: \n```sh\ngit clone https://github.com/davidheineman/thresh.git\ncd data_tools\n```\n\nMake any changes to the library and push to PyPi:\n```sh\nrm -r dist \npython -m build\npython -m twine upload --repository pypi dist/*\n```\n\n## Cite `thresh.tools`\nIf you find our library helpful, please consider citing [**our work**](https://arxiv.org/abs/2308.06953):\n```\n@article{heineman2023thresh,\n  title={Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation},\n  author={Heineman, David and Dou, Yao and and Xu, Wei},\n  journal={arXiv preprint arXiv:2308.06953},\n  year={2023}\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidheineman%2Fthresh","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavidheineman%2Fthresh","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidheineman%2Fthresh/lists"}