{"id":15014136,"url":"https://github.com/d5555/tageditor","last_synced_at":"2025-08-20T22:32:06.403Z","repository":{"id":49320643,"uuid":"182334223","full_name":"d5555/TagEditor","owner":"d5555","description":"🏖TagEditor - Annotation tool for spaCy","archived":false,"fork":false,"pushed_at":"2022-09-23T12:28:14.000Z","size":511629,"stargazers_count":193,"open_issues_count":5,"forks_count":12,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-05-12T02:40:54.095Z","etag":null,"topics":["annotation","annotation-tool","coreference-resolution","data-science","labeling-tool","machine-learning","named-entities","named-entity-recognition","natural-language-processing","neural-networks","neuralcoref","nlp","spacy","spacy-visualizer","tagging-tool","text-annotation","text-tagging","training-data"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/d5555.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-04-19T22:48:02.000Z","updated_at":"2025-04-24T18:22:17.000Z","dependencies_parsed_at":"2023-01-18T20:45:51.284Z","dependency_job_id":null,"html_url":"https://github.com/d5555/TagEditor","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/d5555/TagEditor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d5555%2FTagEditor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d5555%2FTagEditor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d5555%2FTagEditor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d5555%2FTagEditor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/d5555","download_url":"https://codeload.github.com/d5555/TagEditor/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d5555%2FTagEditor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271397958,"owners_count":24752640,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-20T02:00:09.606Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotation","annotation-tool","coreference-resolution","data-science","labeling-tool","machine-learning","named-entities","named-entity-recognition","natural-language-processing","neural-networks","neuralcoref","nlp","spacy","spacy-visualizer","tagging-tool","text-annotation","text-tagging","training-data"],"created_at":"2024-09-24T19:45:14.774Z","updated_at":"2025-08-20T22:32:01.389Z","avatar_url":"https://github.com/d5555.png","language":null,"funding_links":["https://paypal.me/d5555"],"categories":[],"sub_categories":[],"readme":"### TagEditor(v3.4.1) annotation tool\n\nTagEditor is a desktop application (requires **_Windows 10, 64-bit_**) that allows you to quickly annotate text with the help of spaCy library.\u003cbr/\u003e\nWith TagEditor you can annotate **dependencies, parts of speech, Named entities, text categories and Coreference resolution**, create your customized annotated data or create a training dataset in formats .json or .spacy for training with spaCy library or pytorch. \n\n### Installation\nNo installation required.\u003cbr/\u003e\nDownload and unpack [**TagEditor.7z**](https://github.com/d5555/TagEditor/raw/master/TagEditor.7z)\u003cbr/\u003e\nRun 'TagEditor.exe' in the main folder.\n\n### Demo\n\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/dep800.gif)\n\n### Usage\n\n\u0026nbsp; Insert your text or open a text file and press **Start tagging** (or choose one of the options in Menu/Tools). Choose type of annotation and labels like in the screenshot below and press Ok. Or you can start with loading your datasets in formats .spacy or .json.\n\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/select.png)\u003cbr/\u003e\nSelect a tag in **TAG SET pannel** then select a word to assign the tag. Select a head tag to assign dependency if you are working in the Dependencies window .\u003cbr/\u003e\n\u0026nbsp; To edit Doc or Tokens  - use Right-click on any word. Context menu allows to  edit, delete, insert words or sentences, also merge or split sentences. \nTo merge sentences right-click on the first word of sentence. It is checked as **Sentence start**. Uncheck it and the sentence will merge with the previous sentence.\u003cbr/\u003e \n\u003cimg src=\"https://github.com/d5555/TagEditor/blob/master/pics/Context.png\" width=\"700\" \u003e\n\n\u0026nbsp; To assign new paragraph use context menu or click on the sentence number on the left side. Or use button **Assign paragraphs** in the tab **Words** to assign paragraphs after new line symbols '\\n' in text. \u003cbr/\u003e\n\u0026nbsp; To delete all newline characters and extra whitespaces in the text, select the tab **Words** and press **Remove Whitespaces**. All **Sentence starts** are highlighted with rose color and **whitespaces** - with yellow color.\u003cbr/\u003e \n\u003c!---![alt text](https://github.com/d5555/TagEditor/blob/master/pics/words.png)---\u003e\n\u003cimg src=\"https://github.com/d5555/TagEditor/blob/master/pics/words.png\" width=\"700\" \u003e\n\nPress button **Create DATA** to create training data in \"simple training style\" or JSON. You can save it in text, json or spacy format ...\u003cbr/\u003e\n**Save project** for future editing. **Load project** to continue where you left.\u003cbr/\u003e\nAlso you can save and load your datasets in formats .spacy or .json. \n\n\u003cimg src=\"https://github.com/d5555/TagEditor/blob/master/pics/MenuFile.png\" width=\"700\" \u003e\n\nIf you don't have a pretrained model for a given language, select language from the list for proper tokenization: \n\n\u003cimg src=\"https://github.com/d5555/TagEditor/blob/master/pics/Menu_Mod.png\" width=\"450\" \u003e\n\n\u003c!--- \u003eTry **[NeuralGym](https://github.com/d5555/NeuralGym)** to train spaCy model with your training data. ---\u003e\n\n**Named Entities**\u003cbr/\u003e\nFirst click on a label in the Tag Set pannel then select words in the main window that you want to assign label to. To delete assigned label just click on it in the editor window. Create output data with char/token offset or BILUO / IOB scheme. It is allowed to create nested or overlapping tags if you use char/token offset.\u003cbr/\u003e\nIf the option **NER search all** is on and you selected a new span - selected label will be assigned to all spans found in the text accordingly.\u003cbr/\u003e\nOption **--Annotate--** allows to switch models and annotate on top of your already annotated text in different modes. This way you can compare two(or more) different models or just to annotate text using several models in tandem. \u003cbr/\u003e\n**TAG SET** panel allows editing labels, adding new labels and their description, saving and uploading labels.  \n\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/ner1.png) \n\n\nPress button **Create data** , select items and save as **\\*.spacy, \\*.txt or \\*.json** format or print it on the screen. if you assigned paragraphs - select **Manually assigned paragraphs**\u003cbr/\u003e\n\u003c!--- ![alt text](https://github.com/d5555/TagEditor/blob/master/pics/create_data.png width=400) ---\u003e\n\u003cimg src=\"https://github.com/d5555/TagEditor/blob/master/pics/create_data.png\" width=\"450\" \u003e\n\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/data_onscreen.png)\n\n**POS tags**\u003cbr/\u003e\nIn this window you can edit POS tags (fine-grained) and also view coarse-grained pos tags and morphs.\u003cbr/\u003e\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/pos_pic.png)\n\n**Dependencies**\u003cbr/\u003e\nSelect a tag in TAG SET pannel then click on a word in the editor window to assign the tag. Click on another word(token) to assign a head tag. Click on the word again to remove the tag.\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/dep.png)\n\n**Co-reference tagger**\u003cbr/\u003e\n\u0026nbsp;Coreference annotation is according to PreCo  'Data Format'.\u003cbr/\u003eDataset can be downloaded from here: https://github.com/d5555/Coreference-dataset\u003cbr/\u003eCompatible with **NeuralCoref 4.0**. To use NeuralCoref for annotating select \"Enable NeuralCoref\" after 'Start tagging'. Set parameter 'greedyness' 0,55.\n\nhttps://preschool-lab.github.io/PreCo/\u003cbr/\u003e\nhttps://arxiv.org/abs/1810.09807\u003cbr/\u003e\n\"sentences\" - is a list of sentences. Each sentence is a list of tokens. Each token is a string, which can be a word or a punctuation mark. \u003cbr/\u003e\n\"mention_clusters\" - is a list of mention clusters. Each mention cluster is a list of mentions. Each mention is a tuple of integers [sentence_idx, begin_idx, end_idx]. Sentence_idx is the index of the sentence of the mention. Begin_idx is the index of the first token of the mention in the sentence. End_index is the index of the last token of the mention in the sentence plus one. All indices are zero-based.\u003cbr/\u003e\n\u0026nbsp;\u0026nbsp;Select in the editor window a word or a span of words. It will be a singleton(single entity) with no connection to other entities and framed with dash line. Then select another span. Everytime you select an entity it is highlighted by green color frame. While it is in selected state click on another entity and they will be linked together and highligted by same color and get same coref number (a num in the right corner of frame). That simple! \u003cbr/\u003e\nTo deselect just click on empty space in the main window.\u003cbr/\u003e\nTo unlink a span from the entity , select it and then click on it again. It will turn into singleton. You can also use the table on the right side. If the text is long and you don't want to scroll it just click on an entity in the table to get spans linked. Entities which are not singletons are added to the table automatically. Though you can add singletons too. Entity color can be changed except for singleton. \u003cbr/\u003e\nYou can load data from PreCo dataset to TagEditor directly. Unzip  PreCo dataset , run tagEditor and select menu **File-\u003eLoad PreCO/Coref-\u003e(select file)**. You can test it with the file **coref_example.jsonl** \u003cbr/\u003e\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/corefpic.png)\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/coref_annot.png)\n\n**Text Categories**\u003cbr/\u003e\nhttps://spacy.io/api/textcategorizer\u003cbr/\u003e\nIn the Text Categories you can assign labels to paragraphs, sentences or to spans (see below).\u003cbr/\u003e\n\u0026nbsp; Select the score in the TAG SET pannel - True or False(i.e 1.0 or 0.0) and select a category label. Go to the editor window and click on sentence. Category and score will be added. You can easily **switch the score True/False** by just clicking on the score label in editor window. Supports multiple, non-mutually exclusive labels.\u003cbr/\u003e\nUse check button **Assign/unassign all** to assign/unassign all labels to all sentences in one click. Then you can manually change True/False status of each label by clicking on the label or delete the label in the editor window.\u003cbr/\u003e\nFor demo purporses the text classifier of this tool was trained on the IMDB dataset with labels 'POSITIVE NEGATIVE'. The partial dataset converted to spacy format can be downloaded from here:\u003cbr/\u003e\nhttps://github.com/d5555/textcat_dataset_IMDB\u003cbr/\u003e\n\nIf you have a model with pretrained text classifier and want to classify paragraphs (instead of sentences or whole text), close the window **Text Categories**, assign paragraphs (because by default the whole text is one paragraph) and then press **tools** and in the tab  **Text Categories** select **Paragraph**. \u003cbr/\u003e\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/cats.png)\n![alt text](https://github.com/d5555/TagEditor/blob/master/pics/cat_data.png)\n\n'Spans classification mode' allows **multiple overlapping labels**.  Can be used as an **all-purporse text tagger** with the data format (index of first token, index of last token+1, label name). Zero based.\u003cbr/\u003e\n\u003c!---![alt text](https://github.com/d5555/TagEditor/blob/master/pics/spansclass.png) ---\u003e\n\u003cimg src=\"https://github.com/d5555/TagEditor/blob/master/pics/spansclass.png\" width=\"700\" \u003e\n\n\u003c!--- _Try **[NeuralGym](https://github.com/d5555/NeuralGym)** to train spaCy model with your training data._ \u003cbr/\u003e  ---\u003e\n\u003eTo use your pretrained models with TagEditor or other spacy models,  acquire the full version of TagEditor. Please contact gitprojects5@gmail.com \n#### *You have any suggestions on improving the program, adding extra feature, feel free to leave a comment or email at gitprojects5@gmail.com\n\u003c!---**************\n### Extended version\nNeed help, found a bug or you would like get the extended version? [**New issue**](https://github.com/d5555/TagEditor/issues/new) or contact us at gitprojects5@gmail.com\n**************\ngitprojects5@gmail.com \n\n\n[![Donate](https://img.shields.io/badge/Donate-PayPal-green.svg)](https://paypal.me/d5555)\u003cbr/\u003e---\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd5555%2Ftageditor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fd5555%2Ftageditor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd5555%2Ftageditor/lists"}