{"id":24491276,"url":"https://github.com/lcvriend/humannotator","last_synced_at":"2026-02-20T01:02:08.692Z","repository":{"id":62569795,"uuid":"200514455","full_name":"lcvriend/humannotator","owner":"lcvriend","description":"Library for creating annotator tools for Python/Jupyter","archived":false,"fork":false,"pushed_at":"2020-01-07T23:00:01.000Z","size":383,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-01-28T05:23:56.316Z","etag":null,"topics":["annotation-tool","annotator","data-science","dataframe","jupyter","pandas","python","text-classification"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lcvriend.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-08-04T16:12:06.000Z","updated_at":"2020-01-07T22:32:01.000Z","dependencies_parsed_at":"2022-11-03T17:15:31.601Z","dependency_job_id":null,"html_url":"https://github.com/lcvriend/humannotator","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/lcvriend/humannotator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lcvriend%2Fhumannotator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lcvriend%2Fhumannotator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lcvriend%2Fhumannotator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lcvriend%2Fhumannotator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lcvriend","download_url":"https://codeload.github.com/lcvriend/humannotator/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lcvriend%2Fhumannotator/sbom","scorecard":{"id":581224,"data":{"date":"2025-08-11","repo":{"name":"github.com/lcvriend/humannotator","commit":"ba46928d90d8db0a123b5b500dd7cd6787d5a19d"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":2.7,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE.txt:0","Info: FSF or OSI recognized license: GNU General Public License v3.0: LICENSE.txt:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact v0.0.3 not signed: https://api.github.com/repos/lcvriend/humannotator/releases/22474635","Warn: release artifact v0.0.2 not signed: https://api.github.com/repos/lcvriend/humannotator/releases/21206862","Warn: release artifact v0.0.1 not signed: https://api.github.com/repos/lcvriend/humannotator/releases/20495024","Warn: release artifact v0.0.3 does not have provenance: https://api.github.com/repos/lcvriend/humannotator/releases/22474635","Warn: release artifact v0.0.2 does not have provenance: https://api.github.com/repos/lcvriend/humannotator/releases/21206862","Warn: release artifact v0.0.1 does not have provenance: https://api.github.com/repos/lcvriend/humannotator/releases/20495024"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-20T19:13:33.243Z","repository_id":62569795,"created_at":"2025-08-20T19:13:33.243Z","updated_at":"2025-08-20T19:13:33.243Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29637917,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-19T22:32:43.237Z","status":"ssl_error","status_checked_at":"2026-02-19T22:32:38.330Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotation-tool","annotator","data-science","dataframe","jupyter","pandas","python","text-classification"],"created_at":"2025-01-21T18:17:40.018Z","updated_at":"2026-02-20T01:02:08.670Z","avatar_url":"https://github.com/lcvriend.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Humannotator\n\n**Library for conveniently creating simple customizable annotators \nfor manual annotation of your data**  \n*Jenia Kim, Lawrence Vriend*\n\nWorks well with Jupyter notebooks:\n\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/lcvriend/humannotator/master?filepath=examples%2Fexamples.ipynb)\n\n## Use case\n\nThe humannotator provides an easy way to set up custom annotators.\nThis tool is for you if manual annotation is part of your workflow \nand you are looking for a solution that is:\n\n- Lightweight\n- Customizable\n- Easy to set up\n- Integrates with Jupyter/pandas/Python\n\n## Quick start\n\n### Install the humannotator\n\nInstall with conda:\n\n```\n    conda install -c lcvriend humannotator\n```\n\nOr use pip:\n\n```\n    pip install humannotator\n```\n\n### Create a simple annotator\n\n1. [Load the data](#load-data)\n2. [Define the tasks](#define-tasks)\n3. [Instantiate the annotator](#annotator)\n\n```Python\n    import pandas as pd\n    from humannotator import Annotator\n\n    # load data\n    df = pd.read_csv('examples/popcorn_classics.csv', sep=';', index_col=0)\n\n    # set up the annotator\n    ratings = [\n        'One bag',\n        'Two bags',\n        'Three bags',\n        'Four bags',\n        'Five-bagger',\n    ]\n    annotator = Annotator(df, name='VFA | Rate my popcorn classics')\n    annotator.tasks['Bags of popcorn'] = ratings\n\n    # run annotator\n    annotator(user='GT')\n```\n\nIn Jupyter this gives:\n\n\u003cimg src=\"examples/popcorn_classics.png\" alt=\"Humannotator\" width=\"726\"\u003e\n\n### Annotate your data\n\n- Use the annotator by calling it: `annotator()`.\n- The annotator keeps track of where you were.\n- Highlight phrases with the 'phrases' argument.\n- The annotator stores user (if provided) and timestamp with the annotation.\n\n### Access your annotations\n\n- The annotations are conveniently stored in a pandas `DataFrame`.\n- Access the annotations with the `annotated` attribute.\n- Get the indeces of the records without annotation with `unannotated`.\n- Return the data merged with its annotations with the `merged` method.\n\n### Store your annotations\n\n- Store the annotator with the `save` method.\n- Load the annotator with the `load` method.\n\n## Load data\n\nThe annotator accepts `list`, `dict`, `Series` and `DataFrame` objects as data.  \nThe data will be converted to a dataframe internally.\n\n### Dataframes\n\n- By default, the annotator will use the dataframe's `index` and all `columns`.\n- Use `load_data` to easily create a `data` object if you need more control:\n    1. `id_col` sets the column to be used as index.\n    2. `item_cols` set the column or columns to be displayed.\n\n## Define tasks\n\nTasks can be set up through subscription or with the `task_factory`.\n\n### Setting up tasks with the task factory\nCreate a task by passing `task_factory`:\n\n- the `kind` of task\n- the `name` of the task\n- (optionally) an `instruction`\n- (optionally) a list of `dependencies`\n- whether it is `nullable` (default is False)\n- any [kwargs](#Available-tasks) necessary (depends on the kind of task)\n\nTypically: \n```Python\n    task_factory(\n        'kind',\n        'name',\n        instruction='instruction',\n        dependencies=dependencies,\n        nullable=True/False,\n        **kwargs,\n    )\n```\n\nPassing a dict or list to `kind` will create a categorical task.  \nIn this case the `categories` kwarg is ignored.\n\n### Setting up tasks through subscription\n\nIt is also possible to instantiate an annotator and add tasks through subscription:  \n\n```Python\n    a = Annotator()\n    a.tasks['topic'] = ['economy', 'politics', 'media', 'other']\n    a.tasks['factual'] = bool, \"Is the article factual?\", False\n```\n\nTo add a task like this, you minimally need to provide the `kind` of task you are trying to create.\nOptionally, you can add `instruction`, `nullability`, `dependencies` and any other kwargs (as dictionary).\nChange the order in which tasks are prompted to the user with the `order` attribute on `tasks`.\n\n### Available tasks\n\nkind      | kwargs     | dtype            | description\n--------- | -----------| ---------------- | ----------------\nstr       |            | object           | String\nregex     | regex      | object           | String validated by regex\nint       |            | Int64            | Nullable integer\nfloat     |            | float64          | Float\nbool      |            | bool             | Boolean\ncategory  | categories | CategoricalDtype | Categorical variable\ndate      |            | datetime64[ns]   | Date\n\n### Dependencies\n\nDependencies consist of a *condition* and a *value*, that can be passed as tuple:\n\n```Python\n    (\"col1 == 'x'\", False)\n```\n\nThe condition is a [pandas query statement](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html#pandas.DataFrame.query).\nBefore prompting the user for input, the condition is evaluated on the current annotation.\nIf the query evaluates to True then the value will be assigned automatically.\n\n## Annotator\n\n### Calling the annotator\n\nThe annotator detects if it is run from Jupyter.\nIf so, the annotator will render itself in html and css.\nIf not, the annotator will render itself as text.\nYou can annotate a selection of records by passing a list of ids to the annotator call. If you want to reannotate ids that have already been annotated, then set `redo` to True when calling the annotator.\n\n### Instantiating the annotator\n\n\u003e arguments\n\u003e ---------\n\u003e tasks : *Task, list of Task objects, Tasks, Annotations or DataFrame*\n\u003e\n\u003e     Annotation task(s).\n\u003e     If passed a DataFrame, then the tasks will be inferred from it.\n\u003e     Annotation data in the dataframe will also be initialized.\n\u003e\n\u003e data : *data, list-/dict-like, Series or DataFrame, default None*  \n\u003e\n\u003e     Data to be annotated.\n\u003e     If `data` is not already a data object,\n\u003e     then it will be passed through `load_data`.\n\u003e     The annotator can be instantiated without data,\n\u003e     but will only work after data is loaded.\n\u003e\n\u003e user : *str, default None*  \n\u003e\n\u003e     Name of the user.\n\u003e\n\u003e name : *str, default 'HUMANNOTATOR'*  \n\u003e\n\u003e     Name of the annotator.\n\u003e\n\u003e save_data : *boolean, default False*  \n\u003e\n\u003e     Set flag to True if you want to store the data with the annotator.\n\u003e     This will ensure that the pickled object, will contain the data.\n\u003e \n\u003e other parameters\n\u003e ----------------\n\u003e **DISPLAY**  \n\u003e text_display : *boolean, default None*  \n\u003e\n\u003e     If True will display the annotator in plain text instead of html.\n\u003e\n\u003e **HTML**  \n\u003e\n\u003e markdown : *boolean, default {markdown}*\n\u003e\n\u003e      If True will pass values through markdown before rendering.\n\u003e\n\u003e markdown_extensions : *list, default {markdown_extensions}*\n\u003e\n\u003e      List of markdown extensions to apply.\n\u003e\n\u003e escape_html : *boolean, default {escape_html}*\n\u003e\n\u003e     If true will escape html content within items.\n\u003e\n\u003e maxheight : *str, default '{maxheight_items}'*\n\u003e\n\u003e     Max height before item gets y-scroll bar.\n\u003e     Set to None to have no maximum.\n\u003e\n\u003e **DATA**  \n\u003e item_cols : *str or list of str, default None*  \n\u003e\n\u003e     Name(s) of dataframe column(s) to display when annotating.\n\u003e     By default: display all columns.\n\u003e\n\u003e id_col : *str, default None*  \n\u003e\n\u003e     Name of dataframe column to use as index.\n\u003e     By default: use the dataframe's index.\n\u003e \n\u003e **HIGHLIGHTER**  \n\u003e phrases : *str, list of str, default None*  \n\u003e\n\u003e     Phrases to highlight in the display.\n\u003e     The phrases can be regexes.\n\u003e     It also to pass in a dict where:\n\u003e     - the keys are the phrases\n\u003e     - the values are the css styling\n\u003e\n\u003e escape : *boolean, default False*  \n\u003e\n\u003e     Set escape to True in order to escape the phrases.\n\u003e\n\u003e flags : *int, default 0 (no flags)*  \n\u003e\n\u003e     Flags to pass through to the re module, e.g. re.IGNORECASE.\n\u003e \n\u003e **TRUNCATER**  \n\u003e truncate : *boolean, default {truncate}*  \n\u003e\n\u003e     Set to False to not truncate items.\n\u003e\n\u003e trunc_limit : *int, default {truncate_word_limit}*  \n\u003e\n\u003e     The number of words beyond which an item will be truncated.\n\u003e\n\nThe module contains a [configuration file](humannotator/config.ini) in which some of the default behaviour of the humannotator can be configured.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flcvriend%2Fhumannotator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flcvriend%2Fhumannotator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flcvriend%2Fhumannotator/lists"}