Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-human-label-variation
A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, accompanying The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation (EMNLP 2022)
https://github.com/mainlp/awesome-human-label-variation
Last synced: 3 days ago
JSON representation
-
The "Problem" of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation
-
:mortar_board: Citing
-
Human Label Variation - Related Initiatives and further reading
-
Initiatives, Evaluation Campaigns and Workshops
- SemEval 2023 Shared Task 11 on Learning with Disagreement (Le-Wi-Di) - going!
- SemEval 2021 Shared Task 11 on Learning with Disagreement
- Perspectivist Data Manifesto (PDAI) - aggregated datasets
- Workshop on Perspectivist Approaches to NLP - located with ECAI 2023](https://nlperspectives.di.unito.it/w/2nd-workshop-on-perspectivist-approaches-to-nlp/)
-
Survey and Key Selected References
- Learning from Disagreement: A Survey
- Learning part-of-speech taggers with inter-annotator agreement loss - aggregated data to improve performance on morphosyntactic NLP tasks. Inspired follow-up work such as [Linguistically debatable or just plain wrong?](https://aclanthology.org/P14-2083/) ACL 2014. Analysis of systematicity of annotator agreement on objective linguistic annotation tasks (POS tagging).
- Truth is a lie: Crowd truth and the seven myths of human annotation
- Inherent Disagreements in Human Textual Inferences - up work such as dataset re-annotation studies like [ChaosNLI](https://github.com/easonnie/ChaosNLI) by Nie et al., 2020 and follow-up work such as embracing the collective human opinion for NLI.
- Subjective Natural Language Problems: Motivations, Applications, Characterizations, and Implications
- Toward a Perspectivist Turn in Ground Truthing for Predictive Computing - up work on subjective tasks (see e.g. the Le-Wi-Di 2023 shared task)
- Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations
- Investigating Reasons for Disagreement in Natural Language Inference
- Everyone’s Voice Matters: Quantifying Annotation Disagreement Using Demographic Information
- The Disagreement Deconvolution: Bringing Machine Learning Performance Metrics In Line With Reality - Computer-Interaction (CHI) conference.
-
:bar_chart: Datasets
- Nguyen et al., 2017 - data | |
- Zhang et al., 2022 - OEI | |
- Grubenmann et al., 2018 - CH | |
- Ji et al., 2022 - lab/kilogram | |
- Kennedy et al., 2020
- Haber et al., 2023 - online/singapore-online-attacks/tree/main](https://github.com/rewire-online/singapore-online-attacks/tree/main) | |
- Liu et al., 2022
- Frermann et al., 2023 - label frame annotations of 428 news articles, each labeled by 2-3 annotators | [https://github.com/phenixace/narrative-framing/tree/main/data](https://github.com/phenixace/narrative-framing/tree/main/data) | |
- Sap et al., 2020 - bias-frames/ | :small_orange_diamond: |
- Fleisig et al., 2023 - Related Harms in Generated Text (3 annotators) | https://github.com/microsoft/FairPrism | |
- Forbes et al., 2020 - chemistry-101 | :small_orange_diamond: |
- Lourie et al., 2021 - dilemmas: A Corpus of Community Ethical Judgments (with 5 crowd annotations per instance) | https://github.com/allenai/scruples | :small_orange_diamond: |
- Potts et al., 2021 - Sentiment (5 crowd annotations) | https://github.com/cgpotts/dynasent | :small_orange_diamond: |
- Danescu-Niculescu-Mizil et al. 2013 - Annotation-Disagreement | :small_orange_diamond: |
- Madeddu et al., 2023
-
Programming Languages
Categories
Sub Categories