https://github.com/chkla/nlp-standards
🗣 Talks on NLP checklists and sheets to standardize the development pipeline for transparency and accountability @ TUMunich & TADA
https://github.com/chkla/nlp-standards
benchmark datasets evaluation modeling nlp-machine-learning presentation-slides
Last synced: 15 days ago
JSON representation
🗣 Talks on NLP checklists and sheets to standardize the development pipeline for transparency and accountability @ TUMunich & TADA
- Host: GitHub
- URL: https://github.com/chkla/nlp-standards
- Owner: chkla
- Created: 2022-05-03T20:43:09.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-07-02T13:55:27.000Z (over 1 year ago)
- Last Synced: 2025-01-27T07:16:33.654Z (12 months ago)
- Topics: benchmark, datasets, evaluation, modeling, nlp-machine-learning, presentation-slides
- Homepage:
- Size: 8.78 MB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
This repository contains the slide decks of my talks at the _Technical University of Munich_ (24.10.22) and our _text-as-data reading group (TADA.cool)_ (25.10.22). I gave some brief insight into the NLP development pipeline (📚datasets -> 🤖 models -> 📊evaluation) by showing several templates for standardizing basic steps in NLP research for transparency and accountability:
* **The Model Openness Framework"** ([White et al. 2024](https://arxiv.org/pdf/2403.13784))
* **Model Risk Card** ([Derczynski et al. 2023](https://arxiv.org/abs/2303.18190))
* **Data Statements** ([Bender/ Friedman 2018](https://aclanthology.org/Q18-1041.pdf))
* **Datasheets** ([Gebru et al. 2021](https://arxiv.org/pdf/1803.09010.pdf)) --> possible extensions [Heger et al. (2022)](https://www.semanticscholar.org/reader/b43e2d429f6a2f52336c9749651f34d354062418)
* **Responsible Data Use Checklist** ([Rogers/ Baldwin/ Liens 2021](https://aclanthology.org/2021.findings-emnlp.414.pdf))
* **Model Cards** ([Mitchell et al. 2019](https://arxiv.org/pdf/1810.03993.pdf)) --> interactive model cards ([Crisan et al. 2022](https://arxiv.org/abs/2205.02894))
* **Experimental Results Checklist** ([Dodge et al. 2019](https://arxiv.org/pdf/1909.03004.pdf))
* **Benchmark Checklist** ([Reimers 2022](https://nils-reimers.de/talks/2022_03_Chasing_Wrong_Benchmarks.zip))
* **Benchmark Checklist for Reviewers** ([Degjani et al. 2021](https://arxiv.org/pdf/2107.07002.pdf) based on [Gebru et al. 2021](https://arxiv.org/pdf/1803.09010.pdf))
* **Framework for Algorithmic Auditing** ([Raji et al. 2020](https://www.semanticscholar.org/reader/0412076e1004d030ac02de77bc44cc7d92b13ab9))
* **Dataset Development Lifecycle** ([Hutchinson et al. 2020](https://www.semanticscholar.org/reader/27ad3d92a9d02698ae10be1a86f1f6e52c8f0644)])
* **CheckList** [Ribeiro et al. 2020](https://github.com/marcotcr/checklist)
* **Preregistering NLP research** ([Van Miltenburg et al. 2021](https://aclanthology.org/2021.naacl-main.51/))
* **Ethical Guidelines** ([Pistilli et al. 2023](https://huggingface.co/blog/ethics-diffusers))
* **Self-contained Artifacts** ([Arvan et al. 2022](https://aclanthology.org/2022.emnlp-main.150.pdf))