https://github.com/centre-for-humanities-computing/danish_literary_sentiment
https://github.com/centre-for-humanities-computing/danish_literary_sentiment
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/centre-for-humanities-computing/danish_literary_sentiment
- Owner: centre-for-humanities-computing
- License: mit
- Created: 2024-05-17T18:36:55.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-06-12T10:54:31.000Z (about 1 year ago)
- Last Synced: 2025-09-09T07:17:06.514Z (9 months ago)
- Language: Python
- Size: 7.11 MB
- Stars: 1
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# The Fiction2 Danish Literature Corpus [](https://aclanthology.org/2024.wassa-1.15/)

## 🌡️📖 Danish literary sentiment
This repository holds the data for comparing Sentiment Analysis methods on Danish literature - specifically fairy tales and religious hymns of the 19th century.
Our study compares human annotations to the continuous valence scores of both transformer- and dictionary-based sentiment analysis methods to assess their performance, seeking to understand how distinct methods handle the language of Danish prose and poetry.
## 🔎 What is included
- Original and modernized Danish text
- Continuous valence annotation (0-10) by human annotators (n=2-3) per sentence/verse
- Automatic annotation scores per sentence/verse (using dictionary- and transformer-based Sentiment Analysis tools)
This data allows for the comparison of human/human and human/model sentiment scoring on Danish literary texts.

## 🔬 Data
We use two datasets: i) H.C. Andersen fairy tales, and ii) Religious hymns
| | No. texts | No. annotations | No. words | Mean no. verses/sents per text | Period |
|-------------|-----|------|--------|--------------|------------|
| **HCA** | 3 | 791 | 18,910 | 263.7 | 1837-1847 |
| **Hymns** | 65 | 1,914 | 10,303 | 32.9 | 1798-1873 |
## 📖 Documentation
Code for the **hymns** and **fairy tales** analysis (separately) -- annotator agreement and human/model correlations -- is available in this folder, while the SHAP values analysis of RoBERTa scores is available in [a seperate GitHub repository](https://github.com/centre-for-humanities-computing/fabula-shap).
| | |
| --------------------------- | --------------------------------------------------------------------------------- |
| 📄 **[Paper]** | Link to our paper comparing SA resources on Danish literary texts. |
| 🔬 **[CHC]** | Center for Humanities Computing, hosting the project. |
| ✉️ **[Contact]** | Contact the authors. |
[Paper]: https://aclanthology.org/2024.wassa-1.15/
[CHC]: https://chc.au.dk
[Contact]: mailto:pascale.moreira@cc.au.dk