https://github.com/allegro/klejbenchmark-allegroreviews
Allegro Reviews is a sentiment analysis dataset, consisting of 11,588 product reviews written in Polish and extracted from Allegro.pl - a popular e-commerce marketplace.
https://github.com/allegro/klejbenchmark-allegroreviews
Last synced: 3 months ago
JSON representation
Allegro Reviews is a sentiment analysis dataset, consisting of 11,588 product reviews written in Polish and extracted from Allegro.pl - a popular e-commerce marketplace.
- Host: GitHub
- URL: https://github.com/allegro/klejbenchmark-allegroreviews
- Owner: allegro
- Created: 2020-05-13T13:49:39.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-07-07T12:55:34.000Z (almost 5 years ago)
- Last Synced: 2025-04-11T23:49:23.771Z (3 months ago)
- Size: 2.93 KB
- Stars: 10
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Allegro Reviews
**Allegro Reviews** is a sentiment analysis dataset, consisting of 11,588 product reviews written in Polish and extracted from [Allegro.pl](https://allegro.pl) - a popular e-commerce marketplace. Each review contains at least 50 words and has a rating on a scale from one (negative review) to five (positive review).We recommend using the provided train/dev/test split. The ratings for the test set reviews are kept hidden. You can evaluate your model using the online evaluation tool available on [klejbenchmark.com](https://klejbenchmark.com/).
The dataset can be downloaded from [here](https://klejbenchmark.com/static/data/klej_ar.zip).
## Evaluation
To counter slight class imbalance in the dataset, we propose to evaluate models using wMAE, i.e.macro-average of the mean absolute error per class. Additionally, we transform the rating to be between zero and one and report 1 − wMAE to obtain the final score.Python implementation of the proposed metric:
```python
import pandas as pd
from sklearn.metrics import mean_absolute_errordef ar_score(y_true, y_pred):
ds = pd.DataFrame({
'y_true': (y_true - 1.0)/4.0,
'y_pred': (y_pred - 1.0)/4.0,
})
wmae = ds \
.groupby('y_true') \
.apply(lambda df: mean_absolute_error(df['y_true'], df['y_pred'])) \
.mean()return 1 - wmae
```## Results
| Model | AR Score |
| ----------------- | --------: |
| [ELMo](https://clarin-pl.eu/dspace/handle/11321/690) | **86.15** |
| [Multilingual BERT](https://github.com/google-research/bert/blob/master/multilingual.md) | 83.33 |
| [Slavic BERT](https://github.com/deepmipt/Slavic-BERT-NER) | 84.31 |
| [XLM-17](https://github.com/facebookresearch/XLM/#the-17-and-100-languages) | 84.52 |
| [HerBERT](https://github.com/allegro/HerBERT) | 84.48 |## License
CC BY-SA 4.0## Citation
If you use this dataset, please cite the following paper:```
@inproceedings{rybak-etal-2020-klej,
title = "{KLEJ}: Comprehensive Benchmark for Polish Language Understanding",
author = "Rybak, Piotr and Mroczkowski, Robert and Tracz, Janusz and Gawlik, Ireneusz",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.111",
pages = "1191--1201",
}
```## Authors
Dataset was created by the **Allegro Machine Learning Research** team.You can contact us at: [email protected]