https://github.com/dinhanhx/performance_calculation_tool_for_hm
Performance calculation tool for Hateful Memes Challenge
https://github.com/dinhanhx/performance_calculation_tool_for_hm
accuracy-score auc-roc-score cli dataset hateful-memes-challenge python-3 python3
Last synced: about 1 month ago
JSON representation
Performance calculation tool for Hateful Memes Challenge
- Host: GitHub
- URL: https://github.com/dinhanhx/performance_calculation_tool_for_hm
- Owner: dinhanhx
- Created: 2021-08-08T14:06:30.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-08-08T14:26:40.000Z (almost 5 years ago)
- Last Synced: 2025-01-28T23:50:04.154Z (over 1 year ago)
- Topics: accuracy-score, auc-roc-score, cli, dataset, hateful-memes-challenge, python-3, python3
- Language: Python
- Homepage:
- Size: 610 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Performance calculation tool for [Hateful Memes Challenge](https://hatefulmemeschallenge.com/)
This simple project uses simple functions from [pretty errors](https://pypi.org/project/pretty-errors/), [click](https://click.palletsprojects.com/en/8.0.x/), [pandas](https://pandas.pydata.org/getting_started.html), [sklearn](https://scikit-learn.org/stable/install.html). Therefore, one can go to these link and install as instructions. This works with Python 3.7
`calc_test.py` calculates AUC ROC and Accuracy scores. One can run `python calc_test.py --help` for instruction or can read the source code. It's very simple.
`calc_test.py` takes `test_seen.jsonl` ([Phase 1](https://www.drivendata.org/competitions/64/hateful-memes/page/206/)) or `test_unseen.jsonl` ([Phase 2](https://www.drivendata.org/competitions/70/hateful-memes-phase-2/page/267/)) **and** `result.csv`. Importantly, `test_seen.jsonl`, `test_unseen.jsonl` must have labels.
`result.csv` must have to three columns:
- Meme identification number, `id`
- Probability that the meme is hateful, `proba` (must be a float)
- Binary label that the meme is hateful (`1`) or non-hateful (`0`), `label` (must be an int)
## Other scripts
`combine.py` is meant to combine all `train.jsonl`, `dev_seen.jsonl`, `dev_unseen.jsonl`, `test_seen.jsonl`, `test_unseen.jsonl` into `data_test.jsonl`. Importantly, `test_seen.jsonl`, `test_unseen.jsonl` must have labels. By combining so, `data_test.jsonl` contains all metadata of all memes in the dataset.