https://github.com/dinhanhx/performance_calculation_tool_for_hm

Performance calculation tool for Hateful Memes Challenge
https://github.com/dinhanhx/performance_calculation_tool_for_hm

accuracy-score auc-roc-score cli dataset hateful-memes-challenge python-3 python3

Last synced: about 1 month ago
JSON representation

Performance calculation tool for Hateful Memes Challenge

Host: GitHub
URL: https://github.com/dinhanhx/performance_calculation_tool_for_hm
Owner: dinhanhx
Created: 2021-08-08T14:06:30.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2021-08-08T14:26:40.000Z (almost 5 years ago)
Last Synced: 2025-01-28T23:50:04.154Z (over 1 year ago)
Topics: accuracy-score, auc-roc-score, cli, dataset, hateful-memes-challenge, python-3, python3
Language: Python
Homepage:
Size: 610 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Performance calculation tool for [Hateful Memes Challenge](https://hatefulmemeschallenge.com/)

This simple project uses simple functions from [pretty errors](https://pypi.org/project/pretty-errors/), [click](https://click.palletsprojects.com/en/8.0.x/), [pandas](https://pandas.pydata.org/getting_started.html), [sklearn](https://scikit-learn.org/stable/install.html). Therefore, one can go to these link and install as instructions. This works with Python 3.7

`calc_test.py` calculates AUC ROC and Accuracy scores. One can run `python calc_test.py --help` for instruction or can read the source code. It's very simple.

`calc_test.py` takes `test_seen.jsonl` ([Phase 1](https://www.drivendata.org/competitions/64/hateful-memes/page/206/)) or `test_unseen.jsonl` ([Phase 2](https://www.drivendata.org/competitions/70/hateful-memes-phase-2/page/267/)) **and** `result.csv`. Importantly, `test_seen.jsonl`, `test_unseen.jsonl` must have labels.  

`result.csv` must have to three columns:

- Meme identification number, `id`

- Probability that the meme is hateful, `proba` (must be a float)

- Binary label that the meme is hateful (`1`) or non-hateful (`0`), `label` (must be an int)

## Other scripts

`combine.py` is meant to combine all `train.jsonl`, `dev_seen.jsonl`, `dev_unseen.jsonl`, `test_seen.jsonl`, `test_unseen.jsonl` into `data_test.jsonl`. Importantly, `test_seen.jsonl`, `test_unseen.jsonl` must have labels. By combining so, `data_test.jsonl` contains all metadata of all memes in the dataset.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dinhanhx/performance_calculation_tool_for_hm

Awesome Lists containing this project

README