{"id":22938797,"url":"https://github.com/braindatalab/gecobench","last_synced_at":"2025-08-12T18:34:00.048Z","repository":{"id":244784071,"uuid":"636268498","full_name":"braindatalab/gecobench","owner":"braindatalab","description":"NLP Benchmark for XAI methods","archived":false,"fork":false,"pushed_at":"2024-12-02T12:52:29.000Z","size":10389,"stargazers_count":3,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-12-02T13:45:47.679Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/braindatalab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-04T13:23:16.000Z","updated_at":"2024-12-02T12:52:33.000Z","dependencies_parsed_at":null,"dependency_job_id":"12af82c1-2888-4ad7-8c8c-9186ccbd0d13","html_url":"https://github.com/braindatalab/gecobench","commit_stats":null,"previous_names":["braindatalab/gecobench"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/braindatalab%2Fgecobench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/braindatalab%2Fgecobench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/braindatalab%2Fgecobench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/braindatalab%2Fgecobench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/braindatalab","download_url":"https://codeload.github.com/braindatalab/gecobench/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229703533,"owners_count":18110572,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-14T12:28:32.789Z","updated_at":"2024-12-14T12:28:33.357Z","avatar_url":"https://github.com/braindatalab.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations\n\nThis repository contains the code for the paper \"GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations\" submitted to NeurIPS 2024 (Datasets and Benchmarks track).\n\n**Abstract:**\n\nLarge pre-trained language models have become popular for many applications and form an important backbone of many downstream tasks in natural language processing (NLP). Applying 'explainable artificial intelligence' (XAI) techniques to enrich such models' outputs is considered crucial for assuring their quality and shedding light on their inner workings. However, large language models are trained on a plethora of data containing a variety of biases, such as gender biases, affecting model weights and, potentially, behavior. Currently, it is unclear to what extent such biases also impact model explanations in unfavorable ways. We create a gender-controlled text dataset, GECO, in which otherwise identical sentences appear in male and female forms. This gives rise to ground-truth `world explanations' for gender classification tasks, enabling the objective evaluation of the correctness of XAI methods. We provide GECOBench, a rigorous quantitative evaluation framework benchmarking popular XAI methods, applying them to pre-trained language models fine-tuned to different degrees. This allows us to investigate how pre-training induces undesirable bias in model explanations and to what extent fine-tuning can mitigate such explanation bias. We show a clear dependency between explanation performance and the number of fine-tuned layers, where XAI methods are observed to particularly benefit from fine-tuning or complete retraining of embedding layers. Remarkably, this relationship holds for models achieving similar classification performance on the same task. With that, we highlight the utility of the proposed gender-controlled dataset and novel benchmarking approach for research and development of novel XAI methods.\n\n| [Download Dataset \u0026 Artifacts](https://osf.io/74j9s/?view_only=8f80e68d2bba42258da325fa47b9010f) |\n| ------------------------------------------------------------------------------------------------ |\n\n## Project Structure\n\n![Project Structure](./misc/overview_full_v2_border.png)\n\nThe project is structured in the following way:\n\n- `configs/`: Contains the configuration files for the dataset and the project.\n- `data/`: Contains the code to generate the **GECO** dataset.\n- `training/`: Contains the code to train the different bert models according to the training schemes defined in the config.\n- `xai/`: Contains the code to generate the explanations for the different models and explanation methods.\n- `evaluation/`: Contains the code to evaluate the explanations and compare them to the ground truth explanations from **GECO**.\n- `visualization/`: Contains the code to visualize the evaluation results.\n\nThe project can be run locally or on a slurm cluster. The `scripts/` directory contains the scripts to setup the environment and run the code on the cluster.\n\nThe benchmark pipeline consists of the following steps:\n\n![Pipeline](./misc/pipeline.png)\n\n## Dataset Format\n\n**GECO** consists of two gender-controlled datasets: `gender_all` and `gender_subj`. In the `gender_all` dataset, every word referring to a protagonist is replaced with the male or female version. In the `gender_subj` dataset, only the subject of the sentence is changed. Both datasets are split into a training and test set.\n\nThe training set contains 1288 male and 1288 female sentences, totaling 2576 sentences. The test set contains 322 male and 322 female sentences resulting in 644 test sentences.\n\nFolder structure of the dataset:\n\n```\n├── data_config.json\n├── gender_all\n│   ├── test.jsonl\n│   └── train.jsonl\n└── gender_subj\n    ├── test.jsonl\n    └── train.jsonl\n```\n\nFormat:\n\nThe format of the dataset is a jsonl file with the following fields:\n\n```js\n// Example entry labelled according to the gender_all scheme\n{\n  \"sentence\": [\"Paul\", \"loves\", \"his\", \"dog\"],\n  \"ground_truth\": [1.0, 0.0, 1.0, 0.0],\n  \"target\": 1,\n  \"sentence_idx\": 0\n}\n```\n\nThe sentence is tokenized into a list of strings. The ground truth is a list of floats, where 1.0 indicates that the word is part of the ground truth explanation and 0.0 otherwise. The target indicates whether the sentence is in the female (0) or male (1) form. The sentence_idx is the index of the raw unaltered sentence in the dataset and can be used to retrieve the both the male and female version of a sentence.\n\n## Results\n\n![MassAccruacy](./misc/mass_accuracy__filter_correct_best_no_legend.png)\n\nMass Accuracy for different post-hoc XAI methods applied on the five training schemes, with the null performance baseline for random explanations _Uniform Random_ for correctly classified sentences. Within the results for the _BERT_ models, fine-tuning or retraining the embedding layer leads to consistent changes in explanation performance; though model performance is equivalent for all models. Applying XAI methods to the _OLA_ model is leading to overall higher explanation performance, with InputXGradient on par with the Pattern Variant baseline.\n\n![OneSentencePlot](./misc/sentence_plot_2.png)\n\nExplanations by popular XAI methods for one sample sentence, broken down into input tokens as given to the respective model, with the ground truth manipulations highlighted in green. The majority of importance by many methods is correctly attributed to the word 'she', however all tokenized words show non-zero attribution for multiple methods, including the period character '.'.\n\n## Getting Started\n\nAll artifacts are available on [OSF](https://osf.io/74j9s/?view_only=8f80e68d2bba42258da325fa47b9010f), including the **GECO** dataset, the trained models, the generated explanations, evaluation results and visualizations. With the artifacts, you can start from any step of the pipeline, by downloading the artifacts and unpacking them in the main directory of the project.\n\n### Installation\n\nWe use [poetry](https://python-poetry.org/) to manage the dependencies. To install the dependencies run the following command:\n\n```bash\npoetry config virtualenvs.in-project true\npoetry install --no-root\nsource .venv/bin/activate\n```\n\n### Building the datasets\n\nThe dataset generation consists of two steps:\n\n1. Scraping, labeling and preprocessing the data for the GECO dataset.\n2. Generating the final datasets as .jsonl files for GECO and Sentiment datasets.\n\nFor Step 1. please refer to the `data/dataset_generation/README.md` for more information.\n\nAfter finishing Step 1. or downloading the artifacts from OSF, you can generate the datasets.\n\nCurrently we have three datasets: `gender_all`, `gender_subj`, and `sentiment_imdb`.\nThe config in `config/data_config.json` specifies the datasets and the parameters for the data generation.\nThe script expects the raw pickle files to be in `data/raw`.\n\nTo generate the datasets we can run the following command:\n\n```bash\npython generate_data.py --config=./configs/data_config.json\n```\n\nThis will generate a timestamped folder in the `artifacts/data` directory locally.\nTo upload the data to the cluster we can use the `copy_data_to_cluster.sh` script. This requires the environment variables to be set in the `.env` file as described below.\n\n```bash\n./scripts/hydra/copy_data_to_cluster.sh nlp-benchmark_2024-02-15-10-14-37\n```\n\nLastly, the project config has to be updated to point to the correct data directory in `configs/gender_no_sub_samp_project_config.json` and `configs/sentiment_config.json`:\n\n```json\n{\n  \"data\" {\n    \"data_dir\": \"/path/to/nlp-benchmark_2024-02-15-10-14-37\"\n  },\n}\n```\n\n### Running experiments locally\n\nEither download the artifacts from OSF or generate a new timestamped experiment run, by running the following command:\n\n```bash\npython setup_experiment.py\n```\n\nThis will create a timestamped folder in the `artifacts` directory with the necessary config files for the project.\n\nThe output gives you the instructions to run the different steps of the experiment.\n\n#### Run model experiment\n\nAfter setting up the experiment folder, we can run any step of the pipeline by running the command below. Set the mode to `training`, `xai`, `evaluation` or `visualization` and the config to the project config in the artifacts directory. The modes depend on each other and have to be run in the order `training`, `xai`, `evaluation` and `visualization`. Additionally, we provide a bias analysis script to analyze the bias in the dataset and models by setting the mode to `bias` or `model_bias`.\n\n```bash\npython run_experiments.py --mode=MODE --config=artifacts/xai-nlp-benchmark-2024-02-15-16-45-19/configs/gender_project_config.json\n```\n\n### On Hydra\n\nTo run the code on the cluster we have to do three steps:\n\n#### Step 1: Setup environment\n\nCopy the `.env.example` file to `.env` and fill in the environment variables.\nThe script assumes you have added lazy access to hydra in your ssh config, as described in the [hydra documentation](https://git.tu-berlin.de/ml-group/hydra/documentation).\n\n```\nHYDRA_SSH_USER= # Name of the user on the cluster\nHYDRA_DATA_DIR=# The path where to place the data on the cluster\nHYDRA_PROJECT_DIR= # The path to the code on the cluster\nKNOWN_HOSTS= # Path to the known_hosts file\n```\n\n#### Step 2: Move the code to the cluster\n\nEither clone the remote repository (recommended) or use the `upload_code_to_cluster.py` script.\n\n```bash\npython ./scripts/hydra/upload_code_to_cluster.py hydra\n```\n\n#### Step 4. Setup the project\n\nSsh into the cluster and navigate to the code directory.\nAs mentioned above, this creates a timestamped folder for the artifacts and copies the config files to the folder.\n\n```bash\npython3 setup_experiment.py\n```\n\nBy default it will create the artifacts folder in the code directory.\n\n#### Step 5. Build and run the container\n\nTo run the code we need to first build the container. This step only needs to be repeated if the dependencies change.\n\n```bash\npython3 ./scripts/hydra/submit_hydra_job.py --mode build --config ./artifacts/xai-nlp-benchmark-2024-02-15-16-45-19/configs/sentiment_project_config.json\n```\n\nAfterwards we can run the container with the following command:\n\n```bash\npython3 ./scripts/hydra/submit_hydra_job.py --mode training --config ./artifacts/xai-nlp-benchmark-2024-02-15-16-45-19/configs/sentiment_project_config.json\n```\n\nAgain, the mode depends on the previous steps and has to be run in the order `training`, `xai`, `evaluation` and `visualization`.\n\nThe machine the code is run on and the timeslot can be configured in `./scripts/hydra/cluster_job_hydra_gpu.sh`.\nFurther details can be found in the hydra documentation: https://git.tu-berlin.de/ml-group/hydra/documentation\nThe logger outputs of the container can be found in the code directory under logs.\n\n#### Step 4. View and cancel jobs\n\nTo view your current jobs: run e.g. `squeue --user=USERNAME`\nTo cancel a job run `scancel job_id` with the job id you get from the command above.\n\n#### Step 5. Retrieve results\n\nTo copy the results from the cluster to your local machine you can use the `get_results_from_cluster.sh` script.\n\n```bash\n./scripts/hydra/get_results_from_cluster.sh xai-nlp-benchmark-2024...\n```\n\n## Visualization\n\nAll visualizations are saved in the `artifacts` directory as images. The only exception are the html plots. To view the html plots start a local server in the root directory of the project:\n\n```bash\npython -m http.server 9000\n```\n\nThen navigate to `http://localhost:9000` in your browser, go to the artifacts folder for the run, select the visualization folder and click on the html file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbraindatalab%2Fgecobench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbraindatalab%2Fgecobench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbraindatalab%2Fgecobench/lists"}