{"id":13459035,"url":"https://github.com/alisawuffles/DExperts","last_synced_at":"2025-03-24T16:31:20.385Z","repository":{"id":62063781,"uuid":"373678157","full_name":"alisawuffles/DExperts","owner":"alisawuffles","description":"code associated with ACL 2021 DExperts paper","archived":false,"fork":false,"pushed_at":"2023-05-24T18:58:54.000Z","size":1416,"stargazers_count":113,"open_issues_count":2,"forks_count":22,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-29T04:34:31.892Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alisawuffles.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-04T00:35:04.000Z","updated_at":"2024-10-27T09:56:03.000Z","dependencies_parsed_at":"2024-07-31T09:10:16.577Z","dependency_job_id":null,"html_url":"https://github.com/alisawuffles/DExperts","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alisawuffles%2FDExperts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alisawuffles%2FDExperts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alisawuffles%2FDExperts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alisawuffles%2FDExperts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alisawuffles","download_url":"https://codeload.github.com/alisawuffles/DExperts/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245308567,"owners_count":20594271,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T09:01:01.230Z","updated_at":"2025-03-24T16:31:19.903Z","avatar_url":"https://github.com/alisawuffles.png","language":"Jupyter Notebook","readme":"# DExperts\nHi! This repository contains code for the paper [DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts](https://aclanthology.org/2021.acl-long.522/), published in ACL 2021. If you have any questions, please feel free to create a Github issue or reach out to the first author at alisaliu@cs.washington.edu. \n\nCreate a conda environment called `dexperts` with\n```\nconda env create -f environment.yml\n```\n\n## Toxicity\nTo generate continuations with DExperts and score them for toxicity using the [PerspectiveAPI](https://github.com/conversationai/perspectiveapi) toxicity scorer, run the following command.\n```\nOUTPUT_DIR=generations/toxicity/dexperts\nPROMPTS_DATASET=prompts/nontoxic_prompts-10k.jsonl\n\npython -m scripts.run_toxicity_experiment \\\n    --use-dataset \\\n    --dataset-file $PROMPTS_DATASET \\\n    --model-type dexperts \\\n    --model gpt2-large \\\n    --nontoxic-model $MODEL_DIR/finetuned_gpt2_nontoxic \\\n    --toxic-model $MODEL_DIR/finetuned_gpt2_toxic \\\n    --perspective-rate-limit $API_RATE \\\n    --alpha 2.0 \\\n    --filter_p 0.9 \\\n    $OUTPUT_DIR\n```\n\nIn general, `model_type` is one of `gpt2` (the base model), `dexperts` (our method), and `pplm`. With an [OpenAI API](https://beta.openai.com/) key for GPT-3 access, you can also try `gpt3` and `dexperts-gpt3`. Different methods have different additional parameters to specify; to see the commands we used for each method in our paper, please look under `scripts/our_scripts/toxicity`. For experiments with GeDi, we directly used the original [authors' codebase](https://github.com/salesforce/GeDi). \n\nWhen `model_type` is `dexperts`, we can steer away from toxicity using only a toxic anti-expert. To do this, leave `--nontoxic-model` empty, and DExperts will re-use the base model as the expert. The hyperparameter `alpha` controls the strength of steering over the base model. We use `filter_p` to use the nucleus from the base model, as described in Section 2.2 of our paper.\n\nThis script will create three files in `OUTPUT_DIR`: `generations.jsonl` with all of the generated continuations, `perspective.jsonl` with all the scores from Perspective API, and `prompted_gens_[model_type].jsonl`, which collates the previous two files.\n\nTo try a model's output on your own prompts, simply create your own prompts file! To see the format of the prompts file, see `prompts/toy_prompt.jsonl`.\n\n## Sentiment\nTo generate continuations with DExperts conditioned on sentiment prompts and score them for sentiment using HuggingFace's sentiment classifier, run the following command.\n\n```\nPROMPTS_DATASET=prompts/sentiment_prompts-10k/neutral_prompts.jsonl\nOUTPUT_DIR=generations/sentiment/neutral_prompts/dexperts/positive/\n\npython -m scripts.run_sentiment_experiment \\\n    --use-dataset \\\n    --dataset-file $PROMPTS_DATASET \\\n    --model-type dexperts \\\n    --model gpt2-large \\\n    --pos-model $MODEL_DIR/finetuned_gpt2_positive \\\n    --neg-model $MODEL_DIR/finetuned_gpt2_negative \\\n    --alpha 3.2 \\\n    --filter_p 0.9 \\\n    $OUTPUT_DIR\n```\n\nThe `model_type` can be any of the options from before, with the addition of `ctrl`. Again, the full commands used for each method can be found under `scripts/our_scripts/sentiment`.\n\nWhen `model_type` is `dexperts`, we always interpret `--pos-model` as the expert and `--neg-model` as the anti-expert; for negative steering, use `alpha` \u003c 0. By leaving one of `--pos-model` or `--neg-model` empty, DExperts will re-use the base model as the missing expert or anti-expert.\n\n## Evaluation\nTo evaluate generated output for fluency and diversity, run the following command. The `GENERATIONS_FILE` should have the format `prompted_gens_[model_type].jsonl`.\n```\npython -m scripts.evaluation.evaluate_generations \\\n    --generations_file $GENERATIONS_FILE\n```\n\n## Notebooks\nOur jupyter notebooks are in `notebooks/`. To obtain the same tables and plots that appear in the paper, look in `sentiment_results.ipynb`, `toxicity_results.ipynb`, and `human_eval_results.ipynb`. To create your own prompts dataset with a couple lines of code, you can get started with `prompts_playground.ipynb`. Sample and compare generations from each model with `review_sentiment_generations.ipynb` and `review_toxicity_generations.ipynb`. \n\n## Downloading the original data and models from our paper\n\nTo download the prompts we used for evaluation, generations output by each model, and finetuning datasets from our paper, ensure you have `gdown` installed, then run the following commands inside the `dexperts/` root directory. Descriptions of the contents of each of these folders can be found within the folder.\n```\n# prompts\ngdown https://drive.google.com/uc?id=1bI49aJvmEoLdqSNb30JkORdsNJmv7Aep\nunzip prompts.zip \u0026\u0026 rm prompts.zip\n# generations\ngdown https://drive.google.com/uc?id=10jL1-eCv8w3oeGFgA_jrel0enrNVdFW7\nunzip generations.zip \u0026\u0026 rm generations.zip\n# datasets\ngdown https://drive.google.com/uc?id=1MeEjLPxQ77AYtzL0nd1hYJTlL8OJgHkI\nunzip datasets.zip \u0026\u0026 rm datasets.zip\n```\n\nTo download models from our paper,\n```\nmkdir models\ncd models\n# (anti-)expert models\ngdown https://drive.google.com/uc?id=1HSrNMrq4OZ3nyTobNd2TZFcB5NYwluu-\nunzip experts.zip \u0026\u0026 rm experts.zip\n# DAPT models\ngdown https://drive.google.com/uc?id=1eDlRU04s-H1elWWtPuDoBNAqyoqj3_p9\nunzip dapt.zip \u0026\u0026 rm dapt.zip\n# PPLM classifiers\ngdown https://drive.google.com/uc?id=17s26QM9vJp9hCUkRBrDx5Wa__4BlrqGL\nunzip pplm_classifiers.zip \u0026\u0026 rm pplm_classifiers.zip\n```\n\n## Citation\n```\n@inproceedings{liu-etal-2021-dexperts,\n    title = \"{DE}xperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts\",\n    author = \"Liu, Alisa  and\n      Sap, Maarten  and\n      Lu, Ximing  and\n      Swayamdipta, Swabha  and\n      Bhagavatula, Chandra  and\n      Smith, Noah A.  and\n      Choi, Yejin\",\n    booktitle = \"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)\",\n    month = aug,\n    year = \"2021\",\n    address = \"Online\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2021.acl-long.522\",\n    doi = \"10.18653/v1/2021.acl-long.522\",\n    pages = \"6691--6706\",\n}\n```\n\nThis code was built on top of [allenai/real-toxicity-prompts](https://github.com/allenai/real-toxicity-prompts) and with inspiration from [yangkevin2/naacl-2021-fudge-controlled-generation](https://github.com/yangkevin2/naacl-2021-fudge-controlled-generation).\n","funding_links":[],"categories":["Papers"],"sub_categories":["2021"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falisawuffles%2FDExperts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falisawuffles%2FDExperts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falisawuffles%2FDExperts/lists"}