{"id":15035455,"url":"https://github.com/uber-research/pplm","last_synced_at":"2025-05-16T04:03:42.778Z","repository":{"id":44473752,"uuid":"219833342","full_name":"uber-research/PPLM","owner":"uber-research","description":"Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.","archived":false,"fork":false,"pushed_at":"2024-02-20T16:47:37.000Z","size":2477,"stargazers_count":1145,"open_issues_count":30,"forks_count":204,"subscribers_count":27,"default_branch":"master","last_synced_at":"2025-05-16T04:03:33.581Z","etag":null,"topics":["deep-learning","language-modeling","machine-learning","natural-language-generation","natural-language-processing","nlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/uber-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-05T19:25:15.000Z","updated_at":"2025-05-13T19:37:59.000Z","dependencies_parsed_at":"2023-01-19T20:18:59.382Z","dependency_job_id":"c7c3c2e2-e923-4777-9ecc-ebc22ecd3ed1","html_url":"https://github.com/uber-research/PPLM","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber-research%2FPPLM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber-research%2FPPLM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber-research%2FPPLM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uber-research%2FPPLM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/uber-research","download_url":"https://codeload.github.com/uber-research/PPLM/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254464891,"owners_count":22075570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","language-modeling","machine-learning","natural-language-generation","natural-language-processing","nlp"],"created_at":"2024-09-24T20:28:43.738Z","updated_at":"2025-05-16T04:03:42.751Z","avatar_url":"https://github.com/uber-research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PPLM\n\nThis repository contains code to run the Plug and Play Language Model (PPLM), as described in this **[blog post](https://eng.uber.com/pplm)** and **[arXiv paper](https://arxiv.org/abs/1912.02164)**. A **[demo](https://transformer.huggingface.co/model/pplm)** and **[Colab notebook](https://colab.research.google.com/drive/1Ux0Z4-ruiVtJ6jUk98uk6FqfvGHCOYL3)** are also available.\n\n\nNote: If you are planning on using PPLM as a baseline, and would like to use the parameters listed in the paper's Appendix, please use the LM and the discriminator from this **[folder](https://github.com/uber-research/PPLM/tree/master/paper_code)**.\nAlternatively, tune the hyperparamters on your own if you are using the code/models in the main directory and/or the **[🤗/Transformers](https://transformer.huggingface.co/model/pplm)** for a fair comparison (the optimal parameters for these models/discriminators are roughly off by a factor of 5 from those used in the paper).\n\nPPLM is also integrated into the **[🤗/Transformers](https://github.com/huggingface/transformers/tree/master/examples/pplm)** repository.\n\n![header image](./imgs/headfigure.png)\n\n## Plug and Play Language Models: a Simple Approach to Controlled Text Generation\nAuthors: [Sumanth Dathathri](https://dathath.github.io/), [Andrea Madotto](https://andreamad8.github.io/), Janice Lan, Jane Hung, Eric Frank, [Piero Molino](https://w4nderlu.st/), [Jason Yosinski](http://yosinski.com/), and [Rosanne Liu](http://www.rosanneliu.com/)\n\nPPLM allows a user to flexibly plug in one or more tiny attribute models representing the desired steering objective into a large, unconditional language model (LM). The method has the key property that it uses the LM _as is_—no training or fine-tuning is required—which enables researchers to leverage best-in-class LMs even if they do not have the extensive hardware required to train them.\n\nSee also our [arXiv paper](https://arxiv.org/abs/1912.02164), [blog post](https://eng.uber.com/pplm), and try it out for yourself with no setup using the [Colab notebook](https://colab.research.google.com/drive/1Ux0Z4-ruiVtJ6jUk98uk6FqfvGHCOYL3).\n\n## Setup\n\n```bash\npip install -r requirements.txt\n```\n\n## Citation\n```\n@inproceedings{\nDathathri2020Plug,\ntitle={Plug and Play Language Models: A Simple Approach to Controlled Text Generation},\nauthor={Sumanth Dathathri and Andrea Madotto and Janice Lan and Jane Hung and Eric Frank and Piero Molino and Jason Yosinski and Rosanne Liu},\nbooktitle={International Conference on Learning Representations},\nyear={2020},\nurl={https://openreview.net/forum?id=H1edEyBKDS}\n}\n```\n\n## PPLM-BoW \n\n### Example command for bag-of-words control\n\n```bash\npython run_pplm.py -B military --cond_text \"The potato\" --length 50 --gamma 1.5 --num_iterations 3 --num_samples 10 --stepsize 0.03 --window_length 5 --kl_scale 0.01 --gm_scale 0.99 --colorama --sample\n```\n\n### Tuning hyperparameters for bag-of-words control\n\n1. Increase `--stepsize` to intensify topic control, and decrease its value to soften the control. `--stepsize 0` recovers the original uncontrolled GPT-2 model. \n\n2. If the language being generated is repetitive (For e.g. \"science science experiment experiment\"), there are several options to consider: \u003c/br\u003e\n\ta) Reduce the `--stepsize` \u003c/br\u003e\n\tb) Increase `--kl_scale` (the KL-loss coefficient) or decrease `--gm_scale` (the gm-scaling term) \u003c/br\u003e\n\tc) Add `--grad-length xx` where xx is an (integer \u003c= length, e.g. `--grad-length 30`).\u003c/br\u003e\n\n\n## PPLM-Discrim\n\n### Example command for discriminator based sentiment control\n\n```bash\npython run_pplm.py -D sentiment --class_label 2 --cond_text \"My dog died\" --length 50 --gamma 1.0 --num_iterations 10 --num_samples 10 --stepsize 0.04 --kl_scale 0.01 --gm_scale 0.95 --sample\n```\n\n### Tuning hyperparameters for discriminator control\n\n1. Increase `--stepsize` to intensify topic control, and decrease its value to soften the control. `--stepsize 0` recovers the original uncontrolled GPT-2 model. \n\n2. Use `--class_label 3` for negative, and `--class_label 2` for positive\n\n\nThe discriminator and the GPT-2 model in the root directory are different from those used for the analysis in the paper. Code and models corresponding to the paper can be found [here](https://github.com/uber-research/PPLM/tree/master/paper_code).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuber-research%2Fpplm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fuber-research%2Fpplm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuber-research%2Fpplm/lists"}