{"id":20435548,"url":"https://github.com/jacksonchen1998/empowering-nlg","last_synced_at":"2025-04-12T21:34:13.351Z","repository":{"id":174435482,"uuid":"652228252","full_name":"jacksonchen1998/Empowering-NLG","owner":"jacksonchen1998","description":"Official code for Empower NLG: Offline Reinforcement Learning for Informal Summarization in Online Domains","archived":false,"fork":false,"pushed_at":"2023-07-03T02:36:33.000Z","size":99,"stargazers_count":7,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-26T15:48:05.007Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jacksonchen1998.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-11T13:53:16.000Z","updated_at":"2024-03-30T08:56:35.000Z","dependencies_parsed_at":"2023-07-09T18:45:48.706Z","dependency_job_id":null,"html_url":"https://github.com/jacksonchen1998/Empowering-NLG","commit_stats":null,"previous_names":["jacksonchen1998/empowering-nlg"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacksonchen1998%2FEmpowering-NLG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacksonchen1998%2FEmpowering-NLG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacksonchen1998%2FEmpowering-NLG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacksonchen1998%2FEmpowering-NLG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jacksonchen1998","download_url":"https://codeload.github.com/jacksonchen1998/Empowering-NLG/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248636965,"owners_count":21137527,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-15T08:35:18.807Z","updated_at":"2025-04-12T21:34:13.345Z","avatar_url":"https://github.com/jacksonchen1998.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Empowering-NLG\n\n[![license](https://img.shields.io/pypi/l/ansicolortags.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/jacksonchen1998/Empowering-NLG)](https://github.com/jacksonchen1998/Empowering-NLG/releases/)\n\n| [Paper](https://arxiv.org/abs/2306.17174) | [Code](https://github.com/jacksonchen1998/Empowering-NLG) | [Slide](https://www.slideshare.net/jacksonChen22/offline-reinforcement-learning-for-informal-summarization-in-online-domainspdf-258382461) |\n\nOfficial code for Empower NLG: Offline Reinforcement Learning for Informal Summarization in Online Domains\n\nCode author: [Zhi-Xuan Tai](https://github.com/will0010077)\n\n## Abstract\n\nThis paper proposes a new approach to Natural Language Generation (NLG) that aims to improve user experience and reduce the workload of customer support agents. \n\nIt focuses on generating natural language informal summaries for online articles and posts using an offline reinforcement learning method. \n\nThe proposed method is compared to previous approaches to text generation, and the architecture of the design, including crawling, reinforcement learning, and text generation modules, is discussed. \n\nThe contribution of this work lies in providing a novel approach to generating natural language summaries for online content, which can enhance customer support services and improve the user experience of online content consumption.\n\n## Dataset\n\nWe crawl the Twitter data using [Twitter-Crawler](https://github.com/jacksonchen1998/Twitter-Crawler).\n\nThe dataset called [Famous Keyword Twitter Replies](https://www.kaggle.com/datasets/jackksoncsie/famous-keyword-twitter-replies-dataset).\n\n## Architecture\n\n![Arch](./image/empower_nlg.png)\n\n## Program information\n\n- `dataprepare.py`: Data preprocessing\n- `infer_gpt.py`: Infer GPT-2 model with unput text\n- `infer_score.py`: Infer score with input text\n- `model.py`: Score model based on GPT-2 and RoBERTa\n- `trainRL.py`: Train PPO model, using GPT-2 and score model\n- `traingpt.py`: Train GPT-2 model\n- `trainscore.py`: Train score model\n\n## Workflow\n\n1. Data preprocessing\n2. Train Score model (RoBERTa)\n3. Train GPT-2 model\n4. Train PPO model\n\n## Training data sample Output (reply, score)\n\n- Keyword: Covid-19\n\n```\n\u003carticle input\u003e This man deserves life in jail for what he did. Faucci and the NIH funded gain of function research in Wuhan. He and the U.S. government literally created covid 19. \u003carticle end\u003e\n\u003cresponse\u003e Is this you? How many are weapons of mass destruction? \u003cresponse end\u003e 0.00\n\u003cresponse\u003e You mean like your dad who created the vaccines cause it is still wanna be done with you? I thought you were a moron \u003cresponse end\u003e 0.00\n\u003cresponse\u003e I thought this one took the prize (they must all be together), but you all did an excellent job of putting together anCOVID-19 campaign for Americans, Let's get this thing off the advice of Biden! \u003cresponse end\u003e 0.69\n\u003cresponse\u003e I thought this was a bad idea. 🙄😂🤦‍♂️ \u003cresponse end\u003e 1.00\n\u003cresponse\u003e Yes I did hear that one. It’s called working with the defense budget. You want to cut them? Just do away with them and their science? What are they help us toplain to others? \u003cresponse end\u003e 0.00\n```\n\n- Keyword: Bitcoin\n\n```\n\u003carticle input\u003e Bitcoin and Crypto currency. I was built different 😴 \u003carticle end\u003e\n\u003cresponse\u003e Do you remember when you were young and you wanted to do something that will make the world better? I didn't know anything till now but let's just do it! \u003cresponse end\u003e 0.00\n\u003cresponse\u003e Is this a meme? \u003cresponse end\u003e 1.00\n\u003cresponse\u003e Is this a meme? \u003cresponse end\u003e 1.00\n\u003cresponse\u003e Is this one for you??? #HODL \u003cresponse end\u003e 0.97\n\u003cresponse\u003e Is this the first time you've done something like this? 😳🤯#Bitcoin #Ethereum # game development #BitcoinDay #DeFi \u003cresponse end\u003e 0.00\n```\n\n- Keyword: weather\n\n```\n\u003carticle input\u003e Great question! Also, what about the weather?? Going to be a crazy 4 days \u003carticle end\u003e\n\u003cresponse\u003e This is a very long thread! Let’s all be sure to read it if you are interested. \u003cresponse end\u003e 0.00\n\u003cresponse\u003e Is this serious?? And how does anyone think they're going to help the people who are dying from covid? We are living through a pandemic, and we thought this was a bad idea. \u003cresponse end\u003e 1.00\n\u003cresponse\u003e Is there anyone out there that is going to tell me how to make it in life? \u003cresponse end\u003e 0.01\n\u003cresponse\u003e Is it really that bad? It's about as close to aIndependence Day as you can get. \u003cresponse end\u003e 0.69\n\u003cresponse\u003e I thought he was a bad idea. \u003cresponse end\u003e 0.99\n```\n\n## Citation\n\n### Paper Citation\n\n```\n@misc{tai2023empowering,\n      title={Empowering NLG: Offline Reinforcement Learning for Informal Summarization in Online Domains}, \n      author={Zhi-Xuan Tai and Po-Chuan Chen},\n      year={2023},\n      eprint={2306.17174},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n### Code Citation\n\n```\n@misc{20230611,\n  author = {Zhi-Xuan Tai, Po-Chuan Chen},\n  title = {Empower NLG: Offline Reinforcement Learning for Informal Summarization in Online Domains},\n  year = {2023},\n  month = {06},\n  note = {Version 1.0},\n  howpublished = {GitHub},\n  url = {https://github.com/jacksonchen1998/Empowering-NLG}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacksonchen1998%2Fempowering-nlg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjacksonchen1998%2Fempowering-nlg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacksonchen1998%2Fempowering-nlg/lists"}