Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zhjohnchan/awesome-reinforcement-learning-in-nlp

A curated list of reinforcement learning in NLP. :-)
https://github.com/zhjohnchan/awesome-reinforcement-learning-in-nlp

List: awesome-reinforcement-learning-in-nlp

Last synced: about 2 months ago
JSON representation

A curated list of reinforcement learning in NLP. :-)

Awesome Lists containing this project

README

        

# Awesome Reinforcement Learning in NLP[![Awesome](https://awesome.re/badge.svg)](https://awesome.re)



A curated list of reinforcement learning in NLP. :-)

## Contributing
Please feel free to send me [pull requests](https://github.com/zhjohnchan/awesome-reinforcement-learning-in-nlp/pulls) or email ([email protected]) to add links.

## Table of Contents
- [Papers](#papers)
- [Survey](#survey)
- [Research Paper](#research-paper)

## Papers
### Research Paper
| Year | Venue | Title |
|-------:|:---------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2001 | NAACL | [Learning Optimal Dialogue Management Rules by Using Reinforcement Learning and Inductive Logic Programming](https://aclanthology.org/N01-1028.pdf) |
| 2005 | EMNLP | [Learning Mixed Initiative Dialog Strategies By Using Reinforcement Learning On Both Conversants](https://aclanthology.org/H05-1127.pdf) |
| 2006 | EMNLP | [Unsupervised Information Extraction Approach Using Graph Mutual Reinforcement](https://aclanthology.org/W06-1659.pdf) |
| 2006 | NAACL | [Comparing the Utility of State Features in Spoken Dialogue Using Reinforcement Learning](https://aclanthology.org/N06-1035.pdf) |
| 2006 | EACL | [Using Reinforcement Learning to Build a Better Model of Dialogue State](https://aclanthology.org/E06-1037.pdf) |
| 2006 | EACL | [An ISU Dialogue System Exhibiting Reinforcement Learning of Dialogue Policies: Generic Slot-Filling in the TALK In-car System](https://aclanthology.org/E06-2009.pdf) |
| 2007 | ACL | [Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction](https://aclanthology.org/P07-1070.pdf) |
| 2008 | COLING | [PNR2: Ranking Sentences with Positive and Negative Reinforcement for Query-Oriented Update Summarization](https://aclanthology.org/C08-1062.pdf) |
| 2008 | CL | [Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets](https://aclanthology.org/J08-4002.pdf) |
| 2009 | ACL | [Reinforcement Learning for Mapping Instructions to Actions](https://aclanthology.org/P09-1010.pdf) |
| 2009 | NAACL | [An Iterative Reinforcement Approach for Fine-Grained Opinion Mining](https://aclanthology.org/N09-1055.pdf) |
| 2010 | ACL | [From Structured Prediction to Inverse Reinforcement Learning](https://aclanthology.org/P10-5005.pdf) |
| 2010 | COLING | [Simultaneous Ranking and Clustering of Sentences: A Reinforcement Approach to Multi-Document Summarization](https://aclanthology.org/C10-1016.pdf) |
| 2011 | ACL | [Hierarchical Reinforcement Learning and Hidden Markov Models for Task-Oriented Natural Language Generation](https://aclanthology.org/P11-2115.pdf) |
| 2011 | ACL | [Beyond Structured Prediction: Inverse Reinforcement Learning](https://aclanthology.org/P11-5001.pdf) |
| 2011 | EMNLP | [Improved Transliteration Mining Using Graph Reinforcement](https://aclanthology.org/D11-1128.pdf) |
| 2012 | EMNLP | [Framework of Automatic Text Summarization Using Reinforcement Learning](https://aclanthology.org/D12-1024.pdf) |
| 2012 | EACL | [A Comparative Study of Reinforcement Learning Techniques on Dialogue Management](https://aclanthology.org/E12-3003.pdf) |
| 2012 | CoNLL | [Framework of Automatic Text Summarization Using Reinforcement Learning](https://aclanthology.org/D12-1024.pdf) |
| 2014 | ACL | [Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies](https://aclanthology.org/P14-1047.pdf) |
| 2014 | ACL | [Comparing Multi-label Classification with Reinforcement Learning for Summarisation of Time-series Data](https://aclanthology.org/P14-1116.pdf) |
| 2014 | EMNLP | [Fear the REAPER: A System for Automatic Multi-Document Summarization with Reinforcement Learning](https://aclanthology.org/D14-1075.pdf) |
| 2014 | EMNLP | [Don’t Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation](https://aclanthology.org/D14-1140.pdf) |
| 2014 | COLING | [Reinforcement Learning of Cooperative Persuasive Dialogue Policies using Framing](https://aclanthology.org/C14-1161.pdf) |
| 2014 | EACL | [Undirected Machine Translation with Discriminative Reinforcement Learning](https://aclanthology.org/E14-1002.pdf) |
| 2015 | EMNLP | [Language Understanding for Text-based Games using Deep Reinforcement Learning](https://aclanthology.org/D15-1001.pdf) |
| 2016 | ACL | [Deep Reinforcement Learning with a Natural Language Action Space](https://aclanthology.org/P16-1153.pdf) |
| 2016 | EMNLP | [Deep Reinforcement Learning for Dialogue Generation](https://aclanthology.org/D16-1127.pdf) |
| 2016 | EMNLP | [Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads](https://aclanthology.org/D16-1189.pdf) |
| 2016 | EMNLP | [Deep Reinforcement Learning for Mention-Ranking Coreference Models](https://aclanthology.org/D16-1245.pdf) |
| 2016 | EMNLP | [Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning](https://aclanthology.org/D16-1261.pdf) |
| 2017 | ACL | [Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access](https://aclanthology.org/P17-1045.pdf) |
| 2017 | ACL | [Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning](https://aclanthology.org/P17-1062.pdf) |
| 2017 | ACL | [From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood](https://aclanthology.org/P17-1097.pdf) |
| 2017 | EMNLP | [DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning](https://aclanthology.org/D17-1060.pdf) |
| 2017 | EMNLP | [Task-Oriented Query Reformulation with Reinforcement Learning](https://aclanthology.org/D17-1061.pdf) |
| 2017 | EMNLP | [Sentence Simplification with Deep Reinforcement Learning](https://aclanthology.org/D17-1062.pdf) |
| 2017 | EMNLP | [Learning how to Active Learn: A Deep Reinforcement Learning Approach](https://aclanthology.org/D17-1063.pdf) |
| 2017 | EMNLP | [Reinforced Video Captioning with Entailment Rewards](https://aclanthology.org/D17-1103.pdf) |
| 2017 | EMNLP | [Mapping Instructions and Visual Observations to Actions with Reinforcement Learning](https://aclanthology.org/D17-1106.pdf) |
| 2017 | EMNLP | [Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback](https://aclanthology.org/D17-1153.pdf) |
| 2017 | EMNLP | [Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning](https://aclanthology.org/D17-1237.pdf) |
| 2017 | EMNLP | [Speeding up Reinforcement Learning-based Information Extraction Training using Asynchronous Methods](https://aclanthology.org/D17-1281.pdf) |
| 2017 | EACL | [Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing](https://aclanthology.org/E17-1064.pdf) |
| 2017 | EACL | [Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents](https://aclanthology.org/E17-2077.pdf) |
| 2018 | ACL | [Deep Reinforcement Learning for Chinese Zero Pronoun Resolution](https://aclanthology.org/P18-1053.pdf) |
| 2018 | ACL | [Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting](https://aclanthology.org/P18-1063.pdf) |
| 2018 | ACL | [Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach](https://aclanthology.org/P18-1090.pdf) |
| 2018 | ACL | [Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference](https://aclanthology.org/P18-1091.pdf) |
| 2018 | ACL | [Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning](https://aclanthology.org/P18-1165.pdf) |
| 2018 | ACL | [Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning](https://aclanthology.org/P18-1199.pdf) |
| 2018 | ACL | [End-to-End Reinforcement Learning for Automatic Taxonomy Induction](https://aclanthology.org/P18-1229.pdf) |
| 2018 | ACL | [Reinforced Extractive Summarization with Question-Focused Rewards](https://aclanthology.org/P18-3015.pdf) |
| 2018 | ACL | [Deep Reinforcement Learning for NLP](https://aclanthology.org/P18-5007.pdf) |
| 2018 | EMNLP | [Improving Reinforcement Learning Based Image Captioning with Natural Language Prior](https://aclanthology.org/D18-1083.pdf) |
| 2018 | EMNLP | [Automatic Essay Scoring Incorporating Rating Schema via Reinforcement Learning](https://aclanthology.org/D18-1090.pdf) |
| 2018 | EMNLP | [Automatic Poetry Generation with Mutual Reinforcement Learning](https://aclanthology.org/D18-1353.pdf) |
| 2018 | EMNLP | [Playing 20 Question Game with Policy-Based Reinforcement Learning](https://aclanthology.org/D18-1361.pdf) |
| 2018 | EMNLP | [A Study of Reinforcement Learning for Neural Machine Translation](https://aclanthology.org/D18-1397.pdf) |
| 2018 | EMNLP | [Paraphrase Generation with Deep Reinforcement Learning](https://aclanthology.org/D18-1421.pdf) |
| 2018 | EMNLP | [APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Reinforcement Learning](https://aclanthology.org/D18-1445.pdf) |
| 2018 | EMNLP | [Macquarie University at BioASQ 6b: Deep learning and deep reinforcement learning for query-based summarisation](https://aclanthology.org/W18-5303.pdf) |
| 2018 | EMNLP | [Joint Modeling for Query Expansion and Information Extraction with Reinforcement Learning](https://aclanthology.org/W18-5506.pdf) |
| 2018 | EMNLP | [Autonomous Sub-domain Modeling for Dialogue Policy with Hierarchical Deep Reinforcement Learning](https://aclanthology.org/W18-5702.pdf) |
| 2018 | EMNLP | [A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems](https://aclanthology.org/W18-5705.pdf) |
| 2018 | EMNLP | [Curriculum Learning Based on Reward Sparseness for Deep Reinforcement Learning of Task Completion Dialogue Management](https://aclanthology.org/W18-5707.pdf) |
| 2018 | EMNLP | [Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning](https://aclanthology.org/W18-6021.pdf) |
| 2018 | NAACL | [Reinforced Co-Training](https://aclanthology.org/N18-1113.pdf) |
| 2018 | NAACL | [Ranking Sentences for Extractive Summarization with Reinforcement Learning](https://aclanthology.org/N18-1158.pdf) |
| 2018 | NAACL | [Multi-Reward Reinforced Summarization with Saliency and Entailment](https://aclanthology.org/N18-2102.pdf) |
| 2018 | NAACL | [Feudal Reinforcement Learning for Dialogue Management in Large Domains](https://aclanthology.org/N18-2112.pdf) |
| 2018 | NAACL | [Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning](https://aclanthology.org/N18-3006.pdf) |
| 2018 | COLING | [Neural Math Word Problem Solver with Reinforcement Learning](https://aclanthology.org/C18-1018.pdf) |
| 2018 | COLING | [A New Concept of Deep Reinforcement Learning based Augmented General Tagging System](https://aclanthology.org/C18-1143.pdf) |
| 2018 | COLING | [A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators](https://aclanthology.org/C18-1150.pdf) |
| 2018 | COLING | [Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning](https://aclanthology.org/C18-1183.pdf) |
| 2018 | COLING | [Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language](https://aclanthology.org/C18-1305.pdf) |
| 2019 | ACL | [End-to-end Deep Reinforcement Learning Based Coreference Resolution](https://aclanthology.org/P19-1064.pdf) |
| 2019 | ACL | [Reinforced Training Data Selection for Domain Adaptation](https://aclanthology.org/P19-1189.pdf) |
| 2019 | ACL | [Reinforced Dynamic Reasoning for Conversational Question Generation](https://aclanthology.org/P19-1203.pdf) |
| 2019 | ACL | [Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards](https://aclanthology.org/P19-1208.pdf) |
| 2019 | ACL | [Aspect Sentiment Classification Towards Question-Answering with Reinforced Bidirectional Attention Network](https://aclanthology.org/P19-1345.pdf) |
| 2019 | ACL | [Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning](https://aclanthology.org/P19-1451.pdf) |
| 2019 | ACL | [A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer](https://aclanthology.org/P19-1482.pdf) |
| 2019 | ACL | [A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification](https://aclanthology.org/P19-1518.pdf) |
| 2019 | ACL | [Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation](https://aclanthology.org/P19-2056.pdf) |
| 2019 | ACL | [Implementing a Multi-lingual Chatbot for Positive Reinforcement in Young Learners](https://aclanthology.org/W19-3629.pdf) |
| 2019 | EMNLP | [Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning](https://aclanthology.org/D19-1035.pdf) |
| 2019 | EMNLP | [Hierarchical Text Classification with Reinforced Label Assignment](https://aclanthology.org/D19-1042.pdf) |
| 2019 | EMNLP | [Reinforced Product Metadata Selection for Helpfulness Assessment of Customer Reviews](https://aclanthology.org/D19-1177.pdf) |
| 2019 | EMNLP | [Deep Reinforcement Learning-based Text Anonymization against Private-Attribute Inference](https://aclanthology.org/D19-1240.pdf) |
| 2019 | EMNLP | [Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning](https://aclanthology.org/D19-1264.pdf) |
| 2019 | EMNLP | [Clickbait? Sensational Headline Generation with Auto-tuned Reinforcement Learning](https://aclanthology.org/D19-1303.pdf) |
| 2019 | EMNLP | [Answers Unite! Unsupervised Metrics for Reinforced Summarization Models](https://aclanthology.org/D19-1320.pdf) |
| 2019 | EMNLP | [Neural Topic Model with Reinforcement Learning](https://aclanthology.org/D19-1350.pdf) |
| 2019 | EMNLP | [LexicalAT: Lexical-Based Adversarial Reinforcement Training for Robust Sentiment Classification](https://aclanthology.org/D19-1554.pdf) |
| 2019 | EMNLP | [Human-Like Decision Making: Document-level Aspect Sentiment Classification via Hierarchical Reinforcement Learning](https://aclanthology.org/D19-1560.pdf) |
| 2019 | EMNLP | [An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation](https://aclanthology.org/D19-1619.pdf) |
| 2019 | EMNLP | [Deep Reinforcement Learning with Distributional Semantic Rewards for Abstractive Summarization](https://aclanthology.org/D19-1623.pdf) |
| 2019 | EMNLP | [Transfer in Deep Reinforcement Learning Using Knowledge Graphs](https://aclanthology.org/D19-5301.pdf) |
| 2019 | EMNLP | [Reinforcement-based denoising of distantly supervised NER with partial annotation](https://aclanthology.org/D19-6125.pdf) |
| 2019 | NAACL | [Learning Interpretable Negation Rules via Weak Supervision at Document Level: A Reinforcement Learning Approach](https://aclanthology.org/N19-1038.pdf) |
| 2019 | NAACL | [Courteously Yours: Inducing courteous behavior in Customer Care responses using Reinforced Pointer Generator Network](https://aclanthology.org/N19-1091.pdf) |
| 2019 | NAACL | [Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models](https://aclanthology.org/N19-1123.pdf) |
| 2019 | NAACL | [Reinforcement Learning based Curriculum Optimization for Neural Machine Translation](https://aclanthology.org/N19-1208.pdf) |
| 2019 | NAACL | [Posterior-regularized REINFORCE for Instance Selection in Distant Supervision](https://aclanthology.org/N19-1290.pdf) |
| 2019 | NAACL | [Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction](https://aclanthology.org/N19-1315.pdf) |
| 2019 | NAACL | [Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus](https://aclanthology.org/N19-1320.pdf) |
| 2019 | NAACL | [Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning](https://aclanthology.org/N19-1358.pdf) |
| 2019 | NAACL | [Learning When Not to Answer: a Ternary Reward Structure for Reinforcement Learning Based Question Answering](https://aclanthology.org/N19-2016.pdf) |
| 2020 | ACL | [Zero-shot Text Classification via Reinforced Self-training](https://aclanthology.org/2020.acl-main.272.pdf) |
| 2020 | ACL | [A Reinforced Generation of Adversarial Examples for Neural Machine Translation](https://aclanthology.org/2020.acl-main.319.pdf) |
| 2020 | ACL | [Improving Entity Linking through Semantic Reinforced Entity Embeddings](https://aclanthology.org/2020.acl-main.612.pdf) |
| 2020 | ACL | [Meta-Reinforced Multi-Domain State Generator for Dialogue Systems](https://aclanthology.org/2020.acl-main.636.pdf) |
| 2020 | ACL | [Noise Pollution in Hospital Readmission Prediction: Long Document Classification with Reinforcement Learning](https://aclanthology.org/2020.bionlp-1.10.pdf) |
| 2020 | ACL | [A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards](https://aclanthology.org/2020.ngt-1.7.pdf) |
| 2020 | EMNLP | [Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning](https://aclanthology.org/2020.emnlp-main.136.pdf) |
| 2020 | EMNLP | [Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning](https://aclanthology.org/2020.emnlp-main.175.pdf) |
| 2020 | EMNLP | [Human-centric dialog training via offline reinforcement learning](https://aclanthology.org/2020.emnlp-main.327.pdf) |
| 2020 | EMNLP | [Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning](https://aclanthology.org/2020.emnlp-main.469.pdf) |
| 2020 | EMNLP | [Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning](https://aclanthology.org/2020.emnlp-main.624.pdf) |
| 2020 | EMNLP | [Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning](https://aclanthology.org/2020.emnlp-main.693.pdf) |
| 2020 | EMNLP | [Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation](https://aclanthology.org/2020.emnlp-main.726.pdf) |
| 2020 | EMNLP | [Production-based Cognitive Models as a Test Suite for Reinforcement Learning Algorithms](https://aclanthology.org/2020.cmcl-1.3.pdf) |
| 2020 | EMNLP | [Reinforcement Learning with Imbalanced Dataset for Data-to-Text Medical Report Generation](https://aclanthology.org/2020.findings-emnlp.202.pdf) |
| 2020 | EMNLP | [Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems](https://aclanthology.org/2020.findings-emnlp.316.pdf) |
| 2020 | COLING | [A Learning-Exploring Method to Generate Diverse Paraphrases with Multi-Objective Deep Reinforcement Learning](https://aclanthology.org/2020.coling-main.209.pdf) |
| 2020 | COLING | [Reinforced Multi-task Approach for Multi-hop Question Generation](https://aclanthology.org/2020.coling-main.249.pdf) |
| 2020 | COLING | [Combining Cognitive Modeling and Reinforcement Learning for Clarification in Dialogue](https://aclanthology.org/2020.coling-main.391.pdf) |
| 2020 | COLING | [Answer-driven Deep Question Generation based on Reinforcement Learning](https://aclanthology.org/2020.coling-main.452.pdf) |
| 2020 | COLING | [Interactive Question Clarification in Dialogue via Reinforcement Learning](https://aclanthology.org/2020.coling-industry.8.pdf) |
| 2020 | AACL | [ExpanRL: Hierarchical Reinforcement Learning for Course Concept Expansion in MOOCs](https://aclanthology.org/2020.aacl-main.77.pdf) |
| 2020 | AACL | [Text Simplification with Reinforcement Learning Using Supervised Rewards on Grammaticality, Meaning Preservation, and Simplicity](https://aclanthology.org/2020.aacl-srw.22.pdf) |
| 2020 | Findings | [Reinforcement Learning with Imbalanced Dataset for Data-to-Text Medical Report Generation](https://aclanthology.org/2020.findings-emnlp.202.pdf) |
| 2020 | Findings | [Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems](https://aclanthology.org/2020.findings-emnlp.316.pdf) |
| 2021 | ACL | [How Helpful is Inverse Reinforcement Learning for Table-to-Text Generation?](https://aclanthology.org/2021.acl-short.11.pdf) |
| 2021 | ACL | [Reinforcement Learning for Abstractive Question Summarization with Question-aware Semantic Rewards](https://aclanthology.org/2021.acl-short.33.pdf) |
| 2021 | ACL | [Efficient Text-based Reinforcement Learning by Jointly Leveraging State and Commonsense Graph Representations](https://aclanthology.org/2021.acl-short.91.pdf) |
| 2021 | ACL | [A Proposal: Interactively Learning to Summarise Timelines by Reinforcement Learning](https://aclanthology.org/2021.internlp-1.4.pdf) |
| 2021 | ACL | [Meta-Reinforcement Learning for Mastering Multiple Skills and Generalizing across Environments in Text-based Games](https://aclanthology.org/2021.metanlp-1.1.pdf) |
| 2021 | ACL | [Interactive Reinforcement Learning for Table Balancing Robot](https://aclanthology.org/2021.splurobonlp-1.8.pdf) |
| 2021 | ACL | [RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation](https://aclanthology.org/2021.spnlp-1.1.pdf) |
| 2021 | ACL | [Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks](https://aclanthology.org/2021.spnlp-1.4.pdf) |
| 2021 | NAACL | [Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning](https://aclanthology.org/2021.naacl-main.83.pdf) |
| 2021 | NAACL | [Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation](https://aclanthology.org/2021.naacl-main.133.pdf) |
| 2021 | NAACL | [Quantitative Day Trading from Natural Language using Reinforcement Learning](https://aclanthology.org/2021.naacl-main.316.pdf) |
| 2021 | EACL | [ECOL-R: Encouraging Copying in Novel Object Captioning with Reinforcement Learning](https://aclanthology.org/2021.eacl-main.104.pdf) |
| 2021 | EACL | [Implicit Unlikelihood Training: Improving Neural Text Generation with Reinforcement Learning](https://aclanthology.org/2021.eacl-main.123.pdf) |
| 2021 | EACL | [Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation](https://aclanthology.org/2021.eacl-main.281.pdf) |
| 2021 | Findings | [Better Chinese Sentence Segmentation with Reinforcement Learning](https://aclanthology.org/2021.findings-acl.25.pdf) |
| 2021 | Findings | [Language-based General Action Template for Reinforcement Learning Agents](https://aclanthology.org/2021.findings-acl.187.pdf) |
| 2021 | Findings | [Rule-Aware Reinforcement Learning for Knowledge Graph Reasoning](https://aclanthology.org/2021.findings-acl.412.pdf) |
| 2021 | Findings | [Phrase-Level Action Reinforcement Learning for Neural Dialog Response Generation](https://aclanthology.org/2021.findings-acl.446.pdf) |

## Licenses

[![CC0](http://i.creativecommons.org/p/zero/1.0/88x31.png)](http://creativecommons.org/publicdomain/zero/1.0/)

To the extent possible under law, [Zhihong Chen](https://github.com/zhjohnchan) has waived all copyright and related or neighboring rights to this work.