Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/thunlp/RCPapers

Must-read papers on Machine Reading Comprehension
https://github.com/thunlp/RCPapers

paper-list reading-comprehension

Last synced: 8 days ago
JSON representation

Must-read papers on Machine Reading Comprehension

Lists

README

        

## Must-read papers on Machine Reading Comprehension.

Contributed by [Yankai Lin](http://www.thunlp.org/~lyk/), Deming Ye and Haozhe Ji.

### Model Architecture

1. **Memory networks.** Jason Weston, Sumit Chopra, and Antoine Bordes. arXiv preprint arXiv:1410.3916 (2014). [paper](https://arxiv.org/pdf/1410.3916)
2. **Teaching Machines to Read and Comprehend.** Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. NIPS 2015. [paper](https://papers.nips.cc/paper/5945-teaching-machines-to-read-and-comprehend.pdf)
3. **Text Understanding with the Attention Sum Reader Network.** Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, and Jan Kleindienst. ACL 2016. [paper](http://www.aclweb.org/anthology/P16-1086)
4. **A Thorough Examination of the Cnn/Daily Mail Reading Comprehension Task.** Danqi Chen, Jason Bolton, and Christopher D. Manning. ACL 2016. [paper](https://www.aclweb.org/anthology/P16-1223)
4. **Long Short-Term Memory-Networks for Machine Reading.** Jianpeng Cheng, Li Dong, and Mirella Lapata. EMNLP 2016. [paper](https://aclweb.org/anthology/D16-1053)
4. **Key-value Memory Networks for Directly Reading Documents.** Alexander Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, and Jason Weston. EMNLP 2016. [paper](http://www.aclweb.org/anthology/D16-1147)
5. **Modeling Human Reading with Neural Attention.** Michael Hahn and Frank Keller. EMNLP 2016. [paper](http://www.aclweb.org/anthology/D16-1009)
6. **Learning Recurrent Span Representations for Extractive Question Answering** Kenton Lee, Shimi Salant, Tom Kwiatkowski, Ankur Parikh, Dipanjan Das, and Jonathan Berant. arXiv preprint arXiv:1611.01436 (2016). [paper](https://arxiv.org/pdf/1611.01436)
7. **Multi-Perspective Context Matching for Machine Comprehension.** Zhiguo Wang, Haitao Mi, Wael Hamza, and Radu Florian. arXiv preprint arXiv:1612.04211. [paper](https://arxiv.org/pdf/1612.04211)
5. **Natural Language Comprehension with the Epireader.** Adam Trischler, Zheng Ye, Xingdi Yuan, and Kaheer Suleman. EMNLP 2016. [paper](https://www.aclweb.org/anthology/D16-1013)
6. **Iterative Alternating Neural Attention for Machine Reading.** Alessandro Sordoni, Philip Bachman, Adam Trischler, and Yoshua Bengio. arXiv preprint arXiv:1606.02245 (2016). [paper](https://arxiv.org/pdf/1606.02245)
7. **Bidirectional Attention Flow for Machine Comprehension.** Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. ICLR 2017. [paper](https://arxiv.org/pdf/1611.01603.pdf)
8. **Machine Comprehension Using Match-lstm and Answer Pointer.** Shuohang Wang and Jing Jiang. arXiv preprint arXiv:1608.07905 (2016). [paper](https://arxiv.org/pdf/1608.07905)
9. **Gated Self-matching Networks for Reading Comprehension and Question Answering.** Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. ACL 2017. [paper](http://www.aclweb.org/anthology/P17-1018)
10. **Attention-over-attention Neural Networks for Reading Comprehension.** Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, and Guoping Hu. ACL 2017. [paper](http://aclweb.org/anthology/P17-1055)
11. **Gated-attention Readers for Text Comprehension.** Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William W. Cohen, and Ruslan Salakhutdinov. ACL 2017. [paper](http://aclweb.org/anthology/P17-1168)
12. **A Constituent-Centric Neural Architecture for Reading Comprehension.** Pengtao Xie and Eric Xing. ACL 2017. [paper](http://aclweb.org/anthology/P17-1129)
12. **Structural Embedding of Syntactic Trees for Machine Comprehension.** Rui Liu, Junjie Hu, Wei Wei, Zi Yang, and Eric Nyberg. EMNLP 2017. [paper](http://aclweb.org/anthology/D17-1085)
13. **Accurate Supervised and Semi-Supervised Machine Reading for Long Documents.** Izzeddin Gur, Daniel Hewlett, Alexandre Lacoste, and Llion Jones. EMNLP 2017. [paper](http://aclweb.org/anthology/D17-1214)
13. **MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension.** Boyuan Pan, Hao Li, Zhou Zhao, Bin Cao, Deng Cai, and Xiaofei He. arXiv preprint arXiv:1707.09098 (2017). [paper](https://arxiv.org/pdf/1707.09098)
14. **Dynamic Coattention Networks For Question Answering.** Caiming Xiong, Victor Zhong, and Richard Socher. ICLR 2017 [paper](https://arxiv.org/pdf/1611.01604.pdf)
14. **R-NET: Machine Reading Comprehension with Self-matching Networks.** Natural Language Computing Group, Microsoft Research Asia. [paper](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf)
15. **Reasonet: Learning to Stop Reading in Machine Comprehension.** Yelong Shen, Po-Sen Huang, Jianfeng Gao, and Weizhu Chen. KDD 2017. [paper](https://arxiv.org/pdf/1609.05284)
16. **FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension.** Hsin-Yuan Huang, Chenguang Zhu, Yelong Shen, and Weizhu Chen. ICLR 2018. [paper](https://arxiv.org/pdf/1711.07341)
17. **Making Neural QA as Simple as Possible but not Simpler.** Dirk Weissenborn, Georg Wiese, and Laura Seiffe. CoNLL 2017. [paper](http://www.aclweb.org/anthology/K17-1028)
18. **Efficient and Robust Question Answering from Minimal Context over Documents.** Sewon Min, Victor Zhong, Richard Socher, and Caiming Xiong. ACL 2018. [paper](http://aclweb.org/anthology/P18-1160)
19. **Simple and Effective Multi-Paragraph Reading Comprehension.** Christopher Clark and Matt Gardner. ACL 2018. [paper](http://aclweb.org/anthology/P18-1078)
18. **Neural Speed Reading via Skim-RNN.** Minjoon Seo, Sewon Min, Ali Farhadi, and Hannaneh Hajishirzi. ICLR2018. [paper](https://arxiv.org/pdf/1711.02085)
19. **Hierarchical Attention Flow forMultiple-Choice Reading Comprehension.** Haichao Zhu, Furu Wei, Bing Qin, and Ting Liu. AAAI 2018. [paper](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/16331/16177)
20. **Towards Reading Comprehension for Long Documents.** Yuanxing Zhang, Yangbin Zhang, Kaigui Bian, and Xiaoming Li. IJCAI 2018. [paper](https://www.ijcai.org/proceedings/2018/0638.pdf)
21. **Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension.** Zhen Wang, Jiachen Liu, Xinyan Xiao, Yajuan Lyu, and Tian Wu. ACL 2018. [paper](http://aclweb.org/anthology/P18-1159)
22. **Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification.** Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li, and Haifeng Wang. ACL 2018. [paper](http://aclweb.org/anthology/P18-1178)
23. **Reinforced Mnemonic Reader for Machine Reading Comprehension.** Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. IJCAI 2018. [paper](https://www.ijcai.org/proceedings/2018/0570.pdf)
24. **Stochastic Answer Networks for Machine Reading Comprehension.** Xiaodong Liu, Yelong Shen, Kevin Duh, and Jianfeng Gao. ACL 2018. [paper](http://aclweb.org/anthology/P18-1157)
25. **Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering.** Wei Wang, Ming Yan, and Chen Wu. ACL 2018. [paper](http://aclweb.org/anthology/P18-1158)
26. **A Multi-Stage Memory Augmented Neural Networkfor Machine Reading Comprehension.** Seunghak Yu, Sathish Indurthi, Seohyun Back, and Haejun Lee. ACL 2018 workshop. [paper](http://aclweb.org/anthology/W18-2603)
27. **S-NET: From Answer Extraction to Answer Generation for Machine Reading Comprehension.** Chuanqi Tan, Furu Wei, Nan Yang, Bowen Du, Weifeng Lv, and Ming Zhou. AAAI2018. [paper](https://arxiv.org/abs/1706.04815)
28. **Ask the Right Questions: Active Question Reformulation with Reinforcement Learning.** Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, and Wei Wang. ICLR2018. [paper](https://arxiv.org/abs/1705.07830)
29. **QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension.** Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, and Quoc V. Le. ICLR2018. [paper](https://arxiv.org/abs/1804.09541)
30. **Read + Verify: Machine Reading Comprehension with Unanswerable Questions.** Minghao Hu, Furu Wei, Yuxing Peng, Zhen Huang, Nan Yang, and Ming Zhou. AAAI2019. [paper](https://arxiv.org/abs/1808.05759)
31. **Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering.** Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong. [paper](https://arxiv.org/abs/1911.10470)

### Utilizing External Knowledge
1. **Leveraging Knowledge Bases in LSTMs for Improving Machine Reading.** Bishan Yang and Tom Mitchell. ACL 2017. [paper](http://aclweb.org/anthology/P17-1132)
2. **Learned in Translation: Contextualized Word Vectors.** Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher. arXiv preprint arXiv:1708.00107 (2017). [paper](https://arxiv.org/pdf/1708.00107)
3. **Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge.** Todor Mihaylov and Anette Frank. ACL 2018. [paper](http://aclweb.org/anthology/P18-1076)
4. **A Comparative Study of Word Embeddings for Reading Comprehension.** Bhuwan Dhingra, Hanxiao Liu, Ruslan Salakhutdinov, and William W. Cohen. arXiv preprint arXiv:1703.00993 (2017). [paper](https://arxiv.org/pdf/1703.00993)
5. **Deep contextualized word representations.** Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. NAACL 2018. [paper](http://aclweb.org/anthology/N18-1202)
6. **Improving Language Understanding by Generative Pre-Training.** Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. OpenAI. [paper](https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf)
6. **BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.** Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. arXiv preprint arXiv:1810.04805 (2018). [paper](https://arxiv.org/pdf/1810.04805.pdf)

### Exploration
1. **Adversarial Examples for Evaluating Reading Comprehension Systems.** Robin Jia, and Percy Liang. EMNLP 2017. [paper](https://web.stanford.edu/~robinjia/pdf/emnlp2017-adversarial.pdf)
2. **Did the Model Understand the Question?** Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, and Kedar Dhamdhere. ACL 2018. [paper](http://aclweb.org/anthology/P18-1176)

### Open Domain Question Answering
1. **Reading Wikipedia to Answer Open-Domain Questions.** Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. ACL 2017. [paper](http://aclweb.org/anthology/P17-1171)
2. **R^3: Reinforced Reader-Ranker for Open-Domain Question Answering.** Shuohang Wang, Mo Yu, Xiaoxiao Guo, Zhiguo Wang, Tim Klinger, Wei Zhang, Shiyu Chang, Gerald Tesauro, Bowen Zhou, and Jing Jiang. AAAI 2018. [paper](https://arxiv.org/pdf/1709.00023)
3. **Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering.** Shuohang Wang, Mo Yu, Jing Jiang, Wei Zhang, Xiaoxiao Guo, Shiyu Chang, Zhiguo Wang, Tim Klinger, Gerald Tesauro, and Murray Campbell. ICLR 2018. [paper](https://arxiv.org/pdf/1711.05116)
4. **Denoising Distantly Supervised Open-Domain Question Answering.** Yankai Lin, Haozhe Ji, Zhiyuan Liu, and Maosong Sun. ACL 2018. [paper](http://aclweb.org/anthology/P18-1161)
5. **Answering Complex Open-domain Questions Through Iterative Query Generation.** Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning. EMNLP 2019. [paper](https://arxiv.org/abs/1910.07000v1)

### Datasets
1. (SQuAD 1.0) **SQuAD: 100,000+ Questions for Machine Comprehension of Text.** Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. EMNLP 2016. [paper](https://aclweb.org/anthology/D16-1264)
2. (SQuAD 2.0) **Know What You Don't Know: Unanswerable Questions for SQuAD.**
Pranav Rajpurkar, Robin Jia, and Percy Liang. ACL 2018. [paper](http://aclweb.org/anthology/P18-2124)
3. (MS MARCO) **MS MARCO: A Human Generated MAchine Reading COmprehension Dataset.** Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. arXiv preprint arXiv:1611.09268 (2016). [paper](https://arxiv.org/pdf/1611.09268)
4. (Quasar) **Quasar: Datasets for Question Answering by Search and Reading.** Bhuwan Dhingra, Kathryn Mazaitis, and William W. Cohen. arXiv preprint arXiv:1707.03904 (2017). [paper](https://arxiv.org/pdf/1707.03904)
5. (TriviaQA) **TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension.** Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer. arXiv preprint arXiv:1705.03551 (2017). [paper](https://arxiv.org/pdf/1705.03551)
6. (SearchQA) **SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine.**
Matthew Dunn, Levent Sagun, Mike Higgins, V. Ugur Guney, Volkan Cirik, and Kyunghyun Cho. arXiv preprint arXiv:1704.05179 (2017). [paper](https://arxiv.org/pdf/1704.05179)
7. (QuAC) **QuAC : Question Answering in Context.** Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. arXiv preprint arXiv:1808.07036 (2018). [paper](https://arxiv.org/pdf/1808.07036)
8. (CoQA) **CoQA: A Conversational Question Answering Challenge.** Siva Reddy, Danqi Chen, and Christopher D. Manning. arXiv preprint arXiv:1808.07042 (2018). [paper](https://arxiv.org/pdf/1808.07042)
7. (MCTest) **MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text.** Matthew Richardson, Christopher J.C. Burges, and Erin Renshaw. EMNLP 2013. [paper](http://www.aclweb.org/anthology/D13-1020).
8. (CNN/Daily Mail) **Teaching Machines to Read and Comprehend.** Hermann, Karl Moritz, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. NIPS 2015. [paper](https://papers.nips.cc/paper/5945-teaching-machines-to-read-and-comprehend.pdf)
9. (CBT) **The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations.** Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. arXiv preprint arXiv:1511.02301 (2015). [paper](https://arxiv.org/pdf/1511.02301)
10. (bAbi) **Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks.** Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, and Tomas Mikolov. arXiv preprint arXiv:1502.05698 (2015). [paper](https://arxiv.org/pdf/1502.05698)
11. (LAMBADA) **The LAMBADA Dataset:Word Prediction Requiring a Broad Discourse Context.** Denis Paperno, Germ ́an Kruszewski, Angeliki Lazaridou, Quan Ngoc Pham, Raffaella Bernardi, Sandro Pezzelle, Marco Baroni, Gemma Boleda, and Raquel Fern ́andez. ACL 2016. [paper](https://www.aclweb.org/anthology/P16-1144)
12. (SCT) **LSDSem 2017 Shared Task: The Story Cloze Test.** Nasrin Mostafazadeh, Michael Roth, Annie Louis,Nathanael Chambers, and James F. Allen. ACL 2017 workshop. [paper](http://aclweb.org/anthology/W17-0906)
13. (Who did What) **Who did What: A Large-Scale Person-Centered Cloze Dataset** Takeshi Onishi, Hai Wang, Mohit Bansal, Kevin Gimpel, and David McAllester. EMNLP 2016. [paper](https://aclweb.org/anthology/D16-1241)
14. (NewsQA) **NewsQA: A Machine Comprehension Dataset.** Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, and Kaheer Suleman. arXiv preprint arXiv:1611.09830 (2016). [paper](https://arxiv.org/pdf/1611.09830)
15. (RACE) **RACE: Large-scale ReAding Comprehension Dataset From Examinations.** Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. EMNLP 2017. [paper](http://aclweb.org/anthology/D17-1082)
16. (ARC) **Think you have Solved Question Answering?Try ARC, the AI2 Reasoning Challenge.** Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot,Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. arXiv preprint arXiv:1803.05457 (2018). [paper](https://arxiv.org/pdf/1803.05457)
17. (MCScript) **MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge.** Simon Ostermann, Ashutosh Modi, Michael Roth, Stefan Thater, and Manfred Pinkal. arXiv preprint arXiv:1803.05223. [paper](https://arxiv.org/pdf/1803.05223.pdf)
18. (NarrativeQA) **The NarrativeQA Reading Comprehension Challenge**.
Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, and Edward Grefenstette. TACL 2018. [paper](http://aclweb.org/anthology/Q18-1023)
19. (DuoRC) **DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension.** Amrita Saha, Rahul Aralikatte, Mitesh M. Khapra, and Karthik Sankaranarayanan. ACL 2018. [paper](http://aclweb.org/anthology/P18-1156)
20. (CLOTH) **Large-scale Cloze Test Dataset Created by Teachers.** Qizhe Xie, Guokun Lai, Zihang Dai, and Eduard Hovy. EMNLP 2018. [paper](https://arxiv.org/pdf/1711.03225)
21. (DuReader) **DuReader: a Chinese Machine Reading Comprehension Dataset from
Real-world Applications.** Wei He, Kai Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, and Haifeng Wang. ACL 2018 Workshop. [paper](https://arxiv.org/abs/1711.05073)
22. (CliCR) **CliCR: a Dataset of Clinical Case Reports for Machine Reading Comprehension.** Simon Suster and Walter Daelemans. NAACL 2018. [paper](http://aclweb.org/anthology/N18-1140)
23. (QUOREF) **Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning.** Pradeep Dasigi, Nelson F. Liu, Ana Marasović, Noah A. Smith, Matt Gardner. EMNLP2019. [paper](https://arxiv.org/abs/1908.05803)