{"id":13994095,"url":"https://github.com/nlpdata/dream","last_synced_at":"2025-07-22T18:33:16.974Z","repository":{"id":175775364,"uuid":"172094210","full_name":"nlpdata/dream","owner":"nlpdata","description":"DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension","archived":false,"fork":false,"pushed_at":"2019-04-23T03:56:31.000Z","size":1420,"stargazers_count":78,"open_issues_count":0,"forks_count":13,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-11-29T15:50:54.217Z","etag":null,"topics":["dataset","dialogue","machine-reading-comprehension"],"latest_commit_sha":null,"homepage":"https://dataset.org/dream/","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nlpdata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-02-22T16:01:14.000Z","updated_at":"2024-11-21T09:09:08.000Z","dependencies_parsed_at":null,"dependency_job_id":"d2a4f8ae-199a-4f09-ba6d-a720cc586cfb","html_url":"https://github.com/nlpdata/dream","commit_stats":null,"previous_names":["nlpdata/dream"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nlpdata/dream","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlpdata%2Fdream","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlpdata%2Fdream/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlpdata%2Fdream/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlpdata%2Fdream/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nlpdata","download_url":"https://codeload.github.com/nlpdata/dream/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlpdata%2Fdream/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266552546,"owners_count":23947178,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","dialogue","machine-reading-comprehension"],"created_at":"2024-08-09T14:02:42.240Z","updated_at":"2025-07-22T18:33:11.958Z","avatar_url":"https://github.com/nlpdata.png","language":"Python","readme":"DREAM\n=====\nOverview\n--------\nThis repository maintains **DREAM**, a multiple-choice **D**ialogue-based **REA**ding comprehension exa**M**ination dataset.\n\n* Paper: https://arxiv.org/abs/1902.00164\n```\n@article{sundream2018,\n  title={{DREAM}: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension},\n  author={Sun, Kai and Yu, Dian and Chen, Jianshu and Yu, Dong and Choi, Yejin and Cardie, Claire},\n  journal={Transactions of the Association for Computational Linguistics},\n  year={2019},\n  url={https://arxiv.org/abs/1902.00164v1}\n}\n```\n\n* Leaderboard: https://dataset.org/dream/\n\nFiles in this repository:\n\n* ```data``` folder: the dataset.\n* ```annotation``` folder: question type annotations.\n* ```dsw++``` folder: code of DSW++.\n* ```ftlm++``` folder: code of FTLM++.\n* ```license.txt```: the license of DREAM.\n* ```websites.txt```: list of websites used for the data collection of DREAM.\n\nDataset\n-------\n```data/train.json```, ```data/dev.json``` and ```data/test.json``` are the training, development and test sets, respectively. The format of them is as follows:\n\n```\n[\n  [\n    [\n      dialogue 1 / turn 1,\n      dialogue 1 / turn 2,\n      ...\n    ],\n    [\n      {\n        \"question\": dialogue 1 / question 1,\n        \"choice\": [\n          dialogue 1 / question 1 / answer option 1,\n          dialogue 1 / question 1 / answer option 2,\n          dialogue 1 / question 1 / answer option 3\n        ],\n        \"answer\": dialogue 1 / question 1 / correct answer option\n      },\n      {\n        \"question\": dialogue 1 / question 2,\n        \"choice\": [\n          dialogue 1 / question 2 / answer option 1,\n          dialogue 1 / question 2 / answer option 2,\n          dialogue 1 / question 2 / answer option 3\n        ],\n        \"answer\": dialogue 1 / question 2 / correct answer option\n      },\n      ...\n    ],\n    dialogue 1 / id\n  ],\n  [\n    [\n      dialogue 2 / turn 1,\n      dialogue 2 / turn 2,\n      ...\n    ],\n    [\n      {\n        \"question\": dialogue 2 / question 1,\n        \"choice\": [\n          dialogue 2 / question 1 / answer option 1,\n          dialogue 2 / question 1 / answer option 2,\n          dialogue 2 / question 1 / answer option 3\n        ],\n        \"answer\": dialogue 2 / question 1 / correct answer option\n      },\n      {\n        \"question\": dialogue 2 / question 2,\n        \"choice\": [\n          dialogue 2 / question 2 / answer option 1,\n          dialogue 2 / question 2 / answer option 2,\n          dialogue 2 / question 2 / answer option 3\n        ],\n        \"answer\": dialogue 2 / question 2 / correct answer option\n      },\n      ...\n    ],\n    dialogue 2 / id\n  ],\n  ...\n]\n```\n\nQuestion Type Annotations\n-------------------------\n\n```annotation/{annotator1,annotator2}_{dev,test}.json``` are the question type annotations for 25% questions in the development and test sets from two annotators.\n\nIn accordance with the format explanation above, the question index starts from ```1```.\n\nWe adopt the following abbreviations:\n\n| Abbreviation | Question Type | \n| ------------ | ------------- |\n| m            | matching      |\n| s            | summary       |\n| l            | logic         |\n| a            | arithmetic    |\n| c            | commonsense   |\n\nCode\n----\n\n* DSW++\n\n  1. Copy the data folder ```data``` to ```dsw++/```.\n  2. Download ```numberbatch-en-17.06.txt.gz``` from https://github.com/commonsense/conceptnet-numberbatch, and put it into ```dsw++/data/```.\n  3. In ```dsw++```, execute ```python run.py```.\n  4. Execute ```python evaluate.py``` to get the accuracy on the test set.\n\n* FTLM++\n\n  1. Download the pre-trained language model from https://github.com/openai/finetune-transformer-lm, and copy the model folder ```model``` to ```ftlm++/```.\n  2. Copy the data folder ```data``` to ```ftlm++/```.\n  3. In ```ftlm++```, execute ```python train.py --submit```. You may want to also specify ```--n_gpu``` (e.g., 4) and ```--n_batch``` (e.g., 2) based on your environment.\n  4. Execute ```python evaluate.py``` to get the accuracy on the test set.\n\n\n**Note**: The results you get may be slightly different from those reported in the paper. For example, the dev and test accuracy for DSW++ in this repository is 51.2 and 50.2 respectively, while the reported accuracy in the paper is 51.4 and 50.1. That is due to (1) we refactor the code with different dependencies to make it portable, and (2) some of the code is non-deterministic due to GPU non-determinism.\n\n**Environment**: The code has been tested with Python 3.6/3.7 and Tensorflow 1.4\n\nOther Useful Code\n-----------------\nYou can refer to [this repository](https://github.com/nlpdata/mrc_bert_baseline) for a finetuned transformer baseline based on BERT.\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnlpdata%2Fdream","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnlpdata%2Fdream","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnlpdata%2Fdream/lists"}