{"id":13589145,"url":"https://github.com/google-deepmind/narrativeqa","last_synced_at":"2025-06-17T00:39:19.563Z","repository":{"id":46145549,"uuid":"114897541","full_name":"google-deepmind/narrativeqa","owner":"google-deepmind","description":"This repository contains the NarrativeQA dataset. It includes the list of documents with Wikipedia summaries, links to full stories, and questions and answers.","archived":false,"fork":false,"pushed_at":"2020-04-15T09:16:14.000Z","size":4995,"stargazers_count":478,"open_issues_count":0,"forks_count":67,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-06-07T05:45:39.910Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-deepmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-12-20T14:39:57.000Z","updated_at":"2025-05-17T13:35:20.000Z","dependencies_parsed_at":"2022-09-24T14:50:23.614Z","dependency_job_id":null,"html_url":"https://github.com/google-deepmind/narrativeqa","commit_stats":null,"previous_names":["google-deepmind/narrativeqa","deepmind/narrativeqa"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/google-deepmind/narrativeqa","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fnarrativeqa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fnarrativeqa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fnarrativeqa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fnarrativeqa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-deepmind","download_url":"https://codeload.github.com/google-deepmind/narrativeqa/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fnarrativeqa/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260268635,"owners_count":22983601,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T16:00:23.925Z","updated_at":"2025-06-17T00:39:19.527Z","avatar_url":"https://github.com/google-deepmind.png","language":"Shell","funding_links":[],"categories":["Datasets","Document Question Answering","Shell"],"sub_categories":["English"],"readme":"# The NarrativeQA Reading Comprehension Challenge Dataset\n\nThis repository contains the NarrativeQA dataset. It includes the list of\ndocuments with Wikipedia summaries, links to full stories, and questions and\nanswers.\n\nFor a detailed description of this see the paper\n[The NarrativeQA Reading Comprehension\nChallenge](https://arxiv.org/abs/1712.07040).  Please cite the paper if you use\nthis corpus in your work.\n\n\n### Files\n\n* documents.csv - contains document_id, set, kind, story_url, story_file_size,\n  wiki_url, wiki_title, story_word_count, story_start, story_end. The word count\n  is approximate after some basic cleanup and tokenization.\n* third_party/wikipedia/summaries.csv - contains document_id, set, summary,\n  summary_tokenized. The summaries are from Wikipedia.\n* qaps.csv - contains document_id, set, question, answer1, answer2,\n  question_tokenized, answer1_tokenized, answer2_tokenized.\n* download_stories.sh - script to download the stories.\n* compare.sh - compare downloaded story's file size to the document size we had.\n  (At the time of publication, all stories have \u003c3.5% file difference (except\n  one), likely due to punctuation encoding.)\n\n### Bibtex\n\n```\n@article{narrativeqa,\nauthor = {Tom\\'a\\v s Ko\\v cisk\\'y and Jonathan Schwarz and Phil Blunsom and\n          Chris Dyer and Karl Moritz Hermann and G\\'abor Melis and\n          Edward Grefenstette},\ntitle = {The {NarrativeQA} Reading Comprehension Challenge},\njournal = {Transactions of the Association for Computational Linguistics},\nurl = {https://TBD},\nvolume = {TBD},\nyear = {2018},\npages = {TBD},\n}\n```\n\n### Dataset Metadata\nThe following table is necessary for this dataset to be indexed by search\nengines such as \u003ca href=\"https://g.co/datasetsearch\"\u003eGoogle Dataset Search\u003c/a\u003e.\n\u003cdiv itemscope itemtype=\"http://schema.org/Dataset\"\u003e\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eproperty\u003c/th\u003e\n    \u003cth\u003evalue\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ename\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"name\"\u003eThe NarrativeQA Reading Comprehension Challenge Dataset\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ealternateName\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"alternateName\"\u003eNarrativeQA\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eurl\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"url\"\u003ehttps://github.com/deepmind/narrativeqa\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003esameAs\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"sameAs\"\u003ehttps://github.com/deepmind/narrativeqa\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003edescription\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"description\"\u003eThis repository contains the NarrativeQA dataset. It includes the list of\ndocuments with Wikipedia summaries, links to full stories, and questions and answers.\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eprovider\u003c/td\u003e\n    \u003ctd\u003e\n      \u003cdiv itemscope itemtype=\"http://schema.org/Organization\" itemprop=\"provider\"\u003e\n        \u003ctable\u003e\n          \u003ctr\u003e\n            \u003cth\u003eproperty\u003c/th\u003e\n            \u003cth\u003evalue\u003c/th\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003ename\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"name\"\u003eDeepMind\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003esameAs\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"sameAs\"\u003ehttps://en.wikipedia.org/wiki/DeepMind\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n        \u003c/table\u003e\n      \u003c/div\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003elicense\u003c/td\u003e\n    \u003ctd\u003e\n      \u003cdiv itemscope itemtype=\"http://schema.org/CreativeWork\" itemprop=\"license\"\u003e\n        \u003ctable\u003e\n          \u003ctr\u003e\n            \u003cth\u003eproperty\u003c/th\u003e\n            \u003cth\u003evalue\u003c/th\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003ename\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"name\"\u003eApache License, Version 2.0\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003eurl\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"url\"\u003ehttps://www.apache.org/licenses/LICENSE-2.0.html\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n        \u003c/table\u003e\n      \u003c/div\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ecitation\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"citation\"\u003ehttps://identifiers.org/arxiv:1712.07040\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fnarrativeqa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-deepmind%2Fnarrativeqa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fnarrativeqa/lists"}