{"id":13754252,"url":"https://github.com/allenai/macaw","last_synced_at":"2025-10-13T15:58:04.032Z","repository":{"id":45071125,"uuid":"366787578","full_name":"allenai/macaw","owner":"allenai","description":"Multi-angle c(q)uestion answering","archived":false,"fork":false,"pushed_at":"2022-08-22T13:19:41.000Z","size":357,"stargazers_count":458,"open_issues_count":7,"forks_count":57,"subscribers_count":19,"default_branch":"main","last_synced_at":"2024-11-16T07:33:15.451Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/allenai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-05-12T16:53:07.000Z","updated_at":"2024-07-25T15:13:36.000Z","dependencies_parsed_at":"2022-07-18T06:00:37.895Z","dependency_job_id":null,"html_url":"https://github.com/allenai/macaw","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fmacaw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fmacaw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fmacaw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fmacaw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/allenai","download_url":"https://codeload.github.com/allenai/macaw/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253335702,"owners_count":21892714,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T09:01:52.167Z","updated_at":"2025-10-13T15:58:03.932Z","avatar_url":"https://github.com/allenai.png","language":"Python","funding_links":[],"categories":["文本匹配 文本检索 文本相似度","Python"],"sub_categories":["其他_文本生成、文本对话"],"readme":"# Macaw\n\n## Introduction\n\nMacaw (\u003cb\u003eM\u003c/b\u003eulti-\u003cb\u003ea\u003c/b\u003engle \u003cb\u003ec\u003c/b\u003e(q)uestion \u003cb\u003ea\u003c/b\u003ens\u003cb\u003ew\u003c/b\u003eering) is a ready-to-use model capable of general \nquestion answering, showing robustness outside the domains it was \ntrained on. It has been trained in \"multi-angle\" fashion, which means it can handle a flexible set of input\nand output \"slots\" (like question, answer, explanation) .\n\nMacaw was built on top of [T5](https://github.com/google-research/text-to-text-transfer-transformer) and \ncomes in different sizes:  [macaw-11b](https://huggingface.co/allenai/macaw-11b), [macaw-3b](https://huggingface.co/allenai/macaw-3b), \nand [macaw-large](https://huggingface.co/allenai/macaw-large), as well as an answer-focused version featured on \nvarious leaderboards: [macaw-answer-11b](https://huggingface.co/allenai/macaw-answer-11b) (see [below](#training-data)).\n\n### Examples\n\nSome suggestive examples from the Macaw (11B) model, for different angles:\n\n  * (Q→A) \u003ci\u003eGiven a question, what's the answer?\u003c/i\u003e \u003cbr\u003e\n  **Q: James went camping in the woods, but forgot to bring a hammer to bang the tent pegs in. What else might he use? \u003cbr\u003e \n  → A: rocks**\n  \n  * (QM→A) \u003ci\u003eGiven a question and answer choices, what's the answer?\u003c/i\u003e \u003cbr\u003e\n  **Q: James went camping in the woods, but forgot to bring a hammer to bang the tent pegs in. What else might he use? \u003cbr\u003e \n           M: (A) a leaf (B) a log (C) a worm \u003cbr\u003e\n  → A: a log**\n\n  * (Q→AE) \u003ci\u003eGiven a question, what's the answer and an explanation?\u003c/i\u003e\u003cbr\u003e\n  **Q: Which force pulls objects to the ground? \u003cbr\u003e\n  → A: gravity \u003cbr\u003e\n  → E: Gravitational force causes objects that have mass to be pulled down on a planet.**\n\n  * (A→QE) \u003ci\u003eGiven an answer, what's a plausible question and explanation?\u003c/i\u003e\u003cbr\u003e\n  **A: elephant \u003cbr\u003e\n  → Q: Which animal has the largest ears? \u003cbr\u003e\n  → E: The ears of an elephant are the largest.**\n\n  * (C→QA) \u003ci\u003eGiven a context, what's a plausible question and answer?\u003c/i\u003e\u003cbr\u003e\n  **C: A car needs a battery to start. \u003cbr\u003e\n  → Q: What is required for a car to start? \u003cbr\u003e\n  → A: battery**\n  \nFor many more examples of the basic Q→A angle, see [examples.md](examples.md).\n\n## Usage examples\n\nMacaw can easily be used in the Hugging Face [transformers](https://github.com/huggingface/transformers) \nlibrary, as shown here for the \nsmallest model (the smallest model is not generally recommended, but has much \nsmaller footprint), where given a question we want to return an answer and \nsuggested multiple-choice answer options.\n\n```\nfrom transformers import AutoTokenizer, AutoModelForSeq2SeqLM\ntokenizer = AutoTokenizer.from_pretrained(\"allenai/macaw-large\")\nmodel = AutoModelForSeq2SeqLM.from_pretrained(\"allenai/macaw-large\")\ninput_string = \"$answer$ ; $mcoptions$ ; $question$ = What is the color of a cloudy sky?\"\ninput_ids = tokenizer.encode(input_string, return_tensors=\"pt\")\noutput = model.generate(input_ids, max_length=200)\n\n\u003e\u003e\u003e tokenizer.batch_decode(output, skip_special_tokens=True)\n['$answer$ = gray ; $mcoptions$ = (A) blue (B) white (C) grey (D) white']\n```\n\n(run `pip install -r requirements.txt` if any dependencies are missing). Note there's no guarantee the different \nslots are fully coherent, as in gray/grey (and duplicate \"white\") here,\nmore so for the macaw-large model vs the larger ones.\n\nThe code in `macaw/utils.py` includes some convenience wrappers, such as `load_model` and \n`run_macaw`, here are some examples\nloading the macaw-11b model onto two GPUs (need around 48GB total GPU memory for the \nlargest model to work):\n\n```\nfrom macaw.utils import load_model, run_macaw\nmodel_dict = load_model(\"allenai/macaw-11b\", cuda_devices=[0,1])\nres1 = run_macaw(\"Q: Which force pulls objects to the ground?\\nA\\nE\", model_dict)\n# Alternate input syntax\nres2 = run_macaw({\"Q:\":\"Which force causes a compass needle to point north?\", \"A\":\"\"}, model_dict)\n# Add sampling options for the output\nres3 = run_macaw(\"Q: Which force pulls objects to the ground?\\nA\\nE\", model_dict, {\"do_sample\": True, \"temperature\": 2.0})\n\n\u003e\u003e\u003e [print(res[\"output_slots_list\"][0]) for res in [res1, res2, res3]]\n{'answer': 'gravity', 'explanation': 'Gravitational force causes objects that have mass to be pulled down on a planet.'}\n{'answer': 'magnetism'}\n{'answer': 'gravitional force', 'explanation': 'Gravitational force causes objects that have mass to be pulled down on a planet.'}\n```\n\nFor batch evaluation of instances at various angles, see [`macaw/batch_eval.py`](macaw/batch_eval.py) for pointers.\n\n## Supported slots\n\nHere are the slots available in Macaw, generally applicable for both input and output:\n\n| Slot name | Description | Example | \n|---|---|---|\n|question (Q) | Question text | What is the color of a cloudy sky? |\n|answer (A) | Answer text | The sky is blue |\n|mcoptions (M) | Multiple-choice answer options |  (A) blue (B) white (C) grey |\n|context (C) | Potentially relevant context (noisy IR) | The sky looks blue to us because... |\n|explanation (E) | Sentences explaining the answer | A cloudy sky is usually gray in color... |\n\nAn angle is a specific set of input/output slots, for instance QM-\u003eAE is the task of producing answer and explanation,\ngiven a question and multiple-choice options. Macaw is trained on a wide variety of angles and handles unseen angles\nas well, one exception is that the context (C) only appears as an input slot in the training data.\n\n  \n## The Challenge300 dataset of probing questions\n\nThe **Challenge300** dataset of 300 diverse probing examples can be found in \n[challenge300-probes-v1.jsonl](challenge300-probes-v1.jsonl). The basic Q→A output\nfrom Macaw (at different sizes), as well as outputs from [GPT3](https://arxiv.org/pdf/2005.14165.pdf), \n[Jurassic-1](https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1) and \n[alternate T5 models](https://www.aclweb.org/anthology/2020.emnlp-main.437/) trained on NaturalQuestions, can be seen in\n[examples.md](examples.md).\n\n## Demo\n\nSee [DEMO.md](DEMO.md) for instructions and code to host an interactive version of Macaw.\n\n## Training data\n\nMacaw was trained in two steps from the text-to-text transformer \nmodel [T5](https://github.com/google-research/text-to-text-transfer-transformer):\n\n   1. Multi-angle version of [UnifiedQA](https://github.com/allenai/unifiedqa) by fine-tuning T5\n   on the following 7 datasets and associated angles:\n       * [BoolQ](https://github.com/google-research-datasets/boolean-questions), \n       [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer), \n       [NarrativeQA](https://github.com/deepmind/narrativeqa): QC→A, AC→Q\n       * [ARC](https://allenai.org/data/arc), [OBQA](https://allenai.org/data/open-book-qa): \n       QMC→A, QC→A, QM→A,QAC→M, MAC→Q, AC→QM\n       * [RACE](https://www.cs.cmu.edu/~glai1/data/race/), \n       [MCTest](https://mattr1.github.io/mctest/): QMC→A, QC→A, QAC→M,MAC→Q\n       \n   2. Further fine-tuning of Multi-Angle UnifiedQA on multiple-choice and direct-answer elementary science questions, \n   along with (up to 5) explanation sentences from [WorldTreeV2](http://cognitiveai.org/explanationbank/): \n       * [ARC](https://allenai.org/data/arc): QMC→AE, AQC→M, QMEC→A, QME→A, QE→A, QMC→A, QC→AE, QM→AE, QMAC→E, QMA→E\n       * [ARC-DA](https://allenai.org/data/arc-da): QC→AE, Q→AE, QC→A, Q→A, QEC→A, QE→A, AE→Q, AC→Q, QA→E, AQC→E\n       \n   3. A specialized answer-focused model, \u003cb\u003emacaw-answer-11b\u003c/b\u003e (called \"UnifiedQA + ARC MC/DA + IR\" on the \n   leaderboards for [ARC](https://leaderboard.allenai.org/arc/submissions/public), \n   [ARC-Easy](https://leaderboard.allenai.org/arc_easy/submissions/public), and \n   [ARC-DA](https://leaderboard.allenai.org/genie-arcda/submissions/public))\n   was trained on a smaller set of angles, not including explanations:\n       * ARC: QMC→A, QAC→M, QC→A, QM→A, MAC→Q, AC→QM, M→QA\n       * ARC-DA: QC→A, Q→A, AC→Q, C→QA\n       \n   \n## Available models\n\nThe Macaw models can be accessed from the Hugging Face model hub:\n\n   * [macaw-11b](https://huggingface.co/allenai/macaw-11b)  (11 billion parameters)\n   * [macaw-3b](https://huggingface.co/allenai/macaw-3b)  (3 billion parameters)\n   * [macaw-large](https://huggingface.co/allenai/macaw-large)  (770 million parameters)\n   * [macaw-answer-11b](https://huggingface.co/allenai/macaw-answer-11b)  (11 billion parameters)\n\nFor a sense of the degradation in performance for the smaller sizes, here are baseline scores on the ARC Challenge and \nARC Easy multiple-choice \u003cb\u003edevelopment\u003c/b\u003e questions. Included are variants with and without IR context from a large science \ncorpus (corresponding to angles QMC→A and QM→A respectively).\n\n|Model | ARC Challenge | ARC Challenge (no IR) | ARC Easy | ARC Easy (no IR)|\n|---|---|---|---|---|\n|Macaw (11B) | 76.9 | 74.6 | 91.2 | 84.9|\n|Macaw-3B | 68.2 | 67.9 | 87.9 |  77.7|\n|Macaw-large | 57.2 | 50.5 | 82.5 | 63.9|\n|Macaw-answer (11B) | 79.9 | 75.2 | 90.5 | 85.8|\n\n## Disclaimer\n\nAs a model capable of generating free form text, the output of the model is not guaranteed to be free of\noffensive material, so appropriate caution is advised when using the model.\n\n## Citation\n\nIf you use Macaw in your work, please reference the related [paper](https://arxiv.org/abs/2109.02593) using\n\n```\n@article{Tafjord2021Macaw,\n  title={General-Purpose Question-Answering with {M}acaw},\n  author={Oyvind Tafjord and Peter Clark},\n  journal={ArXiv},\n  year={2021},\n  volume={abs/2109.02593}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallenai%2Fmacaw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fallenai%2Fmacaw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallenai%2Fmacaw/lists"}