Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/allenai/macaw
Multi-angle c(q)uestion answering
https://github.com/allenai/macaw
Last synced: about 1 month ago
JSON representation
Multi-angle c(q)uestion answering
- Host: GitHub
- URL: https://github.com/allenai/macaw
- Owner: allenai
- License: apache-2.0
- Created: 2021-05-12T16:53:07.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-22T13:19:41.000Z (over 2 years ago)
- Last Synced: 2024-08-03T09:07:08.927Z (5 months ago)
- Language: Python
- Size: 349 KB
- Stars: 458
- Watchers: 19
- Forks: 57
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - allenai/macaw - angle c(q)uestion answering 多角度 c(q) 问题回答)是一种即用型模型,能够进行一般问题回答,在训练的领域之外表现出稳健性。 它以“多角度”方式进行了训练,这意味着它可以处理一组灵活的输入和输出“槽”(如问题、答案、解释)。Macaw 建立在 T5 之上,有不同的尺寸:macaw-11b、macaw-3b 和 macaw-large,以及各种排行榜上的以答案为重点的版本:macaw-answer-11b。 (文本匹配 文本检索 文本相似度 / 其他_文本生成、文本对话)
README
# Macaw
## Introduction
Macaw (Multi-angle c(q)uestion answering) is a ready-to-use model capable of general
question answering, showing robustness outside the domains it was
trained on. It has been trained in "multi-angle" fashion, which means it can handle a flexible set of input
and output "slots" (like question, answer, explanation) .Macaw was built on top of [T5](https://github.com/google-research/text-to-text-transfer-transformer) and
comes in different sizes: [macaw-11b](https://huggingface.co/allenai/macaw-11b), [macaw-3b](https://huggingface.co/allenai/macaw-3b),
and [macaw-large](https://huggingface.co/allenai/macaw-large), as well as an answer-focused version featured on
various leaderboards: [macaw-answer-11b](https://huggingface.co/allenai/macaw-answer-11b) (see [below](#training-data)).### Examples
Some suggestive examples from the Macaw (11B) model, for different angles:
* (Q→A) Given a question, what's the answer?
**Q: James went camping in the woods, but forgot to bring a hammer to bang the tent pegs in. What else might he use?
→ A: rocks**
* (QM→A) Given a question and answer choices, what's the answer?
**Q: James went camping in the woods, but forgot to bring a hammer to bang the tent pegs in. What else might he use?
M: (A) a leaf (B) a log (C) a worm
→ A: a log*** (Q→AE) Given a question, what's the answer and an explanation?
**Q: Which force pulls objects to the ground?
→ A: gravity
→ E: Gravitational force causes objects that have mass to be pulled down on a planet.*** (A→QE) Given an answer, what's a plausible question and explanation?
**A: elephant
→ Q: Which animal has the largest ears?
→ E: The ears of an elephant are the largest.*** (C→QA) Given a context, what's a plausible question and answer?
**C: A car needs a battery to start.
→ Q: What is required for a car to start?
→ A: battery**
For many more examples of the basic Q→A angle, see [examples.md](examples.md).## Usage examples
Macaw can easily be used in the Hugging Face [transformers](https://github.com/huggingface/transformers)
library, as shown here for the
smallest model (the smallest model is not generally recommended, but has much
smaller footprint), where given a question we want to return an answer and
suggested multiple-choice answer options.```
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("allenai/macaw-large")
model = AutoModelForSeq2SeqLM.from_pretrained("allenai/macaw-large")
input_string = "$answer$ ; $mcoptions$ ; $question$ = What is the color of a cloudy sky?"
input_ids = tokenizer.encode(input_string, return_tensors="pt")
output = model.generate(input_ids, max_length=200)>>> tokenizer.batch_decode(output, skip_special_tokens=True)
['$answer$ = gray ; $mcoptions$ = (A) blue (B) white (C) grey (D) white']
```(run `pip install -r requirements.txt` if any dependencies are missing). Note there's no guarantee the different
slots are fully coherent, as in gray/grey (and duplicate "white") here,
more so for the macaw-large model vs the larger ones.The code in `macaw/utils.py` includes some convenience wrappers, such as `load_model` and
`run_macaw`, here are some examples
loading the macaw-11b model onto two GPUs (need around 48GB total GPU memory for the
largest model to work):```
from macaw.utils import load_model, run_macaw
model_dict = load_model("allenai/macaw-11b", cuda_devices=[0,1])
res1 = run_macaw("Q: Which force pulls objects to the ground?\nA\nE", model_dict)
# Alternate input syntax
res2 = run_macaw({"Q:":"Which force causes a compass needle to point north?", "A":""}, model_dict)
# Add sampling options for the output
res3 = run_macaw("Q: Which force pulls objects to the ground?\nA\nE", model_dict, {"do_sample": True, "temperature": 2.0})>>> [print(res["output_slots_list"][0]) for res in [res1, res2, res3]]
{'answer': 'gravity', 'explanation': 'Gravitational force causes objects that have mass to be pulled down on a planet.'}
{'answer': 'magnetism'}
{'answer': 'gravitional force', 'explanation': 'Gravitational force causes objects that have mass to be pulled down on a planet.'}
```For batch evaluation of instances at various angles, see [`macaw/batch_eval.py`](macaw/batch_eval.py) for pointers.
## Supported slots
Here are the slots available in Macaw, generally applicable for both input and output:
| Slot name | Description | Example |
|---|---|---|
|question (Q) | Question text | What is the color of a cloudy sky? |
|answer (A) | Answer text | The sky is blue |
|mcoptions (M) | Multiple-choice answer options | (A) blue (B) white (C) grey |
|context (C) | Potentially relevant context (noisy IR) | The sky looks blue to us because... |
|explanation (E) | Sentences explaining the answer | A cloudy sky is usually gray in color... |An angle is a specific set of input/output slots, for instance QM->AE is the task of producing answer and explanation,
given a question and multiple-choice options. Macaw is trained on a wide variety of angles and handles unseen angles
as well, one exception is that the context (C) only appears as an input slot in the training data.
## The Challenge300 dataset of probing questionsThe **Challenge300** dataset of 300 diverse probing examples can be found in
[challenge300-probes-v1.jsonl](challenge300-probes-v1.jsonl). The basic Q→A output
from Macaw (at different sizes), as well as outputs from [GPT3](https://arxiv.org/pdf/2005.14165.pdf),
[Jurassic-1](https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1) and
[alternate T5 models](https://www.aclweb.org/anthology/2020.emnlp-main.437/) trained on NaturalQuestions, can be seen in
[examples.md](examples.md).## Demo
See [DEMO.md](DEMO.md) for instructions and code to host an interactive version of Macaw.
## Training data
Macaw was trained in two steps from the text-to-text transformer
model [T5](https://github.com/google-research/text-to-text-transfer-transformer):1. Multi-angle version of [UnifiedQA](https://github.com/allenai/unifiedqa) by fine-tuning T5
on the following 7 datasets and associated angles:
* [BoolQ](https://github.com/google-research-datasets/boolean-questions),
[SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer),
[NarrativeQA](https://github.com/deepmind/narrativeqa): QC→A, AC→Q
* [ARC](https://allenai.org/data/arc), [OBQA](https://allenai.org/data/open-book-qa):
QMC→A, QC→A, QM→A,QAC→M, MAC→Q, AC→QM
* [RACE](https://www.cs.cmu.edu/~glai1/data/race/),
[MCTest](https://mattr1.github.io/mctest/): QMC→A, QC→A, QAC→M,MAC→Q
2. Further fine-tuning of Multi-Angle UnifiedQA on multiple-choice and direct-answer elementary science questions,
along with (up to 5) explanation sentences from [WorldTreeV2](http://cognitiveai.org/explanationbank/):
* [ARC](https://allenai.org/data/arc): QMC→AE, AQC→M, QMEC→A, QME→A, QE→A, QMC→A, QC→AE, QM→AE, QMAC→E, QMA→E
* [ARC-DA](https://allenai.org/data/arc-da): QC→AE, Q→AE, QC→A, Q→A, QEC→A, QE→A, AE→Q, AC→Q, QA→E, AQC→E
3. A specialized answer-focused model, macaw-answer-11b (called "UnifiedQA + ARC MC/DA + IR" on the
leaderboards for [ARC](https://leaderboard.allenai.org/arc/submissions/public),
[ARC-Easy](https://leaderboard.allenai.org/arc_easy/submissions/public), and
[ARC-DA](https://leaderboard.allenai.org/genie-arcda/submissions/public))
was trained on a smaller set of angles, not including explanations:
* ARC: QMC→A, QAC→M, QC→A, QM→A, MAC→Q, AC→QM, M→QA
* ARC-DA: QC→A, Q→A, AC→Q, C→QA
## Available modelsThe Macaw models can be accessed from the Hugging Face model hub:
* [macaw-11b](https://huggingface.co/allenai/macaw-11b) (11 billion parameters)
* [macaw-3b](https://huggingface.co/allenai/macaw-3b) (3 billion parameters)
* [macaw-large](https://huggingface.co/allenai/macaw-large) (770 million parameters)
* [macaw-answer-11b](https://huggingface.co/allenai/macaw-answer-11b) (11 billion parameters)For a sense of the degradation in performance for the smaller sizes, here are baseline scores on the ARC Challenge and
ARC Easy multiple-choice development questions. Included are variants with and without IR context from a large science
corpus (corresponding to angles QMC→A and QM→A respectively).|Model | ARC Challenge | ARC Challenge (no IR) | ARC Easy | ARC Easy (no IR)|
|---|---|---|---|---|
|Macaw (11B) | 76.9 | 74.6 | 91.2 | 84.9|
|Macaw-3B | 68.2 | 67.9 | 87.9 | 77.7|
|Macaw-large | 57.2 | 50.5 | 82.5 | 63.9|
|Macaw-answer (11B) | 79.9 | 75.2 | 90.5 | 85.8|## Disclaimer
As a model capable of generating free form text, the output of the model is not guaranteed to be free of
offensive material, so appropriate caution is advised when using the model.## Citation
If you use Macaw in your work, please reference the related [paper](https://arxiv.org/abs/2109.02593) using
```
@article{Tafjord2021Macaw,
title={General-Purpose Question-Answering with {M}acaw},
author={Oyvind Tafjord and Peter Clark},
journal={ArXiv},
year={2021},
volume={abs/2109.02593}
}
```