https://github.com/allenai/macaw

Multi-angle c(q)uestion answering
https://github.com/allenai/macaw

Last synced: 2 months ago
JSON representation

Multi-angle c(q)uestion answering

Host: GitHub
URL: https://github.com/allenai/macaw
Owner: allenai
License: apache-2.0
Created: 2021-05-12T16:53:07.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2022-08-22T13:19:41.000Z (almost 3 years ago)
Last Synced: 2024-11-16T07:33:15.451Z (8 months ago)
Language: Python
Size: 349 KB
Stars: 458
Watchers: 19
Forks: 57
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

StarryDivineSky - allenai/macaw - angle c(q)uestion answering 多角度 c(q) 问题回答）是一种即用型模型，能够进行一般问题回答，在训练的领域之外表现出稳健性。它以“多角度”方式进行了训练，这意味着它可以处理一组灵活的输入和输出“槽”（如问题、答案、解释）。Macaw 建立在 T5 之上，有不同的尺寸：macaw-11b、macaw-3b 和 macaw-large，以及各种排行榜上的以答案为重点的版本：macaw-answer-11b。 (文本匹配文本检索文本相似度 / 其他_文本生成、文本对话)

README

        # Macaw

## Introduction

Macaw (Multi-angle c(q)uestion answering) is a ready-to-use model capable of general 

question answering, showing robustness outside the domains it was 

trained on. It has been trained in "multi-angle" fashion, which means it can handle a flexible set of input

and output "slots" (like question, answer, explanation) .

Macaw was built on top of [T5](https://github.com/google-research/text-to-text-transfer-transformer) and 

comes in different sizes:  [macaw-11b](https://huggingface.co/allenai/macaw-11b), [macaw-3b](https://huggingface.co/allenai/macaw-3b), 

and [macaw-large](https://huggingface.co/allenai/macaw-large), as well as an answer-focused version featured on 

various leaderboards: [macaw-answer-11b](https://huggingface.co/allenai/macaw-answer-11b) (see [below](#training-data)).

### Examples

Some suggestive examples from the Macaw (11B) model, for different angles:

  * (Q→A) Given a question, what's the answer? 


  **Q: James went camping in the woods, but forgot to bring a hammer to bang the tent pegs in. What else might he use? 
 

  → A: rocks**

  

  * (QM→A) Given a question and answer choices, what's the answer? 


  **Q: James went camping in the woods, but forgot to bring a hammer to bang the tent pegs in. What else might he use? 
 

           M: (A) a leaf (B) a log (C) a worm 


  → A: a log**

  * (Q→AE) Given a question, what's the answer and an explanation?


  **Q: Which force pulls objects to the ground? 


  → A: gravity 


  → E: Gravitational force causes objects that have mass to be pulled down on a planet.**

  * (A→QE) Given an answer, what's a plausible question and explanation?


  **A: elephant 


  → Q: Which animal has the largest ears? 


  → E: The ears of an elephant are the largest.**

  * (C→QA) Given a context, what's a plausible question and answer?


  **C: A car needs a battery to start. 


  → Q: What is required for a car to start? 


  → A: battery**

  

For many more examples of the basic Q→A angle, see [examples.md](examples.md).

## Usage examples

Macaw can easily be used in the Hugging Face [transformers](https://github.com/huggingface/transformers) 

library, as shown here for the 

smallest model (the smallest model is not generally recommended, but has much 

smaller footprint), where given a question we want to return an answer and 

suggested multiple-choice answer options.

```

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("allenai/macaw-large")

model = AutoModelForSeq2SeqLM.from_pretrained("allenai/macaw-large")

input_string = "$answer$ ; $mcoptions$ ; $question$ = What is the color of a cloudy sky?"

input_ids = tokenizer.encode(input_string, return_tensors="pt")

output = model.generate(input_ids, max_length=200)

>>> tokenizer.batch_decode(output, skip_special_tokens=True)

['$answer$ = gray ; $mcoptions$ = (A) blue (B) white (C) grey (D) white']

```

(run `pip install -r requirements.txt` if any dependencies are missing). Note there's no guarantee the different 

slots are fully coherent, as in gray/grey (and duplicate "white") here,

more so for the macaw-large model vs the larger ones.

The code in `macaw/utils.py` includes some convenience wrappers, such as `load_model` and 

`run_macaw`, here are some examples

loading the macaw-11b model onto two GPUs (need around 48GB total GPU memory for the 

largest model to work):

```

from macaw.utils import load_model, run_macaw

model_dict = load_model("allenai/macaw-11b", cuda_devices=[0,1])

res1 = run_macaw("Q: Which force pulls objects to the ground?\nA\nE", model_dict)

# Alternate input syntax

res2 = run_macaw({"Q:":"Which force causes a compass needle to point north?", "A":""}, model_dict)

# Add sampling options for the output

res3 = run_macaw("Q: Which force pulls objects to the ground?\nA\nE", model_dict, {"do_sample": True, "temperature": 2.0})

>>> [print(res["output_slots_list"][0]) for res in [res1, res2, res3]]

{'answer': 'gravity', 'explanation': 'Gravitational force causes objects that have mass to be pulled down on a planet.'}

{'answer': 'magnetism'}

{'answer': 'gravitional force', 'explanation': 'Gravitational force causes objects that have mass to be pulled down on a planet.'}

```

For batch evaluation of instances at various angles, see [`macaw/batch_eval.py`](macaw/batch_eval.py) for pointers.

## Supported slots

Here are the slots available in Macaw, generally applicable for both input and output:

| Slot name | Description | Example | 

|---|---|---|

|question (Q) | Question text | What is the color of a cloudy sky? |

|answer (A) | Answer text | The sky is blue |

|mcoptions (M) | Multiple-choice answer options |  (A) blue (B) white (C) grey |

|context (C) | Potentially relevant context (noisy IR) | The sky looks blue to us because... |

|explanation (E) | Sentences explaining the answer | A cloudy sky is usually gray in color... |

An angle is a specific set of input/output slots, for instance QM->AE is the task of producing answer and explanation,

given a question and multiple-choice options. Macaw is trained on a wide variety of angles and handles unseen angles

as well, one exception is that the context (C) only appears as an input slot in the training data.

  

## The Challenge300 dataset of probing questions

The **Challenge300** dataset of 300 diverse probing examples can be found in 

[challenge300-probes-v1.jsonl](challenge300-probes-v1.jsonl). The basic Q→A output

from Macaw (at different sizes), as well as outputs from [GPT3](https://arxiv.org/pdf/2005.14165.pdf), 

[Jurassic-1](https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1) and 

[alternate T5 models](https://www.aclweb.org/anthology/2020.emnlp-main.437/) trained on NaturalQuestions, can be seen in

[examples.md](examples.md).

## Demo

See [DEMO.md](DEMO.md) for instructions and code to host an interactive version of Macaw.

## Training data

Macaw was trained in two steps from the text-to-text transformer 

model [T5](https://github.com/google-research/text-to-text-transfer-transformer):

   1. Multi-angle version of [UnifiedQA](https://github.com/allenai/unifiedqa) by fine-tuning T5

   on the following 7 datasets and associated angles:

       * [BoolQ](https://github.com/google-research-datasets/boolean-questions), 

       [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer), 

       [NarrativeQA](https://github.com/deepmind/narrativeqa): QC→A, AC→Q

       * [ARC](https://allenai.org/data/arc), [OBQA](https://allenai.org/data/open-book-qa): 

       QMC→A, QC→A, QM→A,QAC→M, MAC→Q, AC→QM

       * [RACE](https://www.cs.cmu.edu/~glai1/data/race/), 

       [MCTest](https://mattr1.github.io/mctest/): QMC→A, QC→A, QAC→M,MAC→Q

       

   2. Further fine-tuning of Multi-Angle UnifiedQA on multiple-choice and direct-answer elementary science questions, 

   along with (up to 5) explanation sentences from [WorldTreeV2](http://cognitiveai.org/explanationbank/): 

       * [ARC](https://allenai.org/data/arc): QMC→AE, AQC→M, QMEC→A, QME→A, QE→A, QMC→A, QC→AE, QM→AE, QMAC→E, QMA→E

       * [ARC-DA](https://allenai.org/data/arc-da): QC→AE, Q→AE, QC→A, Q→A, QEC→A, QE→A, AE→Q, AC→Q, QA→E, AQC→E

       

   3. A specialized answer-focused model, macaw-answer-11b (called "UnifiedQA + ARC MC/DA + IR" on the 

   leaderboards for [ARC](https://leaderboard.allenai.org/arc/submissions/public), 

   [ARC-Easy](https://leaderboard.allenai.org/arc_easy/submissions/public), and 

   [ARC-DA](https://leaderboard.allenai.org/genie-arcda/submissions/public))

   was trained on a smaller set of angles, not including explanations:

       * ARC: QMC→A, QAC→M, QC→A, QM→A, MAC→Q, AC→QM, M→QA

       * ARC-DA: QC→A, Q→A, AC→Q, C→QA

       

   

## Available models

The Macaw models can be accessed from the Hugging Face model hub:

   * [macaw-11b](https://huggingface.co/allenai/macaw-11b)  (11 billion parameters)

   * [macaw-3b](https://huggingface.co/allenai/macaw-3b)  (3 billion parameters)

   * [macaw-large](https://huggingface.co/allenai/macaw-large)  (770 million parameters)

   * [macaw-answer-11b](https://huggingface.co/allenai/macaw-answer-11b)  (11 billion parameters)

For a sense of the degradation in performance for the smaller sizes, here are baseline scores on the ARC Challenge and 

ARC Easy multiple-choice development questions. Included are variants with and without IR context from a large science 

corpus (corresponding to angles QMC→A and QM→A respectively).

|Model | ARC Challenge | ARC Challenge (no IR) | ARC Easy | ARC Easy (no IR)|

|---|---|---|---|---|

|Macaw (11B) | 76.9 | 74.6 | 91.2 | 84.9|

|Macaw-3B | 68.2 | 67.9 | 87.9 |  77.7|

|Macaw-large | 57.2 | 50.5 | 82.5 | 63.9|

|Macaw-answer (11B) | 79.9 | 75.2 | 90.5 | 85.8|

## Disclaimer

As a model capable of generating free form text, the output of the model is not guaranteed to be free of

offensive material, so appropriate caution is advised when using the model.

## Citation

If you use Macaw in your work, please reference the related [paper](https://arxiv.org/abs/2109.02593) using

```

@article{Tafjord2021Macaw,

  title={General-Purpose Question-Answering with {M}acaw},

  author={Oyvind Tafjord and Peter Clark},

  journal={ArXiv},

  year={2021},

  volume={abs/2109.02593}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/allenai/macaw

Awesome Lists containing this project

README