https://github.com/pabloamc/lm_aaai22

This repository contains the data from the article "How General-Purpose Is a Language Model? Usefulness and Safety with Human Prompters in the Wild" to appear in AAAI'22
https://github.com/pabloamc/lm_aaai22

Last synced: 3 months ago
JSON representation

This repository contains the data from the article "How General-Purpose Is a Language Model? Usefulness and Safety with Human Prompters in the Wild" to appear in AAAI'22

Host: GitHub
URL: https://github.com/pabloamc/lm_aaai22
Owner: PabloAMC
Created: 2021-12-03T12:12:23.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2022-07-20T17:48:39.000Z (almost 4 years ago)
Last Synced: 2025-10-24T01:33:07.493Z (7 months ago)
Size: 10.1 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# How General-Purpose Is a Language Model? Usefulness and Safety with Human Prompters in the Wild

[![Conference](http://img.shields.io/badge/AAAI-2022-4b44ce.svg)](https://ojs.aaai.org/index.php/AAAI/article/view/20466)

This repository contains the supplementary material and data from the article "How General-Purpose Is a Language Model? Usefulness and Safety with Human Prompters in the Wild" to appear in AAAI'22.

## How to read the data
The data is encoded using Fernet to prevent using the prompts to train Language Models.
Use
```
import json
import pandas as pd
from cryptography.fernet import Fernet

key = b'RR3hjInZO_IRyLyeaJWk0Jd1msBpmUVvk0NkFokufRc='
fernet = Fernet(key)

for file in ['with_GPT3.json','without_GPT3.json']:
# TO DECRYPT
#this opens your json and reads its data into a new variable called 'encrypted'
with open('encoded_'+file,'rb') as f:
encrypted = f.read()

#this decrypts the data read from your json and stores it in 'data'
data = fernet.decrypt(encrypted)

#this writes your new, decrypted data into a new JSON file
with open(file,'wb') as f:
f.write(data)

df = pd.io.json.read_json(file)
df.head()

df = pd.io.json.read_json('with_GPT3.json')
df.head()

df3 = pd.io.json.read_json('without_GPT3.json')
df3.head()
```

## Data description

The file `with_GPT3.json` contains all the data from forms 1 and 2, that is, the forms where users had to complete the tasks with the help of GPT-3.
The columns labelled as `form_1_...` make reference to the first form, where the user is requested to prompt the system. Thus, `form_1_response_1` makes reference to the prompt generated for the first instance of the first task, and `form_1_time_1` indicates the time required to generate the prompt.
Remember that there are 4 tasks each with 3 instances each, so `form_1_...` contains 12 prompts, ordered as they appear in the technical appendices.

Similarly, `gpt3_answer_...` makes reference to the answer generated by GPT-3. The columns `form_2_...` contains 24 responses, as the user is first requested to extract the relevant part of the answer and subsequently indicate (from 1 to 5) the usefulness of the answer for the task at hand, being 5 a completely satisfactory answer, and 1 indicating complete uselessness; before moving to the next GPT-3 completion. The time for each answer is also measured. Finally demographic data is also included:
- `age`.
- `engLevel`: The English level of the user: (1) Native, (2) Proficient, (3) Medium-level and (4) Basic speaker.
- `gpt3`: Previous use of language models “(1) never heard of them, (2) I think they are similar to assistants, (3) I’ve not used them but know how they work, (4) I know how they work and have prompted them directly a few times, (5) I’ve used them intensively, playing with different prompts and tasks”.
- `assistants`: Previous use of virtual assistants: (yes/no).

On the other hand, `without_GPT3.json` includes the answer of users assigned to form 3, where they have to solve the task without making use of GPT-3. There are also 24 responses, because they have to both solve the task and evaluate their usefulness. Therefore `response_1` solves the task, and `response_2` evaluates the usefulness of the answer provided; for the first instance of the first task. Demographic data (`age` and `engLevel` is also collected).

## Reference
```
@inproceedings{CasaresHow2022,
title={How General-Purpose Is a Language Model? Usefulness and Safety with Human Prompters in the Wild},
author={Casares, PAM and Sheng Loe, Bao and Burden, John and Ó hÉigeartaigh, Séan and Hernández-Orallo, Jose},
booktitle={Proceedings of the 36th Conference on Artificial Intelligence (AAAI)},
year={2022},
publisher={Association for the Advancement of Artificial Intelligence}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pabloamc/lm_aaai22

Awesome Lists containing this project

README