https://github.com/jacksonchen1998/multi-turn-dialogue-response

Master Thesis
https://github.com/jacksonchen1998/multi-turn-dialogue-response

Last synced: about 1 year ago
JSON representation

Master Thesis

Host: GitHub
URL: https://github.com/jacksonchen1998/multi-turn-dialogue-response
Owner: jacksonchen1998
Created: 2024-05-24T14:31:15.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-08-13T09:01:17.000Z (almost 2 years ago)
Last Synced: 2025-03-26T15:48:03.928Z (about 1 year ago)
Language: Python
Size: 299 KB
Stars: 20
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Multi-turn-Dialogue-Response

![image](fig/kitm.png)

## Install requirements.txt

```bash
conda install --yes --file requirements.txt
```

## Pre-processing

* Download [**Pretrained GloVe Embeddings**](http://nlp.stanford.edu/data/glove.6B.zip) and save it in `/vectors`.
* The preprocessed dataset is saved as `/data/ED/dataset_preproc.p`. If you want to create the dataset yourself or change the knowledge types generated by COMET, delete this file, download the [COMET checkpoint](https://github.com/allenai/comet-atomic-2020) and place it in `/data/ED/Comet`. The preprocessed dataset would be generated after the training script. Here, we use BART, since the GPT-2 version can not be used.

## Dataset

In this study we use two different dataset. In this repository, we set the config and mapping to EmpatheticDialogues.

- [EmpatheticDialogues](https://github.com/facebookresearch/EmpatheticDialogues)
- [DailyDialog](http://yanran.li/dailydialog)

### Data Resource

- [EmpatheticDialogues](https://drive.google.com/drive/folders/1UiEr4ug0nc4uJQYvvO2U4MHt3bhf66VW?usp=sharing)
- [DailyDialog](https://drive.google.com/drive/folders/1nUBCQjNjlNjqLZykrYoiKxDAKUVYQApK?usp=sharing)

### Data Structure

The `data` folder is organized as follows:

### Directory and File Descriptions

- **`data/`**: The root directory containing all data files.

- **`ED/`**: This directory contains files for the "EmpatheticDialogues" dataset.

- **`Comet/`**: A subdirectory containing the "Comet" package used to generate common-sense knowledge.

- **`emp.pkl`**: A pickle file containing topic appearance probabilities for the "EmpatheticDialogues" dataset.

- **`train.csv`**: The training data file for the "EmpatheticDialogues" dataset.

- **`valid.csv`**: The validation data file for the "EmpatheticDialogues" dataset.

- **`test.csv`**: The testing data file for the "EmpatheticDialogues" dataset.

- **`DD/`**: This directory contains files for the "DailyDialog" dataset.

- **`Comet/`**: A subdirectory containing the "Comet" package used to generate common-sense knowledge.

- **`dd.pkl`**: A pickle file containing topic appearance probabilities for the "DailyDialog" dataset.

- **`train.csv`**: The training data file for the "DailyDialog" dataset.

- **`valid.csv`**: The validation data file for the "DailyDialog" dataset.

- **`test.csv`**: The testing data file for the "DailyDialog" dataset.

## Training

```bash
python main.py --cuda --save_path save/your_dir
```

## Testing

```bash
python main.py --cuda --test --save_path save/your_dir --model_path save/dir_save/KITM_XXXX.XXX
```

Please be free to contact with us via present90308.ee11@nycu.edu.tw

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jacksonchen1998/multi-turn-dialogue-response

Awesome Lists containing this project

README