https://github.com/demfier/prepare-switchboard
Updated swda class for preparing switchboard dataset for NLP tasks
https://github.com/demfier/prepare-switchboard
Last synced: 10 months ago
JSON representation
Updated swda class for preparing switchboard dataset for NLP tasks
- Host: GitHub
- URL: https://github.com/demfier/prepare-switchboard
- Owner: Demfier
- License: mit
- Created: 2020-05-14T17:03:56.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-05-20T00:22:42.000Z (about 6 years ago)
- Last Synced: 2025-08-13T22:03:05.087Z (10 months ago)
- Language: Python
- Homepage:
- Size: 12.7 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# prepare-switchboard
This repository updates the originally written classes for [The Switchboard Dialog Act Corpus](https://compprag.christopherpotts.net/swda.html) and uses it to freely parse the dataset.
# Instructions to run
1. Extract swda.zip dataset inside `data/raw/`
2. Run `python main.py` to create train/val/test splits. The *_sentences.tsv files generated could be used to train an autoencoder while *_dialog.tsv files could used to train a simple sequence-to-sequence model for dialog generation.