Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alontalmor/CQD
Complex Questions Decomposition Model
https://github.com/alontalmor/CQD
Last synced: 3 months ago
JSON representation
Complex Questions Decomposition Model
- Host: GitHub
- URL: https://github.com/alontalmor/CQD
- Owner: alontalmor
- Created: 2018-11-25T08:51:03.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2018-11-25T08:53:38.000Z (about 6 years ago)
- Last Synced: 2024-08-04T10:01:11.040Z (6 months ago)
- Language: Python
- Homepage:
- Size: 96.7 KB
- Stars: 0
- Watchers: 1
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-kbqa - [code
README
## WebAsKB
This repository contains code for our paper [The Web as a Knowledge-base for Answering Complex Questions](https://arxiv.org/abs/1803.06643).
It can be used to train a neural model for answering complex questions, when the answer needs to be derived from multiple web snippets.
This model was trained on the dataset [ComplexWebQuestions](http://nlp.cs.tau.ac.il/compwebq), and the code is in PyTorch.## Setup
### Setting up a virtual environment
1. First, clone the repository:
```
git clone https://[email protected]/alontalmor/webkb_dev.git
```2. Change your directory to where you cloned the files:
```
cd webkb_dev
```3. Create a virtual environment with Python 3.6:
```
virtualenv -p python3 venv
python3.6 -m venv venv```
4. Activate the virtual environment. You will need to activate the venv environment in each terminal in which you want to use WebAsKB.
```
source venv/bin/activate (or source venv/bin/activate.csh)
```
5. Install the required dependencies:```
pip3 install -r requirements_cloud.txt
```
6. Install pytorch 0.3.1 from their [website](http://pytorch.org/):torch-0.3.0.post4-cp36-cp36m-macosx_10_7_x86_64.whl
http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp36-cp36m-linux_x86_64.whl7. Download external libraries:
```
wget https://www.dropbox.com/s/k867s25qitdo8bc/Lib.zip
unzip Lib.zip
```7. Download the data:
```
wget https://www.dropbox.com/s/tn45a3crehht7c1/Data.zip
unzip Data.zip
```
8. Optional - install and run Stanford NLP server, to generate noisy supervision:```
wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
cd stanford-corenlp-full-2016-10-31
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
```### Data
By default, we expect source data and preprocessed data to be stored in the "data" directory.
The expected file locations can be changed by altering config.py.
Note -- the dataset downloaded here contains only the question-answer pairs, the full dataset (including web snippets)
can be downloaded from [ComplexWebQuestions](http://nlp.cs.tau.ac.il/compwebq)## Running
Now you can do any of the following:
* Generate the noisy supervision data for training `python -m webaskb_run.py gen_noisy_sup`.
* Run a pointer network to generate split points in the question `python -m webaskb_run.py run_ptrnet`.
* Train the pointer network `python -m webaskb_run.py train_ptrnet`.
* Run final predication and calculate p@1 scores `python -m webaskb_run.py splitqa`.Options: ‘—eval_set dev’ or ‘—eval_set test’ to choose between the development and test set.
Please note, Reading Comprehension answer predication data is provided in Data/RC_answer_cache. However the WebAnswer model was not included
due to its complexity and reliance on the ability to query a search engine.
You may replace the RC component with any other RC model to be used with the web-snippets in [ComplexWebQuestions](http://nlp.cs.tau.ac.il/compwebq)
.