Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/RUCKBReasoning/SubgraphRetrievalKBQA

The pytorch implementation of Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering
https://github.com/RUCKBReasoning/SubgraphRetrievalKBQA

Last synced: 3 months ago
JSON representation

The pytorch implementation of Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering

Host: GitHub
URL: https://github.com/RUCKBReasoning/SubgraphRetrievalKBQA
Owner: RUCKBReasoning
Created: 2022-02-27T13:19:07.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2022-10-01T10:58:56.000Z (over 2 years ago)
Last Synced: 2024-08-03T09:07:11.732Z (6 months ago)
Language: Python
Size: 18.7 MB
Stars: 92
Watchers: 2
Forks: 14
Open Issues: 7
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

StarryDivineSky - RUCKBReasoning/SubgraphRetrievalKBQA

README

# Dataset
## QA benchmark
1. WebQuestionSP：Same as the original [WebQuestionSP QA dataset](https://www.microsoft.com/en-us/download/details.aspx?id=52763).
2. CWQ: Same as the original [CWQ dataset](https://allenai.org/data/complexwebquestions).

## KG
1. Setup Freebase: We use the whole freebase as the knowledge base. Please follow [Freebase-Setup](https://github.com/dki-lab/Freebase-Setup) to build a Virtuoso for the Freebase dataset.
2. To improve the data accessing efficiency, we extract a 2-hop topic-centric subgraph for each question in WebQSP and a 4-hop topic-centric subgraph for each question in CWQ to create relatively small knowledge graphs. We extract these small knowledge graphs following [NSM](https://github.com/RichardHGL/WSDM2021_NSM). You can download the graphs from [here](https://drive.google.com/drive/folders/1qNauEQJHuMs4uPQcCtMb-M9Seco5mTUl?usp=sharing).

# Running Instructions for WebQSP
## Step0: Prepare the weak-supervised dataset for training the retriever：
## cd WebQSP Q, run the following scripts.

python run_preprocess.py

## Step1: Train the retriever：

python run_train_retriever.py

## Step2: Extract a subgraph for each data instance：

python run_retrieve_subgraph.py

## Step3: Train the reasoner：

python run_train_nsm.py

## Step4: Fine-tune the retriever by the feeback of the reasoner：

python run_retriever_finetune.py

## You can also directly run：

./run.sh

## Download the data folder tmp from [here](https://drive.google.com/drive/folders/1qNauEQJHuMs4uPQcCtMb-M9Seco5mTUl?usp=sharing).

## For CWQ, you can run ./cwq/run.sh

### If you have any questions about the code, please contact Xiaokang Zhang ([email protected])!