https://github.com/microsoft/IRNet

An algorithm for cross-domain NL2SQL
https://github.com/microsoft/IRNet

Last synced: 3 months ago
JSON representation

An algorithm for cross-domain NL2SQL

Host: GitHub
URL: https://github.com/microsoft/IRNet
Owner: microsoft
License: mit
Created: 2019-11-01T01:10:43.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2023-07-22T20:20:22.000Z (over 2 years ago)
Last Synced: 2024-12-04T15:51:10.616Z (11 months ago)
Language: Python
Size: 2.96 MB
Stars: 269
Watchers: 17
Forks: 80
Open Issues: 29
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md

Awesome Lists containing this project

Awesome-Text2SQL - [code

README

          # IRNet

Code for our ACL'19 accepted paper: [Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation](https://arxiv.org/pdf/1905.08205.pdf)



  



## Environment Setup

* `Python3.6`

* `Pytorch 0.4.0` or higher

Install Python dependency via `pip install -r requirements.txt` when the environment of Python and Pytorch is setup.

## Running Code

#### Data preparation

* Download [Glove Embedding](https://nlp.stanford.edu/data/wordvecs/glove.42B.300d.zip) and put `glove.42B.300d` under `./data/` directory

* Download [Pretrained IRNet](https://drive.google.com/open?id=1VoV28fneYss8HaZmoThGlvYU3A-aK31q) and put `

IRNet_pretrained.model` under `./saved_model/` directory

* Download preprocessed train/dev datasets from [here](https://drive.google.com/open?id=1YFV1GoLivOMlmunKW0nkzefKULO4wtrn) and put `train.json`, `dev.json` and 

`tables.json` under `./data/` directory

##### Generating train/dev data by yourself

You could process the origin [Spider Data](https://drive.google.com/uc?export=download&id=11icoH_EA-NYb0OrPTdehRWm_d7-DIzWX) by your own. Download  and put `train.json`, `dev.json` and 

`tables.json` under `./data/` directory and follow the instruction on `./preprocess/`

#### Training

Run `train.sh` to train IRNet.

`sh train.sh [GPU_ID] [SAVE_FOLD]`

#### Testing

Run `eval.sh` to eval IRNet.

`sh eval.sh [GPU_ID] [OUTPUT_FOLD]`

#### Evaluation

You could follow the general evaluation process in [Spider Page](https://github.com/taoyds/spider)

## Results

| **Model**   | Dev 
 Exact Set Match 
Accuracy | Test
 Exact Set Match 
Accuracy |

| ----------- | ------------------------------------- | -------------------------------------- |

| IRNet    | 53.2                        | 46.7                      |

| IRNet+BERT(base) | 61.9                          | **54.7**                      |

## Citation

If you use IRNet, please cite the following work.

```

@inproceedings{GuoIRNet2019,

  author={Jiaqi Guo and Zecheng Zhan and Yan Gao and Yan Xiao and Jian-Guang Lou and Ting Liu and Dongmei Zhang},

  title={Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation},

  booktitle={Proceeding of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},

  year={2019},

  organization={Association for Computational Linguistics}

}

```

## Thanks

We would like to thank [Tao Yu](https://taoyds.github.io/) and [Bo Pang](https://www.linkedin.com/in/bo-pang/) for running evaluations on our submitted models.

We are also grateful to the flexible semantic parser [TranX](https://github.com/pcyin/tranX) that inspires our works.

# Contributing

This project welcomes contributions and suggestions. Most contributions require you to

agree to a Contributor License Agreement (CLA) declaring that you have the right to,

and actually do, grant us the rights to use your contribution. For details, visit

https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need

to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the

instructions provided by the bot. You will only need to do this once across all repositories using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).

For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)

or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/microsoft/IRNet

Awesome Lists containing this project

README