https://github.com/borealisai/code-gen-tae
Code generation from natural language with less prior and more monolingual data
https://github.com/borealisai/code-gen-tae
Last synced: 11 months ago
JSON representation
Code generation from natural language with less prior and more monolingual data
- Host: GitHub
- URL: https://github.com/borealisai/code-gen-tae
- Owner: BorealisAI
- License: other
- Created: 2021-05-31T15:05:23.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2021-08-24T14:18:36.000Z (almost 5 years ago)
- Last Synced: 2025-04-07T07:36:28.021Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 41 KB
- Stars: 13
- Watchers: 5
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Code generation from natural language with less prior and more monolingual data (TAE)
Paper published in [ACL 2021](https://aclanthology.org/2021.acl-short.98/)
install the requirments:
```
pip install -r requirements.txt
```
To train model on Django
```
python3 train.py --dataset_name django --save_dir CHECKPOINT_DIR --copy_bt --no_encoder_update --monolingual_ratio 1.0 --early_stopping
```
To evaluate the provided Django checkpoint:
```
python3 train.py --dataset_name django --save_dir pretrained_weights/django --copy_bt --no_encoder_update --monolingual_ratio 1.0 --early_stopping --just_evaluate --seed 2
```
To train model on CoNaLa
```
python3 train.py --dataset_name conala --save_dir CHECKPOINT_DIR --copy_bt --no_encoder_update --monolingual_ratio 0.5 --epochs 80
```
To evaluate the provided CoNaLa chceckpoint:
```
python3 train.py --dataset_name conala --save_dir pretrained_weights/conala --copy_bt --no_encoder_update --monolingual_ratio 0.5 --epochs 80 --just_evaluate --seed 4
```
### Evaluation Results
Here are the evaluation numbers for the provided checkpoints:
| Dataset | Results | Metric |
| ------- | ------------ | ------------------ |
| Django | 81.77 | Exact Match Acc. |
| CoNaLa | 33.41 | Corpus BLEU |