An open API service indexing awesome lists of open source software.

https://github.com/amazon-science/multiatis

Data and code for the paper "End-to-End Slot Alignment and Recognition for Cross-Lingual NLU" (Accepted to EMNLP 2020)
https://github.com/amazon-science/multiatis

Last synced: about 1 year ago
JSON representation

Data and code for the paper "End-to-End Slot Alignment and Recognition for Cross-Lingual NLU" (Accepted to EMNLP 2020)

Awesome Lists containing this project

README

          

## MultiAtis++ Corpus

### Description

The ATIS (Air Travel Information Services) collection was developed to support the research and development of speech understanding systems [1]. The original English data includes intent and slot annotations, and was later extended to Hindi and Turkish [2]. MultiATIS++ futher extends ATIS to 6 more languages, and hence, covers a total of 9 languages, that is, English, Spanish, German, French, Portuguese, Chinese, Japanese, Hindi and Turkish. These locales belong to a diverse set of language families- Indo-European, Sino-Tibetan, Japonic and Altaic.

MultiATIS++ corpus has been outsourced to foster further research in the domain of multilingual/cross-lingual natural language understanding.

For more details, please check the paper:
Xu, W., Haider, B. and Mansour, S., 2020. End-to-End Slot Alignment and Recognition for Cross-Lingual NLU. arXiv preprint arXiv:2004.14353 (https://arxiv.org/abs/2004.14353)

### Accessing MultiAtis++

To obtain a copy of *MutliAtis++* data, please visit:
https://catalog.ldc.upenn.edu/LDC2021T04

Please send your queries/comments to multiatis@amazon.com.

### Citation

Please cite [3] when referring to the MultiATIS++ dataset.

## Soft-Align Implementation

Implementation of the *soft-align* method introduced in [3] will be available here, soon.

## Security

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.

## License

This project is licensed under the Apache-2.0 License.

## References

[1] LDC93S5 ATIS2, LDC94S19 ATIS3 Training Data, LDC95S26 ATIS3 Test Data

[2] Shyam Upadhyay, Manaal Faruqui, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck. (Almost) Zero-Shot Cross-Lingual Spoken Language Understanding. IEEE ICASSP 2018.

[3] Weijia Xu, Batool Haider, Saab Mansour. 2020. End-to-End Slot Alignment and Recognition for Cross-Lingual NLU. arXiv preprint arXiv:2004.14353.