{"id":18954853,"url":"https://github.com/jasonwu0731/trade-dst","last_synced_at":"2025-04-05T16:11:01.032Z","repository":{"id":35911311,"uuid":"187520069","full_name":"jasonwu0731/trade-dst","owner":"jasonwu0731","description":"Source code for transferable dialogue state generator (TRADE, Wu et al., 2019). https://arxiv.org/abs/1905.08743 ","archived":false,"fork":false,"pushed_at":"2022-12-08T05:13:09.000Z","size":850,"stargazers_count":391,"open_issues_count":6,"forks_count":114,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-03-29T15:11:20.885Z","etag":null,"topics":["dialogue","dialogue-state-tracking","machine-learning","multi-domain","natural-language-processing","seq2seq"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jasonwu0731.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-19T19:37:03.000Z","updated_at":"2025-01-17T13:11:26.000Z","dependencies_parsed_at":"2023-01-16T09:05:26.225Z","dependency_job_id":null,"html_url":"https://github.com/jasonwu0731/trade-dst","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasonwu0731%2Ftrade-dst","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasonwu0731%2Ftrade-dst/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasonwu0731%2Ftrade-dst/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jasonwu0731%2Ftrade-dst/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jasonwu0731","download_url":"https://codeload.github.com/jasonwu0731/trade-dst/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247361695,"owners_count":20926643,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dialogue","dialogue-state-tracking","machine-learning","multi-domain","natural-language-processing","seq2seq"],"created_at":"2024-11-08T13:46:27.972Z","updated_at":"2025-04-05T16:11:01.002Z","avatar_url":"https://github.com/jasonwu0731.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"## TRADE Multi-Domain and Unseen-Domain Dialogue State Tracking\n\u003cimg src=\"plot/pytorch-logo-dark.png\" width=\"10%\"\u003e \n\n\u003cimg align=\"right\" src=\"plot/einstein-scroll.png\" width=\"8%\"\u003e\n\u003cimg align=\"right\" src=\"plot/salesforce-research.jpg\" width=\"18%\"\u003e\n\u003cimg align=\"right\" src=\"plot/HKUST.jpg\" width=\"12%\"\u003e\n\nThis is the PyTorch implementation of the paper:\n**Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems**. [**Chien-Sheng Wu**](https://jasonwu0731.github.io/), Andrea Madotto, Ehsan Hosseini-Asl, Caiming Xiong, Richard Socher and Pascale Fung. ***ACL 2019***. \n[[PDF]](https://arxiv.org/abs/1905.08743)\n\nThis code has been written using PyTorch \u003e= 1.0. If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex is listed below:\n\u003cpre\u003e\n@InProceedings{WuTradeDST2019,\n  \tauthor = \"Wu, Chien-Sheng and Madotto, Andrea and Hosseini-Asl, Ehsan and Xiong, Caiming and Socher, Richard and Fung, Pascale\",\n  \ttitle = \t\"Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems\",\n  \tbooktitle = \t\"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)\",\n  \tyear = \t\"2019\",\n  \tpublisher = \"Association for Computational Linguistics\"\n}\n\u003c/pre\u003e\n\n## Abstract\nOver-dependence on domain ontology and lack of knowledge sharing across domains are two practical and yet less studied problems of dialogue state tracking. Existing approaches generally fall short in tracking unknown slot values during inference and often have difficulties in adapting to new domains. In this paper, we propose a Transferable Dialogue State Generator (TRADE) that generates dialogue states from utterances using a copy mechanism, facilitating knowledge transfer when predicting (domain, slot, value) triplets not encountered during training. Our model is composed of an utterance encoder, a slot gate, and a state generator, which are shared across domains. Empirical results demonstrate that TRADE achieves state-of-the-art joint goal accuracy of 48.62% for the five domains of MultiWOZ, a human-human dialogue dataset. In addition, we show its transferring ability by simulating zero-shot and few-shot dialogue state tracking for unseen domains. TRADE achieves 60.58% joint goal accuracy in one of the zero-shot domains, and is able to adapt to few-shot cases without forgetting already trained domains.\n\n## Model Architecture\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"plot/model.png\" width=\"75%\" /\u003e\n\u003c/p\u003e\nThe architecture of the proposed TRADE model, which includes (a) an utterance encoder, (b) a state generator, and (c) a slot gate, all of which are shared among domains. The state generator will decode J times independently for all the possible (domain, slot) pairs. At the first decoding step, state generator will take the j-th (domain, slot) embeddings as input to generate its corresponding slot values and slot gate. The slot gate predicts whether the j-th (domain, slot) pair is triggered by the dialogue.\n\n\n## Data\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"plot/dataset.png\" width=\"50%\" /\u003e\n\u003c/p\u003e\n\nDownload the MultiWOZ dataset and the processed dst version.\n```console\n❱❱❱ python3 create_data.py\n```\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"plot/example.png\" width=\"50%\" /\u003e\n\u003c/p\u003e\n\nAn example of multi-domain dialogue state tracking in a conversation. The solid arrows on the left are the single-turn mapping, and the dot arrows on the right are multi-turn mapping. The state tracker needs to track slot values mentioned by the user for all the slots in all the domains.\n\n## Dependency\nCheck the packages needed or simply run the command\n```console\n❱❱❱ pip install -r requirements.txt\n```\nIf you run into an error related to Cython, try to upgrade it first.\n```console\n❱❱❱ pip install --upgrade cython\n```\n\n\n## Multi-Domain DST\nTraining\n```console\n❱❱❱ python3 myTrain.py -dec=TRADE -bsz=32 -dr=0.2 -lr=0.001 -le=1\n```\nTesting\n```console\n❱❱❱ python3 myTest.py -path=${save_path}\n```\n* -bsz: batch size\n* -dr: drop out ratio\n* -lr: learning rate\n* -le: loading pretrained embeddings\n* -path: model saved path\n\n\u003e [2019.08 Update] Now the decoder can generate all the (domain, slot) pairs in one batch at the same time to speedup decoding process. You can set flag \"--parallel_decode=1\" to decode all (domain, slot) pairs in one batch.\n\n\n## Unseen Domain DST\n\n#### Zero-Shot DST\nTraining\n```console\n❱❱❱ python3 myTrain.py -dec=TRADE -bsz=32 -dr=0.2 -lr=0.001 -le=1 -exceptd=${domain}\n```\nTesting\n```console\n❱❱❱ python3 myTest.py -path=${save_path} -exceptd=${domain}\n```\n* -exceptd: except domain selection, choose one from {hotel, train, attraction, restaurant, taxi}.\n\n#### Few-Shot DST with CL\nTraining\nNaive \n```console\n❱❱❱ python3 fine_tune.py -bsz=8 -dr=0.2 -lr=0.001 -path=${save_path_except_domain} -exceptd=${except_domain}\n```\nEWC\n```console\n❱❱❱ python3 EWC_train.py -bsz=8 -dr=0.2 -lr=0.001 -path=${save_path_except_domain} -exceptd=${except_domain} -fisher_sample=10000 -l_ewc=${lambda}\n```\nGEM\n```console\n❱❱❱ python3 GEM_train.py -bsz=8 -dr=0.2 -lr=0.001 -path={save_path_except_domain} -exceptd=${except_domain}\n```\n* -l_ewc: lambda value in EWC training\n\n## Other Notes\n- We found that there might be some variances in different runs, especially for the few-shot setting. For our own experiments, we only use one random seed (seed=10) to do the experiments reported in the paper. Please check the results for average three runs in our [ACL presentation](https://jasonwu0731.github.io/files/TRADE-DST-ACL-2019.pdf). \n\n## Bug Report\nFeel free to create an issue or send email to jason.wu@connect.ust.hk\n\n## License\n```\ncopyright 2019-present https://jasonwu0731.github.io/\n\nPermission is hereby granted, free of charge, to any person obtaining a copy \nof this software and associated documentation files (the \"Software\"), to deal \nin the Software without restriction, including without limitation the rights \nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell \ncopies of the Software, and to permit persons to whom the Software is \nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all \ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR \nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, \nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE \nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER \nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, \nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE \nSOFTWARE.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjasonwu0731%2Ftrade-dst","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjasonwu0731%2Ftrade-dst","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjasonwu0731%2Ftrade-dst/lists"}