{"id":13754388,"url":"https://github.com/JasonForJoy/MPC-BERT","last_synced_at":"2025-05-09T22:32:05.854Z","repository":{"id":162319290,"uuid":"365202394","full_name":"JasonForJoy/MPC-BERT","owner":"JasonForJoy","description":"GIFT (ACL 2023) \u0026 MPC-BERT (ACL 2021) for Multi-Party Conversation Understanding","archived":false,"fork":false,"pushed_at":"2023-07-12T10:30:11.000Z","size":1557,"stargazers_count":39,"open_issues_count":1,"forks_count":7,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-11-16T07:33:30.649Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JasonForJoy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-07T10:49:19.000Z","updated_at":"2024-09-10T08:48:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"8343d8b5-78b0-4207-9f87-5dc4636b81cd","html_url":"https://github.com/JasonForJoy/MPC-BERT","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JasonForJoy%2FMPC-BERT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JasonForJoy%2FMPC-BERT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JasonForJoy%2FMPC-BERT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JasonForJoy%2FMPC-BERT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JasonForJoy","download_url":"https://codeload.github.com/JasonForJoy/MPC-BERT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253335881,"owners_count":21892752,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T09:01:57.930Z","updated_at":"2025-05-09T22:32:00.842Z","avatar_url":"https://github.com/JasonForJoy.png","language":"Python","funding_links":[],"categories":["其他_NLP自然语言处理"],"sub_categories":["其他_文本生成、文本对话"],"readme":"# MPC-BERT \u0026 GIFT for Multi-Party Conversation Understanding\nThis repository contains the source codes for the following papers:\n- [GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding](https://aclanthology.org/2023.acl-long.651.pdf). \u003cbr\u003e\n  Jia-Chen Gu, Zhe-Hua Ling, Quan Liu, Cong Liu, Guoping Hu \u003cbr\u003e\n  _ACL 2023_ \u003cbr\u003e\n  \n- [MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding](https://aclanthology.org/2021.acl-long.285.pdf). \u003cbr\u003e\n  Jia-Chen Gu, Chongyang Tao, Zhen-Hua Ling, Can Xu, Xiubo Geng, Daxin Jiang \u003cbr\u003e\n  _ACL 2021_ \u003cbr\u003e\n\n\n## Introduction of MPC-BERT\nRecently, various neural models for multi-party conversation (MPC) have achieved impressive improvements on a variety of tasks such as addressee recognition, speaker identification and response prediction. \nHowever, these existing methods on MPC usually represent interlocutors and utterances individually and ignore the inherent complicated structure in MPC which may provide crucial interlocutor and utterance semantics and would enhance the conversation understanding process. \nTo this end, we present MPC-BERT, a pre-trained model for MPC understanding that considers learning who says what to whom in a unified model with several elaborated self-supervised tasks. \nParticularly, these  tasks can be generally categorized into (1) interlocutor structure modeling including reply-to utterance recognition, identical speaker searching and pointer consistency distinction, and (2) utterance semantics modeling including masked shared utterance restoration and shared node detection. \nWe evaluate MPC-BERT on three downstream tasks including addressee recognition, speaker identification and response selection. \nExperimental results show that MPC-BERT outperforms previous methods by large margins and achieves new state-of-the-art performance on all three downstream tasks at two benchmarks.\n\n\u003cdiv align=center\u003e\u003cimg src=\"image/result_addressee_recognition.png\" width=80%\u003e\u003c/div\u003e\n\n\u003cdiv align=center\u003e\u003cimg src=\"image/result_speaker_identification.png\" width=80%\u003e\u003c/div\u003e\n\n\u003cdiv align=center\u003e\u003cimg src=\"image/result_response_selection.png\" width=80%\u003e\u003c/div\u003e\n\n\n## Introduction of GIFT\nAddressing the issues of who saying what to whom in multi-party conversations (MPCs) has recently attracted a lot of research attention. However, existing methods on MPC understanding typically embed interlocutors and utterances into sequential information flows, or utilize only the superficial of inherent graph structures in MPCs. To this end, we present a plug-and-play and lightweight method named graph-induced fine-tuning (GIFT) which can adapt various Transformer-based pre-trained language models (PLMs) for universal MPC understanding. In detail, the full and equivalent connections among utterances in regular Transformer ignore the sparse but distinctive dependency of an utterance on another in MPCs. To distinguish different relationships between utterances, four types of edges are designed to integrate graph-induced signals into attention mechanisms to refine PLMs originally designed for processing sequential texts. We evaluate GIFT by implementing it into three PLMs, and test the performance on three downstream tasks including addressee recognition, speaker identification and response selection. Experimental results show that GIFT can significantly improve the performance of three PLMs on three downstream tasks and two benchmarks with only 4 additional parameters per encoding layer, achieving new state-of-the-art performance on MPC understanding.\n\n\u003cdiv align=center\u003e\u003cimg src=\"image/result_addressee_recognition_gift.png\" width=80%\u003e\u003c/div\u003e\n\n\u003cdiv align=center\u003e\u003cimg src=\"image/result_speaker_identification_gift.png\" width=80%\u003e\u003c/div\u003e\n\n\u003cdiv align=center\u003e\u003cimg src=\"image/result_response_selection_gift.png\" width=80%\u003e\u003c/div\u003e\n\n\n## Dependencies\nPython 3.6 \u003cbr\u003e\nTensorflow 1.13.1\n\n\n## Download\n- Download the [BERT released by the Google research](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip), \n  and move to path: ./uncased_L-12_H-768_A-12 \u003cbr\u003e\n\n- We also release the [pre-trained MPC-BERT model](https://drive.google.com/file/d/1krmSuy83IQ0XXYyS9KfurnclmprRHgx_/view?usp=sharing), \n  and move to path: ./uncased_L-12_H-768_A-12_MPCBERT. You just need to fine-tune it to reproduce our results. \u003cbr\u003e\n  \n- Download the [Hu et al. (2019) dataset](https://drive.google.com/file/d/1qSw9X22oGGbuRtfaOAf3Z7ficn6mZgi9/view?usp=sharing) used in our paper,\n  and move to path: ```./data/ijcai2019/``` \u003cbr\u003e\n\n- Download the [Ouchi and Tsuboi (2016) dataset](https://drive.google.com/file/d/1nMiH6dGZfWBoOGbIvyBJp8oxhD8PWSNc/view?usp=sharing) used in our paper,\n  and move to path: ```./data/emnlp2016/``` \u003cbr\u003e\n  Unzip the dataset and run the following commands. \u003cbr\u003e\n  ```\n  cd data/emnlp2016/\n  python data_preprocess.py\n  ```\n\n\n## Pre-training\nCreate the pre-training data.\n```\npython create_pretraining_data.py \n```\nRunning the pre-training process.\n```\ncd scripts/\nbash run_pretraining.sh\n```\nThe pre-trained model will be saved to the path ```./uncased_L-12_H-768_A-12_MPCBERT```. \u003cbr\u003e \nModify the filenames in this folder to make it the same as those in Google's BERT. \n\n\n## Regular Fine-tuning and Testing\nTake the task of addressee recognition as an example. \u003cbr\u003e \nCreate the fine-tuning data.\n```\npython create_finetuning_data_ar.py \n```\nRunning the fine-tuning process.\n```\ncd scripts/\nbash run_finetuning.sh\n```\n\nModify the variable ```restore_model_dir``` in ```run_testing.sh``` \u003cbr\u003e \nRunning the testing process.\n```\ncd scripts/\nbash run_testing.sh\n```\n\n\n## GIFT Fine-tuning and Testing\nTake the task of addressee recognition as an example. \u003cbr\u003e \nCreate the fine-tuning data.\n```\npython create_finetuning_data_ar_gift.py \n```\nRunning the fine-tuning process.\n```\ncd scripts/\nbash run_finetuning_gift.sh\n```\n\nModify the variable ```restore_model_dir``` in ```run_testing_gift.sh``` \u003cbr\u003e \nRunning the testing process.\n```\ncd scripts/\nbash run_testing_gift.sh\n```\n\n\n## Downstream Tasks\nReplace these scripts and its corresponding data when evaluating on other downstream tasks.\n```\ncreate_finetuning_data_{ar, si, rs}_gift.py\nrun_finetuning_{ar, si, rs}_gift.py  \nrun_testing_{ar, si, rs}_gift.py  \n```\nSpecifically for the task of response selection, a ```output_test.txt``` file which records scores for each context-response pair will be saved to the path of ```restore_model_dir``` after testing. \u003cbr\u003e\nModify the variable ```test_out_filename``` in ```compute_metrics.py``` and then run ```python compute_metrics.py```, various metrics will be shown.\n\n\n## Cite\nIf you think our work is helpful or use the code, please cite the following paper:\n\n```\n@inproceedings{gu-etal-2023-gift,\n    title = \"{GIFT}: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding\",\n    author = \"Gu, Jia-Chen  and\n      Ling, Zhen-Hua  and\n      Liu, Quan  and\n      Liu, Cong  and\n      Hu, Guoping\",\n    booktitle = \"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)\",\n    month = jul,\n    year = \"2023\",\n    address = \"Toronto, Canada\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2023.acl-long.651\",\n    pages = \"11645--11658\",\n}\n```\n\n```\n@inproceedings{gu-etal-2021-mpc,\n title = \"{MPC}-{BERT}: A Pre-Trained Language Model for Multi-Party Conversation Understanding\",\n author = \"Gu, Jia-Chen  and\n           Tao, Chongyang  and\n           Ling, Zhen-Hua  and\n           Xu, Can  and\n           Geng, Xiubo  and\n           Jiang, Daxin\",\n booktitle = \"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)\",\n month = aug,\n year = \"2021\",\n address = \"Online\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2021.acl-long.285\",\n pages = \"3682--3692\",\n}\n```\n\n\n## Acknowledgments\nThank Wenpeng Hu and Zhangming Chan for providing the processed Hu et al. (2019) dataset used in their [paper](https://www.ijcai.org/proceedings/2019/0696.pdf). \u003cbr\u003e\nThank Ran Le for providing the processed Ouchi and Tsuboi (2016) dataset used in their [paper](https://www.aclweb.org/anthology/D19-1199.pdf). \u003cbr\u003e\nThank Prasan Yapa for providing a [TF 2.0 version of MPC-BERT](https://github.com/CyraxSector/MPC-BERT-2.0).\n\n\n## Update\nPlease keep an eye on this repository if you are interested in our work.\nFeel free to contact us (gujc@ustc.edu.cn) or open issues.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJasonForJoy%2FMPC-BERT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJasonForJoy%2FMPC-BERT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJasonForJoy%2FMPC-BERT/lists"}