{"id":16779704,"url":"https://github.com/wxl1999/cfcrs","last_synced_at":"2025-10-10T19:08:32.907Z","repository":{"id":172344555,"uuid":"649154774","full_name":"wxl1999/CFCRS","owner":"wxl1999","description":"[KDD23] Official PyTorch implementation for \"Improving Conversational Recommendation Systems via Counterfactual Data Simulation\".","archived":false,"fork":false,"pushed_at":"2023-06-05T12:52:32.000Z","size":10961,"stargazers_count":10,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-10-10T19:08:29.060Z","etag":null,"topics":["conversation","conversational-ai","conversational-bots","conversational-recommendation","conversational-recommender-system","data-augmentation","data-augmentation-strategies","data-augmentations","dialog","dialogue","dialogue-systems","pretrained-language-model","pretrained-models","pretraining","recommendation","recommendation-system","recommender-system"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wxl1999.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-06-04T00:42:14.000Z","updated_at":"2025-06-04T08:03:10.000Z","dependencies_parsed_at":null,"dependency_job_id":"e25944e3-6c09-4b94-a610-257c388d2384","html_url":"https://github.com/wxl1999/CFCRS","commit_stats":null,"previous_names":["wxl1999/cfcrs"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/wxl1999/CFCRS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wxl1999%2FCFCRS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wxl1999%2FCFCRS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wxl1999%2FCFCRS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wxl1999%2FCFCRS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wxl1999","download_url":"https://codeload.github.com/wxl1999/CFCRS/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wxl1999%2FCFCRS/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279005030,"owners_count":26083826,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["conversation","conversational-ai","conversational-bots","conversational-recommendation","conversational-recommender-system","data-augmentation","data-augmentation-strategies","data-augmentations","dialog","dialogue","dialogue-systems","pretrained-language-model","pretrained-models","pretraining","recommendation","recommendation-system","recommender-system"],"created_at":"2024-10-13T07:31:46.536Z","updated_at":"2025-10-10T19:08:32.902Z","avatar_url":"https://github.com/wxl1999.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CFCRS\n\nThis is the official PyTorch implementation for the paper:\n\n\u003e Xiaolei Wang, Kun Zhou, Xinyu Tang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen. Improving Conversational Recommendation Systems via Counterfactual Data Simulation. KDD 2023.\n\n## Overview\n\nConversational recommender systems (CRSs) aim to provide recommendation services via natural language conversations. Although a number of approaches have been proposed for developing capable CRSs, they typically rely on sufficient training data for training. Since it is difficult to annotate recommendation-oriented dialogue datasets, existing CRS approaches often suffer from the issue of insufficient training due to the scarcity of training data.\n\nTo address this issue, in this paper, we propose a CounterFactual data simulation approach for CRS, named **CFCRS**, to alleviate the issue of data scarcity in CRSs. Our approach is developed based on the framework of counterfactual data augmentation, which gradu-ally incorporates the rewriting to the user preference from a real dialogue without interfering with the entire conversation flow. To develop our approach, we characterize user preference and organize the conversation flow by the entities involved in the dialogue, and design a multi-stage recommendation dialogue simulator based on a conversation flow language model. Under the guidance of the learned user preference and dialogue schema, the flow language model can produce reasonable, coherent conversation flows, which can be further realized into complete dialogues. Based on the sim-ulator, we perform the intervention at the representations of the interacted entities of target users, and design an adversarial training method with a curriculum schedule that can gradually optimize the data augmentation strategy.\n\n![model](asset/model.png)\n\n## Requirements\n\n- python == 3.8\n- pytorch == 1.8.1\n- cudatoolkit == 11.1.1\n- transformers == 4.21.3\n- pyg == 2.0.1\n- accelerate == 0.12\n- nltk == 3.6\n\nYou can also see ``requirements.txt``.\n\nWe only list the version of key packages here. \n\n## Quick-Start\n\nWe run all experiments and tune hyperparameters on a GPU with 24GB memory, you can adjust `per_device_train_batch_size` and `per_device_eval_batch_size` in the script according to your GPU, and then the optimization hyperparameters (e.g., `learning_rate`) may also need to be tuned.\n\nThe number after each command is used to set ``CUDA_VISIBLE_DEVICES``.\n\nYou can change ``save_dir_prefix`` in the script to set your own saving directory.\n\n### Training Recommendation Dialogue Simulator\n\n- dataset: [redial, inspired]\n\n```bash\nbash script/simualtor/{dataset}/train_FLM.sh 0\nbash script/simualtor/{dataset}/train_schema.sh 0\n```\n\n### Training CRS models\n\n- model: [KBRD, BARCOR, UniCRS]\n- dataset: [redial, inspired]\n\n```bash\nbash script/{model}/{dataset}/train_pre.sh 0  # only for UniCRS\nbash script/{model}/{dataset}/train_rec.sh 0\nbash script/{model}/{dataset}/train_cf.sh 0\nbash script/{model}/{dataset}/train_conv.sh 0\n```\n\n## Contact\n\nIf you have any questions for our paper or codes, please send an email to wxl1999@foxmail.com.\n\n[//]: # (## Acknowledgement)\n\n[//]: # ()\n[//]: # (Please cite the following papers as the references if you use our codes or the processed datasets.)\n\n[//]: # ()\n[//]: # (```bibtex)\n\n[//]: # (@inproceedings{wang2022towards,)\n\n[//]: # (  title={Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning},)\n\n[//]: # (  author={Wang, Xiaolei and Zhou, Kun and Wen, Ji-Rong and Zhao, Wayne Xin},)\n\n[//]: # (  booktitle={Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},)\n\n[//]: # (  pages={1929--1937},)\n\n[//]: # (  year={2022})\n\n[//]: # (})\n\n[//]: # (```)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwxl1999%2Fcfcrs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwxl1999%2Fcfcrs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwxl1999%2Fcfcrs/lists"}