{"id":13958282,"url":"https://github.com/HKUDS/RLMRec","last_synced_at":"2025-07-20T23:31:02.411Z","repository":{"id":203345160,"uuid":"708685675","full_name":"HKUDS/RLMRec","owner":"HKUDS","description":"[WWW'2024] \"RLMRec: Representation Learning with Large Language Models for Recommendation\"","archived":false,"fork":false,"pushed_at":"2024-06-26T19:57:38.000Z","size":473380,"stargazers_count":401,"open_issues_count":3,"forks_count":51,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-07-04T00:08:41.827Z","etag":null,"topics":["collaborative-filtering","graph-neural-networks","large-language-models","recommendation","recommender-systems"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2310.15950","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HKUDS.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-23T07:18:57.000Z","updated_at":"2025-06-26T08:46:17.000Z","dependencies_parsed_at":"2024-01-27T08:20:00.711Z","dependency_job_id":"a0738d2f-b5fd-4689-a2fb-99506a4369bd","html_url":"https://github.com/HKUDS/RLMRec","commit_stats":null,"previous_names":["hkuds/rlmrec"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/HKUDS/RLMRec","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FRLMRec","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FRLMRec/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FRLMRec/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FRLMRec/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HKUDS","download_url":"https://codeload.github.com/HKUDS/RLMRec/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FRLMRec/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266214683,"owners_count":23893936,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["collaborative-filtering","graph-neural-networks","large-language-models","recommendation","recommender-systems"],"created_at":"2024-08-08T13:01:28.555Z","updated_at":"2025-07-20T23:31:01.542Z","avatar_url":"https://github.com/HKUDS.png","language":"Python","funding_links":[],"categories":["推荐系统算法库与列表"],"sub_categories":["网络服务_其他"],"readme":"# RLMRec: Representation Learning with Large Language Models for Recommendation\n\n\u003cimg src='RLMRec_cover.png' /\u003e\n\n This is the PyTorch implementation by \u003ca href='https://github.com/Re-bin'\u003e@Re-bin\u003c/a\u003e for RLMRec model proposed in this [paper](https://arxiv.org/abs/2310.15950):\n\n \u003e**Representation Learning with Large Language Models for Recommendation**  \n \u003eXubin Ren, Wei Wei, Lianghao Xia, Lixin Su, Suqi Cheng, Junfeng Wang, Dawei Yin, Chao Huang*\\\n \u003e*WWW2024*\n\n\n\\* denotes corresponding author\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"RLMRec.png\" alt=\"RLMRec\" /\u003e\n\u003c/p\u003e\n\nIn this paper, we propose a model-agnostic framework **RLMRec** that enhances existing recommenders with LLM-empowered representation learning. It proposes a paradigm that integrates representation learning with LLMs to capture intricate semantic aspects of user behaviors and preferences. RLMRec incorporates auxiliary textual signals, develops a user/item profiling paradigm empowered by LLMs, and aligns the semantic space of LLMs with the representation space of collaborative relational signals through a cross-view alignment framework.\n\n## 📝 Environment\nYou can run the following command to download the codes faster:\n```bash\ngit clone --depth 1 https://github.com/HKUDS/RLMRec.git\n```\n\nThen run the following commands to create a conda environment:\n\n```bash\nconda create -y -n rlmrec python=3.9\nconda activate rlmrec\npip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116\npip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.1+cu117.html\npip install torch-sparse -f https://data.pyg.org/whl/torch-1.13.1+cu117.html\npip install pyyaml tqdm\n```\n\n😉 The codes are developed based on the [SSLRec](https://github.com/HKUDS/SSLRec) framework.\n\n## 📚 Text-attributed Recommendation Dataset\n\nWe utilized three public datasets to evaluate RLMRec:  *Amazon-book, Yelp,* and *Steam*.\n\nEach user and item has a generated text description.\n\nFirst of all, please **download the data** by running following commands.\n ```\n cd data/\n wget https://archive.org/download/rlmrec_data/data.zip\n unzip data.zip\n ```\n\nYou can also download our data from the [[Google Drive](https://drive.google.com/file/d/1PzePFsBcYofG1MV2FisFLBM2lMytbMdW/view?usp=sharing)].\n\n\nEach dataset consists of a training set, a validation set, and a test set. During the training process, we utilize the validation set to determine when to stop the training in order to prevent overfitting.\n```\n- amazon(yelp/steam)\n|--- trn_mat.pkl    # training set (sparse matrix)\n|--- val_mat.pkl    # validation set (sparse matrix)\n|--- tst_mat.pkl    # test set (sparse matrix)\n|--- usr_prf.pkl    # text description of users\n|--- itm_prf.pkl    # text description of items\n|--- usr_emb_np.pkl # user text embeddings\n|--- itm_emb_np.pkl # item text embeddings\n```\n\n### User/Item Profile\n- Each profile is a **high quality text description** of a user/item.\n- Both user and item profiles are generated from **Large Language Models** from raw text data.\n- The `user profile` (in `usr_prf.pkl`) shows the particular types of items that the user tends to prefer. \n- The `item profile` (in `itm_prf.pkl`) articulates the specific types of users that the item is apt to attract. \n\n😊 You can run the code `python data/read_profile.py` as an example to read the profiles as follows.\n```\n$ python data/read_profile.py\nUser 123's Profile:\n\nPROFILE: Based on the kinds of books the user has purchased and reviewed, they are likely to enjoy historical\nfiction with strong character development, exploration of family dynamics, and thought-provoking themes. The user \nalso seems to enjoy slower-paced plots that delve deep into various perspectives. Books with unexpected twists, \nconnections between unrelated characters, and beautifully descriptive language could also be a good fit for \nthis reader.\n\nREASONING: The user has purchased several historical fiction novels such as 'Prayers for Sale' and 'Fall of \nGiants' which indicate an interest in exploring the past. Furthermore, the books they have reviewed, like 'Help \nfor the Haunted' and 'The Leftovers,' involve complex family relationships. Additionally, the user appreciates \nthought-provoking themes and character-driven narratives as shown in their review of 'The Signature of All \nThings' and 'The Leftovers.' The user also enjoys descriptive language, as demonstrated in their review of \n'Prayers for Sale.'\n```\n\n### Semantic Representation\n- Each user and item has a semantic embedding encoded from its own profile using **Text Embedding Models**.\n- The encoded semantic embeddings are stored in `usr_emb_np.pkl` and `itm_emb_np.pkl`.\n\n### Mapping to Original Data\n\nThe original data of our dataset can be found from following links (thanks to their work):\n- Yelp: https://www.yelp.com/dataset\n- Amazon-book: https://cseweb.ucsd.edu/~jmcauley/datasets/amazon/links.html\n- Steam: https://github.com/kang205/SASRec\n\nWe provide the **mapping dictionary** in JSON format in the `data/mapper` folder to map the `user/item ID` in our processed data to the `original identification` in original data (e.g., asin for items in Amazon-book).\n\n🤗 Welcome to use our processed data to improve your research!\n\n## 🚀 Examples to run the codes\n\nThe command to evaluate the backbone models and RLMRec is as follows. \n\n  - Backbone \n\n    ```python encoder/train_encoder.py --model {model_name} --dataset {dataset} --cuda 0```   \n\n  - RLMRec-Con **(Constrastive Alignment)**:\n\n    ```python encoder/train_encoder.py --model {model_name}_plus --dataset {dataset} --cuda 0```\n\n  - RLMRec-Gen **(Generative Alignment)**:\n\n    ```python encoder/train_encoder.py --model {model_name}_gene --dataset {dataset} --cuda 0```\n\nSupported models/datasets:\n\n* model_name:  `gccf`, `lightgcn`, `sgl`, `simgcl`, `dccf`, `autocf`\n* dataset: `amazon`, `yelp`, `steam`\n\nHypeparameters:\n\n* The hyperparameters of each model are stored in `encoder/config/modelconf` (obtained by grid-search).\n\n **For advanced usage of arguments, run the code with --help argument.**\n\n## 🔮 Profile Generation and Semantic Representation Encoding\nHere we provide some examples with *Yelp* Data to generate user/item profiles and semantic representations.\n\nFirstly, we need to complete the following three steps.\n- Install the openai library `pip install openai`\n- Prepare your **OpenAI API Key**\n- Enter your key on `Line 5` of these files: `generation\\{item/user/emb}\\generate_{profile/emb}.py`.\n\nThen, here are the commands to generate the desired output with examples:\n\n  - **Item Profile Generation**:\n\n    ```python generation/item/generate_profile.py```   \n\n  - **User Profile Generation**:\n\n    ```python generation/user/generate_profile.py```\n\n  - **Semantic Representation**:\n\n    ```python generation/emb/generate_emb.py```\n\nFor semantic representation encoding, you can also try other text embedding models like [Instructor](https://github.com/xlang-ai/instructor-embedding) or [Contriever](https://github.com/facebookresearch/contriever).\n\n😀 The **instructions** we designed are saved in the `{user/item}_system_prompt.txt` files and also the `generation/instruction` folder. You can modify them according to your requirements and generate the desired output!\n\n## 🌟 Citation\nIf you find this work is helpful to your research, please consider citing our paper:\n```bibtex\n@inproceedings{ren2024representation,\n  title={Representation learning with large language models for recommendation},\n  author={Ren, Xubin and Wei, Wei and Xia, Lianghao and Su, Lixin and Cheng, Suqi and Wang, Junfeng and Yin, Dawei and Huang, Chao},\n  booktitle={Proceedings of the ACM on Web Conference 2024},\n  pages={3464--3475},\n  year={2024}\n}\n```\n\n**Thanks for your interest in our work!**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHKUDS%2FRLMRec","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FHKUDS%2FRLMRec","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHKUDS%2FRLMRec/lists"}