Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/THUDM/ComiRec

Source code and dataset for KDD 2020 paper "Controllable Multi-Interest Framework for Recommendation"
https://github.com/THUDM/ComiRec

controllable multi-interest recommendation recommender-system

Last synced: about 2 months ago
JSON representation

Source code and dataset for KDD 2020 paper "Controllable Multi-Interest Framework for Recommendation"

Awesome Lists containing this project

README

        

# Controllable Multi-Interest Framework for Recommendation

Original implementation for paper [Controllable Multi-Interest Framework for Recommendation](https://arxiv.org/abs/2005.09347).

[Yukuo Cen](https://sites.google.com/view/yukuocen), Jianwei Zhang, Xu Zou, Chang Zhou, [Hongxia Yang](https://sites.google.com/site/hystatistics/home), [Jie Tang](http://keg.cs.tsinghua.edu.cn/jietang/)

Accepted to KDD 2020 ADS Track!

## Prerequisites

- Python 3
- TensorFlow-GPU >= 1.8 (< 2.0)
- Faiss-GPU

## Getting Started

### Installation

- Install TensorFlow-GPU 1.x

- Install Faiss-GPU based on the instructions here: https://github.com/facebookresearch/faiss/blob/master/INSTALL.md

- Clone this repo `git clone https://github.com/THUDM/ComiRec`.

### Dataset

- Original links of datasets are:

- http://jmcauley.ucsd.edu/data/amazon/index.html
- https://tianchi.aliyun.com/dataset/dataDetail?dataId=649&userId=1

- Two preprocessed datasets can be downloaded through:

- Tsinghua Cloud: https://cloud.tsinghua.edu.cn/f/e5c4211255bc40cba828/?dl=1
- Dropbox: https://www.dropbox.com/s/m41kahhhx0a5z0u/data.tar.gz?dl=1

- You can also download the original datasets and preprocess them by yourself. You can run `python preprocess/data.py {dataset_name}` and `python preprocess/category.py {dataset_name}` to preprocess the datasets.

### Training

#### Training on the existing datasets

You can use `python src/train.py --dataset {dataset_name} --model_type {model_name}` to train a specific model on a dataset. Other hyperparameters can be found in the code. (If you share the server with others or you want to use the specific GPU(s), you may need to set `CUDA_VISIBLE_DEVICES`.)

For example, you can use `python src/train.py --dataset book --model_type ComiRec-SA` to train ComiRec-SA model on Book dataset.

When training a ComiRec-DR model, you should set `--learning_rate 0.005`.

#### Training on your own datasets

If you want to train models on your own dataset, you should prepare the following three(or four) files:
- train/valid/test file: Each line represents an interaction, which contains three numbers `,,`.
- category file (optional): Each line contains two numbers `,` used for computing diversity..

## Common Issues

The computation of NDCG score.



I'm so sorry that the computation of NDCG score in the original version (now in `paper` branch) is not consistent with the definition in the paper, as mentioned in the issue #6.
I have updated the computation of NDCG score in the `master` branch according to the correct definition. For reproducing the NDCG scores reported in the paper, please use the `paper` branch.
By the way, I personally recommend to use the reported results of recall and hit rate only.

If you have ANY difficulties to get things working in the above steps, feel free to open an issue. You can expect a reply within 24 hours.

## Acknowledgement

The structure of our code is based on [MIMN](https://github.com/UIC-Paper/MIMN).

## Cite

Please cite our paper if you find this code useful for your research:

```
@inproceedings{cen2020controllable,
title = {Controllable Multi-Interest Framework for Recommendation},
author = {Cen, Yukuo and Zhang, Jianwei and Zou, Xu and Zhou, Chang and Yang, Hongxia and Tang, Jie},
booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
year = {2020},
pages = {2942–2951},
publisher = {ACM},
}
```