Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/joisino/private-recsys

Code for "Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data?" (SDM 2022)
https://github.com/joisino/private-recsys

recommender-system user-side

Last synced: about 7 hours ago
JSON representation

Code for "Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data?" (SDM 2022)

Awesome Lists containing this project

README

        

# Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022)

We consider how a user of a web service can build their own recommender system. Many recommender systems on the Internet are still unfair/undesirable for some users, in which case the users need to leave the service or unwillingly continue to use the system. Our proposed concept, private recommender systems, provides a way for the users to resolve this dilemma.

Paper: https://arxiv.org/abs/2105.12353

## ๐Ÿ’ฟ Dependency

```
$ pip install -r requirements.txt
$ sudo apt install wget unzip
```

## ๐Ÿ—ƒ๏ธ Download and Preprocess Datasets

You can download and preprocess data by the following command. It may take time.

```
$ bash download.sh
```

`hetrec.npy` is the Last.fm dataset. `home_and_kitchen.npy` is the Amazon dataset. `adult_*.npy` and `adult_*.npz` are the Adult dataset.

## ๐Ÿงช Evaluation

```
$ python evaluate.py --data 100k --prov cosine --sensitive popularity
$ python evaluate.py --data 100k --prov bpr --sensitive popularity
$ python evaluate.py --data 100k --prov cosine --sensitive old
$ python evaluate.py --data 100k --prov bpr --sensitive old
$ python evaluate.py --data hetrec --prov bpr --sensitive popularity
$ python evaluate.py --data home --prov bpr --sensitive popularity
$ python evaluate_adult.py
```

* `100k` is the MovieLens 100k dataset. `hetrec` is the LastFM dataset. `home` is the Amazon Home and Kitchen dataset.
* `--prov` specifys the algorithm of the service provider's recommender system.
* `--sensitive` specifyies the sensitive attribute. `old` is available only for the MovieLens datasets.

These scripts compute the sums of recalls, NDCGs, least ratios, and entropies for all users. Be sure to divide these values by the number of users to obtain the average values.

When your environment supports multi-processing, run, for example, the following commands to speed up the computation (with background executions):

```
$ python evaluate.py --data 100k --prov cosine --sensitive popularity --split 7 --block 0
$ python evaluate.py --data 100k --prov cosine --sensitive popularity --split 7 --block 1
$ python evaluate.py --data 100k --prov cosine --sensitive popularity --split 7 --block 2
$ python evaluate.py --data 100k --prov cosine --sensitive popularity --split 7 --block 3
$ python evaluate.py --data 100k --prov cosine --sensitive popularity --split 7 --block 4
$ python evaluate.py --data 100k --prov cosine --sensitive popularity --split 7 --block 5
$ python evaluate.py --data 100k --prov cosine --sensitive popularity --split 7 --block 6
$ python summary.py 7
```

## ๐Ÿ–‹๏ธ Citation

```
@inproceedings{sato2022retrieving,
author = {Ryoma Sato},
title = {Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data?},
booktitle = {Proceedings of the 2022 {SIAM} International Conference on Data Mining, {SDM}},
year = {2022},
}
```