https://github.com/HeardACat/recsys-deploy
Recommendation System Deployment
https://github.com/HeardACat/recsys-deploy
Last synced: 3 months ago
JSON representation
Recommendation System Deployment
- Host: GitHub
- URL: https://github.com/HeardACat/recsys-deploy
- Owner: HeardACat
- Created: 2021-10-04T06:30:29.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2021-10-22T08:36:57.000Z (over 3 years ago)
- Last Synced: 2025-01-31T17:20:03.624Z (5 months ago)
- Language: Python
- Size: 129 KB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# recsys-deploy
Recommendation System Deployment
# Setup and Requirements
Our project leverages Python and the FAISS library. This has to be built before running our code. Unfortunately there is no `pip install` variation for FAISS.
The easiest way to setup is using conda:
```sh
conda install -c conda-forge faiss-cpu
```# Training
The easiest way to try and run the whole thing end to end is to
* move `tags_on_posts_sample.csv` to `data/tags_on_posts_sample.csv`
* automatically install package + dependencies, and train the model via `make train_quick_run````sh
conda create -n recsys-deploy -c conda-forge python=3.8 faiss-cpu -y
conda activate recsys-deploy
wget -O data/tags_on_posts_sample.csv
make train_quick_run
```This won't train `LSI` to completion, but shouldn't take longer than a few minutes to train. It will then build and run the container. Longer form:
```sh
conda create -n recsys-deploy -c conda-forge python=3.8 faiss-cpu -y
conda activate recsys-deploy
wget -O data/tags_on_posts_sample.csv
conda install -c conda-forge faiss-cpu # see setup notes above
pip install -e .
mkdir -p notebooks/model_quick
wget -O notebooks/model_quick/lid.176.ftz https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.ftz
python notebooks/model_quick.py
cp -r notebooks/model_quick recsys/model_quick
docker build -t recsys -f docker/Dockerfile.api .
docker run --rm -it -p 8000:8000 recsys
```In a separate shell
```sh
pip install -e ".[dev]"
python benchmark/benchmark.py
```# Usage
You can try the cli as shown below
````sh
$ python -m recsys.cli '{"query":["dog", "dog park"], "limit": 5}'
[{'tag': 'dogsarefamily', 'score': 31.691408157348633}, {'tag': 'cute dogs', 'score': 31.691408157348633}, {'tag': 'huskylove', 'score': 31.691408157348633}, {'tag': 'petlovers', 'score': 31.691408157348633}, {'tag': 'mtblife', 'score': 35.07106399536133}]```
```````sh
$ python -m recsys.cli '{"query":["广州"], "limit": 5}'
[{'tag': 'c25', 'score': 98.29067993164062}, {'tag': '混凝土直销', 'score': 98.29067993164062}, {'tag': 'sherlolly', 'score': 99.99999237060547}, {'tag': '三遊亭わん丈', 'score': 99.99999237060547}, {'tag': '春風亭一蔵', 'score': 99.99999237060547}]
```# Benchmark results
You can build and start the API server as follows (`podman` was used on Ubuntu 21.04)
```sh
make podman_build
make podman_run
```The benchmark results can be run through `python benchmark/benchmark.py`, which presumes port `8000` is used (as hard-coded in the `Makefile`)
The below were based on running off the container as above
```
$ python benchmark/benchmark.py
100%|███████████████████████████████████████████████████████████████████████████| 2000/2000 [03:14<00:00, 10.29it/s]
percentile score
0 50 92.00
1 75 105.00
2 90 122.00
3 95 133.00
4 99 158.01
```