Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jonaylor89/wineinamillion
Wine Recommender created with sentence-BERT and NearestNeighbor on AWS SageMaker
https://github.com/jonaylor89/wineinamillion
bert jupyter-notebook nlp python pytorch sagemaker sentence-bert sentence-embeddings sklearn
Last synced: about 1 month ago
JSON representation
Wine Recommender created with sentence-BERT and NearestNeighbor on AWS SageMaker
- Host: GitHub
- URL: https://github.com/jonaylor89/wineinamillion
- Owner: jonaylor89
- License: mit
- Created: 2019-10-19T06:45:25.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2023-03-06T20:45:03.000Z (almost 2 years ago)
- Last Synced: 2024-05-01T16:51:15.817Z (8 months ago)
- Topics: bert, jupyter-notebook, nlp, python, pytorch, sagemaker, sentence-bert, sentence-embeddings, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 1.53 MB
- Stars: 9
- Watchers: 3
- Forks: 1
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Wine in a Million
Wine Recommender created with sentence-BERT and NearestNeighbor on AWS SageMaker
Authors: [Zephyr](https://github.com/JZHeadley) and [Johannes](https://jonaylor.xyz)
[Blog Post](https://blog.jonaylor.xyz/create-a-wine-recommender-using-nlp-on-aws)
-----
## Overview
In the associated jupyter notebook, we'll demonstrate how BERT can be used in tandem with Nearest Neighbors to create a recommendation engine that uses natural language as an input. To do this, we'll take advantage of a dataset of wine reviews located here that contains 130k different reviews of various wines. We'll use BERT to take those wine reviews, convert the reviews into word embeddings (i.e. vectors) and store those embeddings in AWS S3. With the embeddings stored in S3, we will then use that as our dataset for the Nearest Neighbor algorithm which will in turn be able to accept new user input, create an embedding for it, and find the K closest embeddings to that user input. In essence finding the wines that have a review most similar to the input the user provided.