Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gil--/shoppy-gpt

A Nextjs + Pinecone + OpenAI GPT site to surface answers to Shopify help center content
https://github.com/gil--/shoppy-gpt

nextjs openai pinecone shopify

Last synced: 4 months ago
JSON representation

A Nextjs + Pinecone + OpenAI GPT site to surface answers to Shopify help center content

Host: GitHub
URL: https://github.com/gil--/shoppy-gpt
Owner: gil--
Created: 2023-02-06T02:32:50.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-02-06T14:52:29.000Z (about 2 years ago)
Last Synced: 2024-05-01T14:32:18.099Z (9 months ago)
Topics: nextjs, openai, pinecone, shopify
Language: JavaScript
Homepage:
Size: 167 KB
Stars: 63
Watchers: 4
Forks: 14
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

**This is a research demo. Support is not provided.

# Shopify Help Center Search via GPT
Quickly surface answers from Shopify's help center using GPT.

## Technologies used
- ScrapingBee to scrape list of help center urls
- Mongodb to store scraped data
- OpenAI to created embeddings vector points and completion prompt
- Pinecone to store vectors in db

## How this works
1. Run `tasks/1-sitemap-to-csv.js` to convert Shopify's Help Center Sitemap.xml into CSV and drop all columns except urls.
2. Convert CSV into array of links.
3. Run `tasks/2-scrape.js` to scrape the article text from every link using ScrapingBee and isnert text into Mongodb using url as unique index.
4. Run `tasks/3-generate-embeddings.js` to generate OpenAI embeddings and upsert into Pinecone.

## Why?
Created this as a research experiment in order to learn OpenAI embeddings + Pinecone. Added bonus was to have a way to quickly surface answers for my Shopify platform questions.

## How to optimize this further
- Split article text into smaller documents to decrease cost of token usage. Split by H2/section.
- Test different models to see one cost. Curie is 10x cheaper than Davinci.
- Search documents with a normal search engine (Algolia) and pass that document into open AI rather than using embedding’s and Pinecone.
- Cache results for common queries.
- Test a shorter prompt to further save tokens.

## Preview
![preview.png](./preview.png)