https://github.com/micahcantor/redditsemanticsearchflask

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/micahcantor/redditsemanticsearchflask
Owner: micahcantor
Created: 2023-04-20T04:16:34.000Z (about 2 years ago)
Default Branch: master
Last Pushed: 2023-05-01T01:06:57.000Z (almost 2 years ago)
Last Synced: 2025-01-13T18:45:53.559Z (3 months ago)
Language: Python
Size: 5.54 MB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Reddit/Semantic Search Forum

This is the code for the demo for a technical article showing off the use of Cohere's semantic search API. The app is written in [Flask](https://flask.palletsprojects.com/en/2.2.x/) and styled with [Boostrap](https://getbootstrap.com/). Here's a screenshot of what it looks like:

![demo](./demo-preview.png)

## Credentials

To run the app, credentials are needed to access the Reddit and Cohere APIs. Create a `credentials.json` file at the root level of this repository with the following structure:

```json
{
"client_id": "REDDIT_CLIENT_ID",
"client_secret": "REDDIT_CLIENT_SECRET",
"username": "REDDIT_USERNAME",
"password": "REDDIT_PASSWORD",
"cohere_api_key": "COHERE_API_KEY"
}
```

## Development

First, create a new Python virtual environment, activate it, and install the dependencies:

```
python -m venv venv
. venv/bin/activate
python -m pip install -r requirements.txt
```

Then, to run a development server for the app, simply execute `flask run` at the top level of this repository.

## Local Database

The app relies on a local database of cached subreddit data and their respective FAISS indices. The database can be built using the scripts `get_embeddings.py` and `get_reddit_data.py`. To set up the database from scratch, do the following:

```sh
touch subreddit_data_db.json
python get_reddit_data.py programming
python get_reddit_data.py technology
python get_embeddings.py
```

First we create the database JSON file. Then we get the Reddit data for the subreddits we want. Finally we generate the embeddings for these subreddits. The indices for these embeddings will be stored under a new directory `indices/`.

## Deployment

To deploy the app, refer to Flask documentation on [deploying to production](https://flask.palletsprojects.com/en/2.2.x/deploying/)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/micahcantor/redditsemanticsearchflask

Awesome Lists containing this project

README