https://github.com/dieg0code/portfolio_04_personal_intelligent_assistant
Serverless Personal Intelligent Assistant (PIA), with semantic search and Retrieval Augmented Generation (OpenAI, Supabase, PgVector)
https://github.com/dieg0code/portfolio_04_personal_intelligent_assistant
aws golang lambda pgvector retrieval-augmented-generation supabase terraform
Last synced: 7 months ago
JSON representation
Serverless Personal Intelligent Assistant (PIA), with semantic search and Retrieval Augmented Generation (OpenAI, Supabase, PgVector)
- Host: GitHub
- URL: https://github.com/dieg0code/portfolio_04_personal_intelligent_assistant
- Owner: Dieg0Code
- Created: 2024-09-16T23:25:05.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-27T05:29:17.000Z (8 months ago)
- Last Synced: 2025-01-27T06:27:09.408Z (8 months ago)
- Topics: aws, golang, lambda, pgvector, retrieval-augmented-generation, supabase, terraform
- Language: Go
- Homepage:
- Size: 137 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PIA

## Configuration
### Environment Variables
Need to set the following environment variables:
```env
SUPABASE_URL=yoursupabaseurl
SUPABASE_KEY=yoursupabasekey
OPENAI_API_KEY=youropenaikey
```*In my case, I'm using Github secrets to store these values.*
### Terraform
Also need to create an s3 bucket and a DynamoDB table for terraform state.
S3 bucket:
```bash
aws s3api create-bucket --bucket terraform-state-rag-diary --region sa-east-1 --create-bucket-configuration LocationConstraint=sa-east-1
```Enable versioning for the bucket (optional):
```bash
aws s3api put-bucket-versioning --bucket terraform-state-rag-diary --versioning-configuration Status=Enabled
```DynamoDB table:
```bash
aws dynamodb create-table \
--table-name terraform_locks_diary \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST \
--region sa-east-1
```### Supabase
I'm using supabase and PgVector to store the diary entries with a column for the OpenAI text embedding.
Need to enable the `pgvector` extension:
```sql
create extension if not exists vector;
```Create a new table called `diary` with the following columns:
```sql
CREATE TABLE IF NOT EXISTS diary (
id bigserial PRIMARY KEY,
title text NOT NULL,
content text NOT NULL,
created_at timestamp NOT NULL DEFAULT NOW(),
embedding vector(3072) NOT NULL -- openai text embedding (large)
);
```Create a stored procedure called `search_diary`:
```sql
create or replace function search_diary(
query_embedding vector(3072),
similarity_threshold float,
match_count int
)
returns table (
id bigint,
title text,
content text,
created_at timestamp,
similarity float
)
language plpgsql
as $$
begin
return query
select
diary.id,
diary.title,
diary.content,
diary.created_at,
diary.embedding <#> query_embedding as similarity
from
diary
where
diary.embedding <#> query_embedding < similarity_threshold
order by
diary.embedding <#> query_embedding
limit
match_count;
end;
$$;
```That's the store procedure that does the semantic search.
Create a index for the `embedding` column (optional):
```sql
create index on public.diary
using ivfflat (embedding vector_cosine_ops)
with (lists = 100);
```---