https://github.com/dieg0code/portfolio_04_personal_intelligent_assistant

Serverless Personal Intelligent Assistant (PIA), with semantic search and Retrieval Augmented Generation (OpenAI, Supabase, PgVector)
https://github.com/dieg0code/portfolio_04_personal_intelligent_assistant

aws golang lambda pgvector retrieval-augmented-generation supabase terraform

Last synced: 7 months ago
JSON representation

Serverless Personal Intelligent Assistant (PIA), with semantic search and Retrieval Augmented Generation (OpenAI, Supabase, PgVector)

Host: GitHub
URL: https://github.com/dieg0code/portfolio_04_personal_intelligent_assistant
Owner: Dieg0Code
Created: 2024-09-16T23:25:05.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-01-27T05:29:17.000Z (8 months ago)
Last Synced: 2025-01-27T06:27:09.408Z (8 months ago)
Topics: aws, golang, lambda, pgvector, retrieval-augmented-generation, supabase, terraform
Language: Go
Homepage:
Size: 137 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # PIA

![infra](infra.png)

## Configuration

### Environment Variables

Need to set the following environment variables:

```env

SUPABASE_URL=yoursupabaseurl

SUPABASE_KEY=yoursupabasekey

OPENAI_API_KEY=youropenaikey

```

*In my case, I'm using Github secrets to store these values.*

### Terraform

Also need to create an s3 bucket and a DynamoDB table for terraform state.

S3 bucket:

```bash

aws s3api create-bucket --bucket terraform-state-rag-diary --region sa-east-1 --create-bucket-configuration LocationConstraint=sa-east-1

```

Enable versioning for the bucket (optional):

```bash

aws s3api put-bucket-versioning --bucket terraform-state-rag-diary --versioning-configuration Status=Enabled

```

DynamoDB table:

```bash

aws dynamodb create-table \

    --table-name terraform_locks_diary \

    --attribute-definitions AttributeName=LockID,AttributeType=S \

    --key-schema AttributeName=LockID,KeyType=HASH \

    --billing-mode PAY_PER_REQUEST \

    --region sa-east-1

```

### Supabase

I'm using supabase and PgVector to store the diary entries with a column for the OpenAI text embedding.

Need to enable the `pgvector` extension:

```sql

create extension if not exists vector;

```

Create a new table called `diary` with the following columns:

```sql

CREATE TABLE IF NOT EXISTS diary (

    id bigserial PRIMARY KEY,

    title text NOT NULL,

    content text NOT NULL,

    created_at timestamp NOT NULL DEFAULT NOW(),

    embedding vector(3072) NOT NULL -- openai text embedding (large)

);

```

Create a stored procedure called `search_diary`:

```sql

create or replace function search_diary(

    query_embedding vector(3072),

    similarity_threshold float,

    match_count int

)

returns table (

    id bigint,

    title text,

    content text,

    created_at timestamp,

    similarity float

)

language plpgsql

as $$

begin

    return query

    select

        diary.id,

        diary.title,

        diary.content,

        diary.created_at,

        diary.embedding <#> query_embedding as similarity

    from

        diary

    where

        diary.embedding <#> query_embedding < similarity_threshold

    order by

        diary.embedding <#> query_embedding

    limit

        match_count;

end;

$$;

```

That's the store procedure that does the semantic search.

Create a index for the `embedding` column (optional):

```sql

create index on public.diary

using ivfflat (embedding vector_cosine_ops)

with (lists = 100);

```

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dieg0code/portfolio_04_personal_intelligent_assistant

Awesome Lists containing this project

README