Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Stell0/pdm
PDM - Personal Data Manager
https://github.com/Stell0/pdm
Last synced: 8 days ago
JSON representation
PDM - Personal Data Manager
- Host: GitHub
- URL: https://github.com/Stell0/pdm
- Owner: Stell0
- License: agpl-3.0
- Created: 2023-04-30T09:12:13.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-30T14:35:02.000Z (over 1 year ago)
- Last Synced: 2024-08-13T07:07:52.356Z (4 months ago)
- Language: Python
- Size: 60.5 KB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - Stell0/pdm - PDM - Personal Data Manager (Python)
README
# PDM - Personal Data Manager
PDM aims to be a manager of all your personal data. It ingests your documents, creates embedding for them and allows you to chat with a ChatGPT that knows your stuff.
Data dat you offer to PDM are ingested and embedding are created using OpenAI, then text chunks and embedding are stored in a PostgreSQL Vectorstore. When you chat with assistant, semantically similar chunk of text are retrieved from ingested data, and are given as context to GPT.## give data to PDM
Create a directory and put your files in it. At the moment .txt and .rtdocs are supported.
```
mkdir -p sources
```.rtdocs are dump of ReadTheDocs websites created with the command:
```
wget -r -A.html -P my.documentation.rtdocs https://mydocumentation.readthedocs.io/en/latest/
```## launch PostgreSQL pgvector storage and application
```
export OPENAI_API_KEY=xx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export POSTGRES_PASSWORD=some-password
```
Other available environment variables are:
```
POSTGRES_DATABASE - Default: postgres
POSTGRES_HOST - Default: 127.0.0.1
POSTGRES_PORT - Default: 5432
POSTGRES_USER - Default: postgres
``````
podman run --name postgres -e POSTGRES_PASSWORD -v pgdata:/var/lib/postgresql/data -d -p 5432:5432 --replace docker.io/ankane/pgvector:latest
```
Launch API server
```
podman run --rm --interactive=true --name pdm-server -e POSTGRES_PASSWORD -e OPENAI_API_KEY -p 5000:5000 --network=host --replace ghcr.io/stell0/pdm
```Upload data
http://127.0.0.1:5000/uploadList of sources
GET http://127.0.0.1:5000/sourcesAsk something to the assistant
The history parameter is optional, it is used to give context to the assistant if you want to follow up on a previous question.
```
curl 'http://127.0.0.1:5000/ask' -H 'Content-Type: application/json' -X POST --data '{"question": "What is CQR?","history":""}'
```Launch interactive shell
```
podman exec -it --interactive=true --name pdm-cli -e POSTGRES_PASSWORD -e OPENAI_API_KEY pdm-server python3 console.py
```## Langchain
Of course all of that is possible thanks to https://github.com/hwchase17/langchain