https://github.com/eggsyntax/py-user-knowledge

Predicting demographics of users with GPT based on text they've written
https://github.com/eggsyntax/py-user-knowledge

Last synced: 12 months ago
JSON representation

Predicting demographics of users with GPT based on text they've written

Host: GitHub
URL: https://github.com/eggsyntax/py-user-knowledge
Owner: eggsyntax
Created: 2024-04-17T16:59:40.000Z (almost 2 years ago)
Default Branch: master
Last Pushed: 2024-05-09T18:10:32.000Z (over 1 year ago)
Last Synced: 2025-02-15T13:42:57.206Z (12 months ago)
Language: Python
Size: 12.5 MB
Stars: 2
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README

Awesome Lists containing this project

README

          Series of experiments testing how well LLMs (mainly GPT-3.5) can predict 

demographics from text (mainly OKCupid profiles).

To run from the command line:

- Make sure that OPENAI_API_KEY is defined in your environment

- Install packages (untested as yet, please let me know if you encounter difficulties): `conda install --file conda_requirements.txt`

- `python test-demographics.py`

NOTE: this is HORRIBLE CODE. This was my experiment with letting GPT-4 generate 

most of the individual functions, and then it's just patches on patches 

from there.

It suffers further from my initial naivete about typical ML conventions for 

eg data representation, so I'm munging data back and forth in a bunch of places.

Ideally I will rewrite it when I get time, but also when do I ever get time?

Caveat emptor.

Note that despite using temperature=0, the probability distribution predicted 

by GPT will vary somewhat between runs, so results will differ.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/eggsyntax/py-user-knowledge

Awesome Lists containing this project

README