Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tom-draper/persona
Probabilistic generation of character profiles using real-world demographic data.
https://github.com/tom-draper/persona
api character-creation character-generator data-api data-generation dataset demographic demographics demographics-data fastapi generator probabilistic-models profile profile-builder random-generation rest-api sample story-creation survey
Last synced: 16 days ago
JSON representation
Probabilistic generation of character profiles using real-world demographic data.
- Host: GitHub
- URL: https://github.com/tom-draper/persona
- Owner: tom-draper
- Created: 2022-11-01T10:41:04.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-16T08:52:01.000Z (4 months ago)
- Last Synced: 2024-10-18T06:30:45.029Z (26 days ago)
- Topics: api, character-creation, character-generator, data-api, data-generation, dataset, demographic, demographics, demographics-data, fastapi, generator, probabilistic-models, profile, profile-builder, random-generation, rest-api, sample, story-creation, survey
- Language: Python
- Homepage: https://persona-api.vercel.app/v1
- Size: 4.17 MB
- Stars: 4
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Persona
**Make your characters more representative and realistic.**
A REST API and CLI tool for probabilistically generating random character profiles from a given input location using real-world demographic data. Generating a new persona rolls the dice on features such as age, sex, sexuality, ethnicity, language and religion. This project was born out of a lack of tools for building representative and realistic characters for stories.
## REST API
### Generate Persona
```
https://persona-api.vercel.app/v1//
``````bash
$ curl https://persona-api.vercel.app/v1/england/[
{
"age": 21,
"sex": "Female",
"sexuality": "Heterosexual",
"ethnicity": "British, White",
"religion": "Christian",
"language": "English",
"location": "Oldham, North West"
}
]```
#### Count Query
Multiple personas from the same location can be generated at once by providing a `count` query parameter.
```
https://persona-api.vercel.app/v1//?count=5
```### List Locations
All locations currently included can be listed with the `/v1/locations/` endpoint.
```bash
https://persona-api.vercel.app/v1/locations/
``````bash
$ curl https://persona-api.vercel.app/v1/locations/[
"australia",
"canada",
"germany",
"global",
"united_kingdom",
"england",
"london",
"northern_ireland",
"scotland",
"wales",
"california",
"florida",
"texas"
]```
### Location Features
Currently, not all features are available for each location. For a given location, all features available for generation can be retrieved with the `/v1//features/` endpoint.
```
https://persona-api.vercel.app/v1//features/
``````bash
$ curl https://persona-api.vercel.app/v1/england/features/{
"england": [
"age",
"sex",
"sexuality",
"religion",
"ethnicity",
"language",
"location"
]
}
```## Command-line Tool
### Installation
Install Python dependencies from `requirements.txt`.
```py
pip install -r requirements.txt
```### Generate Persona
Run `main.py` from the root directory.
```py
python src/main.py
```The generated persona can be limited to specific features using the feature flags to include.
```py
python src/main.py --age --location --language
```Multiple personas can be generated at once using the `-n` flag.
```py
python src/main.py -n
```### Example
```bash
python src/main.py united_kingdom> United Kingdom
Age: 48
Sex: Female
Sexuality: Heterosexual
Ethnicity: British, White
Religion: No religion
Language: English
Location: Blackburn with Darwen, North West, England
```## Data
The demographic data is carefully sourced from reputable census data for each location. Sources for each location can be found alongside the data in each `README.md` in `/data`. The data is stored in a raw JSON format to make it as transparent, accessible and modifiable as possible.
### Locations
The full list of locations currently available can be found [here](data/README.md). It includes countries, groups of locations (e.g. UK, USA), and cities. More locations and features will continue to be added in future.
## Limitations
Personas generated are basic approximations. Character features are naively generated under the assumption that each feature is independent from one another. This assumption is not true; knowing a person's age could help you better predict their religion. However, the sourcing of accurate and large scale data necessary for the joint probabilities for all feature combinations is exponentially harder to achieve. As a result, generated characters should be taken with a pinch of salt, and very occasionally personas will be generated that have a combination of features that may seem extremely unlikely or even impossible. Obviously, the fewer features included in the persona, the easier it is to approximate, and the less likely this is to occur.
Demographic data can change quite rapidly, and surveys take a long time to conduct, so the data used to generate profiles will always be somewhat outdated. Although, I still believe using outdated data in this way is an improvement over manual character creation in terms of representation as it will bypass any biases or misconceptions you may hold.
With this aim, this project is only as good as its data. There will certainly be minorities that make up a tiny proportion of the population that are missing from survey data (or grouped into an 'other' category) and therefore cannot surface during character generation. Improvement to data is an imperative and continuous goal for this project.
## Contributions
Contributions are very welcome for data or general improvements.
To contribute:
1. Fork the repo.
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am "Add some feature"`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create a new pull requestWhen contributing data, keep content, directory structure and JSON formatting consistent and remember to note your source (including URL) in `data/...//README.md`. Sources should be from reputable organisations conducting census research. Avoid "Other" as a feature attribute. Do not worry if percentages do not sum to 1 exactly, all feature probabilities are normalised during generation.