https://github.com/umd-lib/papaya
IIIF Presentation API Application
https://github.com/umd-lib/papaya
Last synced: 12 days ago
JSON representation
IIIF Presentation API Application
- Host: GitHub
- URL: https://github.com/umd-lib/papaya
- Owner: umd-lib
- License: apache-2.0
- Created: 2025-11-19T15:49:16.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2026-05-27T15:44:31.000Z (19 days ago)
- Last Synced: 2026-05-27T17:12:54.268Z (19 days ago)
- Language: Python
- Size: 68.4 KB
- Stars: 0
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# papaya
IIIF Presentation API Application
## Configuration
### Environment Variables
* **`PAPAYA_URL`** Public facing base URL of this application.
* **`PAPAYA_FCREPO_ENDPOINT`** URL of the Fedora repository. This is not
directly accessed, but is used when translating between URIs and IIIF
identifiers.
* **`PAPAYA_FCREPO_PREFIX`** Prefix string to use in IIIF identifiers for
resources in the Fedora repository.
* **`PAPAYA_SOLR_ENDPOINT`** URL of the Solr server that will provide the
metadata about the resources.
* **`PAPAYA_SOLR_TEXT_MATCH_FIELD`** Field name in Solr to use for text
queries that will return hit highlight annotation lists
* **`PAPAYA_IIIF_IMAGE_ENDPOINT`** URL of the IIIF Image API server that
provides additional metadata about the images.
* **`PAPAYA_IIIF_IMAGE_ORIGIN`** Actual request URL to use for the IIIF
Image API server, if it differs from `PAPAYA_IIIF_IMAGE_ENDPOINT`
* **`PAPAYA_THUMBNAIL_WIDTH`** Maximum width of thumbnail images included
in the manifest.
* **`PAPAYA_LOGO_URL`** URL of an image file to be used as the logo in the
manifest.
* **`PAPAYA_METADATA_QUERIES_FILE`** YAML or JSON formatted file that
contains a mapping from metadata field label to a
[jq query](https://jqlang.org/manual/) to retrieve the value or values
for that field from the Solr document for a resource.
### Files
* **`METADATA_QUERIES_FILE`** YAML or JSON file that maps metadata field
labels to `jq` queries. It also includes the queries necessary to build
the IIIF Manifest structure (see `papaya.source.Resource` for more info).
For example:
```yaml
Title: .object__title__display[]?
Date: .object__date__edtf
Bibliographic Citation: .object__bibliographic_citation__display[]?
Creator: .object__creator[]?.agent__label__display[]
Contributor: .object__contributor[]?.agent__label__display[]?
Subject: .object__subject[]?.subject__label__display[]
# structural metadata fields
# used by Papaya to generate canvases, sequences, etc.
$uri: .id
$label: .object__title__display[]?
$date: .object__date__dt?
$license_uri: .object__rights__same_as__uris[0]
$page_uris: .page_uri_sequence__uris[]?
$page_image_ids: .iiif_thumbnail_sequence__ids[]?
$*page_doc: .object__has_member[]|select(.id == $uri)
$*page_label: .object__has_member[]|select(.id == $uri).page__title__txt
$*file_page_uri: .object__has_member[]|select(.page__has_file[].id == $uri).id
```
### IIIF Image Service Endpoint vs. Origin
The `PAPAYA_IIIF_IMAGE_ENDPOINT` is the canonical base URI for the
IIIF Image server associated with this instance of Papaya. For many cases,
it will also be the base URL that is used to make requests to that service.
However, there are cases where it makes more sense to be able to separate
the canonical base URI from the request base URL. For instance, consider
the case where both Papaya and the IIIF Image server are running inside a
Kubernetes cluster and can be connected directly without leaving the
internal Kubernetes network. In this case, it would be beneficial to be
able to use the cluster-internal base URL to make the HTTP connections,
while retaining the canonical URI any links in the generated manifest.
In this case, use `PAPAYA_IIIF_IMAGE_ORIGIN` to set the request base URL
for the IIIF Image service. When this value is set, Papaya will use it
instead of `PAPAYA_IIIF_IMAGE_ENDPOINT` to generate request URLs. In
addition, Papaya will create a set of `X-Forwarded-*` headers to add to
requests that reflect the canonical URI.
For example, given:
* `PAPAYA_IIIF_IMAGE_ENDPOINT` is `https://iiif.example.com/images/iiif/2`
* `PAPAYA_IIIF_IMAGE_ORIGIN` is `http://papaya:3001/iiif/2`
The headers would be:
* `X-Forwarded-Proto: https`
* `X-Forwarded-Host: iiif.example.com`
* `X-Forwarded-Path: /images`
The `X-Forwarded-Path` is calculated by removing the path of the origin
URL (e.g., `/iiif/2`) from the end of the path of the endpoint URI (e.g.,
`/images/iiif/2`).
## Development Setup
Requires Python 3.14
These setup instructions also assume that you are running the development
stacks for both [umd-fcrepo](https://github.com/umd-lib/umd-fcrepo) and
[umd-iiif](https://github.com/umd-lib/umd-iiif).
```zsh
git clone git@github.com:umd-lib/papaya.git
cd papaya
python -m venv --prompt "papaya-py$(cat .python-version)" .venv
source .venv/bin/activate
```
```zsh
pip install -e . --group test
```
Create a `.env` file with the following contents:
```dotenv
FLASK_DEBUG=1
PAPAYA_URL=http://localhost:3001/manifests
PAPAYA_FCREPO_ENDPOINT=http://fcrepo-local:8080/fcrepo/rest
PAPAYA_FCREPO_PREFIX=fcrepo:
PAPAYA_SOLR_ENDPOINT=http://localhost:8985/solr/fcrepo
PAPAYA_SOLR_TEXT_MATCH_FIELD=extracted_text__dps_txt
PAPAYA_IIIF_IMAGE_ENDPOINT=http://localhost:8182/iiif/2
PAPAYA_THUMBNAIL_WIDTH=250
PAPAYA_LOGO_URL=https://www.lib.umd.edu/images/wrapper/liblogo.png
PAPAYA_METADATA_QUERIES_FILE=metadata-queries.yml
```
### Running
```zsh
flask --app papaya.web run
```
The application will be available at
To listen on a different port, supply the `--port` option:
```zsh
flask --app papaya.web run --port 3001
```
### Tests
```zsh
pytest
```
With coverage information:
```zsh
pytest --cov src --cov-report term-missing tests
```
### API Documentation
```zsh
pip install -e . --group docs
pdoc papaya
```
API documentation generated by [pdoc](https://pdoc.dev/)
will be available at .
To serve the documentation on an alternate port:
```zsh
pdoc -p 8888 papaya
```
Now the documentation will be at .
### Docker Image
Build the image:
```zsh
docker build -t docker.lib.umd.edu/papaya .
```
When running in a Docker container, the `PAPAYA_SOLR_ENDPOINT` and
`PAPAYA_IIIF_IMAGE_ENDPOINT` environment variables will need to be
adjusted to refer to the correct hostname.
Copy the `.env` file set up earlier to `docker.env`, and make these
changes:
```dotenv
PAPAYA_SOLR_ENDPOINT=http://host.docker.internal:8985/solr/fcrepo
PAPAYA_IIIF_IMAGE_ENDPOINT=http://host.docker.internal:8182/iiif/2
```
Run, using this new `docker.env` file:
```zsh
docker run --rm -it -p 3001:5000 --env-file docker.env docker.lib.umd.edu/papaya
```
## Name
This application is so-named because the phrase "Presentation API
Application" could be abbreviated "PAPIA", which could be pronounced the
same as "papaya", and because it is paired with the
[Cantaloupe](https://cantaloupe-project.github.io/) IIIF image server in
the UMD Libraries' IIIF services stack.
## License
Apache-2.0
See the [LICENSE](LICENSE) file for license rights and limitations.