https://github.com/datadavev/igsn_resolver
Performs IGSN resolution in preparation for DataCite switch
https://github.com/datadavev/igsn_resolver
doi identifier igsn
Last synced: 2 months ago
JSON representation
Performs IGSN resolution in preparation for DataCite switch
- Host: GitHub
- URL: https://github.com/datadavev/igsn_resolver
- Owner: datadavev
- License: agpl-3.0
- Created: 2022-08-12T20:12:48.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2024-11-28T17:23:50.000Z (10 months ago)
- Last Synced: 2025-03-29T13:35:19.567Z (6 months ago)
- Topics: doi, identifier, igsn
- Language: Python
- Homepage: https://datadavev.github.io/igsn_resolver/
- Size: 592 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# igsn_resolver
NOTE: These docs are out of date and are in the process of being updated. The implementation has changed significantly since these were written, most notably IGSNs are now managed by DataCite and the underlying implementation of this service is being modified to leverage the DataCite API to assist with IGSN resolution.
Performs IGSN resolution by leveraging the DataCite API.
`igsn_resolver` provides a simple proof of concept for an IGSN resolver service implemented using [FastAPI](https://fastapi.tiangolo.com/), and using the [DataCite API](https://support.datacite.org/docs/api) and optionally the [handle.net](https://handle.net/proxy_servlet.html) infrastructure while supporting expected behavior of content negotiation for RDF resources.

The service is composed of two components, the API which performs the resolution functions, and a minimal Web UI implemented as a Web Component. The UI component has minimal dependencies and may be deployed in any HTML page.

A test instance of the API is deployed on [Vercel](https://vercel.com/) at [https://igsn-resolver.vercel.app/](https://igsn-resolver.vercel.app/). The UI is deployed using GitHub pages, available at [https://datadavev.github.io/igsn_resolver/](https://datadavev.github.io/igsn_resolver/).
The API supports two endpoints, one for redirection, the other for basic metadata. These methods are described in the API documenation at [https://igsn-resolver.vercel.app/docs](https://igsn-resolver.vercel.app/docs) with some examples below.
Identifiers are provided as strings, and the service will attempt to normalize a provided identifier string prior to lookup. Examples of IGSN identifier strings that are recognized include:
```
au1234
AU1234
igsn:au1234
10273/au1234
igsn:10273/au1234
```Since the service is using the handle system under the hood, DOI idnetifier strings are also accepted, for example:
```
10.1594/PANGAEA.930327
doi:10.1594/PANGAEA.930327
```### `/.info/{identifier}`
The `/.info/{identifier}` endpoint will return metadata from the handle system about the identifier. For example:
```
curl "https://igsn-resolver.vercel.app/.info/au1234" | jq '.'
[
{
"original": "au1234",
"scheme": "igsn",
"normalized": "igsn:10273/au1234",
"handle": "10273/au1234",
"target": "http://www.ga.gov.au/sample-catalogue/10273/AU1234",
"ttl": 86400,
"timestamp": "2015-07-22T05:19:38Z"
}
]
```Where:
original
- The provided identifier string.
scheme
- Recognized identifier scheme, either "igsn" or "doi".
normalized
- Normalized representation of the identifier string.
handle
- Handle representation of the identifier string.
target
- Identifier targer as reported by the Handle System.
ttl
- Time to live in seconds, reported by the Handle System.
timestamp
- The entry timestamp as reported by the Handle System.
Multiple identifiers (up to 50) may be sent to the `/.info/` endpoint using a comma as a delimiter. For example:
```
curl "https://igsn-resolver.vercel.app/.info/au1234,10.1594/PANGAEA.930327" | jq '.'
[
{
"original": "au1234",
"scheme": "igsn",
"normalized": "igsn:10273/au1234",
"handle": "10273/au1234",
"target": "http://www.ga.gov.au/sample-catalogue/10273/AU1234",
"ttl": 86400,
"timestamp": "2015-07-22T05:19:38Z"
},
{
"original": "10.1594/PANGAEA.930327",
"scheme": "doi",
"normalized": "doi:10.1594/PANGAEA.930327",
"handle": "10.1594/PANGAEA.930327",
"target": "https://doi.pangaea.de/10.1594/PANGAEA.930327",
"ttl": 86400,
"timestamp": "2021-06-10T01:14:56Z"
}
]
```
### `/{identifier}`
The resolve endpoint `/{identifier}` accepts a single identifier string and returns a redirect (status code 307) to the target address listed by the handle system. For example:
```
curl -v -q "https://igsn-resolver.vercel.app/au1234"
...
< HTTP/1.1 307 Temporary Redirect
< Link:
;
rel="canonical",
;
type="application/json";
rel="alternate";
profile="https://igsn.org/info",
;
rel="alternate";
profile="https://schema.datacite.org/"
< Location: http://www.ga.gov.au/sample-catalogue/10273/AU1234
```
Note the `Link` header response which provides a hint to the client about alternate locations and profiles for accessing information about the identified resource as described below.
The behavior of this method can be modified by an optional `Accept-Profile` header[^1] sent by the client.
[^1]: Content Negotiation by Profile is currently a W3C draft, https://www.w3.org/TR/dx-prof-conneg/
If the client includes an `Accept-Profile` header of `https://igsn.org/info` the response is the same as a call to the `/.info/{identifier}` endpoint. For example:
```
curl -q -H "Accept-Profile: https://igsn.org/info" \
"https://igsn-resolver.vercel.app/au1234"
...
{
"original": "au1234",
"scheme": "igsn",
"normalized": "igsn:10273/au1234",
"handle": "10273/au1234",
"target": "http://www.ga.gov.au/sample-catalogue/10273/AU1234",
"ttl": 86400,
"timestamp": "2015-07-22T05:19:38Z"
}
```
If the client includes an `Accept-Profile` header of `https://schema.datacite.org/` then the redirect response is to the handle system resolve address, which will subsequently return a redirect to the known target.
This approach enables correct resolution of some resource content types (such as RDF formats) in the DOI system which otherwise return metadata about the identifier rather than the identified resource. The IGSN infrastructure is in the process of migrating to using DOI infrastructure provided by DataCite, and a service such as this will be necessary for correct resolution of IGSN identifiers when that change is implemented.
For example, the DOI identifier `doi:10.1594/PANGAEA.930327` has a target of `https://doi.pangaea.de/10.1594/PANGAEA.930327`. Resolving this with the handle system for a content type of `text/html` results in the expected redirect:
```
curl -q -v -H "Accept: text/html" "https://hdl.handle.net/10.1594/PANGAEA.930327"
...
< HTTP/2 302
< vary: Accept
< location: https://doi.pangaea.de/10.1594/PANGAEA.930327
```
If instead a content-type of `application/ld+json` is requested, the location of DataCite metadata is returned instead of the identified resource:
```
curl -q -v -H "Accept: application/ld+json" "https://hdl.handle.net/10.1594/PANGAEA.930327"
...
< HTTP/2 302
< vary: Accept
< location: https://data.crosscite.org/10.1594%2FPANGAEA.930327
```
Resolving the same identifier with this `igsn-resolver` service results in the expected location:
```
curl -q -v -H "Accept: application/ld+json" "https://igsn-resolver.vercel.app/10.1594/PANGAEA.930327"
...
< HTTP/1.1 307 Temporary Redirect
< Link:
;
rel="canonical",
;
type="application/json";
rel="alternate";
profile="https://igsn.org/info",
;
rel="alternate";
profile="https://schema.datacite.org/"
< Location: https://doi.pangaea.de/10.1594/PANGAEA.930327
```
The DataCite metadata may be retrieved by specifically requesting that format:
```
curl -q -v -H "Accept: application/ld+json" \
-H "Accept-Profile: https://schema.datacite.org/" \
"https://igsn-resolver.vercel.app/10.1594/PANGAEA.930327"
...
< HTTP/1.1 307 Temporary Redirect
< Link:
;
rel="canonical",
;
type="application/json";
rel="alternate";
profile="https://igsn.org/info",
;
rel="alternate";
profile="https://schema.datacite.org/"
< Location: https://hdl.handle.net/PANGAEA.930327/10.1594
```
## Development
After cloning this repo, create a virtual environment and install development dependencies:
```
pip install -r dev_requirements.txt
```
Then run a local instance on port 8000 by:
```
cd app
uvicorn main:app --reload
```
A push to `main` on the origin repo will result in a re-deployment to `deta.sh` and deployment of the web interface to GitHub pages.
Tests can be run with `pytest`, e.g.:
```
pytest
================================ test session starts =================================
platform darwin -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/vieglais/Documents/Projects/IGSN/igsn_resolver
plugins: anyio-3.6.1, asyncio-0.19.0
asyncio: mode=strict
collected 13 items
tests/test_igsnresolve.py ............. [100%]
================================= 13 passed in 1.33s =================================
```
The web component is in the `identifier-resolver` folder. It is implemented in Javascript using [Lit](https://lit.dev/docs/) and may be deployed without building or bundling. See the [`README`](identifier-resolver/README.md) in that folder for details.