https://github.com/internethealthreport/rpki-history
An API to query historical RPKI data
https://github.com/internethealthreport/rpki-history
Last synced: about 2 months ago
JSON representation
An API to query historical RPKI data
- Host: GitHub
- URL: https://github.com/internethealthreport/rpki-history
- Owner: InternetHealthReport
- License: gpl-3.0
- Created: 2025-07-14T11:18:43.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-08-26T08:23:20.000Z (10 months ago)
- Last Synced: 2025-08-26T10:31:04.600Z (10 months ago)
- Language: Python
- Size: 77.1 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# RPKI History
An API to query historical RPKI data, more specifically, Validated ROA Payloads (VRPs).
Links:
* [GitHub](https://github.com/InternetHealthReport/rpki-history)
* [API](https://www.ihr.live/rpki-history/api/)
We obtain VRP data from the [RPKI Views](https://www.rpkiviews.org/) project and store
it in a compact form for fast access. From each dump we extract the
`output/rpki-client.csv` file, which contains just the VRPs, and update our database.
A VRP entry consists of five fields:
```csv
ASN,IP Prefix,Max Length,Trust Anchor,Expires
AS13335,1.0.0.0/24,24,apnic,1753280249
```
We ignore the expiry time and only work on dump-time granularity. For each ingested dump
we compare the set of VRPs (in form of `asn, prefix, max_length`) with the previous
dump. Each VRP has a `visible` time range, during which is was continuously visible.
Since most VRPs are stable, this allows us to update just the time range, keeping the
database size compact.
Data limitations:
* **High-frequency updates are invisible.** Currently, dumps are created roughly
every 20 minutes. If a VRP expires / is removed and potentially recreated between dumps,
this will not be reflected in the database.
* **Data is at dump-time granularity.** If a VRP is created, expires, or is removed
between dumps, the exact point in time is lost. For example, if there are two dumps at
time `A` and `C`, a VRP expires at time `B` (`A < B < C`), and a user queries for
timestamp `D` (`A < D < B`) then the query will not find the VRP, since its visibility
ended at `A`. Similarly, VRPs that are only visible in one dump will have equal values
for the start and end visibility time.
* **Queryable time range is limited.** Naturally, we are limited by the amount of
available data. Trying queries outside of the available time range will result in an
error (to distinguish from non-existent VRPs).
Data sources:
* 2020-12-06 16:37:23 UTC to 2022-06-14 14:54:59 UTC:
[josephine.sobornost.net](https://josephine.sobornost.net/rpkidata/)
* 2022-06-14 15:08:09 UTC to now: [dango.attn.jp](https://dango.attn.jp/rpkidata/)
## Endpoints
### `/vrp`
Returns the list of covering VRPs for a prefix at a specific time, time range, or at the
latest dump time if no time parameter is specified.
#### Parameters
Mandatory:
* `prefix`: The prefix for which to return covering VRPs.
There are two types of time parameters, which are mutually exclusive: Point-in-time and
time range. For a time range only one bound can be specified, in which case the returned
list will include *all* earlier/later data available.
Point-in-time:
* `timestamp`: The timestamp (in `%Y-%m-%dT%H:%M:%S` [assumes UTC] or unix
epoch format) for which to return VRPs.
Time range:
* `timestamp__gte`: The start of the time range (inclusive; in
`%Y-%m-%dT%H:%M:%S`[assumes UTC] or unix epoch format)
for which to return VRPs.
* `timestamp__lte`: The end of the time range (inclusive; in `%Y-%m-%dT%H:%M:%S`
[assumes UTC] or unix epoch format) for which to return VRPs.
#### Result Format
```json
// https://www.ihr.live/rpki-history/api/vrp?prefix=8.8.8.0/24
[
{
"prefix": "8.8.8.0/24",
"asn": 15169,
"max_length": 24,
"visible": {
"from": "2023-12-29T17:30:54+00:00",
"to": "2025-08-13T06:29:10+00:00"
}
}
]
```
`visible` refers to the timespan during which the VRP was *continuously* visible, i.e.,
present in the dumps. Thus, if a VRP is missing from a dump, a new entry with a separate
`visible` range is created. **This time is unrelated to the validity time (`Not
before`/`Not after`) of the ROA!**
### `/status`
Returns the RPKI status for the specified prefix/ASN combination at the specified time,
or at the latest dump time if no timestamp is specified.
#### Parameters
Mandatory:
* `prefix`: The prefix to be checked.
* `asn`: The origin ASN of the prefix.
Optional:
* `timestamp`: The timestamp (in `%Y-%m-%dT%H:%M:%S` [assumes UTC] or unix epoch format)
for which to check the status. If omitted, use the latest available dump time.
#### Result Format
```json
// https://www.ihr.live/rpki-history/api/status?prefix=8.8.8.0/24&asn=15169
{
"status": "Valid"
}
// https://www.ihr.live/rpki-history/api/status?prefix=8.8.8.0/25&asn=15169
{
"status": "Invalid",
"reason": {
"code": "moreSpecific",
"description": "Covering VRP with matching origin ASN found, but queried prefix is more specific than maxLength attribute allows."
}
}
```
`status` is one of `[Valid, Invalid, NotFound]`.
`reason` (only for `Invalid` status) gives more detailed information about why the
status is invalid. `code` is for automatic processing, while `description` provides a
human-readable explanation.
### `/metadata`
Returns the list of dumps contained in the database. Since this list is very long, this
endpoint is paginated and returns at most 10000 results per page.
#### Parameters
Mandatory: None
Optional:
* `timestamp__gte`: The start of the time range (inclusive; in
`%Y-%m-%dT%H:%M:%S`[assumes UTC] or unix epoch format)
for which to return data.
* `timestamp__lte`: The end of the time range (inclusive; in `%Y-%m-%dT%H:%M:%S` [assumes UTC] or unix epoch format) for which to return data.
* `page`: The page number to load (defaults to 1).
* `page_size`: The number of results to include in one page (defaults to 10000).
#### Result Format
```json
// https://www.ihr.live/rpki-history/api/metadata
{
"next": "https://www.ihr.live/rpki-history/api/metadata?page_size=1000&page=2",
"results": [
{
"timestamp": "2020-12-06T16:37:23+00:00",
"deleted_vrps": 0,
"unchanged_vrps": 0,
"new_vrps": 205850
},
// ...
]
}
```
`next` is the URL to the next page. It will be an empty string if there are no results
left.
`timestamp` refers to the dump time (taken from the filename).
`[deleted|updated|new]_vrps` indicates the number of VRPs differing from the previous
dump. Note that `deleted` refers to a VRP that was present in the previous dump, but not
in the current one.
## Database Dump
For easier analysis of the dataset, or self-hosting, [a dump of the database (updated
weekly) is available.](https://archive.ihr.live/ihr/rpki-history/)
## Self-hosting
If for some reason you want to host your own version of this page, here is how.
### Getting Started
Create secrets files containing the Postgres user passwords.
```bash
# For normal user
touch ./secrets/postgres-pw.txt
# For read-only user
touch ./secrets/postgres-ro-pw.txt
# Of course write actual passwords to these files...
```
Initialize database. This will also build the initial Docker image, which might take
some time.
```bash
docker compose run --rm init-db
```
**Optional:** Restore database dump.
```bash
docker compose exec -T database pg_restore -d rpki_history < rpki-history.dump
```
**Optional (but recommended):** Create an index over the prefix column to greatly
decrease query time. This is not built into the init-db script, since it is faster to
build the index once after all data was imported.
```bash
docker compose exec database psql -c "CREATE INDEX ON vrps USING gist (prefix inet_ops)"
```
Start the API server.
```bash
docker compose up -d api
```
### Updating the Data
To import the latest available data (uses RPKIViews by default):
```bash
docker compose run --rm update-db
```
For more usage options (e.g., importing a specific timestamp):
```bash
docker compose run --rm update-db --help
```
### Backup/restore Database
#### Backup
```bash
docker compose exec database pg_dump --data-only -Fc > rpki-history.dump
```
#### Restore
Assumes fresh install or that `postgres_data` volume was deleted.
```bash
# Reinitialize database to create schema *and additional user*.
docker compose run --rm init-db
# Restore data
docker compose exec -T database pg_restore -d rpki_history < rpki-history.dump
```