https://github.com/fedi-e2ee/public-key-directory-specification

Specification for a Fediverse Directory Server for Public Keys
https://github.com/fedi-e2ee/public-key-directory-specification
fediverse key-transparency
Last synced: 5 months ago
JSON representation
Specification for a Fediverse Directory Server for Public Keys
Host: GitHub
URL: https://github.com/fedi-e2ee/public-key-directory-specification
Owner: fedi-e2ee
License: other
Created: 2024-06-08T06:50:01.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2026-01-24T20:53:50.000Z (5 months ago)
Last Synced: 2026-01-25T06:46:50.050Z (5 months ago)
Topics: fediverse, key-transparency
Language: PHP
Homepage: https://publickey.directory
Size: 525 KB
Stars: 155
Watchers: 13
Forks: 3
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project

README

          # Public Key Directory Server Specification

This repository contains the specification for a **Public Key Directory** for the Fediverse.

> [!CAUTION]

> This is still a work in progress.

> 

> There may be bugs. Some things may be under-specified, or there could be subtle contradictions.

> [Please tell us](https://github.com/fedi-e2ee/public-key-directory-specification/issues/new) if we made a mistake.

## What Is This All About?

> [!NOTE]

> If you're more technical, [feel free to skip this section](#our-proposal).

This project is an infrastructure component to improve the privacy of the Fediverse. Its intended audience is software

developers, system administrators, security researchers, academics, cryptographers, and technology experts.

If you're not already familiar with the Fediverse (e.g., Mastodon), [The Verge](https://www.theverge.com/24063290/fediverse-explained-activitypub-social-media-open-protocol) 

([archived](https://archive.is/TuNUu)) tried to explain it in 2024.

Currently, Direct Messages (DMs) on the Fediverse are like postcards: Server admins and malicious hackers can read them,

if they want to. Law enforcement can also compel server admins to disclose them if they want to prosecute you, and there

is little recourse for them except to comply. (Unlike large tech companies, most instance admins do not have a large

legal defense fund to fight the government in court.)

The architects of the Fediverse are investigating a technique called [end-to-end encryption](https://github.com/swicg/activitypub-e2ee)

to alleviate this privacy risk. This is like upgrading from postcards to sealed envelopes that only the recipient can

open. This envelope upgrade requires careful use of math that involves something called a "public key". You can think of

it as a special lock that only the intended party can open.

### Where does this project come in?

The hardest part of designing end-to-end encryption for the Fediverse is actually a **Key management** problem, which

can be summed up as one question:

> How do you know which public key belongs to a stranger you want to chat privately with?

How ever you decide to initially answer that question, the obvious follow-up question is:

> How do you know that you weren't deceived?

And if you repeat this exercise, you will eventually reinvent a trust model for a public key infrastructure.

### Why This Problem Is Harder Than It Looks

There are some obvious questions that arise when first faced with this problem, depending on your technical background.

#### Why not just a simple directory?

A simple directory is quick and easy: Just look up a username and get their public key. ActivityPub already offers such

a mechanism with [webfinger](https://webfinger.net/). Why not just rely on that?

The problem is: You have to **completely trust** whoever runs that directory. If a simple directory operator is 

compromised, or decides to be malicious, they can silently hand you the wrong key for someone you want to message, and

you'd have no way to detect it. The key you think belongs to Alice might actually belong to an attacker.

#### Why not just use Certificates and Certificate Authorities?

This creates a power structure where a few trusted entities (Authorities) have the capability to impersonate anyone in

the ecosystem. This gives hackers and hostile governments easy targets to prioritize, and is fundamentally incompatible

with the goals and culture of the Fediverse.

#### Why not use an existing solution?

The technical details matter a lot. Read [the threat model](Specification.md#threat-model) to understand more about what

features and security properties we're trying to provide, which isn't compatible with existing designs.

(This README only gets more technical from this point on.)

## Our Proposal

Our solution is to require all relevant actions (public key enrollment and revocation) be published immediately onto an 

append-only data structure (i.e., a [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree)). In the cryptographic

literature, this is called "Key Transparency".

The Public Key Directory vends a user's public keys (which originate from the user's device) that can be used with 

digital signature algorithms, and includes a machine-verifiable proof of when each public key was enrolled.

This machine-verifiable proof is useful for establishing a baseline, minimal level of trust that a given public key is

correct for the party you wish to talk to. That is to say: **It keeps the directory honest.**

Additional manual key verification mechanisms (key fingerprints, safety numbers, etc.) are out-of-scope but totally 

permitted for technical users in higher-level protocols. Really, we're trying to do better than Trust on First Use 

(TOFU), so [Johnny can finally encrypt](https://people.eecs.berkeley.edu/~tygar/papers/Why_Johnny_Cant_Encrypt/OReilly.pdf). 

Other applications can build atop our Public Key Directory design to build advanced use cases (i.e., authenticated key 

exchanges for end-to-end encryption).

It's worth keeping in mind that the Public Key Directory isn't *just* the Merkle Tree, it's an API built on top of a

Merkle Tree. To that end, you can query the API to retrieve every currently-trusted public key for a user, rather than

having to manually parse this information out of the data stored in the underlying data structure.

### How Does This Help Non-Technical Users?

**This doesn't *directly* help non-technical users,** but it does help developers solve a difficult problem so they can

focus on making software better and easier to use, which (in turn) helps everyone.

The Public Key Directory is a building block for developers. Our immediate audience for this specific component is 

necessarily somewhat technical.

However, the projects that build *atop* this building block should take pains to minimize the friction for non-technical

users. (That includes the other projects we will be opening in this GitHub organization!)

### What kind of keys are vended by the Public Key Directory?

Public keys meant for verifying digital signatures, not encryption keys.

To integrate the Public Key Directory with a broader E2EE project, you can use the signature verification keys to verify

a signed [key package (in MLS parlance)](https://www.rfc-editor.org/rfc/rfc9420.html#name-key-packages). This provides

assurance that the "key package" your software is using to start a private chat is actually owned by your friend.

### Does the Public Key Directory publish anything except those public keys?

**Yes!** The Public Key Directory additionally publishes [Auxiliary Data](Specification.md#auxiliary-data), which allows

developers a convenient way to build key transparency into their own systems by building atop ours.

### How does Key Transparency help the trust issue?

It proves [everyone sees the same thing](https://defuse.ca/triangle-of-secure-code-delivery.htm#:~:text=are%20successfully%20attacked.-,Userbase%20Consistency%20Verification%3A,-Users%20of%20the).

Transparency systems allow you to prove that everyone has an identical view of history. This means the directory cannot

lie about its contents without detection. **This keeps everyone honest.**

The mapping of public keys (and auxiliary data) to ActivityPub actors is entirely deterministic: If you replayed the 

entire Transparency log from the first entry to the current latest, starting with a blank slate, you will always arrive

at the same final mapping.

### How the Pieces Fit Together

```mermaid

flowchart TB

    subgraph Users["End Users"]

        Alice["Alice's Client"]

        Bob["Bob's Client"]

    end

    subgraph Fediverse["Fediverse Instances"]

        ServerA["Instance A
(e.g., mastodon.social)"]

        ServerB["Instance B
(e.g., hachyderm.io)"]

    end

    subgraph PKD["Public Key Directory"]

        API["JSON REST API
(read-only queries)"]

        Inbox["ActivityPub Inbox
(Protocol Messages)"]

        State["Current Key State
(valid keys per Actor)"]

        TLog["Transparency Log
(append-only Merkle tree)"]

    end

    subgraph Federation["PKD Federation"]

        PKD2["Other PKD Instances"]

    end

    Auditors["Auditors & Verifiers"]

    %% User to Instance flows

    Alice -->|"HPKE-encrypted
Protocol Messages"| ServerA

    Bob -->|"HPKE-encrypted
Protocol Messages"| ServerB

    %% Instance to PKD flows

    ServerA -->|"HTTP Signatures +
AddKey, RevokeKey, etc."| Inbox

    ServerB -->|"HTTP Signatures +
AddKey, RevokeKey, etc."| Inbox

    %% Direct user revocation (bypass instance)

    Alice -.->|"RevokeKeyThirdParty
(emergency revocation)"| API

    %% PKD internal flows

    Inbox --> State

    State --> TLog

    %% Query flows

    ServerA & ServerB -->|"Query public keys
for other users"| API

    API --> State

    %% Federation

    TLog <-->|"Checkpoint
(cross-PKD Merkle roots)"| PKD2

    %% Auditing

    TLog -->|"Verify append-only
history"| Auditors

```

* **Users** generate keys locally and send encrypted [Protocol Messages](Specification.md#protocol-messages) through

  their Fediverse instance

* **Instances** add HTTP Message Signatures ([RFC 9421](https://www.rfc-editor.org/rfc/rfc9421.html)) and relay messages

  to the Public Key Directory (PKD)

* **PKD** validates messages, updates key state, and commits everything to the transparency log

* **Disaster recovery** that can [pass the Mud Puddle test](https://blog.cryptographyengineering.com/2012/04/05/icloud-who-holds-key/)

  is baked into our design:

  * Third-party revocation allows users to revoke compromised keys directly, bypassing a potentially malicious

    instance

  * Recovery is enabled through [BurnDown](Specification.md#burndown); users can elect to become 

    [Fireproof](Specification.md#fireproof) to become immune to BurnDown

* **Federation** enables multiple PKDs to cross-verify via Checkpoint messages

* **Auditors** can independently verify the append-only log hasn't been tampered with

## Our Guiding Principles

All design decisions for this proposal have been influenced by the following guiding principles.

1. Build for people, and

2. Security over legacy

### Build for People

The main goal of this project is to enable more people to securely communicate with each others.

From this it follows that we

* don't require any expert knowledge from the users of this system.

* minimize the number of steps a user has to take to use this system securely.

* value the privacy of the users, by only storing the minimal amount of information necessary and make it possible to delete data when the user demands it.

* clearly communicate errors and incidents to the users and by doing so give them a proper understanding of their security state.

As a side note, we don't consider companies as people.

### Security Over Legacy

We want to build a system that solves key management for people, nothing less and nothing more.

There are many ways to solves this problem and many existing solutions which could become part of our solution.

But we want to take the opportunity to focus on security and verifiability.

This means if we have the choice we will not take an existing solution, when this solution leads to an unwarranted increase in complexity or a security compromise. Even if this solution is an established standard and used by everyone else.

## Documents In Scope

* [Architecture](Architecture.md)

  \- This document succinctly describes how the Public Key Directory fits into the Fediverse.

* **[Specification](Specification.md)**

  \- This document contains the specification text in its entirety.

  1. [Introduction](Specification.md#introduction)

  2. [Concepts](Specification.md#concepts)

  3. [Threat Model](Specification.md#threat-model)

  4. [Protocol Messages](Specification.md#protocol-messages)

  5. [The Federated Public Key Directory](Specification.md#the-federated-public-key-directory)

  6. [Cryptography Protocols](Specification.md#cryptography-protocols)

  7. [Security Considerations](Specification.md#security-considerations)

* [Test Vectors](Test-Vectors.md)

  \- This document will contain test vectors for the protocols used in the Public Key Directory.

## Reference Implementation

The [**Public Key Directory Reference Implementation**](https://github.com/fedi-e2ee/pkd-server-php) is now available.

## Historical Development Blogs

This section includes some highlights that may be worth considering to understand the technical underpinnings of our

design. Reading them is not mandatory, but should provide insight into how we approached these problems.

1. [Towards Federated Key Transparency](https://soatok.blog/2024/06/06/towards-federated-key-transparency) (June 2024)

    * This blog post kicked this project off. It explains the motivation and how it fits into the goal of delivering

      "end-to-end encryption for the Fediverse".

2. [Key Transparency and the Right to be Forgotten](https://soatok.blog/2024/11/21/key-transparency-and-the-right-to-be-forgotten/)

   (November 2024)

    * This blog post describes how we square the auditable, append-only nature of Merkle Trees with a desire to not make

      complying with a nation's *Right To Be Forgotten* technically impossible.

## Extensions for Auxiliary Data

See the [Extensions](https://github.com/fedi-e2ee/fedi-pkd-extensions) repository.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fedi-e2ee/public-key-directory-specification

Awesome Lists containing this project

README