2 Dec 2024 |
richvdh | you could come up with various ways of improving that, but it's Hard to do in a way that preserves Matrix's availability promises | 10:26:16 |
Solodric | Hmm. | 11:10:11 |
Solodric | In reply to @richvdh:sw1v.org the principal problem for metadata on Matrix is that all servers have to know the full list of members in a room, so that they can correctly fan out new messages to all the other servers in the room This is a good summary, thanks. Put that way, I'm not sure how to apply something like an ODoH framework to it, although it feels like it should be possible. | 11:10:39 |
Solodric | I'd have to consider a specific piece of metadata, though. | 11:11:10 |
Solodric | What's currently the metadata tidbit of most concern? | 11:11:20 |
richvdh | like I said, imho the most fundamental concern is the membership list. | 11:11:58 |
Solodric | Okay, so the goal is to conceal the membership list of each room from the homeserver. Hmm. | 11:12:35 |
Solodric | I think you actually could apply an ODoH framework to that pretty directly, actually? I'm trying to consider what it would look like. | 11:13:09 |
Solodric | There's probably some interoperability aspect I'm not considering. | 11:13:35 |
Solodric | To be sure, I'd need to know exactly what that metadata's currently used for. | 11:14:03 |
richvdh | (you could argue that reactions being in plaintext are a larger concern on a day-to-day basis, but I think that's a much easier problem to solve) | 11:14:05 |
Solodric | In reply to @richvdh:sw1v.org (you could argue that reactions being in plaintext are a larger concern on a day-to-day basis, but I think that's a much easier problem to solve) (I had the exact same thought) | 11:14:17 |
Solodric | don't think you need any ODoH trickery for that. | 11:14:34 |
Solodric | ODoH is an inherently imperfect solution to metadata - the best solution is to outright encrypt it. ODoH doesn't E2E the data or eliminate it, it just distributes trust. | 11:14:57 |
richvdh | Currently, in a nutshell
- client wants to send a message to a room. It makes an http hit on the server with the room_id and the (encrypted, if necessary) content
- server constructs an event and adds it to the DAG of events the comprise a room
- server forwards that event to all other servers in the room
- other servers authorize that event based on their knowledge of the room membership, which is based on their knowledge of the event DAG
- servers forward authorized events to clients
| 11:17:32 |
richvdh | it's steps 3 and 4 that require servers to know room membership, and is somewhat inherent to the way matrix works | 11:18:08 |
Solodric | Tch. Yeah, I was going to say. | 11:18:22 |
Solodric | You could probably get through steps 1-3 anonymously with the framework I'm thinking of, but I'm not sure how you'd get past 4 and 5. | 11:18:48 |
Solodric | Step 3 also implies the need to know which other servers are connected, which means there's either a soft-metadata leak or you're trusting the client to truthfully report their homeserver, which introduces a whole other set of issues. | 11:19:46 |
Solodric | Although I guess the client can already spoof that actually, so that's probably not a problem with this anyway. | 11:20:02 |
richvdh | * Currently, in a nutshell
- client wants to send a message to a room. It makes an http hit on the server with the room_id and the (encrypted, if necessary) content
- server constructs an event and adds it to the DAG of events that comprises a room
- server forwards that event to all other servers in the room
- other servers authorize that event based on their knowledge of the room membership, which is based on their knowledge of the event DAG
- servers forward authorized events to clients
| 11:20:20 |
Solodric | The client would need to authenticate with their homeserver before sending requests to other servers, but I think that part's okay. | 11:20:37 |
Solodric | Still, that does imply the servers know which other servers they're forwarding events to - they just don't know which members are attached to which servers. | 11:21:42 |
Solodric | What I called a 'soft metadata' leak. Not a term of art, just the only way I can think of to say the metadata is semi-anonymized, depending on context. | 11:22:15 |
Solodric | (For an extreme example, if a person is the only user of a given homeserver, presumably this offers no benefit) | 11:22:50 |
Solodric | I guess you might be able to manage step 4 due to the authentication I mentioned before. Once everyone has authenticated with their own homeservers and presumably received some kind of anonymized session token with them, they could use that for even authorization and forwarding. | 11:24:15 |
Solodric | This really doesn't absolutely solve the problem, like I said. At best you've just softened the usefulness of the metadata significantly. | 11:24:40 |
Solodric | the homeserver doesn't know which of its users goes with which requests and other servers. | 11:25:08 |
Solodric | Unfortunately, this all depends on the homeserver implementing the system correctly and non-maliciously, since it's not true E2EE like I said. A malicious homeserver could still reveal all of this metadata. | 11:26:29 |
Solodric | You could use an ODoH framework to distribute that trust across multiple servers, but I can't see any outright solution. | 11:26:55 |