!ofPIbjtEXHmZiyuCxh:matrix.org

InfoCentral

10 Members
InfoCentral.org project discussion4 Servers

Load older messages


SenderMessageTime
9 Jul 2020
@pukkamustard:matrix.org@pukkamustard:matrix.org * ECRS from GNUNet (https://grothoff.org/christian/ecrs.pdf) also encrypts leaf nodes. They have a nice description of the threath model and what an adversary could do if not encrypted.09:27:53
@chrisgebhardt:matrix.orgChris GebhardtI've gone back and forth on these issues within my similar designs for collections and large data support. I'll read that ECRS paper when I get a chance.09:40:33
@chrisgebhardt:matrix.orgChris GebhardtAre second-preimage attacks the main concern or content obfuscation? https://flawed.net.nz/2018/02/21/attacking-merkle-trees-with-a-second-preimage-attack/20:36:55
10 Jul 2020
@chrisgebhardt:matrix.orgChris Gebhardt * Are second-preimage attacks the main concern or content (data size) obfuscation? https://flawed.net.nz/2018/02/21/attacking-merkle-trees-with-a-second-preimage-attack/00:31:46
@chrisgebhardt:matrix.orgChris Gebhardt * Are second-preimage attacks the main concern or content (data size / known block) obfuscation? https://flawed.net.nz/2018/02/21/attacking-merkle-trees-with-a-second-preimage-attack/00:32:15
@pukkamustard:matrix.org@pukkamustard:matrix.orgBoth are a concern. Second-preimage attack is mitigated by encoding the position of the node in the tree in the node (via the nonce - https://openengiadina.net/papers/eris.html#orgecda04e) Content obfuscation is important for plausible deniablility/censorhip resistance.05:43:12
@chrisgebhardt:matrix.orgChris GebhardtRight, but if the leaves are encrypted, I'm not seeing the practical difference. Routing obfuscation and network path redundancy are the only way to gain anonymity / availability, which is a different layer than the data model. Am I missing something?16:04:44
@chrisgebhardt:matrix.orgChris Gebhardt(ie. Even if branch nodes are encrypted, an adversary can often do timing analysis on network traffic to match which data belongs to which root requests, thereby determining length.)16:17:12
@chrisgebhardt:matrix.orgChris GebhardtPartly, I suspect we have radically different objectives here. A core principle of the InfoCentral data model is to make a fair amount of graph structure (hash refs) publicly visible, even if content is encrypted. This is for the sake of reference collection, which allows all parties to participate in routing public information toward its adjacencies in the graph. It also eliminates the need for the fiasco that is owned namespaces and federation, because new information can simply coalesce around existing information and be filtered by end users.17:19:15
@chrisgebhardt:matrix.orgChris Gebhardt * Partly, I suspect we have radically different objectives here. A core principle of the InfoCentral data model is to make a fair amount of graph structure (hash refs) publicly visible, even if content is encrypted. This is for the sake of reference collection, which allows all parties to participate in routing public information toward its adjacencies in the graph. (Thus helping build out a shared semantic graph, RDF or otherwise..) It also eliminates the need for the fiasco that is owned namespaces and federation, because new information can simply coalesce around existing information and be filtered by end users.17:27:12
@chrisgebhardt:matrix.orgChris GebhardtAll network attributes are thus considered orthogonal to this core data model, as competitive QoS parameters: consistency, redundancy, availability, anonymity, etc. I think a lot of techniques you've mentioned would fit well here.17:27:44
11 Jul 2020
@pukkamustard:matrix.org@pukkamustard:matrix.org

Hi Chris Gebhardt. Thanks for the comments!

Right, but if the leaves are encrypted, I'm not seeing the practical difference. Routing obfuscation and network path redundancy are the only way to gain anonymity / availability, which is a different layer than the data model. Am I missing something?

I agree completely. An adversary can observe network traffic and then learn about which blocks belong to what content. One would require some kind of transport level obfuscation (e.g. transmit a bunch of random blocks) to mitigate this. I think the indistinguishability of block type can help do these kinds of obfuscations.

A case where encryption of branch nodes might make more sense is plausible deniability for caching peers. Caching peers can choose to not know anything about the blocks they are caching. If the caching peers hold the verification capability then they can decode the branch nodes and yes we end up at the case where branch nodes are just not encrypted.

So, I admit, the case for encrypting branch nodes is not so strong. Still, I think it is worth the effort.

Partly, I suspect we have radically different objectives here. A core principle of the InfoCentral data model is to make a fair amount of graph structure (hash refs) publicly visible, even if content is encrypted. This is for the sake of reference collection, which allows all parties to participate in routing public information toward its adjacencies in the graph. (Thus helping build out a shared semantic graph, RDF or otherwise..) It also eliminates the need for the fiasco that is owned namespaces and federation, because new information can simply coalesce around existing information and be filtered by end users.

I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.

08:46:01
@how:public.cathellekin
So, I admit, the case for encrypting branch nodes is not so strong. Still, I think it is worth the effort.
11:59:11
@how:public.cathellekinI think plausible deniability for caching actors is a strong case as It effectively enables network neutrality and provides another defense layer against censorship attempts by strong attackers.12:00:56
12 Jul 2020
@chrisgebhardt:matrix.orgChris Gebhardt
I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.
Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise). My new "standard entity" data model (WIP) allows contained references to be selectively plaintext or encrypted, even to different keys. This could be used to hide the Merkle branch references as in your model. But it also allows, say, subject references in RDF to be publicly indexed by repositories and collected (for the sake of gathering multi-sourced statements within the entities) Of course, the statement data may also be encrypted, with the same or different keys.
04:43:38
@chrisgebhardt:matrix.orgChris Gebhardt *
I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.
Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise). My new "standard entity" data model (WIP) allows contained references to be selectively plaintext or encrypted, even to different keys. This could be used to hide the Merkle branch references as in your model. But it also allows, say, subject references in RDF to be publicly indexed by repositories and collected (for the sake of gathering multi-sourced statements within the entities) Of course, the statement data may also be encrypted, with the same or different keys.
04:43:55
@chrisgebhardt:matrix.orgChris Gebhardt *

I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.
Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise). My new "standard entity" data model (WIP) allows contained references to be selectively plaintext or encrypted, even to different keys. This could be used to hide the Merkle branch references as in your model. But it also allows, say, subject references in RDF to be publicly indexed by repositories and collected (for the sake of gathering multi-sourced statements within the entities) Of course, the statement data may also be encrypted, with the same or different keys.

04:44:03
@chrisgebhardt:matrix.orgChris Gebhardt *
I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.
Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise). My new "standard entity" data model (WIP) allows contained references to be selectively plaintext or encrypted, even to different keys. This could be used to hide the Merkle branch references as in your model. But it also allows, say, subject references in RDF to be publicly indexed by repositories and collected (for the sake of gathering multi-sourced statements within the entities) Of course, the statement data may also be encrypted, with the same or different keys.
04:44:21
@chrisgebhardt:matrix.orgChris Gebhardt *

I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.

Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise). My new "standard entity" data model (WIP) allows contained references to be selectively plaintext or encrypted, even to different keys. This could be used to hide the Merkle branch references as in your model. But it also allows, say, subject references in RDF to be publicly indexed by repositories and collected (for the sake of gathering multi-sourced statements within the entities) Of course, the statement data may also be encrypted, with the same or different keys.

04:44:41
@chrisgebhardt:matrix.orgChris Gebhardt *

I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.

Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise). My new "standard entity" data model (WIP) allows contained references to be selectively plaintext or encrypted, even to different keys. This could be used to hide the Merkle branch references as in your model. But it also allows, say, subject references in RDF to be publicly indexed by repositories and collected (for the sake of gathering multi-sourced statements within the entities) Of course, the statement data may also be encrypted, with the same or different keys.

04:44:52
@chrisgebhardt:matrix.orgChris Gebhardt *

I think this still works with what we propose. Namely you can reference the verification capabilities. This make the hash refs publicly visible, even if content is encrypted.

Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise). My new "standard entity" data model (WIP) allows contained references to be selectively plaintext or encrypted, even to different keys. This could be used to hide the Merkle branch references as in your model. But it also allows, say, subject references in RDF to be publicly indexed by repositories and collected (for the sake of gathering multi-sourced statements within the entities for different subjects) Of course, any statement data may also be encrypted, with the same or different keys.

04:46:37
@chrisgebhardt:matrix.orgChris GebhardtThus far, I didn't have any particular plans to force people to use a certain storage block size or Merkle tree structure. Rather the latter was to be made available as a facility for handling large data like multi-media, but with clean logical segments of arbitrary size rather than a uniform binary chunk size. (ex. a video's large data segments may be divided cleanly at keyframes.)04:53:31
@chrisgebhardt:matrix.orgChris Gebhardt
In reply to @how:public.cat
I think plausible deniability for caching actors is a strong case as It effectively enables network neutrality and provides another defense layer against censorship attempts by strong attackers.
One of my assumptions is that there will not be a single topography, like true P2P, but rather a wide range of network overlays. I think practical economic forces will drive toward variable centralization of network/storage services. Future ISPs are the most logical caching peers for popular public data (with dynamic QoS agreements replacing static CDN), and they'll need to see many references to be able to do any sort of optimization efficiently. (ie. without guessing adjacency via observed request patterns.) But then there can be overlays of private repositories, public and private P2P wide nets of any scale, local adhoc and mesh networks, sneakernet, etc. In the public cases, root hashes and visible references to root hashes are censorship targets, if the adversary knows what roots to target. Alternative roots that lead to the same leaves can often be detected by network analysis. (perhaps leading to targetting the leaves as well..) And alternative roots also break valuable reference collection for forking into multiple aliases.
05:15:05
@chrisgebhardt:matrix.orgChris GebhardtThe full plausible deniability model is certainly useful, though I suspect it is necessarily quite costly in performance. Imagine a DHT P2P ring topography using a 4K block size. A single 4MB image would involve on the order of 1000 requests to randomized peers within the network, also involving retries for unavailable content, various latencies, etc. Another economic issue is how caching actors will be compensated for their work if they don't know whose data they are storing.05:24:38
@chrisgebhardt:matrix.orgChris Gebhardt * The full plausible deniability model is certainly useful, though I suspect it is necessarily quite costly in performance. Imagine a DHT P2P ring topography using a 4K block size. A single 4MB image would involve on the order of 1000 requests to randomized peers within the network, also involving retries for unavailable blocks, varying latencies, etc. Another economic issue is how caching actors will be compensated for their work if they don't know whose data they are storing.05:25:14
@chrisgebhardt:matrix.orgChris Gebhardt
In reply to @how:public.cat
I think plausible deniability for caching actors is a strong case as It effectively enables network neutrality and provides another defense layer against censorship attempts by strong attackers.
* One of my assumptions is that there will not be a single topography, like true P2P, but rather a wide range of network overlays. I think practical economic forces will drive toward variable centralization of network/storage services. Future ISPs are the most logical caching peers for popular public data (with dynamic QoS agreements replacing static CDN), and they'll need to see many references to be able to do any sort of optimization efficiently. (ie. without guessing adjacency via observed request patterns.) But then there can be overlays of private repositories, public and private P2P wide nets of any scale, local adhoc and mesh networks, sneakernet, etc. In the public cases, root hashes and visible references to root hashes are censorship targets, if the adversary knows what roots to target. Alternative roots that lead to the same leaves can often be detected by network analysis. (perhaps leading to targetting the leaves as well..) And alternative roots also break valuable reference collection by forking into multiple aliases. So my overall thinking was more on the order of layered networks filling in gaps if censorship occurs.
05:28:57
@chrisgebhardt:matrix.orgChris Gebhardt * The full plausible deniability model is certainly useful, though I suspect it is necessarily quite costly in performance. Imagine a DHT P2P ring topography using a 4K block size. A single 4MB image would involve on the order of 1000 requests to randomized peers within the network, also involving retries for unavailable blocks, varying latencies, etc. (And possibly also a layer of onion routing!) Another economic issue is how caching actors will be compensated for their work if they don't know whose data they are storing. (So perhaps this may be best for special cases under very hostile regimes.)05:31:25
@chrisgebhardt:matrix.orgChris Gebhardt * The full plausible deniability model is certainly useful, though I suspect it is necessarily quite costly in performance. Imagine a DHT P2P ring topography using a 4K block size. Since peers cannot aggregate adjacent data if branches are encrypted, a single 4MB image would involve on the order of 1000 requests to randomized peers within the network, also involving retries for unavailable blocks, varying latencies, etc. (And possibly also with onion routing!) Another economic issue is how caching actors will be compensated for their work if they don't know whose data they are storing. (So perhaps this may be best for special cases under very hostile regimes.)05:32:50
@chrisgebhardt:matrix.orgChris Gebhardt * The full plausible deniability model is certainly useful, though I suspect it is necessarily quite costly in performance. Imagine a DHT P2P ring topography using a 4K block size. Since peers cannot aggregate adjacent data if branches are encrypted, a single 4MB image would involve on the order of 1000 requests to randomized peers within the network, also involving retries for unavailable blocks, varying latencies, etc. (And possibly also with onion routing!)05:35:30
@pukkamustard:matrix.org@pukkamustard:matrix.org

Here, in particular, I'm talking about not just the references that build a Merkle tree but the hash references within serialized content, such as the subjects, predicates, and objects of semantic graph statements (RDF or otherwise).

The ERIS and "Content-addressable RDF" proposals split up these two ideas as well. "Content-addressable RDF" defines some restrictions on RDF that make sub-graphs content-addressable. Subjects, predicates and objects can be referenced by a (not-so-)good old URL, a simple hash of the serialized content being referenced or by a more intricate (and secure) encoding of the content (e.g. ERIS).

Thus far, I didn't have any particular plans to force people to use a certain storage block size or Merkle tree structure

Same, same. ERIS is just one possible way of securely encoding content that can be referenced.

I recently learnt that GNUNet also defines URLs (https://docs.gnunet.org/handbook/gnunet.html#File_002dSharing-URIs). So also ECRS encoded content can be referenced from RDF.

One of my assumptions is that there will not be a single topography, like true P2P, but rather a wide range of network overlays.

I share this assumption. @tg has proposed one such topography (https://p2pcollab.net/#upsycle).

The full plausible deniability model is certainly useful, though I suspect it is necessarily quite costly in performance.

This might very well be the case. But there are situations where plausible deniability (and censorship resistance) is absolutely necessary. Maybe encoding the branch nodes will help in lowering the total cost of full plausible deniability?

10:48:56

Show newer messages


Back to Room ListRoom Version: 4