That would be a good analogy if they just claimed the network gives you anonymit...

Chris2048 · on May 1, 2020

If the law where clear cut, yes. When the law is uncertain, it's not possible to either fully comply, or fully violate the law - avoiding "being caught" prevents legal action on an issue that may or may not be legal. That this is possible/necessary is due to the structure of the legal system.

What intensives have people to risk being a legal test case? It's clear to me those with influence over the law purposefully keep things ambiguous in order to have it both ways (threaten, without risk of being threatened) - maybe the opposing sides should be able to take advantage of the ambiguity just as much - after all, privacy is the right even of those with nothing to hide.

amelius · on May 1, 2020

The allegation will be: I downloaded A from X and B from Y, XORed them together and got copyrighted work.

However, X will say that Y could have manufactured B that way. And Y will say the same.

> This is the part that will be laughed out of the court room.

You can laugh all you want, but you can't convict anyone.

greenshackle2 · on May 1, 2020

Great, X and Y have infringed the copyright for both A and B. I'm not convinced they don't have more legal exposure rather than less (if B is copyrighted too).

amelius · on May 1, 2020

The main point is that there will be nobody in court in the first place.

Assume you have published random block A.

Now I claim that you have infringed copyright of a work because XORing your A with my C gives that work.

I hope you can agree that I can't sue you based on that. And this is exactly how this system works.

Kalium · on May 1, 2020

If I need your A in order to produce the copyrighted work, it seems likely to me that the argument will be made that it's a derivative work or another way of copying the copyrighted work. It might event work. It would certainly be expensive to find out.

As a rule, trying to get clever with technology to get around the legal system rarely works well. Especially when you can't afford to defend yourself.

LukeShu · on May 1, 2020

The point being made is that the system creates such deniability that it's impossible to say who the perpetrator is.

Let A and B be two data blocks produced by different people, and C be a copyrighted work.

It becomes known that A⊕B⇒C, and so the copyright holder of C would like to sue someone.

It is obvious that either the person who produced A or the person who produced B is infringing on the copyright of C. However, it is impossible to say which one it is; it is impossible to say whether A is a derivative work of B and C (B⊕C⇒A) or if B is a derivative work of A and C (A⊕C⇒B).

(Of course, good luck explaining that to a judge.)

Let's iterate on that more. So A and B a blocks on the network. Yet a 3rd person publishes some freely-licensed works; they F1 and F2. In order to be stored in OFF, they get xor'ed against existing blocks, and just so happen to xor against A and B. So A⊕F1⇒A' and B⊕F2⇒B'. So someone wanting to download F1 will download A and A', then get F1 by A⊕A'⇒F1. Neither the person publishing F1 and F2 nor the person downloading them have any idea that A⊕B⇒C. So now someone lawfully downloading F1 and F2 appears to be downloading C, because the blocks they're downloading are [A, B, A', B']; and [A, B] are what it takes to infringe on C. But they had no idea that A⊕B⇒C, they were just interested in F1 and F2. So now you can't say that downloading A and B together indicates unlawfully downloading C.

(Again, good luck explaining that to a judge.)

So the point is to make it really hard to say who did what.

Kalium · on May 1, 2020

Maybe it's just me, but doesn't this make it exceptionally clear that multiple people have colluded to both do a thing and hide who was the responsible party? I think the US legal system might have tools for handling such groups of people. I'm reasonably sure this isn't unprecedented, except perhaps in the technological details.

I think it might strain credulity that these "random" blocks "just happen" to have these particular properties. So much so that I doubt the argument would stand in face of the argument that F1 and F2 were deliberately crafted derivative works of C, made in an effort to allow piracy of C.

LukeShu · on May 1, 2020

> multiple people have colluded

Yes, everyone using OFFSystem is colluding to make it hard to tell who did what. There are legitimate reasons one someone might want to engage in such a system--including privacy. It's the same thing with Tor; everyone with a Tor node is colluding to make it hard to tell who's talking with what other nodes.

> it might strain credulity that these "random" blocks "just happen" to have these particular properties

The OFFSystem is designed such that blocks do "just happen" to have those particular properties. Any new thing put on the OFFSystem will make use of many random blocks that are already on the OFFSystem. For simplicity, in the examples I gave, storing 1 new data block used just 1 existing data block, but that number is configurable; the default is 2. So when a user stores F in to the OFFSystem, the software will randomly select 2 existing data blocks on the network (S, R), and combine F with them in order to store it: S⊕R⊕F⇒data-block-to-store; the software having having no idea what S and R have previously been used for. For simplicity, I've also been assuming that a file you want to store fits in 1 block; a source block is 128KiB; storing a file larger than that will take storing many blocks. So in all, by default, storing a file will make use of 2×(filesize/128KiB) randomly-selected data blocks already on the network. The idea is that after a short time, virtually every data block on the network will have many unrelated uses.

It would be entirely plausible that someone has both a legitimate use for A (e.g. F1) and a legitimate use for B (e.g. F2) with no one involved in that legitimate use having any idea that together they have the illegitimate use A⊕B⇒C, because the whole system was designed to make that likely.

LukeShu · on May 1, 2020

Switching sides, kinda: OK, so you might not be able to assign much culpability to any given data block, so just assign culpability to sharing the metadata that A⊕B⇒C, right?

It's my understanding that that metadata is stored in a "descriptor block" stored in to the network like any other data block (but isn't combined with a set of random other blocks already on the network). I am unsure what the format of that data is, it might very well be easy to identify a block as being a descriptor block (or maybe not, I have no idea).

Going after those who download the descriptor block is problematic for the same reasons--that block will end up being reused for other files, the same as any other data block; they could be using it for any number of purposes.

OK, so you go after those that say "hey, use this X descriptor block to get such-and-such copyrighted file". Well, there's a reasonable amount of precedent that that sharing a link to a copyrighted work doesn't constitute infringement in many jurisdictions (e.g. hosting a torrent file or a magnet link isn't infringing; it's actually seeding or leaching the contents of the torrent that's infringing).

OK, so you go after those that downloaded the descriptor block, then shortly afterward downloaded the data blocks referred to by the descriptor block, and that's reasonable suspicion. Well (1) tracking that is a lot harder than tracking who downloaded an individual block, and (2) there's still a sliver of plausible deniability, so then then you get a subpoena to search their computer for C.

Again, the whole point is to make it really _hard_ to tell who's doing what.

heavenlyblue · on May 1, 2020

No, I think this whole thing makes it impeccably clear which blocks are the data and which blocks are the encryption pad since all of that is in the URL.

There’s no philosophical argument here, in my opinion.

LukeShu · on May 1, 2020

xor is a commutative operation; there is no technical distinction between the "data" and "encryption pad" as you call them. The URL identifies an unordered set of blocks, and does not distinguish which of them are the pre-existing "randomizer" blocks ("encryption pad" blocks, as you call them), and which one is the new block ("data" block, as you call it). The URL has a set of blocks, you xor them all together in any order, and you get the result.

heavenlyblue · on May 1, 2020

All encryption functions are total, it’s just that the block size of XOR is 1 bit.

If you decided to take AES-256 and then split your data into blocks of 256 bits and then encrypted them with other groups of blocks of 256 bits, you would get exactly the same conceptual result.

That being said (this definition is absolutely irrelevant) - it doesn’t matter if you have an unordered set of blocks; if by applying them together you get the data then every one who held the block is liable.

Would you say that if you stored child porn on a RAID array and then distributed this RAID array across several people and removed one disk then philosophically the only moment when that porn really exists is only when it is combined to get the data in memory?

LukeShu · on May 2, 2020

I'm not sure why you've brought up philosophy twice now; what I'm showing is that the system makes it hard, practically, to tell who did what; no philosophy involved.

All but one of the blocks were already used by other people for entirely unrelated purposes. Only one of the blocks was created in the process of uploading illicit material, and it is impossible to know which one that is (from inspecting the blocks themselves; if you knew when each block was created--which the system does not track--then you could just say "the last one to be created"). You had asserted that the URL tells you which block that is; all I said to you is that it does not.

And even if you did identify which block that was, after a short time, another unsuspecting user of the network will end up using that block-involved-in-illicit-activity as a randomizer block for an entirely unrelated purpose; say: uploading cat gifs. And then others download those cat gifs, and download that block; having no idea that that block is also used to download child porn.

The idea of the system is that after a short time, virtually every data block on the network will have many unrelated uses. And because of those many unrelated uses, one cannot say that transmission of any given block indicates one of those uses over any of the other uses.

I haven't made any sort of philosophical argument that the illicit material does not exist until it when the data blocks are combined in memory, or anything like that. I have only made the argument that because each data block has many unrelated uses, it is difficult to prove who is using it for which uses. As I said in another comment, you can start to figure out things if you keep track of patterns of which blocks are downloaded when; but tracking that is a lot harder than tracking a single block. And that's the point of the system; to make it hard to prove specific assertions about who did what.

richardwhiuk · on May 1, 2020

How were A and C constructed?

Were they constructed from the derivative work? Or did they choose or generate A and C so that A xor C produces the derived work?

In all of those cases, they are still derivative work.

kelnos · on May 2, 2020

It feels like you don't really understand how the laws work, perhaps?

Intent is very very important.

If I'm hosting stuff for the express purpose of allowing people to download what I have, download what someone else has, and then do math with the two parts to get a copyrighted work, a judge (& jury) will have more than enough to rule against all of the people involved.

> You can laugh all you want, but you can't convict anyone.

That is an incredibly naive and incorrect view of how the (US, at least) legal system works.

Also consider that in a civil suit, the burden of proof is much, much lower than in a criminal case. So the word "convict" does not apply here.

The only thing here unique to OFFSystem is that if you have hundreds or thousands of nodes, it's (theoretically) not possible to prove who "owns" A and who owns B, so who do you take to court?

But then we get back to intent. The current legal consensus is that if a tool has legitimate non-infringing uses that are actually realistically happening in practice, then it gets very muddy and a plaintiff has to work hard to prove that any particular user is responsible for infringement. But in the case of OFFSystem, its express purpose is to enable copyright infringement. That is literally the stated purpose of the entire thing. Legally, anyone who participates in that system is assumed to be there to enable copyright infringement, and can be held liable, if not for direct infringement, then contributory infringement.

That's how the law actually works, for better or worse. And even if you're in the clear, do you want to be dragged through court and be responsible for legal fees, and the general uncertainty of what's going to happen to you?