Deniable plaintext: sharing file without anyone being able to prove it

I’d like to describe a very simple scheme to publish messages in a such a way that you can deny you’re publishing them. I call it “deniable plaintext”, by reference to “deniable encryption”, because no encryption or authentication is involved. Despite some search, I haven’t found this scheme described in this specific of file sharing.

== Scenario ==

In short, Alice takes the file F she wants to publish and a decoy (“chaff”) file D, she’s ready to admit publishing.

I’d like to describe a very simple scheme to publish messages in a such a way that you can deny you’re publishing them. I call it “deniable plaintext”, by reference to “deniable encryption”, because no encryption or authentication is involved. Despite some search, I haven’t found this scheme described in this specific of file sharing.

== Scenario ==

In short, Alice takes the file F she wants to publish and a decoy (“chaff”) file D, she’s ready to admit publishing. F and D are then used to derive four files a, b, c and d such that by recombining a and c or b and d, you get the decoy file D; by recombining the files a and b or c and d, you get the original file F. Alice then publishes the file a. Ideally, someone else (Bob) should publish the file b (Alice can send the files b and d to Bob over a secure channel).

If someone else, Charlie, wants to get the original file F, he downloads the file a from Alice and the file b from Bob, and recombines them to get the file F.

Now, Eve wants to prove that either Alice or Bob (or both) are somehow publishing the file F. Eve downloads the files a from Alice and b from Bob. She then comes to Alice and says: “Look, you’re publishing file F. I know because by combining your file a with the file b, I obtain file F”. Alice then exhibits file c and replies: “You’re wrong. I’m publishing file D. Look, by combining files a you got from me and file c, we obtain the file D”.

Eve then comes to Bob and says: “Look, you’re publishing file F. I know because by combining your file b with the file a, I obtain file F”. Bob then exhibits file d and replies: “You’re wrong. I’m publishing file D. Look, by combining files b you got from me and file d, we obtain the file D”.

== How it works ==

We now turn to a way to construct the files, that is, we look for the “split” and “combine” functions such as

split(F, D) -> {a, b, c, d}
combine(a, b) -> F
combine(c, d) -> F
combine(a, c) -> D
combine(b, d) -> D

a is completely arbitrary, thus a random file is not a bad idea. There are many ways to combine the files to achieve the desired result. I use XOR, but clearly (by looking at the equations below), any abelian group would do it.

Thus, by treating files as ”same length” bit strings, we form (I use ^ to note XOR):

b = a ^ F

c = a ^ D

d = b ^ D

And the combine is XOR also, so we have:

a ^ b = a ^ (a ^ F) = (a ^ a) ^ F = F

c ^ d = (a ^ D) ^ ((a ^ F) ^ D) = (a ^ D) ^ (a ^ D) ^ F = F

a ^ c = a ^ (a ^ D) = D

b ^ d = b ^ (b ^ D) = D

== Working code ==

I wrote a tiny peace of code (C, unix environment) attached.

Note that the files F and D must be of the same length. The code doesn’t process error gracefully and isn’t user-friendly. It is intended to be a proof of concept, not something usable.

== Some related works ==

The main source of inspiration for the present scheme to a method to derive n random files from an original file F in such a way that any p <= n files can be combined to recreate the original F files. I was introduced to it during my MS course (in 1990). I think the author was Shamir, but I couldn't find a reference. According to wikipedia, the concept of "deniable encryption" was coined by Julian Assange and Ralf Weinmann in the Rubberhose filesystem. Their goal is to have data buried in an encrypted file with two keys. One you can reveal if forced to do so will extract the decoy data. The other one let you access the real (secret) data. The context is a bit different from deniable plaintext, since in deniable plaintext you do publish (potentially) both sets of data.