[dlc-dev] CET collision attacks

Lloyd Fournier lloyd.fourn at gmail.com
Mon Mar 29 07:02:26 CEST 2021

Hi all,

A few days ago Nadav pointed out that when you add attestation points from
the same oracle in the "no hash"[2] scheme you will get the same point even
if the individual points were different.
This means you can get a CET encryption key that is the same for a
different set of outcomes. This is bad.

Observe a bet on a number 0-3 decomposed into two binary nonces (R_0, R_1).
The encryption keys for each outcome would be:

0: R_0 + 0*X + R_1 + 0*X
1: R_0 + 1*X + R_1 + 0*X
2: R_0 + 0*X + R_1 + 1*X
3: R_0 + 1*X + R_1 + 1*X

Note that CETs 1 and 2 are the equal? I call this a CET collision.

## How did you miss this Lloyd?

I actually have not studied the case of combining points from the same
oracle. I assumed it would be simpler than the case of having different
oracles and so security would follow on from that. It turns out this was a
very very wrong assumption.

I also never considered an adversary's ability to "collide" CET encryption
keys so that they are the same. I only found that if a CET is used then all
the oracle's whose attestation points were combined to create it did indeed
participate in the attestation (their contribution cannot be canceled out
as long as you have these proofs of knowledge on each R).
But if the decryption key for a CET is the same as another one then (from
the point of view of the protocol) they have actually attested to two
contradictory things.

In this case the oracles are not to blame for the equivocation so this
breaks fraud proofs in the case of multiple nonces from the same oracle.
Clearly in the no hash scheme these collisions happen without any malicious
behavior at all but that doesn't mean they can't happen in the hash scheme.
This is what this post is about.

## Collision attacks without oracle participation fundamentals

Consider an adversary who is going to find a magic combination of
events/nonces to combine to create a CETs such that if the oracle(s) attest
to one thing the adversary is going to be able to decrypt more than one CET
from the information available to it. The oracles are behaving completely
honestly in this scenario. What are the constraints on this attack?

Conjecture   #1 The two CETs will have exactly the same encryption key.

If they didn't then the adversary would have created a combination of
attention points where they know the scalar offset between the two groups
that caused the collision. They don't know the discrete logarithms of the
component points so this should be impossible..

Conjecture #2 The points that make up the two CETs will be derived from the
same nonces.

First let's denote sum(R + c * X) = R_0 + c_0 * X + R_1 + c_1 * X ... R_n +
c_n * X for n possible nonces.

If you know sum(R_a + c_b*X) = sum(R_b + c_b*X) where not every R_a on the
left is the same as every R_b on the right then you have figured out an
equality between two groups of independent points which means you can find
discrete logarithms (same reason as Conjecture #1).

The two conjectures above allow us to hone in on what the adversary must be
doing. Since they are using the same nonces then we can remove those from
the equation:

1. sum(R_a + c_a*X) = sum(R_b + c_b*X)
2. sum(c_a*X) = sum(c_b*X)                      # Every R_a = R_b so
subtract from both sides
3. sum(c_a) = sum(c_b)                             # Divide by X

Ok so the adversary MUST be creating the collision by having two sets of
c's (of equal size) that are different but sum to the same value. Clearly
in the no hash scheme this is very very easy to create such sums but what
about the hash scheme?

## Collision attacks on original DLC scheme

In the original proposal "c"s were computed as H(R || X || o) where o is
the outcome (the no-hash hash scheme can be thought of as just c = o).
So to cause a collision the attacker has to find two sets of o's (items in
the set denoted as o_a and o_b) against a fixed set of Rs  such that
sum(H(R || X || o_a)) = sum(H(R || X || o_b)) which can be simplified to:

sum(H(R || X || o_a) - H(R || X || o_b)) =  0

Just to be clear this means finding o_a0, ..., o_an and o_b0, ..., o_bn
such that:

H(R_0 || X || o_a0) - H(R_0 || X || o_b0) + H(R_1 || X || o_a1) - H(R_1 ||
X || o_b1) + ... + H(R_0 || X || o_an) - H(R_0 || X || o_bn) = 0

On average how many nonces and outcomes per nonce do we need before there
is a collision?

Each term like H(R_0 || X || o_a0) - H(R_0 || X || o_b0) can take on m * (m
- 1) possible values where m is the number of possible outcomes.
For n nonces we have (m^2 - m) ^ n possible CET encryption keys. Let's say
we have m = 2^10 = 1024 outcomes per nonce this means to get at least 2^256
possible CETs we need roughly n = 13 nonces to be involved on average to
have a collision.

But how hard would it be to find a collision if one were to exist? Well it
looks a bit like Wagner's k-sum problem [1] where k is the number of nonces
but I think it's best to leave out estimating this for now since it
requires a more in-depth study.

## Don't hash the outcomes

What if instead of H(R || X || o) we computed c as o * H(R || X). A bit of
a mid-way between the hash and no-hash schemes. On the surface this would
seem to fix the original problem with c = o because the H(R || X) term
delinearizes it so collisions don't happen naturally. It also preserves
most of the speedup that the no hash scheme offers since each CET
encryption key can be produced by a single addition from the previous one.
The collision problem for this scheme becomes to find:

1. sum(H(R || X) *o_a) = sum(H(R|| X) * o_b)
2. sum(H(R || X)* o_a - H(R || X) * o_b) = 0
3. sum(H(R || X) * (o_a - o_b) ) = 0

So what are the possible values of (o_a - o_b)? If there are m outcomes
then it should be 2 * (m-1). e.g. if we have 1024 outcomes the values could
be -1023..1023 (not including 0). This is far fewer than the m^2 - m we
have in the original.
For m = 2^10 outcomes per nonce we would need around n = 24 nonces on
average to have a collision or roughly double what the original needs for a
collision to exist.

To actually find a collision it looks like a subset sum kind of problem
except with being able to choose coefficients from -m..m. I am not going to
guess how difficult this is yet.

## What about malicious oracles

A malicious oracle may choose their R values such that there is a
collision. I *think* this would be much more efficient than a malicious
user just finding a collision on an existing set of nonces. In order for
fraud proofs to work we would have to identify what threshold of nonces is
it likely to be a malicious user who created the collision by combining a
bunch of nonces together or whether they had help from a malicious oracle.

Of course we really want to avoid this whole game altogether. I am hopeful
that with the "nonce-only-hash" H(R ||X) * o scheme it might be hard for
the oracle to find collisions no matter how many Rs they try. But showing
this to be the case will take more work. It feels like the full-hash scheme
will struggle to do this.


Please review my claims here and happy to discuss at the next meeting.
Sorry for the awful notation.



[1] https://joinmarket.me/blog/blog/avoiding-wagnerian-tragedies/
[2] https://github.com/LLFourn/dlc-sec/blob/master/main.pdf (section 4)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailmanlists.org/mailman/private/dlc-dev/attachments/20210329/d68beff6/attachment.htm>

More information about the dlc-dev mailing list