Tough decisions
4 minutes to read
Champagne for my real friends, real pain for my sham friends.
Challenge contributed by CryptoHack
Challenge files:
Source code analysis
We are fiven a Python script that takes the flag as bits and, for each bit, it prints 6 outputs of either the real
function (bit 0
) or the fake
function (bit 1
):
if __name__ == "__main__":
s = sample_key()
for b in get_flag():
print([[real, fake][b](s) for _ in range(6)])
Therefore, the objective is to find a way to differentiate between real
and fake
outputs in order to learn the bits of the flag.
Learning With Errors
The fake
function returns just random bytes:
def fake(s):
a = secrets.token_bytes(len(s))
return a, secrets.randbelow(256)
Whereas real
uses LWE (Learning With Errors) using s
as key:
def real(s):
a = secrets.token_bytes(len(s))
e = sample_noise(7) - 64
return a, (dot(a, s) + e) % 256
In LWE, the encryption works as follows:
$$ b = \vec{a} \cdot \vec{s} + e \mod{n} = \sum_{i = 1}^N (a_i \, s_i) + e \mod{n} $$
Where we only know vector $\vec{a}$, modulus $n = 256$ and the ciphertext $b$. The dimension of the vectors $\vec{a}$ and $\vec{s}$ is $16$, and all their elements are 8-bit numbers:
def sample_key():
# LWE estimator told me I'd have more bits of security than bits in my key...
return secrets.token_bytes(16)
Error distribution
Moreover, we know the possible values for the error $e$:
def drbg():
# wuch wow, very deterministic
return secrets.randbelow(2)
def sample_noise(n):
e = 0
for _ in range(n):
e |= drbg()
e <<= 1
return e
Since e = sample_noise(7) - 64
, we observe that all errors are even numbers:
$ python3 -q
>>> import secrets
>>>
>>> def drbg():
... # wuch wow, very deterministic
... return secrets.randbelow(2)
...
>>>
>>> def sample_noise(n):
... e = 0
... for _ in range(n):
... e |= drbg()
... e <<= 1
... return e
...
>>> {sample_noise(7) - 64 for _ in range(10000)}
{0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, -64, -62, -60, -58, -56, -54, -52, -50, -48, -46, -44, -42, -40, -38, -36, -34, -32, -30, -28, -26, -24, -22, -20, -18, -16, -14, -12, -10, -8, -6, -4, -2}
Solution
So, we know that $e \in \{-128, -126, \dots, -2, 0, 2, \dots, 126, 128\}$, and this is the oracle we need.
Moreover, since the flag is formed by ASCII bytes, we know that the most significant bit of every byte is $0$. So, we already know some real
outputs.
Using this knowledge, we can form this equation using $m$ samples:
$$ \vec{b} = A \cdot \vec{s} + \vec{e} \mod{256} $$
Which is equivalent to this matrix equation:
$$ \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{pmatrix} = \begin{pmatrix} a_{1, 1} & \dots & a_{1, N} \\ \vdots & \ddots & \vdots \\ a_{m, 1} & \dots & a_{m, N} \end{pmatrix} \cdot \begin{pmatrix} s_1 \\ s_2 \\ \vdots \\ s_N \end{pmatrix} + \begin{pmatrix} e_1 \\ e_2 \\ \vdots \\ e_m \end{pmatrix} \mod{256} $$
It would be nice to find the value of $\vec{s}$. There are challenges related to LWE where it is possible to find $\vec{s}$ by solving the Closest Vector Problem (CVP) using lattice reduction algorithms. However, it is not possible in this challenge because the errors $e$ are very large.
Instead, we can use the fact that all errors are even, so we can apply $\mod{2}$ and find $\vec{s} \mod{2}$:
$$ \vec{b} = A \cdot \vec{s} \mod{2} \iff \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{pmatrix} = \begin{pmatrix} a_{1, 1} & \dots & a_{1, N} \\ \vdots & \ddots & \vdots \\ a_{m, 1} & \dots & a_{m, N} \end{pmatrix} \cdot \begin{pmatrix} s_1 \\ s_2 \\ \vdots \\ s_N \end{pmatrix} \mod{2} $$
Therefore, we can find $\vec{s} \mod{2}$ with at least 16 pairs of vectors $\vec{a_i}$ and ciphertexts $b_i$.
Once we have a value for $\vec{s} \mod{2}$, we can go through all the outputs we have, take each vector $\vec{a}$, multiply by $\vec{s} \mod{2}$ and check if the ciphertext $b \mod{2}$ matches. If any of the 6 outputs we have for each bit of the flag does not match, we are sure that it is a fake
output. Otherwise, if every 6 outputs match, with high probability we have a real
output.
Flag
If we implement the above procedure, we will find the flag:
$ python3 solve.py
ECSC{D3c1s10n4l_LWE_m0d_2_445fb8dc91}
The full script can be found in here: solve.py
.