zielmicha (zlmch)

ZeroTier proof-of-work function is not memory-hard

09 Jan 2016

ZeroTier is a "a smart ethernet switch for earth". It creates encrypted virtual L2 network which can be joined by any authorized device. The devices are identified by 5 byte strings, which are hashes of Curve25519 public keys.

The hash function used for identity derivation is not standard one like SHA256 - with only 2^40 possible values it would too easy to spoof someone identity. Instead, a custom proof of work function is used. This function is claimed to be memory-hard (i.e. requiring a lot of memory to compute) - this property would make it harder to crack it on GPU or custom ASICs.

I show that the proof-of-work function isn't in fact memory hard. For now, it is mostly theoretical result - the computation still takes long time. Most importantly, the full public keys are cached by devices and centralized root servers. Even when someone spoofs identity, he would still need much luck and collaboration with ZeroTier Inc, which runs the root servers. I have notified ZeroTier Inc about 3 months ago and they weren't too worried about the consequences of my 'attack' (which is an opinion I agree with), saying they will eventually redesign the identity derivation function during transistion to post-quantum ciphers.

The hash function

The hash function used for identity derivation is not specified anywhere, so I have extracted its description from ZeroTier code.

computeMemoryHardHash(publicKey: string) -> string
   digest = sha512(publicKey) # 64 bytes
   initialize Salsa20 stream with IV=digest[0:32] key=digest[32:64]
   genmem = empty 2MB array
   current = 64 zero bytes
   # the genmem array is initialized with CBC-like scheme
   for i=0..2000000/64: # loop 1
     current = encrypt with salsa(current)
     genmem[64*i:64*(i+1)] = current

   for i=0..2000000/16: # loop 2
     num1 = convert genmem[8*i:8*(i+1)] to big-endian integer
     num2 = convert genmem[8*(i+1):8*(i+2)] to big-endian integer
     idx1 = num1 % 8
     idx2 = num2 % (2000000/8)
     swap digest[8*idx1:8*(idx1+1)], genmem[8*idx2:8*(idx2+1)]
     digest = encrypt with salsa(digest)

   if digest[0] >= 17:
     if generating identity, retry with different public key
     if validating identity, abort
     (this part makes it easier to validate identity, than to generate one - a defining characteristic of a proof-of-work function)

   return digest[59:64] # last 5 bytes

We can see that computeMemoryHardHash tries to shuffle memory around in a way that depends on results of previous computation. However, if we look at this code long enough, we can see that sometimes it miserably fails. Specifically, if during the last iteration of loop 2 idx1 == 7, then the result (last 5 bytes of digest) only depends on X=gemem[8*idx2:8*(idx2+1)]. This happens, because Salsa is a stream cipher, so digest = encrypt with salsa(digest) encrypts digest[59:64] in a way than only depends on old digest[59:64].

At first glance, X still can't be computed without lots of memory - it could be changed at any time by swap in loop 2. However, it turns out that the most values of genmem won't ever be modified by loop 2. The loop runs for 2000000/16 iterations and each time psuedorandomly selects one of 2000000/8 indexes to modify - so any particular index will be modified with probability (1-2000000/8)**(2000000/16) ≈ 39%.

With this knowledge, we can create a sketch of memory-easy version of the hash function:

computeMemoryEasyHash(publicKey: string) -> string
   digest = sha512(publicKey) # 64 bytes
   initialize Salsa20 stream with IV=digest[0:32] key=digest[32:64]

   current = 64 zero bytes
   # the genmem array is initialized with CBC-like scheme
   for i=0..sizeof genmem/64: # loop 1
     current = encrypt with salsa(current)

   # this is the same as last num1 with probability 61%
   lastnum1 = convert current[48:56] to big-endian integer
   # this is the same as last num2 with probability 61%
   lastnum2 = convert current[56:64] to big-endian integer

   lastidx1 = num1 % 8
   lastidx2 = num2 % (2000000/8)

   if lastidx1 == 7: # probability 12%
     retry

   initialize Salsa20 stream with IV=digest[0:32] key=digest[32:64]
   # generate genmem again, now to fetch block that contains `genmem[8*lastidx2:8*(lastidx2+1)]`
   current = 64 zero bytes
   for i=0..lastidx2/8: # loop 1
     current = encrypt with salsa(current)

   # 61% that swappedvalue == genmem[8*lastidx2:8*(lastidx2+1)]
   swappedvalue = current[lastidx2%8]

   return swappedvalue[3:8]

Overall, the probability that computeMemoryEasyHash returns correct results is about 1/35.

Suppose Eve wants to spoof identity of Alice. She needs to generate same identity as Alice, but with private/public key known by her ("preimage"). She will run computeMemoryEasyHash on her GPU or ASIC. When the function returns the identity of Alice, Eve will compute the identity again using computeMemoryHardHash (on CPU). If both functions return the same result, she has what she wants. Otherwise she just repeats the whole procedure.

computeMemoryEasyHash is correct with probability 1/35, but Eve also needs to fullfil proof-of-work condition (digest[0] >= 17), so she will need to call computeMemoryHardHash about 35*(256/17) ≈ 527 times to find the preimage.

The whole operation requires 2^26*2^40*(256/17)*35 ≈ 2^74 Salsa20 operations. Even with ASIC acceleration (assuming 60Gops/s), the computation would take 10.000 years. Algorithmic improvements to computeMemoryEasyHash are almost surely possible.

Quite messy proof of concept is available.