I am learning about android dm-verity protection and I try to understand how does the android dm-verity uses the hash tree for validation of "single block".
https://source.android.com/security/verifiedboot/dm-verity says:
Instead, dm-verity verifies blocks individually and only when each one is accessed. When read into memory, the block is hashed in parallel. The hash is then verified up the tree. And since reading the block is such an expensive operation, the latency introduced by this block-level verification is comparatively nominal.
After the block is read and hashed, it is verified up the tree. But how can I verify root hash, when I have not read all the blocks?? I can verify just that part of the tree I have read, and that means I do not have to go up to root hash.
I do not understand why we use a hash tree. StackOverflow thread says that main reason for using hash trees is when the hash is computed for every block and than for the whole file again, i don't get why it is used here.
So how it is actually implemented?? My assumption is that when the block is loaded to memory android just checks the particular branch and rest of values are taken from the pre-computed hash tree. But than I don't see the reason for using the tree. I would just store block hash values and after reading the block and hashing compare just the hash.
Edit: Let's assume this implementation:
- split the whole block device to the blocks of 4K size.
- hash each particular block and concatenate hashes(create layer 0 of dm-verity)
- store the hashes (layer 0) at the end of block device
Now, when I want to verify 4K block loaded to the memory, I find the block position and compare the hash of loaded block with the stored hash.
In the situation as this using a tree makes sense, because you only have Merkle root available, but in Android, we have the whole tree, so why just not use the layer 0 (implementation above) and throw away the rest.
And while writing, I think I came up with an answer. Android stores the whole hash tree at the end. But the tree is not signed, only the dm-verity table(metadata) that contains the root hash. So, In my implementation, I would have to sign the whole layer 0. And that is probably wasting resources, so it's better to use the tree.