本帖最後由 stephenwong 於 2014-8-31 10:03 編輯
What you guys said, using GPU, blah blah blah, can find out the original plaintext of hash codes, captured by PopVote, 'easily', it's simply wrong. You made a wrong assumption that there is ONE hash for HKID, ONE hash for telephone number, and you know the combinations of 'alphabets' in HKID, and telephone number are limited, so, you can 'easily' exhaust all the combinations and compare with the hash (to find out the original HKID and telephone number of the voters.) There must be a data structure to put the HKID and telephone number before hashing is applied. Usually, the data structure will be of some fixed size (or padding will be applied), for example, if the data structure is 16 bytes (128-bits) in size, you don't know which bytes correspond to HKID and which bytes are used to store telephone number. Although the plaintext still won't exhaust all 2^128 combinations, due to some plaintext combinations are not valid (eg. no such HKID, no such telephone number), you can't reduce your brute force trial space. Because you don't know the data structure in the first place. Assume you can achieve 10,000M hash per sec, you still need roughly 1E28 years to find out the plaintext of a given hash. This has nothing to do with 'salt'. Without 'salt', if you have enough time, you can generate a 'dictionary' of all 2^128 hash codes correspond to all possible plaintext combinations, and with the dictionary, you can find out the plaintext by searching your 'dictionary'. By adding 'salt', you just make the 'dictionary' approach even more complex, say, if you add a 2-bytes 'salt', you add 65536 times complexity, because there are 65536 dictionaries to be generated.
But hey, I just illustrated an example IF the data structure is 16-bytes in size, who knows if the data structure is 64-bytes in size, 128-bytes in size. You cannot tell from the hash how big was the original plaintext!
You guys also question how the apps work, whether the hash and encryption was done on the client or on the server. There are a lot of possibilities, and even the hash and encryption was done on the client, it can be designed such that you won't be able to cheat, say, by adding a round of asymmetric cryptography.
You said, the server won't be able to handle the encryption / hashing? You must be joking, there were 700k voters in the last PopVote? A simple Intel i5 can sustain easily 20MB to 30MB AES encryption per second. Just like using Bitlocker in Windows, usually, the bottleneck is still the speed of your hard disk. Those 700k voters did not vote in the same second (but spread in a few weeks), so, don't worry about the server, worry more about the DDoS attack from North! |