Journal Journal: What constitutes a good hash anyway? 3
But another question one must ask is why there are so many applicants for this, when NESSIE (the European version of this challenge) managed just one? Has the mathematics become suddenly easier? Was this challenge better-promoted? (In which case, why did Slashdot only mention it on the day it closed?) Were the Europeans' criteria that much tougher to meet? If so, why did NIST loosen the requirements so much that they were overwhelmed?
These questions, and others, look doomed to not be seriously answered. However, we can take a stab at the criteria and evaluation problem. A strong cryptographic hash must have certain mathematical properties. For example, the distance between any two distinct inputs must be unconnected to the distance between the corresponding outputs. Otherwise, knowing the output for a known input and the output for an unknown input will tell you something about the unknown input, which you don't want. If you have a large enough number of inputs and plot the distance of inputs in relation to the distance in outputs, you should get a completely random scatter-plot. Also, if you take a large enough number of inputs at fixed intervals, the distance between the corresponding outputs should be a uniform distribution. Since you can't reasonably test 2^512 inputs, you can only apply statistical tests on a reasonable subset and see if the probability that you have the expected patterns is within your desired limits. These two tests can be done automatically. Any hash that exhibits a skew that could expose information can then be rejected equally automatically.
This is a trivial example. There will be other tests that can also be applied automatically that can weed out the more obviously flawed hashing algorithms. But this raises an important question. If you can filter out the more problematic entries automatically, why does NIST have a problem with the number of entries per-se? They might legitimately have a problem with the number of GOOD entries, but even then all they need to do is have multiple levels of acceptance and an additional round or two. eg: At the end of human analysis round 2, NIST might qualify all hashes that are successful at that level as "sensitive-grade" with respect to FIPS compliance, so that people can actually start using them, then have a round 3 which produces a pool of 3-4 hashes that are "classified-grade" and a final round to produce the "definitive SHA-3". By adding more rounds, it takes longer, but by producing lower-grade certifications, the extra time needed to perform a thorough cryptanalysis isn't going to impede those who actually use such functions.
(Yes, it means vendors will need to support more functions. Cry me a river. At the current scale of ICs, you can put one hell of a lot of hash functions onto one chip, and have one hell of a lot of instances of each. Software implementations are just as flexible, with many libraries supporting a huge range. Yes, validating will be more expensive, but it won't take any longer if the implementations are orthogonal, as they won't interact. If you can prove that, then one function or a hundred will take about the same time to validate to accepted standards. If the implementations are correctly designed and documented, then proving the design against the theory and then the implementation against the design should be relatively cheap. It's crappy programming styles that make validation expensive, and if you make crappy programming too expensive for commercial vendors, I can't see there being any problems for anyone other than cheap-minded PHBs - and they deserve to have problems.)