It's not particularly clear from the story, nor is it clear from the abstract on how they actually differentiated for natural bias in the random selection. Did each professor receive the same candidate twice, once as a male and once as a female (hopefully far apart)?
It sounds like the data set was randomly generated once, and then used to push through the study. It's quite possible that the data set simply had a lopsided pool of better qualified "males" versus "females." Considering that they do not state the number of students in the pool in either the story or the abstract, it also seems plausible that the results are from a very small pool of students, which makes the bad random data bias much more likely regardless of a large pool of graders (the 127 professors).
Of course, it could also be that the mandating of gender equality, where they are otherwise not equal, has led to a worse perception of actually qualified female candidates due to bad past experiences. Anyone in a decent engineering program has seen women coast through when they otherwise should have failed like many of their male peers, and I suspect that happens in science as well, where the number of women in the programs is simply far lower than the number of males. This, in effect, results in an immense quality bias given the same academic record, so when women have a certain academic record, it will be called into question due to past experiences.
Now, with that said, I would like to believe that people would rate people as equals from paper until the actual interview process differentiates them. Without seeing the random applicant data (specifically the quantity, and the randomization of it), then it's impossible to say. Just as we've seen women coasting where they shouldn't, literally this week, I heard of a professor in academia that still hold the age-old idea that women are only good for dictation and secretarial duties.