The summary (and article) is also missing all those kinds of relevant data :) I thought I had added a comment to my post that I was ignoring things like suitability ranking, but apparently I didn't actually type it. However, it should be unsurprising that my calculations don't include data that's not available, no?

From the article: "The likelihood that this result occurred according to chance is approximately one in a billion," said the lawsuit,

Yeah - I'm going to have to call bullshit on that - I'd love to see that math.

73% of 130ish applicants asian, so 94.9ish asian applicants. Let's call it 94, and 36 non-asian. Then we have 17 non-asian hires and 4 asian hires. So the question is, using random sampling without replacement from that pool of 94A and 36N, what're the odds of getting 17N and 4A?

I'm using the formulas from http://people.wku.edu/david.ne...
So in our case the figure we want is C(36,17)*C(94,4)/C(130,21) - ways to choose 17N and 4A divided by ways to choose just any 21.
Conveniently, you can google "36 choose 17" to do the calculation, but the formula for N choose K is N! / (K! * (N-K)!)
36 choose 17 is 8597496600
94 choose 4 is 3049501
so C(36,17)*C(94,4) is about 2.6218074e+16 (yeah, we lost a little precision but we can live without it)
130 choose 21 is 8.7664606e+23
So our final result is 2.62e16 / 8.77e23, or 1/3.35e7

one in 33 million isn't one in a billion, so I would also be interested in seeing their math, but the conclusion that the selection was biased seems to be fairly well supported.

the problem is that they can spend more on lawyers to make your actions "bad faith" while theirs are still "good faith". I think a more effective route would be to hold this up as an example of their automated scanner being so bad that trusting it is no longer "good faith" but simple blind stupidity, with an eye to forcing them to expose the scanner internals and actually fix the damn thing.

