Actually the problem they are wrestling with here is one that has science has had to deal with for a long time: the uncertainty on a measurement. The star ratings are a measure of the popularity of a game so what you are really asking is "given the ratings it received which game is best?".
Unfortunately with a finite statistical sample you always have some degree of uncertainty and, within this uncertainty your data does not provide any ranking at all: you simply do not know which game is best to any sensible degree of certainty. However while correct this would lead to really confusing rankings since to be fair you would need to randomize the order within the uncertainty of each game's score. This would be complex and confusing to users!
Instead what they suggest is using a confidence level limit: what score can I be confident that 95% of people would rate the game higher than? We do this all the time in particle physics when we put limits on some new physics which we looked for an did not see. For example the precursor to the LHC, LEP had a result that it was 95% confident that the Higgs boson had a mass higher than 116 GeV/c2 (IIRC). There are better ways to do this than the method they quote but since this is just a game rating and not science it's a fine method to use.