Okay, lets say you have two independent papers which both come to the same conclusion. If they both independently have a 60% chance of being wrong, they only have a 36% chance of both being wrong. If you take 10 independent papers with 9 of them agreeing, it is far more likely that the 1 outlier is wrong than the 9 other papers, regardless of the reproducibility failure rate of each individual paper.

I think using a probabilistic model presents its own issues though. Take your example, but ask what if they both made the exact same major error (perhaps it was even a really easy one to make or due to some unknown factor that no one could have seen) then they're both 100% wrong in fact. Science is really hard, because there's a natural human tendency to ask, "What do I need to do in order to prove this correct?" when we should really be asking "Have I done every conceivable thing possible in order to try to disprove any possible other alternative explanations and account for factors that might also lead to a result?"

Your single outlier could be the one paper that has discovered and accounted for the previously unknown factor or discovered some other problem with previous research. You can't just conclude that since in the majority of cases this is unlikely therefore we can dismiss the outlier. You still have to look at it, have a discussion about it, and see whether or not it's worth considering or if it means the other 9 studies need to be rerun to account for new information.