Reminds me of debugging errors related to digital logic race conditions. When you are on the edge of meeting timing, a slight shift in the wrong direction can cause the result to be incorrect, sometimes with an order of randomness. Until you violate that timing you have the feeling of security since everything is going smoothly. I'm sure there's a more mathematical way to explain this, but similarly I think much more testing could be done to understand what variables effect the outcome.
It would be interesting to see more details, such as how many pixels must be modified for a failure? To what magnitude do the pixels have to be changed by? Is there a tradeoff between # of pixels and magnitude of change per pixel? Are certain pixels more important than others (edge detection for example)?