Just because a result is statistically significant doesn't mean it's correct, although I can't search for the appropriate xkcd here at work,so google for "xkcd jelly beans" or "xkcd significant" to find it.
Statistical significance means that the results you got would be gotten by random chance by a certain probability, typically 5%. 5% is not some magic number, and there's no theoretical significance, but it seemed like a reasonable value.
Therefore, if you take a batch of statistically significant results, some of them are going to be statistical flukes, and retests will not confirm the original significance. This happens. If you run twenty studies correlating two factors, and they're all unrelated, you will get an average of one publishable result. If you run a study with more than one comparison, you're much more likely to get a publishable result, because psychologists generally don't do really good stats. I read a psych paper once that had eight or ten correlations, and suggested that the one that was only 10% likely to result from chance variance was promising.