3. Making invalid comparisons using objective metrics. I explained this earlier in the linked blog post, but in short, if you’re going to measure PSNR, make sure all the encoders are optimized for PSNR. Equally, if you’re going to leave the encoder optimized for visual quality, don’t measure PSNR — post screenshots instead. Same with SSIM or any other objective metric. Furthermore, don’t blindly do metric comparisons — always at least look at the output as a sanity test. Finally, do not claim that PSNR is particularly representative of visual quality, because it isn’t.
How to spot it: Encoders with psy optimizations, such as x264 or Theora 1.2, do considerably worse than expected in PSNR tests, but look much better in visual comparisons.
4. Lying with graphs. Using misleading scales on graphs is a great way to make the differences between encoders seem larger or smaller than they actually are. A common mistake is to scale SSIM linearly: in fact, 0.99 is about twice as good as 0.98, not 1% better. One solution for this is to use db to compare SSIM values.