Comment Re:Why the distros? (Score 1) 112
"well, distributions backport security fixes, so 5.3.3 is secure on distro XYZ".
Are you aware of any analysis as to the extent that is actually true, ie for distro X or Y which patches really have been backported and which are skipped?
I had a quick poke about the W3Tech site and couldn't really see much of their methodology, especially in terms of how they identify PHP usage and what version is being used. I'd have though that if you looked at their PHP page there should be a not insignificant number where they can reasonably guess it's using PHP (due to file extensions in URLs perhaps) but not be able to identify the version being used.
I wonder how much your "% of installs that are secure" statistic could be inaccurate due to most (I'd hope) sites that care even slightly about security suppressing the Apache header PHP version information. Are they just missing from the W3Tech stats? It's possible that a significant number of the "secure" PHP installs could be invisible to your calculations because the sort of people who keep their software up to date are the same people who follow fairly basic server set up recommendations.
I suppose there are also questions as to what "insecure" means in practice. For bulk hosting sites running unknown third party code everything is critical but for a lot of sites running their own code whether they are actually "insecure" depends not only on what PHP does but also what their code does. Eg for the most recent PHP 5.4 release there is a fix for a fairly nasty looking bug in unserialize(), but (as I understand it) a site admin with a defined codebase might quite legitimately determine that they never use unserialize() on user generated data and not be in any rush to update if they have other things to be doing. PHP version 5.4.35 might be "insecure" for the purposes of your stats but may not be in practice someone's server if they know they don't use unserialize() in an exploitable fashion (or mcrypt).
None of the above should be interpreted as criticism of your analysis, just food for thought. I find what you have done very interesting and expect that even if there are 'hidden' secure servers, the number of insecure ones would still be alarmingly high.