Microsoft's Hosts file changes in Windows 8 generated a predictable discussion about, and then by, APK, but this discussion caught my eye and I thought was worth raising in a "Mind boggled" way.
The pertinent points:
- APK made a claim about his code that resulted in someone calculating that it was taking about 4 million CPU cycles, or up to 16 million instructions, to process each HOSTS file entry.
- In the ensuing discussion, APK said that his algorithm processes each entry multiple times. He also claims that a slightly optimization to his method would result in the algorithm becoming 98% accurate.
- He also claims, if I understand it correctly, that his code actually turns off the Windows process scheduler - by apparently giving this batch processing code a "realtime" priority - while it runs, for extra speed.
- The reason it takes 4-16 million cycles per record? Apparently, again quoting APK, this is because there's string processing involved with 11 string operations on each record.
Now, to be clear, we're talking about a program whose job is apparently to generate a hosts file from a list of hostnames, deduping and doing other minor clean-up operations on the list, and writing it all out with "0.0.0.0" on the front of each hostname. For the deduping, apparently APK's using a sort - and he denies using a bubblesort, so I guess that's something. Me, I'd use a hash table, but what do I know?
I'm not trying to get at APK, but is anyone else having a WTF moment based upon the above description?