First, let's remember â" I know I didn't: Bush hid the facts
Today, I spent most of the day doing some tests with PHP, writing binary and text strings to files. Later in the day, when I was about to stop, I opened one of the files I had written to that was supposed to contain 10000 numeric characters. But to my utmost incomprehension it was filled with a garble of Chinese characters!
I had been playing around with binary strings so I thought I must have messed up somewhere in my code. I check everything, remove lines until nothing else but the string generation and writing sequence remain, but the file still comes up in Chinese characters! It's an encoding problem clearly, and I get confirmation of this when reading the file from PHP, the content is intact. When reading the output text file from a remote server, it opens normally. My PHP code looks like this by then:
$t=""; $v=0; for($i=1; $i code won't display here *sigh*
If I change $v==100 for 99 or 101, then the file content appears normal. On the other hand, if I replace the 10000 with 1000, I still get the Chinese characters. After more testing, I come to realize that it has nothing to do with PHP. It's just the character sequence in the file. Somehow I'm too confident (and stupid) and don't come to think the issue might be with Notepad.
What seem to me the most logical explanations at this point are: 1/ it's a bug in Vista (I've seen many as I've never upgraded from the original; usually they don't disturb me much) 2/ it's a fault in the RAM that produces a corrupted byte or something (I guess it doesn't make any sense, but I had so many hardware problems in the past I'm just waiting for more) 3/ CHINESE HACKERS GOT ME! (or the yellow peril over Internet)
Somehow I'm too confident (and stupid) and don't come to think the issue might be with Notepad. I desperately try to look for someone on Twitter or Facebook who could help me test and replicate the bug. If the same happens to another Windows user, then I'm safe. But in all likelihood I will need to find another Vista user and possibly one with the same old version as mine, which will prove difficult. For starters, I don't know many people.
Finally, I find someone (thank you Narayan!) who helps me test and, to my surprise, replicate the bug in Windows 7, no less. She's smart enough to also try open the file in other text editor. I use Notepad for everything so doing this didn't come to me naturally, I don't even have another text editor (or just Worpad, somewhere deep down in my hard drive).
And here you have it, Notepad is even more buggy than thought. Contrary to what Wikipedia states:
This bug does not occur in Windows Vista and Windows 7 because their version of IsTextUnicode has been altered to make it much more likely to guess a byte-based encoding rather than UTF-16LE.
It does still occur. Looks like Microsoft will need to make further alterations to get their simplest text editor work as well as the competition. Sad but true.
And for those who'd like to test, here's the line:
You must copy it as such, in one line without carriage return, in a