'Defective by design' is exactly it. On a standard Windows/Office system everything is set up to make this error the default behaviour, and the problem with gene names is so prevalent that we include a warning about it in our genomics course. The typical situation is: some upstream tool generates a very long gene list (maybe with thousands of rows) in CSV format. So far so good. But then a naive user wants to do some simple manipulation of the data and double-clicks the file to open it. Its icon includes the Excel logo and, sure enough, Excel is registered as the default application for CSV files, so it opens as a spreadsheet in Excel. Everything seems to be fine - there's a nice column of gene symbols that all seem to be correct. But hundreds or thousands of rows further down, something looks like a date and has been 'helpfully' converted into one by Excel. At this point, you can't reverse the change by changing the data type of the column - the corruption has happened silently on import and will be permanent in any saved (even CSV) version of the file. The counterintuitive but correct way to deal with a genomics CSV file (if you're mad or uninformed enough to use Excel in the first place) is to open Excel first, then run a file import with the data type specified for each column (for gene symbols, you need 'text' rather than 'general'). The answer to all this is education (avoid Excel, but if you must use it, understand the dumb way it works), but would it kill Microsoft to change the default behaviour to something more sensible (this can hadly be the only use case where this is an issue), and to include a global setting to switch it off?