And encoding the filetype into the file means that you have to examine (and potentially interpret) the file to work out what to open it in.
Yes, wherever some information is, it has to be read and processed. ANY information.
That's fine for certain things (e.g. executables all start with MZ) but not for others (e.g. JAR files are indistinguishable from ZIP until you interpret the ZIP file contents and act upon that interpretation).
Many applications can be used to process jar files as well as zip files. In as much as they use the same container, they ARE the same type of files. In as much as the purpose is different, there should be another data to be read somewhere. Remember you could change the "separate" metadata from application/zip to application/jar - so ALL the confusion that results from decisions taken by examining file content are all possible in decisions taken by examining the separate metadata.
But in this separate metadata, not only the OS might attempt running a zip application on a jar file, it could also run photoshop on an excel sheet. Which is a much more varied possibility, testing against is many orders of magnitude more difficult than testing security of zip applications opening jar files, and hence much much more risky.
As soon as the contents could be malicious, and you're running even a regexp of any complexity on it, it's a risk.
Yes, so rather than execute a regexp of "any complexity", just run a multi-megabyte application on it because that is not a risk.
Encoding it into the filename itself is shoving metadata into other metadata. There's even a metadata separator involved here, the period in between! As such, they should be two separate and independently changeable pieces of information. Parsing the filename to work how to interpret the data inside is a nonsense, when you could just store "filename" (without the extension) and "filetype" separately. This also allows .jpg and .jpeg to be seen as the same thing (which they are!) and not require two separate and confusing entries!
Adding any in-data identifiers to existing files also means modifying the file, potentially modifying hashes and security on them.
Which is a good thing. A perl script IS different from an image, even though an image can be made to look like a perl script by relatively minimal change. Actual security scenario around the file has changed by that change in "in-data identifiers".
Changing the way they are interpreted on one machine will affect every machine they are visible on and require write-access to the file.
Which is again a good thing. A file triggering notepad by default being converted into a file triggering photoshop by default IS a modification to the file's most common behaviour. Why should it not require a write access?
I think either you don't know what you are talking about, or you haven't understood the thread subject at all. I repeat my example - "The context is that OS components need to distinguish between an image and a perl script."