Stories
Slash Boxes
Comments

News for nerds, stuff that matters

MD5 Proven Ineffective for App Signatures

Posted by Zonk on Sunday December 02, @06:31AM
from the needs-a-bit-of-retooling dept.
prostoalex writes "Marc Stevens, Arjen K. Lenstra, and Benne de Weger have released their paper 'Vulnerability of software integrity and code signing applications to chosen-prefix collisions for MD5'. It describes a reproducible attack on MD5 algorithms to fake software signatures. Researchers start off with two simplistic Windows applications — HelloWorld.exe and GoodbyeWorld.exe, and apply a known prefix attack that makes md5() signatures for both of the applications identical. Researchers point out: 'For abusing a chosen-prefix collision on a software integrity protection or a code signing scheme, the attacker should be able to manipulate the files before they are being hashed and/or signed. This may mean that the attacker needs insider access to the party operating the trusted software integrity protection or code signing process.'"

Related Stories

Display Options Threshold:
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • Hmm... (Score:1)

    by Char-i-o's (1195873) on Sunday December 02, @06:41AM (#21550937)
    Need more salt.

    Perhaps they should consider using something else than SHA-1? SHA-2 anyone?
    • Well, duh! (Score:5, Insightful)

      by YA_Python_dev (885173) on Sunday December 02, @06:59AM (#21550993)
      (Last Journal: Thursday August 23, @11:40AM)

      The problem has nothing to do with salt, and can be certainly temporarily "fixed" switching to SHA-1 or, even better, SHA-2. But the real root of the problem here is that, for the attack to work, someone signed as trusted a binary file that contained malicious code in the first place, even if in a disable form.

      Let me explain that. First, this is very old news: we know since 2004 [wikipedia.org] that collision can be found in MD5 hashes (two different files with the same md5sum), and there now are tools that can generate collisions in seconds. All you need is a common prefix and suffix for both files and two block of 128 bytes that are generated automatically and you can insert between the prefix and the suffix to create the two files.

      Applying this to pretty much any file type that can contain binary data (even XML 1.1!) is trivial. For an executable file you can simply insert code in your prefix/suffix that looks at the pseudo-random 128 bytes and does radically different things depending on it. This as already been demonstrated for HTML+JS and even for postscript files.

      Bottom line: if you have an executable file from an untrusted source it may contain bad things (the attack described requires that both the original signed file and the file that you are actually executing are generated by the same hostile source).

  • Nothing new (Score:5, Insightful)

    by grumbel (592662) on Sunday December 02, @06:43AM (#21550943)
    Unless I am missing something this is really nothing new. The same has been demonstrated with a webpage and javascript years ago, i.e. two different webpages producing the same MD5, doing it again with an .exe doesn't really sound all that interesting, especially since the attacker still needs to manipulate both the good .exe and the evil .exe and when he has access to the good .exe you are toast anyway.

    This of course doesn't mean we should continue to use MD5, but the attack is really of rather theoretical nature.
    • Re:Nothing new by Anonymous Coward (Score:2) Sunday December 02, @06:46AM
    • Re:Nothing new by Instine (Score:2) Sunday December 02, @07:16AM
    • Re:Nothing new (Score:5, Interesting)

      by Bert64 (520050) <bert&slashdot,firenzee,com> on Sunday December 02, @07:16AM (#21551053)
      (http://www.ev4.org/)
      If he has access to the good exe *before* it's signed, why not simply replace it with the malicious one so that the malicious one gets signed and distributed instead of the good one...
    • Re:Nothing new (Score:5, Informative)

      by Anonymous Coward on Sunday December 02, @07:40AM (#21551135)
      No, this is different. In the case of the colliding webpages, bit level inspection immediately reveals what's going on: both "good" and "bad" version are included in the webpages, with an if-statement to choose which one to display.

      When you inspect these binaries at bit level, they contain only the "good" or the "bad" version, and some random data appended to it to make the MD5 hash of the files collide. This technique thus also works for file formats which don't have control statements such as "if" or "file starts at offset". See also: http://www.win.tue.nl/hashclash/Nostradamus/ [win.tue.nl], scroll down to: "Didn't Daum and Lucks do something like this in 2005?"

      Marc Stevens already constructed these "chosen-prefix" collisions for X.509 Certificates, see the HashClash [win.tue.nl] project page. What's new in these results, is that it did not require massively distributed computing efforts, only one Playstation 3 and less than two days of computation. There is no paper available yet as to how he achieved this major optimization, but his MSc thesis [win.tue.nl] gives a clue: see "future work" at the end of section 7.4.
    • Re:Nothing new (Score:5, Informative)

      by MathFox (686808) on Sunday December 02, @07:44AM (#21551151)
      This is a different kind of attack: the "old" collision prefix attack had two blocks X and Y with the same hash that allowed one to create two programs:
            X; if (X) then GOOD else EVIL
      and
            Y; if (X) then GOOD else EVIL
      but the evil code would be in the signed good program, it would not be run.

      The new attack is different: it is a method to generate blocks GX and EX for two random files such that the files GOOD+GX and EVIL+EX hash to the same checksum.
    • Re:Nothing new by The New Andy (Score:2) Sunday December 02, @07:44AM
    • Re:Nothing new by houghi (Score:2) Sunday December 02, @08:00AM
    • Re:Nothing new by owlstead (Score:2) Sunday December 02, @09:15AM
    • Re:Nothing new by fatphil (Score:1) Sunday December 02, @11:16AM
  • Well everyone's boned then (Score:4, Interesting)

    by tietokone-olmi (26595) on Sunday December 02, @06:53AM (#21550969)

    [...] This may mean that the attacker needs insider access to the party operating the trusted software integrity protection or code signing process.

    An attack that requires insider access? Well colour me frightened!

    Or don't. That's more accurate anyhow.

  • Right (Score:1)

    by setrops (101212) on Sunday December 02, @06:54AM (#21550977)
    "This may mean that the attacker needs insider access to the party operating the trusted software integrity protection or code signing process.'"

    Let's see now, the attacker already has access to the machine and is probably the one creating or comparing the MD5. Is the problem really with MD5?

  • Not accurate, not new (Score:5, Insightful)

    by Niten (201835) on Sunday December 02, @07:05AM (#21551015)
    (http://markshroyer.com/)

    MD5 collision attacks aren't really new, although this is a powerful example. An equally meaningful example of a collision attack on the algorithm, in the form of two different PostScript files with the same MD5 hash [cits.rub.de], was provided at least two years ago (IIRC).

    The key to understanding the limits of this demonstration's significance is to realize that a collision attack is quite different from a prefix attack. These researchers were able to create a pair of executables having the same hash value by specially constructing them as such; crafting a new executable to match a specific hash value corresponding to some other party's executable is vastly more difficult to achieve.

    So while this demonstrates MD5 to be useless for uses where the purported signatory is to be included in our threat analysis -- as has already been demonstrated to us by other researchers -- the algorithm is still relatively safe if our only goal is to ensure that a given executable almost certainly came from a specific party (rather than showing that it is a specific executable from said party). In other words, one could conceivably use MD5 to verify that the Ubuntu packages on that FTP server were in fact produced by Canonical. So no, demonstration does not mark MD5 as completely useless for code signing; the most common applications of code signing are entirely unconcerned with collisions in the hash function.

    In conclusion: the title is terribly misleading, or possibly just misinformed. Boo! Hiss!

  • Birthday Attack (Score:5, Insightful)

    by tangent3 (449222) on Sunday December 02, @07:15AM (#21551043)
    This is an example of a Birthday Attack [wikipedia.org]. 1. Attacker generates Good.exe and Evil.exe which hashes to the same MD5 2. Attacker passes Good.exe to the key owner to sign 3. Key owner signs and release Good.exe and Good.exe.MD5 4. Attacker releases Evil.exe as Good.exe This of course, requires some serious social engineering to work. MD5 is outdated, yes, but at the moment it is still resilient against a normal attack where an attacker has to generate an Evil.exe to hash to the same MD5 as an already-available Good.exe
    • Re:Birthday Attack (Score:4, Informative)

      by Anonymous Coward on Sunday December 02, @07:28AM (#21551087)
      Sorry but you are wrong. The attack uses two md5 inputs which collide to construct two programs which are otherwise identical. The program can then be contrived to exhibit different behaviour depending on which of the two colliding inputs was used. This is nothing to do with the birthday paradox (except that it may have been used to find the collisions in the first place). Otherwise you description of the attack is accurate.
  • use two hash functions (Score:5, Informative)

    by m2943 (1140797) on Sunday December 02, @07:17AM (#21551055)
    The particular scenario they describe is irrelevant; MD5 checksums aren't intended to protect against that. If the attacker can manipulate the original file, he can usually simply alter it to become malicious itself.

    The case that matters is producing a program with the same checksum as a given program, without the ability to manipulate the correct program beforehand. That's still hard.

    Nevertheless, code signing mechanisms in general should probably be prepared for flaws in hash functions. It might be best always to use two hash functions and to have some strategy of migrating. That way, if one hash function gets compromised, there is still another one in place and can be used until the original one has been replaced.
    • Re:use two hash functions (Score:4, Interesting)

      by mollymoo (202721) on Sunday December 02, @09:53AM (#21551549)
      (Last Journal: Friday December 17 2004, @07:14PM)

      The particular scenario they describe is irrelevant; MD5 checksums aren't intended to protect against that. If the attacker can manipulate the original file, he can usually simply alter it to become malicious itself.

      The problem as I see is that the harmless version can be released and gain trust. That version can be tested and inspected, even checking the binary wouldn't reveal malicious code because there wouldn't be any malicious code to find - no dodgy looking system calls, for example. Just a chunk of seemingly random data, which could be disguised as a lookup table, compressed image or whatever. At some later point, after the harmless version has gained trust, its use has become more widespread and the rate of downloads has increased correspondingly, it can be replaced by the malicious version. So while you could initially release a malicious version, being able to first release a harmless version can widen the impact of an attack.

    • Re:use two hash functions by noidentity (Score:1) Sunday December 02, @04:41PM
    • One hash to rule them all--wot? that's all? by slashdotard (Score:1) Sunday December 02, @04:45PM
  • Ah yes, this again (Score:5, Interesting)

    by Effugas (2378) * on Sunday December 02, @07:18AM (#21551063)
    (http://www.doxpara.com/)
    OK, it's pretty damn cool to see people 'round here referencing my work on Javascript MD5 collisions :)

    The relevant links are:

    http://www.doxpara.com/research/md5/t1.html [doxpara.com]
    http://www.doxpara.com/research/md5/t2.html [doxpara.com] ...and the original paper:

    http://www.doxpara.com/research/md5/md5_someday.pdf [doxpara.com]

    I'm pretty sure I talked about third party attestation in that paper.

    A more interesting point was made to me just the other day, which is that there's always enough ambient entropy in any real world system to deviate between trusted and untrusted behavior. In other words, for a turing complete app, you *can't* create a meaningful hash, because you aren't capturing all bits that will drive the execution flow. So, getting code signed really doesn't assert anything other than a business relationship. App signatures don't actually work, for any arbitrarily good hash.
  • Use GnuPG instead (Score:5, Insightful)

    by gweihir (88907) on Sunday December 02, @07:39AM (#21551125)
    As many projects have done for years. md5 sums as crypto-protection are more or less a historic way to do it.
  • This may mean that the attacker needs insider access to the party operating the trusted software integrity protection or code signing process.
    Isn't this a bit like saying that door locks are insecure although you may need access to a party trusted with the keys in order to exploit? Aren't these "trusted parties" *always* a potential weak-link in the security chain?
  • by houghi (78078) on Sunday December 02, @08:03AM (#21551215)
    (http://www.houghi.org/)
    I alsways thought that MD5SUMS where there only to verify wether a download was successfull or not.
  • Not a real life scenario... (Score:3, Insightful)

    by Matthieu Araman (823) on Sunday December 02, @08:05AM (#21551217)
    Real life scenario :

    developper A produce software X(for example openssh), calculate hash of program X and sign the hash with his PGP key.
    He then put all these files on mirrors servers on Internet (but not his private PGP key !)

    One mirror is hijacked by B.
    B wan't to replace X by X' with the same hash than X

    This article doesn't provide anything as it says MD5(X+a)=MD5(Y+a), which imply you have to change A in the first place which can't be done easily (and if you can change the original program, then what's the point ?)
  • A workaround? (Score:1)

    by finnw (415539) on Sunday December 02, @08:41AM (#21551319)
    (http://www.finnw.me.uk/)
    Create trusted_program.

    Take trusted_program and /bin/false, use this technique to generate trusted_program2 and false2.

    Post both trusted_program2 and false2 on your web page along with their shared md5sum and invite the user to download them (presumably the user trusts you and your web server or he wouldn't download them in the first place.)

    The user is now confident that you cannot replace trusted_program2 with malicious_program without changing the md5sum, because this technique only works with two prefixes, not three.
  • Nothing new (Score:2)

    by kasperd (592156) on Sunday December 02, @09:24AM (#21551453)
    (http://kasperd.net/~kasperd/ | Last Journal: Thursday July 08 2004, @10:18AM)
    As others have pointed out, there is nothing new in this. The same has been demonstrated with other languages before. For example a few years ago it was demonstrated with postscript, and that was as far as I know the first demonstration with meaningful content. While that may have come as a surprise to some people, it was only a minor curiosity to people understanding how md5 works. Doing this thing with exe files is less significant than it was to do it with postscript files for the following reason. You are not likely to sign an exe file from an untrusted source, because there is no way to verify if the content is malicious, and most people know this. In fact that is the very reason for having signatures in the first place. With postscript files it is different, to most people a postscript file is just a document. With a document you can read it, and then you know exactly what the content is. At least I can understand why people would think that way. So less social engineering is required to get your crafted postscript file signed than it would to get a crafted exe file signed. If you can get somebody to sign your carefully crafted exe file, then this attack doesn't matter anyway. Because if you could get your malicious code into the signed executable, there are lots of other ways to trigger it. Having an interchangable piece of pseudorandom binary data in the file itself is a neat way to trigger your code. But it can be based on other external factors such as timing, the IP address of the machine, the existence of certain other files on the system, or just some secret sequence of inputs. It is all just the matter of putting a backdoor into a piece of code, and attacking md5 in this way is not even the most convenient backdoor you could make.

    If it was possible to make a crafted file that would match the md5 of some existing file, which you had no control over, then there would be a lot more reason to worry. Luckily that is not the case yet. Still the demonstration of collisions does serve as a warning, that md5 is weak and maybe somebody will be able to completely break it at some point. There has been a reason to worry about that ever since the collisions were first demonstrated. The construction of additional collisions with meaningful content doesn't change that threat in any way. If you were not worried before this news but you are worried afterwards, it is because you didn't understand the threat.
    • Re:Nothing new (Score:4, Interesting)

      by kasperd (592156) on Sunday December 02, @09:51AM (#21551537)
      (http://kasperd.net/~kasperd/ | Last Journal: Thursday July 08 2004, @10:18AM)
      After having read the actual article I realize that there in fact is something new in it. The slashdot story put all the focus on software signing, which is not the interesting part of the article. The interesting part of the article is, that they have found a new and stronger way to produce collisions. For one thing it is going to be a lot less obvious that a file is crafted. The original attack required all the colliding files to contain all the meaningful content with some psuedorandom content to select between them. The new attack doesn't require this, in fact you could even produce collisions beteween files of different formats. Like a jpg file and an exe file with the same md5 hash. But still it is just a collision attack, it produces collisions between two crafted files. They don't produce collisions between a collision between an arbitrary original file and one crafted file.
  • Security hashes (Score:1)

    by billcopc (196330) <vrillco@yahoo.com> on Sunday December 02, @09:40AM (#21551497)
    (http://fnarg.com/)
    Okay so someone was a bit late to learn that MD5 collisions are indeed possible. Congrats, you're still retarded!

    It's not exactly hard to understand that a 128-bit hash is going to be less unique than a multi-kilobyte executable. I believe 3rd grade math has that covered. With processor speed increasing steadily, these things become easier to break with each passing day.
  • Md5 as a signature (Score:2)

    by Almahtar (991773) on Sunday December 02, @10:38AM (#21551769)
    I see a lot of comments about how, since this attack requires access to the file both before and after signing, this is a non-issue. In most cases you're right, but get creative.

    You have a lengthy verification process for new software - you check it over thoroughly to make sure it can be trusted, and after you certify it as trustworthy you sign it and only need to re-certify if the signature changes next time you download it from me.

    I deliver a new version of the software to you (the "good" version), you certify and sign it (using MD5, unfortunately for you). I swap out the "evil" one, and next time you download it -- sure enough, the signature verifies it's fine.

    What if you even had a virus scanner that used MD5's on executables for lazy re-scanning when they'd been modified?

    I'm not sounding the "holy crap we're doomed" alarms, just pointing out that if you can take two different files and get the same "signature" from them, it's not a very good "signature", now is it?
  • Use multiple hash functions... (Score:1, Redundant)

    by Roogna (9643) on Sunday December 02, @11:33AM (#21552087)
    Back in the day I remember always being told that a single hash function was never secure for verifying information... and that for security you should use two -different- algorithms or more. Simply because an attacker can manipulate the data to collide in a single function, it's that much more difficult to manipulate the data to collide in two entirely different hash spaces.

    Did this concept change over the years, or is it just me? heh
  • MD5+SHA1? (Score:2)

    by Midnight Thunder (17205) on Sunday December 02, @01:01PM (#21552731)
    (http://slashdot.org/ | Last Journal: Saturday February 05 2005, @03:50AM)
    Surely a hybrid MD5+SHA1 signature would prove better? You can find weaknesses in each, but putting them together and the likelihood of the both weaknesses appearing at the same time would be greatly diminished. Other than extra CPU requirements, are there any issues with this approach?
    • Re:MD5+SHA1? by kasperd (Score:2) Sunday December 02, @01:30PM
      • Re:MD5+SHA1? by grumbel (Score:2) Sunday December 02, @02:40PM
        • Re:MD5+SHA1? by kasperd (Score:1) Sunday December 02, @02:55PM
    • Re:MD5+SHA1? by Cairnarvon (Score:2) Sunday December 02, @03:00PM
  • by gr8dude (832945) on Sunday December 02, @02:23PM (#21553449)
    (http://railean.net/)
    Although AES-256 is not a hashing algorithm, I've seen it applied in hashing [lazybit.com]. Since it is a block cipher, when you encrypt a file, at the last iteration you have a chunk of 256 bits, which is used as a digest. If you change anything in the file, the change will propagate to other blocks (if encryption is done in CBC mode), so the last block (i.e. digest) will be different.
  • 3 replies beneath your current threshold.