Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×

SHA-1 Collisions for Meaningful Messages 128

mrogers writes "Following on the heels of last year's collision search attack against SHA-1, researchers at the Crypto 2006 conference have announced a new attack that allows the attacker to choose part of the colliding messages. "Using the new method, it is possible, for example, to produce two HTML documents with a long nonsense part after the closing </html> tag, which, despite slight differences in the HTML part, thanks to the adapted appendage have the same hash value." A similar attack against MD5 was announced last year."
This discussion has been archived. No new comments can be posted.

SHA-1 Collisions for Meaningful Messages

Comments Filter:
  • by Name Anonymous ( 850635 ) on Sunday August 27, 2006 @10:46AM (#15989436)
    Provide the following 3 pieces of data:

    1) Message/file length
    2) SHA1 hash
    3) MD5 checksum or some other hash/checksum that's calculated way differently from SHA1.

    Providing the length means that the person trying to change the data needs to keep it the same length which makes it more difficult.

    Using 2 different hashing/checksumming methods means they have to be able to match both of them in order to be able to switch the data.

    The more restrictrictsion we toss on the data, the harder it is to manipulate. I do think that using more than 2 or 3 hashing/checksumming methods would be overkill however.
  • by Ckwop ( 707653 ) * on Sunday August 27, 2006 @11:02AM (#15989507) Homepage
    Whirlpool is a good choice these days. It's longer than most of the hashes out there, but I don't believe there have been any attacks yet demonstrated against it. For those pythoners out there I wrote a quick wrapper for it that should get you started. Excuse any site errors and just hit refresh

    Seconded. Whirpool uses similiar mathematics to AES so an attack that breaks Whirpool is likely (although not certain by any stretch of the imagination) to also break AES.

    I think much like it is harder to design a cipher that resists attack when you use an LFSR as your base primitive it is hard to design a hash that is secure that uses an Unbalanced Fiestel Network (UFN).

    This is why I do not advocate moving to the higher SHAs. I believe that some weakness will be discovered and it will be found the UFN made it worse.

    If you're going to use AES, you've already thrown all your eggs in the Wide-trail design basket. If you're going to do that for the cipher, you might aswell do the same for the hash too.

    In fact, in most cases you will use the hash has part of an authentication primitive anyway. In this case, there's a good argument for dumping a new hash and using an encrypt-authenticate mode of operation instead of something like HMAC. That way, you reduce the number of assumptions which have to be true for the system to be secure, which can only be a good thing.

    In short, if you need to authenticate use your favourite encrypt-authenticate mode. If you need a hash for some other purpose, use Whirlpool.

    Simon

  • by The_Wilschon ( 782534 ) on Sunday August 27, 2006 @11:19AM (#15989567) Homepage
    What if it were tacked onto the end of the hash? So your hash would look like

    071883abcdef18234760abdefcd38f387bc93-78b

    And 78b would be the size of the file. Not being a cryptologist, not even remotely, I am probably missing something quite simple and obvious here, but at first glance it seems like this would work.
  • by Lockejaw ( 955650 ) on Sunday August 27, 2006 @11:39AM (#15989653)
    That would work for typical everyday use (like a checksum next to a link to a downloadable file). Of course, this is assuming that the birthday attack wasn't keeping a uniform file size.

    It would also take a bit more (though maybe not much) to apply it to digital signatures.
  • by KalvinB ( 205500 ) on Sunday August 27, 2006 @12:10PM (#15989790) Homepage
    If the MD5 of the two different strings that had the same SHA-1 value are different then there's no real reason to panic. For an added level of security you could also calculate the byte length of the data.

    Software will just need to be updated to calculate two hashes. Good luck finding two sets of data that are different yet have the same length, the same SHA-1 hash and the same MD5 hash.
  • by jd ( 1658 ) <imipak@yahoGINSBERGo.com minus poet> on Sunday August 27, 2006 @02:35PM (#15990376) Homepage Journal
    Whirlpool is also in mhash, which is now many versions on from the ones supplied by Fedora Core and Gentoo. Oh well. It's also in the Linux kernel's crypto library. Whirlpool is a damn good hash and uses the same principles as the Rijndael cipher, which means that the underlying concepts have been deeply analyzed twice - once in each form - showing the basic ideas are fairly solid. Being long should reduce the risk of collisions so is actually a strength in many cases - particularly as we're talking bytes, not megabytes.


    Tiger is another good hash function - faster than Whirlpool, smaller for those embedded cases where even the bytes matter, and I believe it is not known to have any attacks against it. Tiger also appears in mhash, not certain if it's in the kernel but it should be.


    I don't see that it is really of any consequence whether anyone has actually demonstrated an attack on SHA - the point of security is to not wait until AFTER the house has been plundered to upgrade. SHA is FIPS-180, but if there is even a theory on how it might become broken, I would urge people to use something stronger. Security that is only certain to be good against skript kiddies is really not very useful security.

  • by suv4x4 ( 956391 ) on Sunday August 27, 2006 @03:29PM (#15990614)
    Someone could modify the original tarball, for example, introducing a trojan horse, and append some other not useful data to it so the sum matches.

    That's the neture of hash keys anyway. The security weakness is not into collisions possibly existing, but how fast and feasible it is to find them.

    It's a simple logical rule: if you have a 256-bit key, this is 2^256 possible hash combinations. If you put in a folder all possible 257 byte long text files, then each file will have a key that matches a the key of another file in that folder.

    Make it a 258 byte file, and you have 4 possible 258-byte long files with the same hash. Now make it a 100kbyte file...

    Again, it's not about the fact collisions exist, it's apparent they do, it's whether you can abuse this fact or not. MD5 has been compromised for certain applications, for example, but doesn't make it useless.
  • Re:git (Score:4, Interesting)

    by kasperd ( 592156 ) on Sunday August 27, 2006 @05:05PM (#15990929) Homepage Journal
    This is bad news for the git.
    It is not a major problem. First of all to exploit it, you'd have to generate a collision and have one of the two versions accepted in mainstream. Second you'd have to get the wrong version onto some user's machine before the correct version. Linus explained this in a posting somewhere after the original SHA-1 weakness was published. And though Linus AFAIK does not have any education in cryptography, he has demonstrated, that he clearly knows how to apply cryptographic primitives in a sound way. I have a PhD in cryptography, and I have read about the design of git, and I did not spot any weakness.

    For now though from a theoretical viewpoint this is a major weakness, it still requires way too much processing power to be realistic. And the way git is designed, I don't think it is going to be any major problem switching to a new hash once cryptographers starts to agree which one should be considered secure in the future. Once they start using a new hash, you can actually still safely use old repositories based on SHA-1. Because once there is no longer being added new data based on SHA-1, a collision is no longer enough to perform an attack, rather you need a second preimage, something which there has not yet been demonstrated an efficient way to produce.
  • by DigitAl56K ( 805623 ) on Monday August 28, 2006 @03:36AM (#15992558)
    That's fine, except where the interpreted data can be created from more than one set of actual data. An example might be when creating a hash of a zip archive. If I take the original zip archive, and recompress it using a more efficient algorithm I can reduce the actual data size. I could even go so far as to manipulate the compression algorithm for every byte encoded until I found an encoding that enabled me to produce a hash collision and still match the original file size.

    For other document types, I could, for example, take a marked-up UTF-16 document and change the character encoding to UTF-8, saving roughly half the space for a document whose characters were mostly English glyphs, then execute the same attack. For rich documents, space could be saved by minimally degrading image quality. For binaries, by running packers on them.

    Adding the filesize to the hash string doesn't offer as much protection as you might first think when at the end of the day you can still find ways to create a collision.

HELP!!!! I'm being held prisoner in /usr/games/lib!

Working...