SHA-1 Collisions for Meaningful Messages 128
mrogers writes "Following on the heels of last year's collision search attack against SHA-1, researchers at the Crypto 2006 conference have announced a new attack that allows the attacker to choose part of the colliding messages. "Using the new method, it is possible, for example, to produce two HTML documents with a long nonsense part after the closing </html> tag, which, despite slight differences in the HTML part, thanks to the adapted appendage have the same hash value." A similar attack against MD5 was announced last year."
yeah... nonsense (Score:5, Funny)
To achieve this, the method uses material pulled from myspace.com.
Re: (Score:2, Funny)
Falling back to sco.com, Microsoft.com, digg and slashdot, in that order.
Re: (Score:1)
Quite simple to check file size also (Score:1)
Re: (Score:2, Insightful)
Re: (Score:1)
(Of course collisions would still be possible, just more difficult--but that's true about any hash of any length less than file size, is it not?)
Re: (Score:2, Interesting)
071883abcdef18234760abdefcd38f387bc93-78b
And 78b would be the size of the file. Not being a cryptologist, not even remotely, I am probably missing something quite simple and obvious here, but at first glance it seems like this would work.
Re: (Score:2, Interesting)
It would also take a bit more (though maybe not much) to apply it to digital signatures.
Re:Quite simple to check file size also (Score:4, Informative)
From the article:
Using the new method, it is possible, for example, to produce two HTML documents with a long nonsense part after the closing </html>tag, which, despite slight differences in the HTML part, thanks to the adapted appendage have the same hash value.
So it appears that both the original and the new messages need that appendage. This isn't just about adding an appendage to a known, appendage-less document.Re: (Score:2, Interesting)
For other document types, I coul
Re: (Score:3, Insightful)
Re: (Score:2)
Come on guys, SHA-1 was broken years ago. That people are now devising attacks is hardly newsworthy.
Re: (Score:3, Informative)
Let's not forget this attack is against a reduced 64 round version of SHA-1.
SHA-1 still does what it says on the tin: the best attack known against the 80-round version is 2^63, which is still not practical.
Re:Quite simple to check file size also (Score:4, Informative)
As the heise.de article points out; the twins are of equal length - the file size would be the same.
Finding hash twins whereby the chosen one is, oh, let's say 160 bits longer is a degree less sophisticated.
Re: (Score:2, Informative)
I don't know the details of this particular attack, but usually attacks on hashes like this produce two documents with the same file size. Certainly the MD5 collisions a couple years ago had the same file size.
Re: (Score:2)
Re: (Score:2)
Not like if it was AES (Score:2)
Re:Not like if it was AES (Score:4, Informative)
Re: (Score:1)
Re: (Score:2)
Re:Not like if it was AES (Score:4, Insightful)
Now, if all emails were encrypted, it would be harder to immediately see what messages in a mailbox deserve your attention. But then at a later date CPU speed may make that a negligible difference.
Re: (Score:2)
and upon close inspection the good one would look very suspicious (lots of random garbage)
so its a break of sorts but nowhere near as bad as a preimage attack would be.
Re: (Score:3, Interesting)
That's the neture of hash keys anyway. The security weakness is not into collisions possibly existing, but how fast and feasible it is to find them.
It's a simple logical rule: if you have a 256-bit key, this is 2^256 possible hash combinations. If you put in a folder all possible 257 byte long text files, then each file will have a key that matches a the key of a
Re: (Score:2)
This is a big deal (Score:5, Insightful)
While expected since last year, selecting and using crypto-hashes just got a lot more difficult and error prone.
Re: (Score:1, Insightful)
How so? By default, a hash is breakable since its reducing the infinite input into a finite output. Ciphers do not.
Re: (Score:2)
That is nonsese. Breaking a Hash means to find collisions with reasonable effort. Breaking a cipher means to reverse it with reasonable effort.
Re:This is NOT a big deal (Score:5, Informative)
The second reason to keep cool is just as important, if not even more so: hackers will have to execute a pre-image attack to manipulate, for instance, a contract that has been digitally signed. In other words, hackers will have to find a second, manipulated contract with the same hash value as the real contract. In principle, the number of operations needed is thus far greater (2160). Indeed, as far as we know all attacks to date have only concerned collisions, and Wang et al. does not change that. There are no known methods to reduce significantly the number of operations needed for pre-image attacks.
Don't you think you're flying off the meter here a bit... Just because a collision was found means truly little. So a garbage laced HTML page was created after the actual HTML closing tag... 1) No one will see what comes after that unless you like viewing the source of a webpage as opposed to an actual page. 2) You should read up on birthday paradoxes. If someone created two similar messages, it would take years for them to figure out how to compute a hash to match. Now in the field of sending out something so so so secure, what makes you think that even if a someone did re-computate a hash to match, that message would be worth anything years down the line. Someone would have to be able to accomplish a collision, re-computate the hash in their new message and send it all within minutes for it to truly be a threat.
Let's look at this scenario... A massive kernel update is made to say Linux... The information is hashed, posted, and everyone is now going to update their Linux boxes... Unless someone is so quick fast to intercept along this path, most are safe unless they choose to verify the hash years down the line (which by then would be worthless). So unless someone can exploit this within minutes (no more than I would guesstimate 36 hours), I see little reason to get all bent out of shape over this...
Re:This is NOT a big deal (Score:5, Informative)
Common web browsers (I just tried Opera, FF, and Lynx) will happily display everything after the closing tag. You would have to put it inside <!-- --> comment delimiters, but then it doesn't matter whether it is before or after the closing tag. Unless the attack doesn't work if the --> has to come at the end, but then you can just omit the closing tag. Only an XHTML-compliant browser would complain. From cursory scanning TFA it is not clear to me what the reason is for mentioning the closing </html.
Re: (Score:3, Informative)
Re: (Score:2)
Re: (Score:2)
http://www.heise-security.co.uk/articles/75686 [heise-security.co.uk]
Re: (Score:3, Insightful)
Whirlpool [terra.com.br] is a good choice these days. It's longer than most of the hashes out there, but I don't believe there have been any attacks yet demonstrated against it.
For those pythoners out there I wrote a quick wrapper for it [kepty.com] that should get you started. Excuse any site errors and just hit refresh
Re:This is a big deal (Score:4, Informative)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Re:This is a big deal (Score:5, Interesting)
Seconded. Whirpool uses similiar mathematics to AES so an attack that breaks Whirpool is likely (although not certain by any stretch of the imagination) to also break AES.
I think much like it is harder to design a cipher that resists attack when you use an LFSR as your base primitive it is hard to design a hash that is secure that uses an Unbalanced Fiestel Network (UFN).
This is why I do not advocate moving to the higher SHAs. I believe that some weakness will be discovered and it will be found the UFN made it worse.
If you're going to use AES, you've already thrown all your eggs in the Wide-trail design basket. If you're going to do that for the cipher, you might aswell do the same for the hash too.
In fact, in most cases you will use the hash has part of an authentication primitive anyway. In this case, there's a good argument for dumping a new hash and using an encrypt-authenticate mode of operation instead of something like HMAC. That way, you reduce the number of assumptions which have to be true for the system to be secure, which can only be a good thing.
In short, if you need to authenticate use your favourite encrypt-authenticate mode. If you need a hash for some other purpose, use Whirlpool.
Simon
Re: (Score:2)
Alright, so there's this SSH packet coming down the line. For those of you not in the know, it loos like this:
[encrypted block][hmac of decrypted
Re:This is a big deal (Score:4, Interesting)
Tiger is another good hash function - faster than Whirlpool, smaller for those embedded cases where even the bytes matter, and I believe it is not known to have any attacks against it. Tiger also appears in mhash, not certain if it's in the kernel but it should be.
I don't see that it is really of any consequence whether anyone has actually demonstrated an attack on SHA - the point of security is to not wait until AFTER the house has been plundered to upgrade. SHA is FIPS-180, but if there is even a theory on how it might become broken, I would urge people to use something stronger. Security that is only certain to be good against skript kiddies is really not very useful security.
A little information theory (Score:4, Informative)
A hash will either contain all of the non-redundant information in the original content, or some of the information gets lost during the hash. Non-redundant information being defined in an information-theory sense that a given bit is completely random/unpredictable based on the content of preceding bits.
In order for a hash to be completely collision proof, it has to contain all of the non-redundant information contained in the original file. Otherwise information in the orignal message is lost in the hash. And if information is lost from the original message, that creates a possibility of constructing a message that differs only in the information that is removed by the hash. Only if the original message is reconstructible from the hash (plus possible information contained in the hash algorithm itself) will it be collision-proof. You've either got the information-content, or you don't. And if you don't have the content, you can't validate it.
Re: (Score:2)
This statement makes little sense to me. "Collision resistant"? It's still quite collision resistant. But by the very definition of such a hash (mapping more than n messages, into n hashes), means a hash could never be collision-*proof*. Since any hash is not going to be collision proof, you can't truly "rely" upon it in the absolute sense.
Perhaps a more precise definition is that there is now a known way to
Re: (Score:2)
See? You answered your question yourself. Except that in crypto terms SHA-1 is now not collision restistant anymore.
Original (Score:1)
Re: (Score:1)
Re: (Score:3, Informative)
Re: (Score:1)
Re: (Score:2)
How about this combination: (Score:4, Interesting)
1) Message/file length
2) SHA1 hash
3) MD5 checksum or some other hash/checksum that's calculated way differently from SHA1.
Providing the length means that the person trying to change the data needs to keep it the same length which makes it more difficult.
Using 2 different hashing/checksumming methods means they have to be able to match both of them in order to be able to switch the data.
The more restrictrictsion we toss on the data, the harder it is to manipulate. I do think that using more than 2 or 3 hashing/checksumming methods would be overkill however.
Why not more rounds? (was Re:How about this c...) (Score:1)
Why not just create an algorithm with more rounds? I'm sure it would have more benefits than just slapping two algorithms together. In fact MD5 and SHA came from the same algorithm anyway (MD4), so I don't see much of a benefit... You may even need less processing time to calculate the hash than just doing both.
Re: (Score:1)
Thi smight be good is the number that was output was two seperate numbers concatenated together. Otherwise it's still just one number they have to match when they are modifying the hash protected data.
Re: (Score:2)
The better alternative is to move to an existing well tested algoritm while the security researchers are testing the future repla
Re: (Score:2)
My favorite example is when script kiddies pull out some wacky scheme like "Take a message, type it on an image in photoshop, save as JPEG, re
Re:How about this combination: - Not so good (Score:1, Informative)
Re: (Score:2)
Channel: 1691 (TEST)
Upload object: 1820 (Test AH)
Time of upload: 24-05-2006 14:12:41
Size of file: 65298 bytes
Uploader: doe (John Doe)
File received as: report.doc
File stored as: doe.doc
File SHA1 digest: e9dd2b06972aac8ef63a5b33e75775ad88d84556
File MD5 digest: df914e41a7d85d4bfd897d368a052f8b
Re: (Score:2)
The only reasonable use for stuff like this is the birthday attack. In the birthday attack, the attacker generates both documents with the hash collisions. They both have a big block of invisible junk in them. That leaves plenty to play with.
Your suggestion does nothing against the birthday attack.
Re: (Score:2, Insightful)
I really am asking-- I'm not all that up on the guts-and-wherefores of encryption/hashing, and I've wondered about this question as well.
Re: (Score:1)
Actually when you going for data verification of non-secret data, more precise information is better.
to find one... how big will the html get (Score:1)
Re: (Score:1)
IANACE (I am not a crypto expert), but...
Well, since you can easily squeeze about 6 bits for readable gibberish (meaning up/lowercase letters numbers and a few punctuation marks), I calculate you should only need 27 characters (160 bits/6). However that is only an estimate. You may need more because the inbetween bits may affect the result. If you were willing to go with unreadable characters, then 20 bytes should be enough.
Either way, I don't see why you'd need more than one line, so this attack may no
Already expected (Score:1)
NO SHA-1 COLLISIONS HAVE EVER BEEN FOUND (Score:5, Informative)
Ahem.
Sorry, my caps lock key got stuck there.
No SHA-1 collisions have ever been found. The first person or group to find one will achieve considerable fame. I say this as an attendee of both last week's Crypto conference and the immediately following hash function workshop.
The work factor estimated for a SHA-1 collision is something over 2^60 cycles. That would put it on par with the biggest calculations that have ever been done (publicly anyway). So far nobody has put together a sufficient effort to achieve a collision.
At the hash function workshop, cryptographer Antoine Joux published a set of recommendations for how such a hash collision effort should be mounted, in order to minimize the damaged from a published collision. The main goal is to make it difficult to take a published collision and use it to create harmful effects in various ways. Hopefully Joux's guidelines will be followed if and when a SHA-1 collision finding project gets started.
easy tiger... (Score:3, Insightful)
No SHA1 collisions have ever been published
whether or not they have been found is a different matter entirely.
multiple hashes MD5 and SHA-1 (Score:3, Interesting)
Software will just need to be updated to calculate two hashes. Good luck finding two sets of data that are different yet have the same length, the same SHA-1 hash and the same MD5 hash.
Re: (Score:1, Insightful)
Re: (Score:2)
You are a crypto n00b! Good luck? It's called a birthday attack. And calculating 2 hashes on the same data is not fundamentally different from just having a longer hash.
All you are proposing is, instead of using SHA1(data), use NewHash(data) where
NewHash(data)=SHA1(data),MD5(data)
That just a longer hash, and just as attackable by the birthday attack.
Add size of file (Score:1, Redundant)
Re: (Score:3, Informative)
MD5, SHA, and every other hash function I've ever read the spec for appends some zeros followed by the original message size (the zeros are so it comes to an integer number of blocks) as the first thing it does. For exactly this reason.
At a guess, this attack requires that the t
Re: (Score:2)
Re: (Score:3, Informative)
Re: (Score:1)
Re: (Score:1)
I know you're an expert in cryptography, but my simple mind sees it this way. MD5.
1. Takes original message
2. Appended Message = original message plus some added bits (includes the file size)
3. Message Digest = Md5 algorithm run on the Appended Message
Would it not be simpler to just include the file size as part of the message digest itself.
1. The original message
2. Message Digest = md5 run on the original message
3. Final me
Re: (Score:2)
Re: (Score:2)
PS: The attacks on both algorithms create files of equal size.
Re: (Score:1)
Passwords are 'files', too (Score:1)
Some Software does include size optionally, other does compute two hashes.
The basic problem though, is that MDn and SHA-n are in the same family and these algorithms have been fundamentally broken for some time now. Instead of slathering on more of old security (Hi, 3DES!), get a new, better and above all different approach like http://en.wikipedia.org/wiki/Whirlpool [wikipedia.org]
OK, what do we use now? (Score:1)
Insecure (Score:5, Informative)
Complaining that SHA-1 is insecure is like complaining that Windows 98 is insecure.
Oblig Wikipedia link: http://en.wikipedia.org/wiki/SHA_hash_functions [wikipedia.org]
Re: (Score:3, Informative)
Slected as the replacement for SHA-1 years ago by the European NSA, 512 bits long, based on AES.
Who loves ya?
Re: (Score:2)
d41d8cd98f00b204e9800998ecf8427e
Haha, I have cracked your worthless hash and I can tell you that the sha1sum of your very short kernel also begins with a 'd', see:
sha1sum
da39a3ee5e6b4b0d3255bfef95601890afd80709
Re: (Score:2)
Oh no!!! (Score:2)
Re: (Score:2)
what? (Score:2)
So how can SHA-2 be secure? It has no proven collisions? I can prove it has collisions. It accepts more than 512 bits of input and only procues 512 bits of output, it must collide.
There have not been shown to be collisions for meaningful messages in SHA-1 yet. And the "break" for it requires on the order of 2^63 operations.
SHA-1 is a long way from dangerous to use at this point.
Re: (Score:1)
If a bug is found, the function can be simplified so that instead of two weeks the password can be found in two days, because the number of iterations is reduced.
that's not what hash function security is about (Score:2)
First of all, all hash functions of the same size are pretty much equally vulnerable to a brute force attack - that's practically the defining feature of a brute force attack. It's resistance to shortcut attacks that matters.
Secondly, there are loads of properties people want hash functions to have - not only preimage resistance but second preimage resistance, collision resistance, correlation-freedom, multiplication-freedom and more. In fact no
Horray for validation (Score:2)
Nothing to see here.. (Score:2)
file size is used as part of the final message digest.
If your application doesn't do this then you should use another
applications.
Using the above mentioned processes none of these attacks are
viable, both in the short and long terms. (supposedly)
When receiving the following message:
Jon D0e owe$ Jane Do3 $1000`000 d011ar5.
or
Jon Doe owes Jane Doe $1000000 do11ars.
$4#^$%^&*5#$^()^%ER$%^RT#$3
Would you not ask the sender to resend
their original messa
Re: (Score:2)
Re: (Score:2)
uncompressed (original) size.
The question is can you tunnel a match for the MD that is also a valid
compressed stream of some compression algorithm X. FIPS 140 doesn't
say to compress but it does say to use a format in the message that
is totally reliant on every bit in the message to be decoded properly
and that does not ignore bits it does not understand. I believe an
amendment to this should be for every erroneous bit it receives it
should a
If you can replace the file... (Score:2)
Even if the hash and the file came from seperate sources, odds are a user went to 'xyzzysoft.com' from which the link to both the hash and the file are provided; so if xyzzy's page were hacked, they are screwed anyway.
To use a car analogy, to steal a purse in a car, a theif dosn't pick the lock, he smashes the window.
Unless you have a seperate, secure chann
Re: (Score:2)
Consider this: A patch to an OS or popular application comes out. The patch is subject to peer review before public release, the hash is published. 30 seconds after the public release happens, I replace the file on one of the mirrors (and NOT the hash -- I don't control the main site, only a mirror or two) with a malicious version with the same hash.
Consider bittorrent -- If you could quickly and easily generate new data that matches a known hash, you could join any torrent you w
Yeah, but.. (Score:1)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1, Informative)
Re: (Score:1)
No, usually the attacks produce 2 files with the same size.
Maybe factoring in message size as part of the hash is the solutio
Re:git (Score:4, Interesting)
For now though from a theoretical viewpoint this is a major weakness, it still requires way too much processing power to be realistic. And the way git is designed, I don't think it is going to be any major problem switching to a new hash once cryptographers starts to agree which one should be considered secure in the future. Once they start using a new hash, you can actually still safely use old repositories based on SHA-1. Because once there is no longer being added new data based on SHA-1, a collision is no longer enough to perform an attack, rather you need a second preimage, something which there has not yet been demonstrated an efficient way to produce.