What happens with passwords in other languages, and more specifically forcing the use of UTF-8 double bit characters? What about using passwords in multiple languages?
Most brute force password cracking at least uses a dictionary to get at the low hanging fruit, why not increase the size of the dictionary? What are there like million words or something like that in the English language (guess) vs millions Chinese?
It would seem just branching out to Spanish, German, or whatever combinations would greatly decrease the success of brute force attacks.
I've analyzed password lists in several languages, and it depends on how the hashing algorithm encodes the password, or more specifically how the program sends the password to the hashing algorithm. Aka the MD5 of an UTF-8 encoded password is different vs. the MD5 of a codepage encoded password. That gets really interesting when someone switches between languages mid-password, (aka half of a password in a right to left language such as Arabic, and the other half in a left to right language such as English). Oh, and yes, increasing the keyspace due to multiple alphabets certainly can hurt a brute-force attack, but not as much as you would expect if the password set is mostly from the same group. There are other patterns as well. For example non-English native speakers tend to use more number replacements, (aka 1 for a 'l', 3 for an 'e', etc), while English speakers favor symbol replacements, (@ for 'a'). Also, in a Spanish set, numbers at the front of the password, such as '123password', were much more frequent then I've seen in other datasets, (most people put the numbers at the end). Like all things though, these are just averages, so it's really hard to nail down the origin of a user based on their password unless they use a non-English word in it.