MD5 Collisions a Game Changer for SSL and AV Companies?
There has been quite a bit of press over the last day or two with respect to a design flaw with SSL that could allow an attacker to forge a security certificate such that it circumvents the built-in authentication methods within your browser. This means that your browser could believe that a malicious, look-alike web site for your bank could authenticate to your browser as your real bank web site if this attack is carried out correctly. See this story from CNET that has a graphical proof of concept example using Bank of America.
If you are not familiar with MD5, essentially it is a 128-bit hashing algorithm that is used by many security applications. For example, an MD5 hash is commonly used as a checksum by system integrity validators (SIV) to ensure that key binaries on your system have not changed their default composition (if they have, this could indicate a trojan or rootkit has been installed on your system).
MD5 checksums have been known for some time to not be completely secure as it is typically expressed as a 32-bit hexadecimal number. This means that there are only a finite number (2^128) of potential hash possibilities. This has been considered to be good enough for many applications, but with the power of today's clustered computing environments (also including botnets), it has been found that the time it takes to generate a targeted MD5 collision has been greatly reduced. According to the CNET article, performing the initial forgery proof of concept took about 2 weeks on a cluster of 200 Playstation PS3s. This kind of computing power is infinitesimal compared to most botnets. Quite a few articles on the web (do a Google search for "md5 collision example" and some will yield source code) already discuss how easy it is to create an MD5 collision.
Web site forgeries are only one example of how MD5 collisions can be used to circumvent security technologies. My friend Adam O'Donnell from Cloudmark points out in a Twitter update that an MD5 collision could also be utilized to make malicious software look legitimate. Take our SIV example from earlier. If a malicious version of a binary was created with the same md5 checksum as its legitimate counterpart, your security checks may never identify that the original executable was modified if your PC were to get infected with some type of trojan or rootkit. This could also cause AV companies to have to rethink how they do some of their own scanning methods also.
What all of this really highlights is the fact that MD5 is no longer a "good enough" (and in reality hasn't been, but that hasn't stopped people from using it) hashing algorithm if your intention is to create a hash that will be used as part of any kind of security/authentication system. I agree with Paul Kocher's statements from the CNet article in that although this is certainly not one of the biggest security issues facing us right now. Between all of the other application based attacks that exist, this one could be potentially very dangerous as it is another one of those that we have discussed that do not require elaborate social engineering to be carried out effectively (at least for web site forgeries) as the redirection to a malicious site can be carried out at the network level.
This is not one of those types of attacks that is likely to occur on a large scale against many widely used web sites (like the Bank of America proof of concept) as it would likely get sniffed out very quickly, but if used for smaller, more localized attacks could prove to be effective.
