As a follow-up to yesterday's roundup of encryption algorithms, I bring you a roundup of popular hashing algorithms. Unlike encryption, which is generally a 2-way function (encrypt, then decrypt), hashing is generally one-way. There is no way to determine the original text based on a hash (except brute force) if it is properly implemented. So what's the point of that? Well, primarily it's used to verify the integrity of a given bit of data. Say, for instance, you are doing forensics on a computer system, and you generate hashes of all the files on the file system. You can demonstrate later, in a court of law perhaps, that those files have not changed since. Because if even one bit had changed the resulting hash would be different now than it was when you started your investigation.
Hashing algorithms are also useful for storing passwords, because the system doesn't need to record what your password is. It only needs to be able to verify that the hash for the password you are giving the system now matches the hash in your user profile. Since hashes are one-way, even if someone did manage to get your password hash out of the database there is no way for them to reverse that hash into a usable password.
A word about collisions:
Theoretically, hashing functions should produce a unique result (digest) for a given piece of content being hashed. If two pieces of content produce the same hash digest it's referred to as a "collision". Collisions are bad, because it casts doubt on the integrity of the hash.
All hash functions have potential for collisions. Let me say that again: All hash functions have potential for collisions. There is no way you can produce a fixed-length digest from a variable length bit of data and not have potential for collisions.
The goal of a good hash function is not to eliminate collisions, which is impossible. The goal is to make collisions so astronomically infrequent that it is computationally infeasible to find them.
OK, with that said, I give you The List:
Popular Hashing Algorithms
|
Algorithm
|
Digest Length (bits)
|
Passes
|
Cracked?
|
| HAVAL |
128-256 |
3, 4 or 5 |
128-bit/3-pass cracked |
| HAVAL-160 |
160 |
4x20 |
No |
| MD2 |
128 |
1 |
Yes |
| MD4 |
128 |
1 |
Yes |
| MD5 |
128 |
1 |
Yes |
| Panama |
256 |
|
Yes |
| RadioGatun |
Variable |
Variable |
No |
| RIPEMD |
128 |
4x16 |
Yes |
| RIPEMD-160 |
160 |
5x16 |
No |
| RIPEMD-320 |
320 |
4x16 |
No |
| SHA-1 |
160 |
80 |
Theoretical |
| SHA-2 Family |
224-512 |
64 or 80 |
No |
| SMASH |
256 |
|
Yes |
| Whirlpool |
512 |
10 |
No |
HAVAL and HAVAL-160: Introduced in 1992, HAVAL showed a lot of promise with its variable digest length and number of passes. However, HAVAL has some fatal flaws, which have been largely fixed by its successor, HAVAL-160, which uses a fixed length digest and a set number of passes. HAVAL-160 has no known weaknesses at this time and is considered secure.
MD Family (MD2, MD4, MD5): Designed in the late 80s and early 90s by Ron Rivest, the MD-series of algorithms were some of the earliest, and MD5 is to this day the most popular. MD5, the most secure of the bunch, has been shown to be abnormally susceptible to collisions, and its use is now actively discouraged. Some consider all current hash functions to be some strengthened variant of MD4.
Panama and Radio-Gatun: Panama was introduced in 1998, and has influenced a number of other popular designs. Unfortunately, in 2007, researches demonstrated a practical attack on Panama that demonstrated an unacceptable collision frequency. Luckily, in 2006, a variant of Panama was introduced, called Radio-Gatun (where do they get all these cool names!), that does not have the weaknesses that were exploited in its parent algorithm. Panama is considered broken, and Radio-Gatun is considered secure at this time.
RIPEMD Family: The original RIPEMD was based on MD4, and unfortunately inherited MD4's fundamental flaws. It is considered broken. The successors, RIPEMD-160, RIPEMD-256 and RIPEMD-320, are considered secure. There was also a RIPEMD-128, but that was found to have similar flaws as its parent, and is not in widespread use any longer.
SHA-1: Designed by the NSA, and published by NIST, SHA-1 was the darling of the federal government for quite a while, and as a result it's probably the most widely used hashing algorithm in place today aside from MD5. While no attacks have been reported in SHA-1, research has discovered a number of theoretical flaws in the algorithm, and NIST has mandated that all federal agencies must stop using it by 2010. It is being replaced by its successor, aptly named "SHA-2"
SHA-2 family: The SHA-2 family is actually 4 separate algorithms, distinguished by the length of their resulting digests. SHA-224, SHA-256, SHA-384 and SHA-512 are all technically SHA-2 algorithms. Along with SHA-1, these algorithms are the hashing algorithms currently recommended by NIST. Unfortunately, since SHA-2 is mathematically similar to SHA-1, there are concerns that it may share some of its weaknesses. A search for its successor, to be called SHA-3, is currently underway at NIST, using the same process that brought us the wildly popular AES encryption algorithm. The winning SHA-3 algorithm will be crowned in 2012. In the meantime, NIST is still recommending the SHA-2 family of algorithms for all purposes.
SMASH: Debuting in 2005, SMASH showed a lot of promise, and was a refreshing departure from the usual MD4 pedigree. Unfortunately, it was quickly broken.
Whirlpool: Released in 2000, Whirlpool actually harnesses the AES algorithm in its computations. Which isn't all that surprising when you consider that it was designed by one of Rijndael's original co-authors. There have been 3 releases, but even the early versions have stood up well to attack. Subsequent releases have improved the algorithm's speed, taking advantage of advances in processor technologies. It has been adopted by ISO for its 10118-3 standard.
Which to choose?
My current preference for hashing algorithm, due to its widespread adoption, is the SHA family, either SHA-1 of a variant of SHA-2 if possible. If I can get it, I'll use Whirlpool.