Hashing is a method for reducing large inputs to a smaller fixed size output. When doing forensics, typically cryptographic hashing algorithms like MD5 and SHA-1 are used. These functions have a few properties useful to forensics. Other types of hashing, such as Context Triggered Piecewise Hashing can also be used.
There are literally hundreds of hashing programs out there, but a few related to forensics are:
- md5sum - Part of the GNU coreutils suite, this program is standard on many computers.
- md5deep - Computes hashes, recursively if desired, and can compare the results to known values.
- ssdeep - Computes and matches Context Triggered Piecewise Hashes.
- National Software Reference Library
- The largest hash database
Online NSRL Lookup
- Allows searching of NSRL 2.17 by MD5 or SHA1. Reportedly the dataset contains 43,103,492 files.
MD5 Reverse Hash Services
There are several online services that allow you to enter a hash code and find out what the preimage might have been. One way to find these services is to google for 'd41d8cd98f00b204e9800998ecf8427e' (the MD5 of the null string).
Here are some services that we have been able to find:
- MD5 reverse lookup, operated by Stephen D Cope. As of December 2007 this database had 28 million MD5 hashes. The author states that the database is divided into 256 MySQL tables to make the problem more tractable. The database claims to include every two, three, and four digit combination, all dictionary words, and a pile of user-submitted data." But the author also states that they are attempting to calculate and index all possible MD5 indexes. Of course, this is an impossibility.
- Similar to the NZ server, but with only 16 million MD5 hashes.
- A nice forward and reverse demonstration system, with an XML and AJAX interface.
- reverse hash lookup of MD5, SHA1, MySQL, NTLM, and Lanman hashes. Claims 75million hashes of 13.2 million unique words.
- MD5 reverse lookup with approximately 1 million entries.
- This site is another simple MD5 reverse lookup. It claims a database with "billions" of entries. Mostly for password cracking. (Who uses straight MD5s for passwords?)