We are having problems with this website. Until further notice, please use this download link as the root of the download hiearchy:

Collection of Hash Values of Forensically Uninteresting Files Available

This file lists SHA-1 hash values of files that are uninteresting for
forensic investigations on a variety of criteria, including frequency
on drives of both hash value and path, time of creation within both
the minute and the week, file size, directory context both in path and
in sibling files, and file extension. Hash values are listed as 40
hexadecimal characters. This data is derived from the Real Drive
Corpus collected by the DEEP Project at the U.S. Naval Postgraduate
School, plus data from drives in classrooms and laboratories at NPS
and some other sources. Hash values in the January 2014
version of NSRL (the National Software Reference Library,
have been excluded.

The criteria for selecting these hash values and the methods used to
obtain them are described in but have now been
applied to significantly more files than the corpus used for the
paper. Our methods focus on cross-correlation of files in a large
corpus and are thus quite different from those used in collecting the
NSRL data. They were obtained from images of 245 million files on
3905 drives. Currently our set has 16 million hash values not in
NSRL, and NSRL has currently 36 million hash values, so this is a
significant supplement to NSRL.

This data was produced in July 2014 by Neil Rowe,
Please acknowledge us in publications if you use this data.


MySQL tables for NIST NSRL RDS 2.26 posted

Ever want to have SQL access to the NIST RDS but didn’t want to spend a month building the MySQL tables? Well, we did too… So we took one of our 8-core, 32GB servers, imported all of the NSRL, and then put a tar file of the tables available for download on this server.

To use these files just download and put the files in your MySQL data directory. You’ll be up-and-running in no time.

