Home > General > Announcement: hashdb toolset

Announcement: hashdb toolset

The text file govdocs1-first512-first4096-docid.txt containing MD5 hashes of the first 512 bytes and first 4096 bytes of every file in the GOVDOCS1 corpus has been removed.  This file was provided to assist with research of block hashes.  We have since created the hashdb toolset which provides support for creating and working with hash block databases.  Please refer to  https://github.com/simsong/hashdb/wiki for downloading the code, continuing progress on this topic, and links to relevant papers including:

Distinct Sector Hashes for Target File Detection

A related masters thesis on this topic was completed at Naval Postgraduate School in 2012 and can be downloaded for additional reading:  http://simson.net/ref/2012/kmf_thesis.pdf

 

 

 

Categories: General Tags:
  1. No comments yet.
  1. No trackbacks yet.

 

"This material is based upon work supported by the National Science Foundation under Grant No. 0919593. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation."