DigitalCorpora.org is a website of digital corpora for use in computer forensics education research. All of the disk images, memory dumps, and network packet captures available on this website are freely available and may be used without prior authorization or IRB approval. We also have available a research corpus of real data acquired from around the world. Use of that dataset is possible under special arrangement.
The Digital Corpora project gets free hosting for the corpus as part of the AWS Open Data Sponsorship Program, for which we are grateful! We could not make this resource available without Amazon’s help.
Digital Corpora was started by Simson Garfinkel when he was an Associate Professor at the Naval Postgraduate School. The original funding for this project came from what is now called the Software Quality Group at the National Institute of Standards and Technology. The funding created our initial set of corpora and produced the paper, “Bringing science to digital forensics with standardized forensic corpora,” by Simson Garfinkel, Paul Farrell, Vassil Roussev and George Dinolt.
Digital Corpora is a website filled with realistic data for use in digital forensics research and education. Unlike other datasets, all of the information on this website was either created for this purpose or previously made freely available. As such, it can be used in education and research without concern about the existence of sensitive personally identifiable information (PII). No IRB approval is required to work with these data.
In 2023 and 2024, the Digital Corpora website underwent a significant expansion with the donation of the SAFEDOCS and UNSAFE-DOCS corpora that were produced by the DARPA SafeDocs program.
Questions regarding this website should be directed to Simson Garfinkel.