Archive

Author Archive

New Scenario: the 2012 National Gallery DC Scenario Digital Evidence and Instruction Materials

July 27th, 2018 No comments

I am happy to announce that a new digital corpora has been posted to the scenarios collection of digitalcorpora.org.

SIX YEARS IN THE MAKING!

The scenario is the 2012 National Gallery DC scenario. Working in collaboration with Joseph Greenfield at USC, we have now completed the collection, annotation, and preparation of teacher guides. This has been a massive project and all of the data are now available for public consumption.

The 2012 National Gallery DC scenario spans approximately 10 days and encompasses two distinct yet intertwined story arcs. The scenario is centered around an employee at the National Gallery DC Art Gallery. Criminal plans for both theft and defacement are discussed amongst actors during the scenario, and evidence may remain across the digital devices they used. The scenario is terminated upon suspicious activity being reported to law enforcement at which point certain devices are seized and network traffic logs are requested. (A wiretap had been previously ordered, so there are some full content traces available.)

The scenario includes several believable characters:

Tracy —
Tracy is a recently divorced mother in the middle of a child custody battle. Unfortunately, Tracy’s daughter is in an expensive private school, which Tracy can no longer afford on her salary. Her ex-husband will only pay for the school if Tracy will give over custody of their daughter to him. Worse, Tracy’s daughter, Terry, age 15, has stated that she would rather live with her dad if it comes to staying in school. “After all, you ran Dad off in the first place.”

Pat —
Pat is Tracy’s brother. He is a police officer of the D.C. Enforcers Bureau. He holds the status of detective. He is very devoted to his sister and niece Terry, to this point he isn’t an outright criminal, but walks the line very closely. He busted King with some items that were against his parole, but hasn’t arrested him on the promise of a future “favor.”

Joe —
Joe is the father of Terry and is currently going through the divorce with Tracy. Joe is financially well-off, and still bitter about the relationship problems. He previously installed a key logger on the MacBook Air in an attempt to keep track of Terry’s online behavior. Now that Joe and Tracy are going through a divorce, he has motivation to utilize the key logger to spy on both Tracy and Terry. Joe used to have an account on the family MacBook Air however it was deleted. The home folder may have been preserved.

Alex —
Alex is a Krasnovian supporter who wishes to embarrass the United States. He is a foreigner and lives outside the country presumably in a region called Krasnovia. He knows Carry through extended family connections and contacts her as both having similar family ties and a fellow Krasnovian. He plans to deface foreign works that are on exhibit in the National Gallery DC. Defacing said artwork will embarrass the United States and possibly degrade the reputation between the United States and the foreign country providing the foreign exhibit to America. (In some documentation this is referred to as ‘Majavia’, a second pseudo-nation)

Carry —
Carry is a somewhat criminally involved individual that shares family ties with Alex. She is a Krasnovian supporter. Carry is both technologically savvy and an occasional social media user. She is contacted by Alex in the beginning of the scenario and asked to orchestrate the defacing of the artwork because she is both aligned with Krasnovia and because she has ‘Connections’. She has a slight familiarity as friends/acquaintances with Tracy.

Terry —
Terry is the daughter of Tracy and Joe. Terry attends an expensive private school. (Prufrock Preparatory School). She wants to stay in school to avoid having to start over and so that she can keep her current friends, despite the fact that her mother can no longer afford to pay the tuition.

The evidence in the scenario includes the following:

• Carry’s phone on 2012-07-15 [ZIP] [FTK Logical Dump]
• Carry’s tablet on 2012-07-16 [E01] [TAR]
• Email messages generated by the spyware installed on Tracy’s Macbook Air and that were periodically emailed to Joe [ZIP]
• Tracy’s phone on 2012-07-15 (encase) [L01] [ZIP]
• Tracy’s phone on 2012-07-15 (other extraction tools) [EO1] [tar]
• Tracy’s external hard drive [E01]
• Tracy’s home computer [E01] [E02]
• Exterior Network Packet Dumps
• exterior 2012-07-06 exterior-2012-07-06.pcap
• exterior 2012-07-09 exterior-2012-07-09.pcap
• exterior 2012-07-10 exterior-2012-07-10.pcap
• exterior 2012-07-12 exterior-2012-07-12.txt
• Interior Network Packet Dumps
• interior 2012-07-06 interior-2012-07-06.pcap
• interior 2012-07-09 interior-2012-07-09.pcap
• interior 2012-07-10 interior-2012-07-10.pcap
• interior 2012-07-12 interior-2012-07-12.txt

In addition to these final images, we also have day-by-day images of the two phones, the tablet, and the external hard drive. These day-by-day images are for *digital forensics research* and are not needed for scenario analysis.

Images of the phones and tablets were performed using a variety of techniques, including logging in to the devices and doing a ‘tar’ as well as using commercial digital forensics tools.

It’s true that this dataset is six years old! Please accept our apologies: it took a long time to clear all of this information. The advantage is that there should be good support for all of the file formats on these media, in both commercial and open source tools.

You can download the scenario and the teachers guides from:

National Gallery DC 2012 Attack

The teacher guides are encrypted with the same passphrase that is used to encrypt all of the digitalcorpora.org teacher’s guides. If you do not have it, you can request it using the website’s contact form.

Finally, as a reminder: these are all fictional people and institutions. There is no National Gallery DC, there is no country of Krasnovia, and there is no D.C. Enforcers Bureau. Any similarity to actual people or organizations is entirely coincidental.

Categories: Disk Images, Scenarios Tags:

New Scenario: 2018 Lone Wolf

July 15th, 2018 No comments

We are pleased to announce a new scenario in the digitalcorpora family!

Released today is the 2018 Lone Wolf Scenario, created by GMU student Thomas Moore. The scenario consists of more than 32GB (compressed) of data that was seized from a fictional individual who was planning a mass shooting.

The 2018 Lone Wolf Scenario is based on a (fictional) unstable individual who is planning a mass shooting. The individual is interrupted when a family member calls the police and his apartment is raided. The task for the investigators is to determine if anyone else was involved.

This scenario contains a disk image and memory dump from a laptop. It’s an image of a real, physical machine that was actually used, so it’s quite big. Also included in the scenario are the results of modern commercial digital forensics tools applied to the dataset, so that students who don’t have access to these tools can still see their results. There is a teacher’s guide that includes a report on all of the planted evidence.

The 2018 Lone Wolf Scenario was created by Thomas J. Moore, a student at George Mason University.

Please remember: this is a fictional scenario about fictional people!

Unlike the other scenarios on our website, this scenario also includes output of commercial forensic tools for student use. The idea is that there is nothing especially creative about running evidence through a tool, and a lot of students do not have access to state-of-the-art commercial tools, so we have run the tools for your students!

A teacher’s guide is available for this scenario.

You can find more information about the 2018 Lone Wolf scenario here: https://digitalcorpora.org/corpora/scenarios/2018-lone-wolf-scenario

Categories: Disk Images, Scenarios Tags:

website transition

April 29th, 2017 No comments

The website has been transitioned to Dreamhost. The downloads remain at George Mason University and can be reached at http://downloads.digitalcorpora.org/corpora/ for the corpora and http://downloads.digitalcorpora.org/downloads/ for files.

Categories: General Tags:

“non-deterministic” USB image contributed

May 27th, 2014 No comments

We are happy to announce the contribution of four disk images of a non-deterministic USB drive. Read More.

Categories: General Tags:

Announcing New File Type Sample Files

February 5th, 2014 No comments

UT San Antonio has kindly provided digitalcorpora with open source, publicly releasable samples of 32 file types. These are the samples that were used by Dr. Nicole Beebe to develop the Sceadan File Type Classifier.

Included file types are ASP, AVI, B64, B85, BZ2, CSS, DLL, ELF, EXE, EXT3, FAT, FLV, JAR, JB2, JS, M4A, MOV, MP3, MP4, NTFS, PST, RPM, RTF, Random, SWF, TXT, Tbird, URL, WAV, WMA, XLSX, ZIP. Each file type sample can be downloaded from the website:
* http://digitalcorpora.org/corp/nps/files/filetypes1/

Also included is a _README directory that includes a list of every file downloaded and a copyright statement for the files that are covered under copyright. You can access that directory at:
* http://digitalcorpora.org/corp/nps/files/filetypes1/_README/

This “FLETYPES1” corpus supplements the files in the GOVDOCS1 corpus.

Please let us know if you use these by including this citation in your paper:

“FILETYPES1 File type samples,” Beebe, Nicole, University of Texas, San Antonio, hosted at http://digitalcorpora.org/corp/nps/files/filetypes1/. 2014

Categories: Files, General Tags:

Malware Scan of Govdocs1 now available

August 15th, 2013 No comments

A malware scan of thegovdocs1 corpus is now available at http://digitalcorpora.org/corp/nps/files/govdocs1/MetascanClientLog_201306281214.txt

 

Categories: General Tags:

35GB of JPEGs ready for download

March 7th, 2012 2 comments

We have created a tar and a ZIP file with 109,223 files from the govdocs1m corpus. You can download them from:

http://downloads.digitalcorpora.org/corpora/files/govdocs1/by_type/files.jpeg.tar   [37.6 GB]

Browse all by type: http://downloads.digitalcorpora.org/corpora/files/govdocs1/by_type/

Please note that the ZIP file is necessarily a ZIP-64 file and will not decompress with the ZIP implementation built-in to MacOS or Windows.

Categories: Files Tags:

M57-Jean Scenario Posted

February 8th, 2011 No comments

The scenario page for M57-Jean has now been posted.

Categories: Scenarios Tags:

test disk image of emails available

February 2nd, 2011 4 comments

I have created a new disk image called 2010-nps-emails that can be used for testing programs that find email addresses or perform string search.

The disk image consists of 30 different email addresses, each one stored in a different document with a different coding scheme.

Below are a list of the email addresses and their codings:

email address                             Application (Encoding)

plain_text@textedit.com                   Apple TextEdit  (UTF-8)
plain_text_pdf@textedit.com               Apple TextEdit print-to-PDF (/FlateDecode)
rtf_text@textedit.com                     Apple TextEdit (RTF)
rtf_text_pdf@textedit.com                 Apple TextEdit print-to-PDF (/FlateDecode)
plain_utf16@textedit.com                  Apple TextEdit (UTF-16)
plain_utf16_pdf@textedit.com              Apple TextEdit print-to-PDF (/FlateDecode)

pages@iwork09.com                         Apple Pages '09
pages_comment@iwork09.com                 Apple Pages (comment) '09
keynote@iwork09.com                       Apple Keynote '09
keynote_comment@iwork09.com               Apple Keynote '09 (comment)
numbers@iwork09.com                       Apple Numbers '09
numbers_comment@iwork09.com               Apple Numbers '09 (comment)

user_doc@microsoftword.com                Microsoft Word 2008 (Mac) (.doc file)
user_doc_pdf@microsoftword.com            Microsoft Word 2008 (Mac) print-to-PDF
user_docx@microsoftword.com
user_docx_pdf@microsoftword.com           Microsoft Word 2008 (Mac) print-to-PDF (.docx file)
xls_cell@microsoft_excel.com
xls_comment@microsoft_excel.com           Microsoft Word 2008 (Mac)
xlsx_cell@microsoft_excel.com             Microsoft Word 2008 (Mac)
xlsx_comment@microsoft_excel.com          Microsoft Word 2008 (Mac) (Comment)

doc_within_doc@document.com               Microsoft Word 2007 (OLE .doc file within .doc)
docx_within_docx@document.com             Microsoft Word 2007 (OLE .doc file within .doc)
ppt_within_doc@document.com               Microsoft PowerPoint and Word 2007 (OLE .ppt file within .doc)
pptx_within_docx@document.com             Microsoft PowerPoint and Word 2007 (OLE .pptx file within .docx)
xls_within_doc@document.com               Microsoft Excel and Word 2007 (OLE .xls file within .doc)
xlsx_within_docx@document.com             Microsoft Excel and Word 2007 (OLE .xlsx file within .docx)

email_in_zip@zipfile1.com                 text file within ZIP
email_in_zip_zip@zipfile2.com             ZIP'ed text file, ZIP'ed
email_in_gzip@gzipfile.com                text file within GZIP
email_in_gzip_gzip@gzipfile.com           GZIP'ed text file, GZIP'ed

The image can be downloaded from http://downloads.digitalcorpora.org/corpora/disk-images/nps-2010-emails/

Edit, 2011-11-26 19:32 PST: One email was incorrectly recorded above. xlsx_comment@microsoft_excel.com is within the disk image, but xlsx_cell_comment@microsoft_excel.com was recorded here. That is now corrected above.

Categories: Disk Images Tags:

First 512 and 4096 byte block hashes of govdocs1

January 4th, 2011 No comments

I have posted a text file containing MD5 hashes for the first 512 bytes and first 4096 bytes of every file in the GOVDOCS1 corpus. This file is intended for research on sector hashing. You can download the file from http://digitalcorpora.org/corp/nps/files/govdocs1/govdocs1-first512-first4096-docid.txt

Categories: Files Tags:
"This material is based upon work supported by the National Science Foundation under Grant No. 0919593. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation."