Search Results

Keyword: ‘nps’

nps-2014-usb-nondeterministic

May 27th, 2014 No comments

The submission contains four raw (dd) image files of the USB flash disk «Transcend JF V10 / 1GB, D33193», two packet capture (pcap) files and four log files. The disk is non-partitioned and contains no file systems; it contains many non-deterministic sectors (each sector contains 512 bytes).

Namely, each sector that doesn’t belong to a written block of flash memory cells contains non-deterministic data (instead of null bytes, as many forensic examiners tend to expect). The disk does function properly though. Several tests show that writing to a sector turns its contents to deterministic state (i.e. you will read exactly what you wrote).

Days were spent to understand why there are non-deterministic blocks of data. The study showed that each non-deterministic sector represents the contents of the SCSI READ(10) command related to reading that sector. In other words, when the disk receives SCSI READ command that covers non-written sectors it simply sends the contents of the command back to the host, and these contents appear as sector data to an operating system.

In the experiment two raw images of the USB flash disk were acquired on a Linux host using dc3dd (these image files together with corresponding dc3dd log files can be found in «linux-dc3dd/»), and two other raw images were acquired on a Windows 7 host using FTK Imager (image files and log files are located in «windows7-ftkimager/»); all images have different hash values. Windows host was also running capture software to intercept all USB commands and replies, this data was written to pcap files named «usb-1» and «usb-2» (for the first and the second acquisition accordingly). There were no writes to the disk during or between acquisitions. The disk was disconnected between acquisitions on a Windows host: this was done to assign a new tag to the command blocks of all SCSI READ(10) commands going to the disk (unlike Linux, Windows uses the same tag in the command block of all SCSI READ(10) commands, the tag seems to be generated randomly when a disk is connected via USB; Linux, conversely, assigns new tag to every command block of SCSI READ(10) command); otherwise, two images would have the same hash value on a Windows host (results of hashing the disk twice without reconnecting it are shown on the screenshot located at «windows7-ftkimager/ftk-imager-screenshot.png»).

Let’s look at the sector #100005 in four images acquired (dd options: skip=100004 count=1).


«linux-dc3dd/flash-firstrun.dd» has the following data:
00000000 55 53 42 43 94 06 00 00 00 80 00 00 80 00 0a 28 |USBC...........(|
00000010 00 00 01 86 80 00 00 40 00 00 00 00 00 00 00 60 |.......@.......`|
00000020 00 60 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.`..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

«linux-dc3dd/flash-secondrun.dd» has the following data:
00000000 55 53 42 43 00 7f 00 00 00 80 00 00 80 00 0a 28 |USBC...........(|
00000010 00 00 01 86 80 00 00 40 00 00 00 00 00 00 00 44 |.......@.......D|
00000020 00 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.D..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

«windows7-ftkimager/flash-firstrun.001» has the following data:
00000000 55 53 42 43 d8 b1 6b 91 00 40 00 00 80 00 0a 28 |USBC..k..@.....(|
00000010 00 00 01 86 a0 00 00 20 00 00 00 00 00 00 00 5a |....... .......Z|
00000020 00 5a ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.Z..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

«windows7-ftkimager/flash-secondrun.001» has the following data:
00000000 55 53 42 43 20 6a 7f 83 00 40 00 00 80 00 0a 28 |USBC j...@.....(|
00000010 00 00 01 86 a0 00 00 20 00 00 00 00 00 00 00 79 |....... .......y|
00000020 00 79 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.y..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

As you can see, sectors are slightly different. Now let's dissect the data in the first hexadecimal dump (using various colors to highlight the bytes):

00000000 55 53 42 43 94 06 00 00 00 80 00 00 80 00 0a 28 |USBC...........(|
00000010 00 00 01 86 80 00 00 40 00 00 00 00 00 00 00 60 |.......@.......`|
00000020 00 60 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.`..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

Letters «USBC» point us to the USB command block, which starts with four-byte signature «USBC». The structure of the USB command block is presented below (taken from Linux kernel source code):


struct bulk_cb_wrap {
    __le32 Signature; /* contains 'USBC' */
    __u32 Tag; /* unique per command id */
    __le32 DataTransferLength; /* size of data */
    __u8 Flags; /* direction in bit 0 */
    __u8 Lun; /* LUN normally 0 */
    __u8 Length; /* of of the CDB */
    __u8 CDB[16]; /* max command */
};

The CDB field contains an actual command transmitted. The first byte of the CDB field is 0x28, which refers us to the SCSI READ(10) command, which operational code is 0x28 (see Table 85 in the «SCSI Commands Reference Manual» by Seagate: www.seagate.com/staticfiles/support/disc/manuals/scsi/100293068a.pdf). SCSI READ(10) command is exactly ten bytes in length (not counting previous USB command block header).


struct read_10 {
    __u8 opCode; /* operational code (28h) */
    __u8 Flags; /* various flags */
    __u32 LBA; /* logical block address (MSB first) */
    __u8 Group; /* group number */
    __u16 TransferLength; /* transfer length (MSB first) */
    __u8 Control; /* control byte */
};

The contents of SCSI READ(10) command are: logical block address is 18680 in hexadecimal, or 99968 in decimal; transfer length is 40 logical blocks in hexadecimal, or 64 in decimal. Note that we were analyzing sector #100005, which is between 99968 and 100032 (99968+64). Now let’s check what data is present in the sectors #99967 till #100032 (one-liner for bash: «for i in `seq 99967 100032`; do echo -n “$i: “; dd if=flash-firstrun.dd skip=$i count=1 2> /dev/null | md5sum; done»): sectors #99968 till #100031 have the same data as sector #99968; sector #99967 differs from them, as well as sector #100032. The conclusion is that sectors #99968 till #100031 have non-deterministic data, which represents the contents of the SCSI READ(10) command used to read that sector range.

A program was written to study all non-deterministic sectors the same way as described above, and the results are the same — every non-deterministic sector contains a corresponding SCSI READ(10) command.

The data can be downloaded from:

http://digitalcorpora.org/corp/nps/drives/nps-2014-usb-non-deterministic/

Related links
1. http://www.forensicfocus.com/index.php?name=Content&pid=366 (Flash drives and acquisition by Dominik Weber)
2. http://www.forensicfocus.com/index.php?name=Forums&file=viewtopic&t=4707 (FAT32 strangeness by «Fab4»)

Categories: Tags:

nps-2010-emails

February 10th, 2011 1 comment

2010-nps-emails is a test disk that can be used for testing programs that find email addresses or perform string search.

The disk image consists of 30 different email addresses, each one stored in a different document with a different coding scheme.

Below are a list of the email addresses and their codings:

email address                             Application (Encoding)

plain_text@textedit.com                   Apple TextEdit  (UTF-8)
plain_text_pdf@textedit.com               Apple TextEdit print-to-PDF (/FlateDecode)
rtf_text@textedit.com                     Apple TextEdit (RTF)
rtf_text_pdf@textedit.com                 Apple TextEdit print-to-PDF (/FlateDecode)
plain_utf16@textedit.com                  Apple TextEdit (UTF-16)
plain_utf16_pdf@textedit.com              Apple TextEdit print-to-PDF (/FlateDecode)

pages@iwork09.com                         Apple Pages '09
pages_comment@iwork09.com                 Apple Pages (comment) '09
keynote@iwork09.com                       Apple Keynote '09
keynote_comment@iwork09.com               Apple Keynote '09 (comment)
numbers@iwork09.com                       Apple Numbers '09
numbers_comment@iwork09.com               Apple Numbers '09 (comment)

user_doc@microsoftword.com                Microsoft Word 2008 (Mac) (.doc file)
user_doc_pdf@microsoftword.com            Microsoft Word 2008 (Mac) print-to-PDF
user_docx@microsoftword.com
user_docx_pdf@microsoftword.com           Microsoft Word 2008 (Mac) print-to-PDF (.docx file)
xls_cell@microsoft_excel.com
xls_comment@microsoft_excel.com           Microsoft Word 2008 (Mac)
xlsx_cell@microsoft_excel.com             Microsoft Word 2008 (Mac)
xlsx_cell_comment@microsoft_excel.com     Microsoft Word 2008 (Mac) (Comment)

doc_within_doc@document.com               Microsoft Word 2007 (OLE .doc file within .doc)
docx_within_docx@document.com             Microsoft Word 2007 (OLE .doc file within .doc)
ppt_within_doc@document.com               Microsoft PowerPoint and Word 2007 (OLE .ppt file within .doc)
pptx_within_docx@document.com             Microsoft PowerPoint and Word 2007 (OLE .pptx file within .docx)
xls_within_doc@document.com               Microsoft Excel and Word 2007 (OLE .xls file within .doc)
xlsx_within_docx@document.com             Microsoft Excel and Word 2007 (OLE .xlsx file within .docx)

email_in_zip@zipfile1.com                 text file within ZIP
email_in_zip_zip@zipfile2.com             ZIP'ed text file, ZIP'ed
email_in_gzip@gzipfile.com                text file within GZIP
email_in_gzip_gzip@gzipfile.com           GZIP'ed text file, GZIP'ed

The image can be downloaded from http://digitalcorpora.org/corp/drives/nps/nps-2010-emails/

Categories: Tags:

Collection of Hash Values of Forensically Uninteresting Files Available

July 30th, 2014 No comments

This file lists SHA-1 hash values of files that are uninteresting for
forensic investigations on a variety of criteria, including frequency
on drives of both hash value and path, time of creation within both
the minute and the week, file size, directory context both in path and
in sibling files, and file extension. Hash values are listed as 40
hexadecimal characters. This data is derived from the Real Drive
Corpus collected by the DEEP Project at the U.S. Naval Postgraduate
School, plus data from drives in classrooms and laboratories at NPS
and some other sources. Hash values in the January 2014
version of NSRL (the National Software Reference Library, nist.gov)
have been excluded.

The criteria for selecting these hash values and the methods used to
obtain them are described in
http://faculty.nps.edu/ncrowe/uninteresting.htm but have now been
applied to significantly more files than the corpus used for the
paper. Our methods focus on cross-correlation of files in a large
corpus and are thus quite different from those used in collecting the
NSRL data. They were obtained from images of 245 million files on
3905 drives. Currently our set has 16 million hash values not in
NSRL, and NSRL has currently 36 million hash values, so this is a
significant supplement to NSRL.

This data was produced in July 2014 by Neil Rowe, ncrowe@nps.edu.
Please acknowledge us in publications if you use this data.

Data:
http://corp.digitalcorpora.org/corp/nps/hashes/nus-deidentified/uninteresting_sha1s_non_nsrl.txt.zip

Categories: Files, General, NIST Tags:

National Gallery DC 2012 Attack

June 10th, 2014 1 comment

We have a dataset for a hypothetical attack on the National Gallery DC that almost took place in 2012 by Krasnovia terrorists.

Currently the dataset is not annotated.

http://digitalcorpora.org/corp/nps/scenarios/2012-ngdc/

Categories: Tags:

“non-deterministic” USB image contributed

May 27th, 2014 No comments

We are happy to announce the contribution of four disk images of a non-deterministic USB drive. Read More.

Categories: General Tags:

Announcing New File Type Sample Files

February 5th, 2014 No comments

UT San Antonio has kindly provided digitalcorpora with open source, publicly releasable samples of 32 file types. These are the samples that were used by Dr. Nicole Beebe to develop the Sceadan File Type Classifier.

Included file types are ASP, AVI, B64, B85, BZ2, CSS, DLL, ELF, EXE, EXT3, FAT, FLV, JAR, JB2, JS, M4A, MOV, MP3, MP4, NTFS, PST, RPM, RTF, Random, SWF, TXT, Tbird, URL, WAV, WMA, XLSX, ZIP. Each file type sample can be downloaded from the website:
* http://digitalcorpora.org/corp/nps/files/filetypes1/

Also included is a _README directory that includes a list of every file downloaded and a copyright statement for the files that are covered under copyright. You can access that directory at:
* http://digitalcorpora.org/corp/nps/files/filetypes1/_README/

This “FLETYPES1″ corpus supplements the files in the GOVDOCS1 corpus.

Please let us know if you use these by including this citation in your paper:

“FILETYPES1 File type samples,” Beebe, Nicole, University of Texas, San Antonio, hosted at http://digitalcorpora.org/corp/nps/files/filetypes1/. 2014

Categories: Files, General Tags:

Malware Scan of Govdocs1 now available

August 15th, 2013 No comments
Categories: General Tags:

Obtaining Solutions

May 16th, 2013 No comments

Solutions

Solution packets for these scenarios are available as encrypted PDF files:

The decrypt password is provided to faculty members teaching courses in digital forensics as accredited educational institutions. To get the solution please contact us with the WordPress contact form and provide

  • your full name
  • your phone number
  • an official web page that describes your course and clearly indicates your email address.
  • How many students and at what level (undergraduate, graduate) will be using the materials.
  • Whether or not we can put you on an announcement-only mailing list regarding new teaching materials we are developing.

Thank you!

Categories: Tags:

Bulk Extractor News and Downloads

April 3rd, 2013 No comments

File bulk_extractor-1.3.1.zip contains the source code for bulk_extractor v1.3.1.  bulk_extractor is a C++ program that scans a disk image, a file, or a directory of files and extracts useful information without parsing the file system or file system structures.  bulk_extractor is typically downloaded on a Fedora system and compiled or cross-compiled to Linux, Mac, or Windows using autotools.  Please see https://github.com/simsong/bulk_extractor/wiki/Introducing-bulk_extractor.

BEViewer.jar is an executable bulk_extractor viewer user interface.
Bulk Extractor Viewer (BEViewer) provides a graphical user interface for browsing features that have been extracted via the bulk extractor feature extraction tool.  Please see https://github.com/simsong/bulk_extractor/wiki/BEViewer.

be_installer-1.3.exe is a Windows installer for installing bulk_extractor and BEViewer v1.3 on a Windows system.

bulk_extractor.pdf, “Digital media triage with bulk data analysis and bulk-extractor,” discusses how the bulk_extractor tool is effective in providing bulk data analysis.

2012-08-08 bulk_extractor Tutorial.pdf describes how to use the BEViewer tool.  Although some of the parameters for running bulk_extractor have changed, the majority of the tutorial remains current..

Source: The information above and links were received from Bruce Allen <bdallen@nps.edu>, Naval Postgraduate School

See other bulk_extractor downloads here: http://digitalcorpora.org/downloads/bulk_extractor/

Categories: General Tags:

35GB of JPEGs ready for download

March 7th, 2012 2 comments

We have created a tar and a ZIP file with 109,223 files from the govdocs1m corpus. You can download them from:

http://digitalcorpora.org/corp/nps/files/govdocs1/files.jpeg.tar   [37.6 GB]

http://digitalcorpora.org/corp/nps/files/govdocs1/files.jpeg.zip   [36.8 GB]

Please note that the ZIP file is necessarily a ZIP-64 file and will not decompress with the ZIP implementation built-in to MacOS or Windows.

Categories: Files Tags:
"This material is based upon work supported by the National Science Foundation under Grant No. 0919593. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation."