Search Results

Keyword: ‘nps’

nps-2014-usb-nondeterministic

May 27th, 2014 No comments

The submission contains four raw (dd) image files of the USB flash disk «Transcend JF V10 / 1GB, D33193», two packet capture (pcap) files and four log files. The disk is non-partitioned and contains no file systems; it contains many non-deterministic sectors (each sector contains 512 bytes).

Namely, each sector that doesn’t belong to a written block of flash memory cells contains non-deterministic data (instead of null bytes, as many forensic examiners tend to expect). The disk does function properly though. Several tests show that writing to a sector turns its contents to deterministic state (i.e. you will read exactly what you wrote).

Days were spent to understand why there are non-deterministic blocks of data. The study showed that each non-deterministic sector represents the contents of the SCSI READ(10) command related to reading that sector. In other words, when the disk receives SCSI READ command that covers non-written sectors it simply sends the contents of the command back to the host, and these contents appear as sector data to an operating system.

In the experiment two raw images of the USB flash disk were acquired on a Linux host using dc3dd (these image files together with corresponding dc3dd log files can be found in «linux-dc3dd/»), and two other raw images were acquired on a Windows 7 host using FTK Imager (image files and log files are located in «windows7-ftkimager/»); all images have different hash values. Windows host was also running capture software to intercept all USB commands and replies, this data was written to pcap files named «usb-1» and «usb-2» (for the first and the second acquisition accordingly). There were no writes to the disk during or between acquisitions. The disk was disconnected between acquisitions on a Windows host: this was done to assign a new tag to the command blocks of all SCSI READ(10) commands going to the disk (unlike Linux, Windows uses the same tag in the command block of all SCSI READ(10) commands, the tag seems to be generated randomly when a disk is connected via USB; Linux, conversely, assigns new tag to every command block of SCSI READ(10) command); otherwise, two images would have the same hash value on a Windows host (results of hashing the disk twice without reconnecting it are shown on the screenshot located at «windows7-ftkimager/ftk-imager-screenshot.png»).

Let’s look at the sector #100005 in four images acquired (dd options: skip=100004 count=1).


«linux-dc3dd/flash-firstrun.dd» has the following data:
00000000 55 53 42 43 94 06 00 00 00 80 00 00 80 00 0a 28 |USBC...........(|
00000010 00 00 01 86 80 00 00 40 00 00 00 00 00 00 00 60 |.......@.......`|
00000020 00 60 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.`..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

«linux-dc3dd/flash-secondrun.dd» has the following data:
00000000 55 53 42 43 00 7f 00 00 00 80 00 00 80 00 0a 28 |USBC...........(|
00000010 00 00 01 86 80 00 00 40 00 00 00 00 00 00 00 44 |.......@.......D|
00000020 00 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.D..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

«windows7-ftkimager/flash-firstrun.001» has the following data:
00000000 55 53 42 43 d8 b1 6b 91 00 40 00 00 80 00 0a 28 |USBC..k..@.....(|
00000010 00 00 01 86 a0 00 00 20 00 00 00 00 00 00 00 5a |....... .......Z|
00000020 00 5a ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.Z..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

«windows7-ftkimager/flash-secondrun.001» has the following data:
00000000 55 53 42 43 20 6a 7f 83 00 40 00 00 80 00 0a 28 |USBC j...@.....(|
00000010 00 00 01 86 a0 00 00 20 00 00 00 00 00 00 00 79 |....... .......y|
00000020 00 79 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.y..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

As you can see, sectors are slightly different. Now let's dissect the data in the first hexadecimal dump (using various colors to highlight the bytes):

00000000 55 53 42 43 94 06 00 00 00 80 00 00 80 00 0a 28 |USBC...........(|
00000010 00 00 01 86 80 00 00 40 00 00 00 00 00 00 00 60 |.......@.......`|
00000020 00 60 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |.`..............|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00000200

Letters «USBC» point us to the USB command block, which starts with four-byte signature «USBC». The structure of the USB command block is presented below (taken from Linux kernel source code):


struct bulk_cb_wrap {
    __le32 Signature; /* contains 'USBC' */
    __u32 Tag; /* unique per command id */
    __le32 DataTransferLength; /* size of data */
    __u8 Flags; /* direction in bit 0 */
    __u8 Lun; /* LUN normally 0 */
    __u8 Length; /* of of the CDB */
    __u8 CDB[16]; /* max command */
};

The CDB field contains an actual command transmitted. The first byte of the CDB field is 0x28, which refers us to the SCSI READ(10) command, which operational code is 0x28 (see Table 85 in the «SCSI Commands Reference Manual» by Seagate: www.seagate.com/staticfiles/support/disc/manuals/scsi/100293068a.pdf). SCSI READ(10) command is exactly ten bytes in length (not counting previous USB command block header).


struct read_10 {
    __u8 opCode; /* operational code (28h) */
    __u8 Flags; /* various flags */
    __u32 LBA; /* logical block address (MSB first) */
    __u8 Group; /* group number */
    __u16 TransferLength; /* transfer length (MSB first) */
    __u8 Control; /* control byte */
};

The contents of SCSI READ(10) command are: logical block address is 18680 in hexadecimal, or 99968 in decimal; transfer length is 40 logical blocks in hexadecimal, or 64 in decimal. Note that we were analyzing sector #100005, which is between 99968 and 100032 (99968+64). Now let’s check what data is present in the sectors #99967 till #100032 (one-liner for bash: «for i in `seq 99967 100032`; do echo -n “$i: “; dd if=flash-firstrun.dd skip=$i count=1 2> /dev/null | md5sum; done»): sectors #99968 till #100031 have the same data as sector #99968; sector #99967 differs from them, as well as sector #100032. The conclusion is that sectors #99968 till #100031 have non-deterministic data, which represents the contents of the SCSI READ(10) command used to read that sector range.

A program was written to study all non-deterministic sectors the same way as described above, and the results are the same — every non-deterministic sector contains a corresponding SCSI READ(10) command.

The data can be downloaded from:
http://downloads.digitalcorpora.org/corpora/drives/nps-2014-usb-non-deterministic/

Related links
1. http://www.forensicfocus.com/index.php?name=Content&pid=366 (Flash drives and acquisition by Dominik Weber)
2. http://www.forensicfocus.com/index.php?name=Forums&file=viewtopic&t=4707 (FAT32 strangeness by «Fab4»)

Categories: Tags:

nps-2010-emails

February 10th, 2011 1 comment

2010-nps-emails is a test disk that can be used for testing programs that find email addresses or perform string search.

The disk image consists of 30 different email addresses, each one stored in a different document with a different coding scheme.

Below are a list of the email addresses and their codings:

email address                             Application (Encoding)

plain_text@textedit.com                   Apple TextEdit  (UTF-8)
plain_text_pdf@textedit.com               Apple TextEdit print-to-PDF (/FlateDecode)
rtf_text@textedit.com                     Apple TextEdit (RTF)
rtf_text_pdf@textedit.com                 Apple TextEdit print-to-PDF (/FlateDecode)
plain_utf16@textedit.com                  Apple TextEdit (UTF-16)
plain_utf16_pdf@textedit.com              Apple TextEdit print-to-PDF (/FlateDecode)

pages@iwork09.com                         Apple Pages '09
pages_comment@iwork09.com                 Apple Pages (comment) '09
keynote@iwork09.com                       Apple Keynote '09
keynote_comment@iwork09.com               Apple Keynote '09 (comment)
numbers@iwork09.com                       Apple Numbers '09
numbers_comment@iwork09.com               Apple Numbers '09 (comment)

user_doc@microsoftword.com                Microsoft Word 2008 (Mac) (.doc file)
user_doc_pdf@microsoftword.com            Microsoft Word 2008 (Mac) print-to-PDF
user_docx@microsoftword.com
user_docx_pdf@microsoftword.com           Microsoft Word 2008 (Mac) print-to-PDF (.docx file)
xls_cell@microsoft_excel.com
xls_comment@microsoft_excel.com           Microsoft Word 2008 (Mac)
xlsx_cell@microsoft_excel.com             Microsoft Word 2008 (Mac)
xlsx_cell_comment@microsoft_excel.com     Microsoft Word 2008 (Mac) (Comment)

doc_within_doc@document.com               Microsoft Word 2007 (OLE .doc file within .doc)
docx_within_docx@document.com             Microsoft Word 2007 (OLE .doc file within .doc)
ppt_within_doc@document.com               Microsoft PowerPoint and Word 2007 (OLE .ppt file within .doc)
pptx_within_docx@document.com             Microsoft PowerPoint and Word 2007 (OLE .pptx file within .docx)
xls_within_doc@document.com               Microsoft Excel and Word 2007 (OLE .xls file within .doc)
xlsx_within_docx@document.com             Microsoft Excel and Word 2007 (OLE .xlsx file within .docx)

email_in_zip@zipfile1.com                 text file within ZIP
email_in_zip_zip@zipfile2.com             ZIP'ed text file, ZIP'ed
email_in_gzip@gzipfile.com                text file within GZIP
email_in_gzip_gzip@gzipfile.com           GZIP'ed text file, GZIP'ed

The image can be downloaded from http://downloads.digitalcorpora.org/corpora/drives/nps-2010-emails/

Categories: Tags:

“non-deterministic” USB image contributed

May 27th, 2014 No comments

We are happy to announce the contribution of four disk images of a non-deterministic USB drive. Read More.

Categories: General Tags:

Announcing New File Type Sample Files

February 5th, 2014 No comments

UT San Antonio has kindly provided digitalcorpora with open source, publicly releasable samples of 32 file types. These are the samples that were used by Dr. Nicole Beebe to develop the Sceadan File Type Classifier.

Included file types are ASP, AVI, B64, B85, BZ2, CSS, DLL, ELF, EXE, EXT3, FAT, FLV, JAR, JB2, JS, M4A, MOV, MP3, MP4, NTFS, PST, RPM, RTF, Random, SWF, TXT, Tbird, URL, WAV, WMA, XLSX, ZIP. Each file type sample can be downloaded from the website:
* http://digitalcorpora.org/corp/nps/files/filetypes1/

Also included is a _README directory that includes a list of every file downloaded and a copyright statement for the files that are covered under copyright. You can access that directory at:
* http://digitalcorpora.org/corp/nps/files/filetypes1/_README/

This “FLETYPES1” corpus supplements the files in the GOVDOCS1 corpus.

Please let us know if you use these by including this citation in your paper:

“FILETYPES1 File type samples,” Beebe, Nicole, University of Texas, San Antonio, hosted at http://digitalcorpora.org/corp/nps/files/filetypes1/. 2014

Categories: Files, General Tags:

Malware Scan of Govdocs1 now available

August 15th, 2013 No comments

A malware scan of thegovdocs1 corpus is now available at http://digitalcorpora.org/corp/nps/files/govdocs1/MetascanClientLog_201306281214.txt

 

Categories: General Tags:

Obtaining Solutions

May 16th, 2013 No comments

Solutions

Solution packets for these scenarios are available as encrypted PDF files:

The decrypt password is provided to faculty members teaching courses in digital forensics as accredited educational institutions. To get the solution please contact us with the WordPress contact form and provide

  • your full name
  • your phone number
  • an official web page that describes your course and clearly indicates your email address.
  • How many students and at what level (undergraduate, graduate) will be using the materials.
  • Whether or not we can put you on an announcement-only mailing list regarding new teaching materials we are developing.

Thank you!

Categories: Tags:

Bulk Extractor News and Downloads

April 3rd, 2013 No comments

File bulk_extractor-1.3.1.zip contains the source code for bulk_extractor v1.3.1.  bulk_extractor is a C++ program that scans a disk image, a file, or a directory of files and extracts useful information without parsing the file system or file system structures.  bulk_extractor is typically downloaded on a Fedora system and compiled or cross-compiled to Linux, Mac, or Windows using autotools.  Please see https://github.com/simsong/bulk_extractor/wiki/Introducing-bulk_extractor.

BEViewer.jar is an executable bulk_extractor viewer user interface.
Bulk Extractor Viewer (BEViewer) provides a graphical user interface for browsing features that have been extracted via the bulk extractor feature extraction tool.  Please see https://github.com/simsong/bulk_extractor/wiki/BEViewer.

be_installer-1.3.exe is a Windows installer for installing bulk_extractor and BEViewer v1.3 on a Windows system.

bulk_extractor.pdf, “Digital media triage with bulk data analysis and bulk-extractor,” discusses how the bulk_extractor tool is effective in providing bulk data analysis.

2012-08-08 bulk_extractor Tutorial.pdf describes how to use the BEViewer tool.  Although some of the parameters for running bulk_extractor have changed, the majority of the tutorial remains current..

Source: The information above and links were received from Bruce Allen <bdallen@nps.edu>, Naval Postgraduate School

See other bulk_extractor downloads here: http://digitalcorpora.org/downloads/bulk_extractor/

Categories: General Tags:

M57-Jean

February 8th, 2011 4 comments

The M57-Jean scenario is a single disk image scenario involving the exfiltration of corporate documents from the laptop of a senior executive. The scenario involves a small start-up company, M57.Biz. A few weeks into inception a confidential spreadsheet that contains the names and salaries of the company’s key employees was found posted to the “comments” section of one of the firm’s competitors. The spreadsheet only existed on one of M57’s officers—Jean.

Jean says that she has no idea how the data left her laptop and that she must have been hacked.

You have been given a disk image of Jean’s laptop. Your job is to figure out how the data was stolen—or if Jean isn’t as innocent as she claims.

Materials:

Solutions:

The solution is distributed as an encrypted PDF file:

Please see our note onobtaining solutions.

 

Categories: Tags:

test disk image of emails available

February 2nd, 2011 4 comments

I have created a new disk image called 2010-nps-emails that can be used for testing programs that find email addresses or perform string search.

The disk image consists of 30 different email addresses, each one stored in a different document with a different coding scheme.

Below are a list of the email addresses and their codings:

email address                             Application (Encoding)

plain_text@textedit.com                   Apple TextEdit  (UTF-8)
plain_text_pdf@textedit.com               Apple TextEdit print-to-PDF (/FlateDecode)
rtf_text@textedit.com                     Apple TextEdit (RTF)
rtf_text_pdf@textedit.com                 Apple TextEdit print-to-PDF (/FlateDecode)
plain_utf16@textedit.com                  Apple TextEdit (UTF-16)
plain_utf16_pdf@textedit.com              Apple TextEdit print-to-PDF (/FlateDecode)

pages@iwork09.com                         Apple Pages '09
pages_comment@iwork09.com                 Apple Pages (comment) '09
keynote@iwork09.com                       Apple Keynote '09
keynote_comment@iwork09.com               Apple Keynote '09 (comment)
numbers@iwork09.com                       Apple Numbers '09
numbers_comment@iwork09.com               Apple Numbers '09 (comment)

user_doc@microsoftword.com                Microsoft Word 2008 (Mac) (.doc file)
user_doc_pdf@microsoftword.com            Microsoft Word 2008 (Mac) print-to-PDF
user_docx@microsoftword.com
user_docx_pdf@microsoftword.com           Microsoft Word 2008 (Mac) print-to-PDF (.docx file)
xls_cell@microsoft_excel.com
xls_comment@microsoft_excel.com           Microsoft Word 2008 (Mac)
xlsx_cell@microsoft_excel.com             Microsoft Word 2008 (Mac)
xlsx_comment@microsoft_excel.com          Microsoft Word 2008 (Mac) (Comment)

doc_within_doc@document.com               Microsoft Word 2007 (OLE .doc file within .doc)
docx_within_docx@document.com             Microsoft Word 2007 (OLE .doc file within .doc)
ppt_within_doc@document.com               Microsoft PowerPoint and Word 2007 (OLE .ppt file within .doc)
pptx_within_docx@document.com             Microsoft PowerPoint and Word 2007 (OLE .pptx file within .docx)
xls_within_doc@document.com               Microsoft Excel and Word 2007 (OLE .xls file within .doc)
xlsx_within_docx@document.com             Microsoft Excel and Word 2007 (OLE .xlsx file within .docx)

email_in_zip@zipfile1.com                 text file within ZIP
email_in_zip_zip@zipfile2.com             ZIP'ed text file, ZIP'ed
email_in_gzip@gzipfile.com                text file within GZIP
email_in_gzip_gzip@gzipfile.com           GZIP'ed text file, GZIP'ed

The image can be downloaded from http://downloads.digitalcorpora.org/corpora/disk-images/nps-2010-emails/

Edit, 2011-11-26 19:32 PST: One email was incorrectly recorded above. xlsx_comment@microsoft_excel.com is within the disk image, but xlsx_cell_comment@microsoft_excel.com was recorded here. That is now corrected above.

Categories: Disk Images Tags:

Announcing GOVDOCS1.1

December 4th, 2010 No comments

As an artifact of the way that it was collected, many of the extensions for the files in the NPS GOVDOCS1 corpus did not reflect the type of the underlying file. For example, many files that were labeled ‘.xls’ did not contain Microsoft Excel spreadsheets, but instead contained HTML error messages from US government web servers indicating that the file was no longer available. In other cases file extensions chosen when the document was created no longer match current usage, as was the case with several files that had a ‘.doc’ extension but where actually WordPerfect files.

We have gone through the corpus and created a shell script that renames the files to current usage. The script contains 115,135 lines. Of these, the following renames are implemented:

  Rank     Count     Value(s):
  ============================
      1     77227      .text -> .txt  
      2      9290      .xml -> .html  
      3      3683      .pdf -> .html  
      4      3565      . -> .html  
      5      2602      . -> .unk  
      6      2601      .xls -> .dbase3  
      7      2082      .text -> .unk  
      8      1943      . -> .pdf  
      9      1942      .text -> .html  
     10      1857      .doc -> .html  
     11      1088      .doc -> .rtf  
     12       620      .xls -> .html  
     13       595      .text -> .f  
     14       533      .text -> .xml  
     15       459      .ppt -> .html  
     16       438      .xls -> .txt  
     17       435      .doc -> .txt  
     18       346      .doc -> .wp  
     19       283      .txt -> .html  
     20       269      .eps -> .html  
     21       256      .log -> .html  
     22       253      .doc -> .unk  
     23       228      .swf -> .html  
     24       218      .xls -> .unk  
     25       179      .text -> .fits  
     26       175      .dwf -> .html  
     27       166      .gz -> .html  
     28       163      .sql -> .html  
     29       161      .text -> .tex  
     30       155      .html -> .xml  
     31       107      .html -> .pdf  
     32        96      .text -> .troff  
     33        94      .ps -> .html  
     34        70      .js -> .html  
     35        66      . -> .xml  
     36        60      .xls -> .gls  
     37        59      .ttf -> .txt  
     38        53      .text -> .sgml  
     39        45      .jpg -> .html  
     40        36      .ppt -> .txt  
     41        35      .csv -> .html  
               35      .ttf -> .html  
     43        30      .ppt -> .unk  
     44        29      .text -> .pdf  
               29      .xbm -> .txt  
     46        26      .java -> .html  
               26      .zip -> .html  
     48        25      .doc -> .fm  
     49        22      .text -> .rtf  
     50        21      .pub -> .html  
     51        20      .js -> .txt  
     52        17      .jar -> .html  
               17      .jar -> .txt  
               17      .text -> .gz  
     55        16      .ps -> .pdf  
     56        15      .ppt -> .doc  
     57        14      .text -> .swf  
               14      .tmp -> .html  
               14      .xbm -> .html  
     60        13      .doc -> .pdf  
               13      .doc -> .troff  
     62         9      .pps -> .html  
                9      .xlsx -> .html  
     64         8      .log -> .txt  
     65         7      . -> .rtf  
                7      .dll -> .html  
                7      .kml -> .html  
                7      .xls -> .wk1  (Lotus Notes)  
     69         6      .doc -> .f  
                6      .kmz -> .html  
                6      .xml -> .txt  
     72         5      . -> .txt  
                5      .doc -> .sgml  
                5      .docx -> .html  
                5      .eps -> .pdf  
                5      .exe -> .html  
                5      .html -> .rtf  
     78         4      .doc -> .ileaf  (Interleaf)  
                4      .ppt -> .zip  
                4      .pptx -> .html  
                4      .text -> .doc  
                4      .text -> .kml  
                4      .xls -> .zip  
     84         3      .bmp -> .html  
                3      .jpeg -> .html  
                3      .ppt -> .sgml  
                3      .text -> .wp  
                3      .tif -> .html  
                3      .xls -> .doc  
                3      .xls -> .xml  
     91         2      .exported -> .html  
                2      .ppt -> .appledouble (AppleDouble encoded Macintosh file  )
                2      .ppt -> .odp
                2      .ppt -> .gd
                2      .tmp -> .xml  
                2      .xls -> .123
                2      .xls -> .lnk (MS Windows shortcut  )
                2      .xls -> .pdf  
     99         1      .csv -> .rtf  
                1      .doc -> .par 
                1      .doc -> .zip
                1      .doc -> .fits  
                1      .doc -> .gz  
                1      .doc -> .icns  
                1      .doc -> .tex  
                1      .doc -> .xls  
                1      .doc -> .xml  
                1      .docx -> .pdf  
                1      .hlp -> .html  
                1      .hmtl -> .html  
                1      .html -> .gif  
                1      .html -> .kml  
                1      .kml -> .xml  
                1      .pdf -> .xml  
                1      .ppt -> .pdf  
                1      .sql -> .txt  
                1      .sys -> .rtf  
                1      .wp -> .pdf  
                1      .wp -> .rtf  
                1      .xls -> .wk3
                1      .xls -> .bin  (mc68020 pure executable  )
                1      .xls -> .f  
                1      .xls -> .sgml  
                1      .xml -> .kml  

We will be remaking the ZIP files over the next few days and will replace the ZIP files and update the searchable database by 7 December 2010.

Categories: Files Tags:
"This material is based upon work supported by the National Science Foundation under Grant No. 0919593. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation."