Difference between pages "File Format Identification" and "Windows Prefetch File Format"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
 
(Section A - Metrics)
 
Line 1: Line 1:
File Format Identification is the process of figuring out the format of a sequence of bytes. Operating systems typically do this by file extension or by embedded MIME information. Forensic applications need to identify file types by content.
+
{{expand}}
  
 +
A Windows Prefetch file consists of one file header and multiple file sections with different content. Not all content has an obvious forensic value.
  
=Tools=
+
As far as have been possible to ascertain, there is no public description of the format. The description below has been synthesised from examination
==libmagic==
+
of multiple prefetch files.
* Written in C.
+
* Rules in /usr/share/file/magic and compiled at runtime.
+
* Powers the Unix “file” command, but you can also call the library directly from a C program.
+
* http://sourceforge.net/projects/libmagic
+
  
==Digital Preservation Efforts==
+
== Characteristics ==
PRONOM is  a project of the National Archives of the United Kingdom to develop a registry of file types. A similar project was started by JSTOR and Harvard as the JSTOR/Harvard Object Validation Environment. Attempts are now underway to merge these two efforts in the Global Digital Format Registry and the Universal Digital Format Registry.
+
{| class="wikitable"
 +
|-
 +
| <b>Integers</b>
 +
| stored in little-endian
 +
|-
 +
| <b>Strings</b>
 +
| Stored as [http://en.wikipedia.org/wiki/UTF-16/UCS-2 UTF-16 little-endian] without a byte-order-mark (BOM).
 +
|-
 +
| <b>Timestamps</b>
 +
| Stored as [http://msdn2.microsoft.com/en-us/library/ms724284.aspx Windows FILETIME] in UTC.
 +
|-
 +
|}
  
The UK National Archives developed the Digital Record Object Identification (DROID) tool, an "automatic file format identification tool." This tool is written in Java and can be downloaded from SourgeForge.
+
== File header ==
 +
The file header is 84 bytes of size and consists of:
 +
{| class="wikitable"
 +
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
| H1
 +
| 0x0000
 +
| 4
 +
| DWORD
 +
| Format version (see format version section below)
 +
|-
 +
| H2
 +
| 0x0004
 +
| 4
 +
| DWORD
 +
| Signature 'SCCA' (or in hexadecimal representation 0x53 0x43 0x43 0x4)
 +
|-
 +
| H3
 +
| 0x0008
 +
| 4
 +
| DWORD?
 +
| Unknown - Values observed: 0x0F - Windows XP, 0x11 - Windows 7, Windows 8.1
 +
|-
 +
| H4
 +
| 0x000C
 +
| 4
 +
| DWORD
 +
| Prefetch file size (or length) (sometimes referred to as End of File (EOF)).
 +
|-
 +
| H5
 +
|0x0010
 +
| 60
 +
| USTR
 +
| The name of the (original) executable as a Unicode (UTF-16 litte-endian string), up to 29 characters and terminated by an end-of-string character (U+0000). This name should correspond with the one in the prefetch file filename.
 +
|-
 +
| H6
 +
|0x004C
 +
|4
 +
|DWORD
 +
|The prefetch hash. This hash value should correspond with the one in the prefetch file filename.
 +
|-
 +
| H7
 +
|0x0050
 +
|4
 +
|?
 +
| Unknown (flags)? Values observed: 0 for almost all prefetch files (XP); 1 for NTOSBOOT-B00DFAAD.pf (XP)
 +
|-
 +
|}
  
See:
+
It's worth noting that the name of a carved prefetch file can be restored using the information in field H5 and H6, and its size can be determined by field H4.
* [http://www.nationalarchives.gov.uk/PRONOM/Default.aspx  PRONOM]
+
* [http://hul.harvard.edu/jhove/ JHOVE]
+
* [https://wiki.ucop.edu/display/JHOVE2Info/Home JHOVE2]
+
* [http://www.gdfr.info/  GDFR]
+
* [http://www.udfr.org/  UDFR]
+
* [http://droid.sourceforge.net DROID download]
+
  
==TrID==
+
=== Format version ===
* XML config file
+
* Closed source; free for non-commercial use
+
* http://mark0.net/soft-trid-e.html
+
  
==Forensic Innovations File Investigator TOOLS==
+
{| class="wikitable"
* Proprietary, but free trial available.
+
|-
* Available as consumer applications and OEM API.
+
! Value
* Identifies 3,000+ file types, using multiple methods to maintain high accuracy.
+
! Windows version
* Extracts metadata for many of the supported file types.
+
|-
* http://www.forensicinnovations.com/fitools.html
+
| 17 (0x11)
 +
| Windows XP, Windows 2003
 +
|-
 +
| 23 (0x17)
 +
| Windows Vista, Windows 7
 +
|-
 +
| 26 (0x1a)
 +
| Windows 8.1 (note this could be Windows 8 as well but has not been confirmed)
 +
|-
 +
|}
  
==Stellent/Oracle Outside-In==
+
=== File information ===
* Proprietary but free demo.
+
The format of the file information is version dependent.
* http://www.oracle.com/technology/products/content-management/oit/oit_all.html
+
  
==[[Forensic Assistant]]==
+
Note that some other format specifications consider the file information part of the file header.  
* Proprietary.
+
* Provides detection of password protected archives, some files of cryptographic programs, Pinch/Zeus binary reports, etc.
+
* http://nhtcu.ru/0xFA_eng.html
+
[[Category:Tools]]
+
  
=Data Sets=
+
==== File information - version 17 ====
If you are working in the field of file format identification, please consider reporting the results of your algorithm with one of these publicly available data sets:
+
The file information – version 17 is 68 bytes of size and consists of:
* NPS govdocs1m - a corpus of 1 million files that can be redistributed without concern of copyright or PII. Download from http://domex.nps.edu/corp/files/govdocs1/
+
{| class="wikitable"
* The NPS Disk Corpus - a corpus of realistic disk images that contain no PII. Information is at: http://digitalcorpora.org/?s=nps
+
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
|
 +
| 0x0054
 +
| 4
 +
| DWORD
 +
| The offset to section A. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0058
 +
| 4
 +
| DWORD
 +
| The number of entries in section A.
 +
|-
 +
|
 +
| 0x005C
 +
| 4
 +
| DWORD
 +
| The offset to section B. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0060
 +
| 4
 +
| DWORD
 +
| The number of entries in section B.
 +
|-
 +
|
 +
| 0x0064
 +
| 4
 +
| DWORD
 +
| The offset to section C. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0068
 +
| 4
 +
| DWORD
 +
| Length of section C.
 +
|-
 +
|
 +
| 0x006C
 +
| 4
 +
| DWORD
 +
| Offset to section D. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0070
 +
| 4
 +
| DWORD
 +
| The number of entries in section D.
 +
|-
 +
|
 +
| 0x0074
 +
| 4
 +
| DWORD
 +
| Length of section D.
 +
|-
 +
|
 +
| 0x0078
 +
| 8
 +
| FILETIME
 +
| Latest execution time (or run time) of executable (FILETIME)
 +
|-
 +
|
 +
| 0x0080
 +
| 16
 +
| ?
 +
| Unknown ? Possibly structured as 4 DWORD. Observed values: /0x00000000 0x00000000 0x00000000 0x00000000/, /0x47868c00 0x00000000 0x47860c00 0x00000000/ (don't exclude the possibility here that this is remnant data)
 +
|-
 +
|
 +
| 0x0090
 +
| 4
 +
| DWORD
 +
| Execution counter (or run count)
 +
|-
 +
|
 +
| 0x0094
 +
| 4
 +
| DWORD?
 +
| Unknown ? Observed values: 1, 2, 3, 4, 5, 6 (XP)
 +
|-
 +
|}
  
=Bibliography=
+
==== File information - version 23 ====
Current research papers on the file format identification problem. Most of these papers concern themselves with identifying file format of a few file sectors, rather than an entire file. '''Please note that this bibliography is in chronological order!'''
+
The file information – version 23 is 156 bytes of size and consists of:
 +
{| class="wikitable"
 +
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
|
 +
| 0x0054
 +
| 4
 +
| DWORD
 +
| The offset to section A. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0058
 +
| 4
 +
| DWORD
 +
| The number of entries in section A.
 +
|-
 +
|
 +
| 0x005C
 +
| 4
 +
| DWORD
 +
| The offset to section B. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0060
 +
| 4
 +
| DWORD
 +
| The number of entries in section B.
 +
|-
 +
|
 +
| 0x0064
 +
| 4
 +
| DWORD
 +
| The offset to section C. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0068
 +
| 4
 +
| DWORD
 +
| Length of section C.
 +
|-
 +
|
 +
| 0x006C
 +
| 4
 +
| DWORD
 +
| Offset to section D. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0070
 +
| 4
 +
| DWORD
 +
| The number of entries in section D.
 +
|-
 +
|
 +
| 0x0074
 +
| 4
 +
| DWORD
 +
| Length of section D.
 +
|-
 +
|
 +
| <b>0x0078</b>
 +
| <b>8</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
|
 +
| 0x0080
 +
| 8
 +
| FILETIME
 +
| Latest execution time (or run time) of executable (FILETIME)
 +
|-
 +
|
 +
| 0x0088
 +
| 16
 +
| ?
 +
| Unknown
 +
|-
 +
|
 +
| 0x0098
 +
| 4
 +
| DWORD
 +
| Execution counter (or run count)
 +
|-
 +
|
 +
| 0x009C
 +
| 4
 +
| DWORD?
 +
| Unknown
 +
|-
 +
|
 +
| <b>0x00A0</b>
 +
| <b>80</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
|}
  
 +
==== File information - version 26 ====
 +
The file information – version 23 is 224 bytes of size and consists of:
 +
{| class="wikitable"
 +
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
|
 +
| 0x0054
 +
| 4
 +
| DWORD
 +
| The offset to section A. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0058
 +
| 4
 +
| DWORD
 +
| The number of entries in section A.
 +
|-
 +
|
 +
| 0x005C
 +
| 4
 +
| DWORD
 +
| The offset to section B. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0060
 +
| 4
 +
| DWORD
 +
| The number of entries in section B.
 +
|-
 +
|
 +
| 0x0064
 +
| 4
 +
| DWORD
 +
| The offset to section C. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0068
 +
| 4
 +
| DWORD
 +
| Length of section C.
 +
|-
 +
|
 +
| 0x006C
 +
| 4
 +
| DWORD
 +
| Offset to section D. The offset is relative from the start of the file.
 +
|-
 +
|
 +
| 0x0070
 +
| 4
 +
| DWORD
 +
| The number of entries in section D.
 +
|-
 +
|
 +
| 0x0074
 +
| 4
 +
| DWORD
 +
| Length of section D.
 +
|-
 +
|
 +
| 0x0078
 +
| 8
 +
| ?
 +
| Unknown
 +
|-
 +
|
 +
| 0x0080
 +
| 8
 +
| FILETIME
 +
| Latest execution time (or run time) of executable (FILETIME)
 +
|-
 +
|
 +
| <b>0x0088</b>
 +
| <b>7 x 8 = 56</b>
 +
| <b>FILETIME</b>
 +
| <b>Older (most recent) latest execution time (or run time) of executable (FILETIME)</b>
 +
|-
 +
|
 +
| <b>0x00C0</b>
 +
| <b>16</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
|
 +
| 0x00D0
 +
| 4
 +
| DWORD
 +
| Execution counter (or run count)
 +
|-
 +
|
 +
| <b>0x00D4</b>
 +
| <b>4</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
|
 +
| <b>0x00D8</b>
 +
| <b>4</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
|
 +
| <b>0x00DC</b>
 +
| <b>88</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
|}
  
;2001
+
== Section A - Metrics ==
 +
This section contains an array with 20 byte (version 17) or 32 byte (version 23 and 26) metrics entry records.
  
* Mason McDaniel, [[Media:Mcdaniel01.pdf|Automatic File Type Detection Algorithm]], Masters Thesis, James Madison University,2001
+
A metrics entry records conists of:
 +
{| class="wikitable"
 +
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
|
 +
| 0
 +
| 4
 +
| DWORD
 +
| Start time in ms
 +
|-
 +
|
 +
| 4
 +
| 4
 +
| DWORD
 +
| Duration in ms
 +
|-
 +
|
 +
| 8
 +
| 4
 +
| DWORD
 +
| Average duration in ms?
 +
|-
 +
|
 +
| 12
 +
| 4
 +
| DWORD
 +
| Filename string offset <br> The offset is relative to the start of the filename string section (section C)
 +
|-
 +
|
 +
| 16
 +
| 4
 +
| DWORD
 +
| Filename string number of characters without end-of-string character
 +
|-
 +
|
 +
| 20
 +
| 4
 +
| DWORD
 +
| Unknown, flags?
 +
|-
 +
|
 +
| 24
 +
| 8
 +
|
 +
| NTFS file reference
 +
|}
  
; 2003
+
== Section B ==
 +
This section contains an array with 12 byte (version 17, 23 and 26) entry records.
  
* [http://www2.computer.org/portal/web/csdl/abs/proceedings/hicss/2003/1874/09/187490332a.pdf Content Based File Type Detection Algorithms], Mason McDaniel and M. Hossain Heydari, 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 9, 2003.
+
The actual format and usage of these entry records is currently not known.
  
; 2005
+
== Section C - Filename strings ==
 +
This section contains filenames strings, it consists of an array of UTF-16 little-endian formatted strings with end-of-string characters (U+0000).
  
* Fileprints: identifying file types by n-gram analysis, LiWei-Jen, Wang Ke, Stolfo SJ, Herzog B..,  IProceeding of the 2005 IEEE workshop on information assurance, 2005. ([http://www.itoc.usma.edu/workshop/2005/Papers/Follow%20ups/FilePrintPresentation-final.pdf Presentation Slides])  ([http://www1.cs.columbia.edu/ids/publications/FilePrintPaper-revised.pdf PDF])
+
At the end of the section there seems to be alignment padding that can contain remnant values.
  
* Douglas J. Hickok, Daine Richard Lesniak, Michael C. Rowe, File Type Detection Technology,  2005 Midwest Instruction and Computing Symposium.([http://www.micsymposium.org/mics_2005/papers/paper7.pdf PDF])
+
== Section D - Volumes information (block) ==
  
; 2006
+
Section D contains one or more subsections, each subsection refers to directories on a volume.
  
* Karresand Martin, Shahmehri Nahid [http://ieeexplore.ieee.org/iel5/10992/34632/01652088.pdf  File type identification of data fragments by their binary structure. ], Proceedings of the IEEE workshop on information assurance, pp.140–147, 2006.([http://www.itoc.usma.edu/workshop/2006/Program/Presentations/IAW2006-07-3.pdf Presentation Slides])
+
If all the executables and libraries referenced in the C section are from one single disk volume, there will be only one section in the D section. If multiple volumes are referenced by section C, section D will contain multiple sections. (A simple way to force this situation is to copy, say, NOTEPAD.EXE to a USB drive, and start it from that volume. The corresponding prefetch file will have one D header referring to, e.g. \DEVICE\HARDDISK1\DP(1)0-0+4 (the USB drive), and one to, e.g. \DEVICE\HARDDISKVOLUME1\ (where the .DLLs and other support files were found).
  
* Gregory A. Hall, Sliding Window Measurement for File Type Identification, Computer Forensics and Intrusion Analysis Group, ManTech Security and Mission Assurance, 2006. ([http://www.mantechcfia.com/SlidingWindowMeasurementforFileTypeIdentification.pdf PDF])
+
In this section, all offsets are assumed to be counted from the start of the D section.
  
* FORSIGS; Forensic Signature Analysis of the Hard Drive for Multimedia File Fingerprints, John Haggerty and Mark Taylor, IFIP TC11 International Information Security Conference, 2006, Sandton, South Africa.
+
=== Volume information ===
 +
The structure of the volume information is version dependent.
  
* Martin Karresand , Nahid Shahmehri, "Oscar -- Using Byte Pairs to Find File Type and Camera Make of Data Fragments," Annual Workshop on Digital Forensics and Incident Analysis, Pontypridd, Wales, UK, pp.85-94, Springer-Verlag, 2006.
+
==== Volume information - version 17 ====
 +
The volume information – version 17 is 40 bytes in size and consists of:
  
; 2007
+
{| class="wikitable"
 +
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
| VI1
 +
| +0x0000
 +
| 4
 +
| DWORD
 +
| Offset to volume device path (Unicode, terminated by U+0000)
 +
|-
 +
| VI2
 +
| +0x0004
 +
| 4
 +
| DWORD
 +
| Length of volume device path (nr of characters, including terminating U+0000)
 +
|-
 +
| VI3
 +
| +0x0008
 +
| 8
 +
| FILETIME
 +
| Volume creation time.
 +
|-
 +
| VI4
 +
| +0x0010
 +
| 4
 +
| DWORD
 +
| Volume serial number of volume indicated by volume string
 +
|-
 +
| VI5
 +
| +0x0014
 +
| 4
 +
| DWORD
 +
| Offset to sub section E
 +
|-
 +
| VI6
 +
| +0x0018
 +
| 4
 +
| DWORD
 +
| Length of sub section E (in bytes)
 +
|-
 +
| VI7
 +
| +0x001C
 +
| 4
 +
| DWORD
 +
| Offset to sub section F
 +
|-
 +
| VI8
 +
| +0x0020
 +
| 4
 +
| DWORD
 +
| Number of strings in sub section F
 +
|-
 +
| VI9
 +
| +0x0024
 +
| 4
 +
| ?
 +
| Unknown
 +
|-
 +
|}
  
* Karresand M., Shahmehri N., [http://dx.doi.org/10.1007/0-387-33406-8_35 Oscar: File Type Identification of Binary Data in Disk Clusters and RAM Pages], Proceedings of IFIP International Information Security Conference: Security and Privacy in Dynamic Environments (SEC2006), Springer, ISBN 0-387-33405-x, pp.413-424, Karlstad, Sweden, May 2006.
+
==== Volume information - version 23 ====
 +
The volume information entry – version 23 is 104 bytes in size and consists of:
  
* Robert F. Erbacher and John Mulholland, "Identification and Localization of Data Types within Large-Scale File Systems," Proceedings of the 2nd International Workshop on Systematic Approaches to Digital Forensic Engineering, Seattle, WA, April 2007.
+
{| class="wikitable"
 +
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
| VI1
 +
| +0x0000
 +
| 4
 +
| DWORD
 +
| Offset to volume device path (Unicode, terminated by U+0000)
 +
|-
 +
| VI2
 +
| +0x0004
 +
| 4
 +
| DWORD
 +
| Length of volume device path (nr of characters, including terminating U+0000)
 +
|-
 +
| VI3
 +
| +0x0008
 +
| 8
 +
| FILETIME
 +
| Volume creation time.
 +
|-
 +
| VI4
 +
| +0x0010
 +
| 4
 +
| DWORD
 +
| Volume serial number of volume indicated by volume string
 +
|-
 +
| VI5
 +
| +0x0014
 +
| 4
 +
| DWORD
 +
| Offset to sub section E
 +
|-
 +
| VI6
 +
| +0x0018
 +
| 4
 +
| DWORD
 +
| Length of sub section E (in bytes)
 +
|-
 +
| VI7
 +
| +0x001C
 +
| 4
 +
| DWORD
 +
| Offset to sub section F
 +
|-
 +
| VI8
 +
| +0x0020
 +
| 4
 +
| DWORD
 +
| Number of strings in sub section F
 +
|-
 +
| VI9
 +
| +0x0024
 +
| 4
 +
| ?
 +
| Unknown
 +
|-
 +
| <b>VI10</b>
 +
| <b>+0x0028</b>
 +
| <b>28</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
| <b>VI11</b>
 +
| <b>+0x0044</b>
 +
| <b>4</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
| <b>VI12</b>
 +
| <b>+0x0048</b>
 +
| <b>28</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
| <b>VI13</b>
 +
| <b>+0x0064</b>
 +
| <b>4</b>
 +
| <b>?</b>
 +
| <b>Unknown</b>
 +
|-
 +
|}
  
* Ryan M. Harris, "Using Artificial Neural Networks for Forensic File Type Identification," Master's Thesis, Purdue University, May 2007. ([https://www.cerias.purdue.edu/tools_and_resources/bibtex_archive/archive/2007-19.pdf PDF])
+
==== Volume information - version 26 ====
 +
The volume information entry – version 26 appears to be similar to volume information – version 23.
  
* Predicting the Types of File Fragments, William Calhoun, Drue Coles, DFRWS 2008. ([http://www.dfrws.org/2008/proceedings/p14-calhoun_pres.pdf Presentation Slides])  ([http://www.dfrws.org/2008/proceedings/p14-calhoun.pdf PDF])
+
=== Sub section E - NTFS file references ===
 +
This sub section can contain NTFS file references.
  
* Sarah J. Moody and Robert F. Erbacher, [http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=04545366 SÁDI – Statistical Analysis for Data type Identification], 3rd International Workshop on Systematic Approaches to Digital Forensic Engineering, 2008.
+
For more information see [https://googledrive.com/host/0B3fBvzttpiiSbl9XZGZzQ05hZkU/Windows%20Prefetch%20File%20(PF)%20format.pdf Windows Prefetch File (PF) format].
  
; 2008
+
=== Sub section F - Directory strings ===
 +
This sub sections contains directory strings. The number of strings is stored in the volume information.
  
* Mehdi Chehel Amirani, Mohsen Toorani, and Ali Asghar Beheshti Shirazi, [http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4625611 A New Approach to Content-based File Type Detection], Proceedings of the 13th IEEE Symposium on Computers and Communications (ISCC'08), pp.1103-1108, July 2008. ([http://arxiv.org/ftp/arxiv/papers/1002/1002.3174.pdf PDF])
+
A directory string is stored in the following structure:
 +
{| class="wikitable"
 +
|-
 +
! Field
 +
! Offset
 +
! Length
 +
! Type
 +
! Notes
 +
|-
 +
|
 +
| 0x0000
 +
| 2
 +
| DWORD
 +
| Number of characters (WORDs) of the directory name. The value does not include the end-of-string character.
 +
|-
 +
|
 +
| 0x0002
 +
|
 +
| USTR
 +
| The directory name as a Unicode (UTF-16 litte-endian string) terminated by an end-of-string character (U+0000).
 +
|-
 +
|}
  
; 2009
+
== See Also ==
* Roussev, Vassil, and Garfinkel, Simson, "File Classification Fragment-The Case for Specialized Approaches," Systematic Approaches to Digital Forensics Engineering (IEEE/SADFE 2009), Oakland, California. ([http://simson.net/clips/academic/2009.SADFE.Fragments.pdf PDF])
+
* [[Prefetch]]
  
* Irfan Ahmed, Kyung-suk Lhee, Hyunjung Shin and ManPyo Hong, [http://www.springerlink.com/content/g2655k2044615q75/ On Improving the Accuracy and Performance of Content-based File Type Identification], Proceedings of the 14th Australasian Conference on Information Security and Privacy (ACISP 2009), pp.44-59, LNCS (Springer), Brisbane, Australia, July 2009.
+
== External Links ==
 +
* [https://googledrive.com/host/0B3fBvzttpiiSbl9XZGZzQ05hZkU/Windows%20Prefetch%20File%20(PF)%20format.pdf Windows Prefetch File (PF) format], by the [[libssca|libssca project]]
 +
* [http://bitbucket.cassidiancybersecurity.com/prefetch-parser/wiki/Home Windows Prefetch file format], by the [http://bitbucket.cassidiancybersecurity.com/prefetch-parser prefetch-parser] project.
  
; 2010
+
[[Category:File Formats]]
*Irfan Ahmed, Kyung-suk Lhee, Hyunjung Shin and ManPyo Hong, [http://www.alphaminers.net/sub05/sub05_03.php?swf_pn=5&swf_sn=3&swf_pn2=3    Fast File-type Identification], Proceedings of the 25th ACM Symposium on Applied Computing (ACM SAC 2010), ACM, Sierre, Switzerland, March 2010.
+
 
+
;2011
+
*Irfan Ahmed, Kyung-Suk Lhee, Hyun-Jung Shin, Man-Pyo Hong, [http://link.springer.com/chapter/10.1007/978-3-642-24212-0_5 Fast Content-Based File Type Identification], Proceedings of the 7th Annual IFIP WG 11.9 International Conference on Digital Forensics, Orlando, FL, USA, February, 2011
+
[[Category:Bibliographies]]
+

Revision as of 11:45, 22 June 2014

Information icon.png

Please help to improve this article by expanding it.
Further information might be found on the discussion page.

A Windows Prefetch file consists of one file header and multiple file sections with different content. Not all content has an obvious forensic value.

As far as have been possible to ascertain, there is no public description of the format. The description below has been synthesised from examination of multiple prefetch files.

Characteristics

Integers stored in little-endian
Strings Stored as UTF-16 little-endian without a byte-order-mark (BOM).
Timestamps Stored as Windows FILETIME in UTC.

File header

The file header is 84 bytes of size and consists of:

Field Offset Length Type Notes
H1 0x0000 4 DWORD Format version (see format version section below)
H2 0x0004 4 DWORD Signature 'SCCA' (or in hexadecimal representation 0x53 0x43 0x43 0x4)
H3 0x0008 4 DWORD? Unknown - Values observed: 0x0F - Windows XP, 0x11 - Windows 7, Windows 8.1
H4 0x000C 4 DWORD Prefetch file size (or length) (sometimes referred to as End of File (EOF)).
H5 0x0010 60 USTR The name of the (original) executable as a Unicode (UTF-16 litte-endian string), up to 29 characters and terminated by an end-of-string character (U+0000). This name should correspond with the one in the prefetch file filename.
H6 0x004C 4 DWORD The prefetch hash. This hash value should correspond with the one in the prefetch file filename.
H7 0x0050 4 ? Unknown (flags)? Values observed: 0 for almost all prefetch files (XP); 1 for NTOSBOOT-B00DFAAD.pf (XP)

It's worth noting that the name of a carved prefetch file can be restored using the information in field H5 and H6, and its size can be determined by field H4.

Format version

Value Windows version
17 (0x11) Windows XP, Windows 2003
23 (0x17) Windows Vista, Windows 7
26 (0x1a) Windows 8.1 (note this could be Windows 8 as well but has not been confirmed)

File information

The format of the file information is version dependent.

Note that some other format specifications consider the file information part of the file header.

File information - version 17

The file information – version 17 is 68 bytes of size and consists of:

Field Offset Length Type Notes
0x0054 4 DWORD The offset to section A. The offset is relative from the start of the file.
0x0058 4 DWORD The number of entries in section A.
0x005C 4 DWORD The offset to section B. The offset is relative from the start of the file.
0x0060 4 DWORD The number of entries in section B.
0x0064 4 DWORD The offset to section C. The offset is relative from the start of the file.
0x0068 4 DWORD Length of section C.
0x006C 4 DWORD Offset to section D. The offset is relative from the start of the file.
0x0070 4 DWORD The number of entries in section D.
0x0074 4 DWORD Length of section D.
0x0078 8 FILETIME Latest execution time (or run time) of executable (FILETIME)
0x0080 16  ? Unknown ? Possibly structured as 4 DWORD. Observed values: /0x00000000 0x00000000 0x00000000 0x00000000/, /0x47868c00 0x00000000 0x47860c00 0x00000000/ (don't exclude the possibility here that this is remnant data)
0x0090 4 DWORD Execution counter (or run count)
0x0094 4 DWORD? Unknown ? Observed values: 1, 2, 3, 4, 5, 6 (XP)

File information - version 23

The file information – version 23 is 156 bytes of size and consists of:

Field Offset Length Type Notes
0x0054 4 DWORD The offset to section A. The offset is relative from the start of the file.
0x0058 4 DWORD The number of entries in section A.
0x005C 4 DWORD The offset to section B. The offset is relative from the start of the file.
0x0060 4 DWORD The number of entries in section B.
0x0064 4 DWORD The offset to section C. The offset is relative from the start of the file.
0x0068 4 DWORD Length of section C.
0x006C 4 DWORD Offset to section D. The offset is relative from the start of the file.
0x0070 4 DWORD The number of entries in section D.
0x0074 4 DWORD Length of section D.
0x0078 8 ? Unknown
0x0080 8 FILETIME Latest execution time (or run time) of executable (FILETIME)
0x0088 16  ? Unknown
0x0098 4 DWORD Execution counter (or run count)
0x009C 4 DWORD? Unknown
0x00A0 80 ? Unknown

File information - version 26

The file information – version 23 is 224 bytes of size and consists of:

Field Offset Length Type Notes
0x0054 4 DWORD The offset to section A. The offset is relative from the start of the file.
0x0058 4 DWORD The number of entries in section A.
0x005C 4 DWORD The offset to section B. The offset is relative from the start of the file.
0x0060 4 DWORD The number of entries in section B.
0x0064 4 DWORD The offset to section C. The offset is relative from the start of the file.
0x0068 4 DWORD Length of section C.
0x006C 4 DWORD Offset to section D. The offset is relative from the start of the file.
0x0070 4 DWORD The number of entries in section D.
0x0074 4 DWORD Length of section D.
0x0078 8  ? Unknown
0x0080 8 FILETIME Latest execution time (or run time) of executable (FILETIME)
0x0088 7 x 8 = 56 FILETIME Older (most recent) latest execution time (or run time) of executable (FILETIME)
0x00C0 16 ? Unknown
0x00D0 4 DWORD Execution counter (or run count)
0x00D4 4 ? Unknown
0x00D8 4 ? Unknown
0x00DC 88 ? Unknown

Section A - Metrics

This section contains an array with 20 byte (version 17) or 32 byte (version 23 and 26) metrics entry records.

A metrics entry records conists of:

Field Offset Length Type Notes
0 4 DWORD Start time in ms
4 4 DWORD Duration in ms
8 4 DWORD Average duration in ms?
12 4 DWORD Filename string offset
The offset is relative to the start of the filename string section (section C)
16 4 DWORD Filename string number of characters without end-of-string character
20 4 DWORD Unknown, flags?
24 8 NTFS file reference

Section B

This section contains an array with 12 byte (version 17, 23 and 26) entry records.

The actual format and usage of these entry records is currently not known.

Section C - Filename strings

This section contains filenames strings, it consists of an array of UTF-16 little-endian formatted strings with end-of-string characters (U+0000).

At the end of the section there seems to be alignment padding that can contain remnant values.

Section D - Volumes information (block)

Section D contains one or more subsections, each subsection refers to directories on a volume.

If all the executables and libraries referenced in the C section are from one single disk volume, there will be only one section in the D section. If multiple volumes are referenced by section C, section D will contain multiple sections. (A simple way to force this situation is to copy, say, NOTEPAD.EXE to a USB drive, and start it from that volume. The corresponding prefetch file will have one D header referring to, e.g. \DEVICE\HARDDISK1\DP(1)0-0+4 (the USB drive), and one to, e.g. \DEVICE\HARDDISKVOLUME1\ (where the .DLLs and other support files were found).

In this section, all offsets are assumed to be counted from the start of the D section.

Volume information

The structure of the volume information is version dependent.

Volume information - version 17

The volume information – version 17 is 40 bytes in size and consists of:

Field Offset Length Type Notes
VI1 +0x0000 4 DWORD Offset to volume device path (Unicode, terminated by U+0000)
VI2 +0x0004 4 DWORD Length of volume device path (nr of characters, including terminating U+0000)
VI3 +0x0008 8 FILETIME Volume creation time.
VI4 +0x0010 4 DWORD Volume serial number of volume indicated by volume string
VI5 +0x0014 4 DWORD Offset to sub section E
VI6 +0x0018 4 DWORD Length of sub section E (in bytes)
VI7 +0x001C 4 DWORD Offset to sub section F
VI8 +0x0020 4 DWORD Number of strings in sub section F
VI9 +0x0024 4  ? Unknown

Volume information - version 23

The volume information entry – version 23 is 104 bytes in size and consists of:

Field Offset Length Type Notes
VI1 +0x0000 4 DWORD Offset to volume device path (Unicode, terminated by U+0000)
VI2 +0x0004 4 DWORD Length of volume device path (nr of characters, including terminating U+0000)
VI3 +0x0008 8 FILETIME Volume creation time.
VI4 +0x0010 4 DWORD Volume serial number of volume indicated by volume string
VI5 +0x0014 4 DWORD Offset to sub section E
VI6 +0x0018 4 DWORD Length of sub section E (in bytes)
VI7 +0x001C 4 DWORD Offset to sub section F
VI8 +0x0020 4 DWORD Number of strings in sub section F
VI9 +0x0024 4  ? Unknown
VI10 +0x0028 28 ? Unknown
VI11 +0x0044 4 ? Unknown
VI12 +0x0048 28 ? Unknown
VI13 +0x0064 4 ? Unknown

Volume information - version 26

The volume information entry – version 26 appears to be similar to volume information – version 23.

Sub section E - NTFS file references

This sub section can contain NTFS file references.

For more information see Windows Prefetch File (PF) format.

Sub section F - Directory strings

This sub sections contains directory strings. The number of strings is stored in the volume information.

A directory string is stored in the following structure:

Field Offset Length Type Notes
0x0000 2 DWORD Number of characters (WORDs) of the directory name. The value does not include the end-of-string character.
0x0002 USTR The directory name as a Unicode (UTF-16 litte-endian string) terminated by an end-of-string character (U+0000).

See Also

External Links