Difference between pages "Word Document (DOCX)" and "Personal Folder File (PAB, PST, OST)"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
(Relationship to OOXML: expanded information about OOXML)
 
(Personal Folder File (.PST, .OST, .PAB))
 
Line 1: Line 1:
DOCX is the file format for Microsoft Office 2007 and later.  
+
[[Microsoft]] [[Outlook]] uses the '''Personal Folder File (PFF)''' to store e-mails, appointments, tasks, contacts, notes, etc.
  
DOCX should not be confused with [[DOC]], the format used by earlier versions of Microsoft Office.
+
Three different types of the PFF are known:
 +
* The '''Personal Address Book (PAB)''', which contains the address book of contacts. These files have the extension '''.pab'''.
 +
* The '''Personal Storage Table (PST)''', which contains items like e-mails, appointments, tasks, notes, etc. and is used as current and archived mailbox files. These files have the extension '''.pst'''. The PST format is also referred to as the '''Personal Folder File (PFF)''' format.
 +
* The '''Offline Storage Table (OST)''', which contains items like e-mails, appointments, tasks, notes, etc. and is used as off line mailbox files in conjunction with [[Microsoft]] [[Exchange]]. These files have the extension '''.ost'''. The OST format is also referred to as the '''Offline Folder File (OFF)''' format.
  
= Container Format =
+
The underlying file format of these files is the same of which the actual name is unknown but has been dubbed the '''Personal Folder File (PFF)''' format, because of its most common usage.
  
DOCX is written in an XML format, which consists of a [[ZIP archive]] file containing [[XML]] and binaries. Content can be analysed without modification by unzipping the file (e.g. in WinZIP) and analysing the contents of the archive.
+
== MIME types ==
  
The file _rels/.rels contains information about the structure of the document.  It contains paths to the metadata information as well as the main XML document that contains the content of the document itself.
+
The actual Mime type of the PFF format is unspecified however some sources claim the following [[MIME types]] apply to this [[file format]]:
 +
* application/vnd.ms-outlook (for PST files)
  
Metadata information are usually stored in the folder docProps.  Two or more XML files are stored inside that folder, app.xml that stores metadata information extracted from the Word application itself and core.xml that stores metadata from the document itself, such as the author name, last time it was printed, etc.
+
== File signature ==
  
Another folder contains the actual content of the document, in a Word document, or an .docx document the folder's name is word.  A XML file called document.xml is the main document, containing most of the content of the document itself.
+
The PFF has the following file signature:
 +
hexadecimal: 21 42 44 4e
 +
ASCII: !BDN
  
= Relationship to OOXML =
+
== File types ==
  
Office Open XML is an open XML standard developed by Microsoft for word processing documents, spreadsheets, presentations and charts. The OOXML standard was submitted to the ISO for approval.  After initially being rejected over technical concerns, the ISO approved a modified version as ISO/IEC 29500:2008. Microsoft intended to use the OOXML standard for its Office suite. However, Office does not support the standard that the ISO approved, it only supports the standard that was originally [http://arstechnica.com/microsoft/news/2010/04/iso-ooxml-convener-microsofts-format-heading-for-failure.ars rejected] by the ISO.  As of Office 2010, Microsoft has still not brought its software into compliance with the standard.
+
There are a 32-bit and a 64-bit version of the PFF. These have the same file signature but can be identified by the version in the file header.
  
For most purposes OOXML may be considered a subset of DOCX (DOCX contains additional features, like OLE serialization).
+
== Contents ==
  
Documentation on OOXML may provide a guide to analysing a DOCX file.
+
The PFF basically contains a hierarchy of items. The attributes of these items are defined by the [[Microsoft]] [[Outlook]] [[Message API (MAPI)]].
  
= External Links =
+
== Encryption ==
  
* [http://msdn.microsoft.com/en-us/library/aa338205.aspx Information from Microsoft about the structure of OpenXML documents]
+
The PFF format allows the file to be encrypted. Two types of encryptions are currently known these are referred to as compressible and high encryption.
 +
The compressible encryption is a basic substitution cypher and the high encryption is a little more complex substitution cypher.
 +
From a cryptographic point of view this is more a way of obfuscation than a means to protect confidentiality.
  
* [http://www.simson.net/clips/academic/2009.IEEE.DOCX.pdf The new XML Office Document Files: Implications For Forensics], [[Simson L. Garfinkel]] and James Migletz
+
== See also==
  
* [http://blog.kiddaland.net/2009/07/antiword-for-office-2007/ Perl script that displays the content of a Docx document, similar to Antiword]
+
* A great deal of information about the format has been documented by the [http://libpff.sourceforge.net libpff project], including some of the [http://downloads.sourceforge.net/libpff/Personal_Folder_File_format.pdf Personal Folder File format specifications] and [http://downloads.sourceforge.net/libpff/MAPI_definitions.pdf MAPI definitions].
 +
* [http://www.five-ten-sg.com/libpst/ libpst]
  
* [http://blog.kiddaland.net/2009/06/office-2007-metadata/ Perl script that displays metadata information that is extracted from an OpenXML document]
 
 
[[Category:File Formats]]
 
[[Category:File Formats]]

Revision as of 04:15, 31 January 2009

Microsoft Outlook uses the Personal Folder File (PFF) to store e-mails, appointments, tasks, contacts, notes, etc.

Three different types of the PFF are known:

  • The Personal Address Book (PAB), which contains the address book of contacts. These files have the extension .pab.
  • The Personal Storage Table (PST), which contains items like e-mails, appointments, tasks, notes, etc. and is used as current and archived mailbox files. These files have the extension .pst. The PST format is also referred to as the Personal Folder File (PFF) format.
  • The Offline Storage Table (OST), which contains items like e-mails, appointments, tasks, notes, etc. and is used as off line mailbox files in conjunction with Microsoft Exchange. These files have the extension .ost. The OST format is also referred to as the Offline Folder File (OFF) format.

The underlying file format of these files is the same of which the actual name is unknown but has been dubbed the Personal Folder File (PFF) format, because of its most common usage.

MIME types

The actual Mime type of the PFF format is unspecified however some sources claim the following MIME types apply to this file format:

  • application/vnd.ms-outlook (for PST files)

File signature

The PFF has the following file signature: hexadecimal: 21 42 44 4e ASCII: !BDN

File types

There are a 32-bit and a 64-bit version of the PFF. These have the same file signature but can be identified by the version in the file header.

Contents

The PFF basically contains a hierarchy of items. The attributes of these items are defined by the Microsoft Outlook Message API (MAPI).

Encryption

The PFF format allows the file to be encrypted. Two types of encryptions are currently known these are referred to as compressible and high encryption. The compressible encryption is a basic substitution cypher and the high encryption is a little more complex substitution cypher. From a cryptographic point of view this is more a way of obfuscation than a means to protect confidentiality.

See also