Difference between pages "Outlook Express Database (DBX)" and "Word Document (DOCX)"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
 
(Document properties - core)
 
Line 1: Line 1:
[[Outlook Express]] (OE) uses the '''Outlook Express Database (DBX)''' database to store emails, folders, etc. The DBX was introduces in version 5. Earlier versions of OE use different file formats.
+
DOCX is the file format for Microsoft Office 2007 and later.  
  
== MIME types ==
+
DOCX should not be confused with [[DOC]], the format used by earlier versions of Microsoft Office.
  
== File signature ==
+
= Container Format =
OE5 DBX files start with hexadecimal: 0xcf 0xad 0x12 0xfe.
+
For now it is assumed that this is the file signature.
+
It is followed by a content class identifier (CLSID), which is unique for the type of DBX file.
+
  
== File types ==
+
DOCX is written in an XML format, which consists of a [[ZIP archive]] file containing [[XML]] and binaries. Content can be analysed without modification by unzipping the file (e.g. in WinZIP) and analysing the contents of the archive.
Although DBX probably contains a minor/major version (offset: 20 and 24) until now it is unknown how it affects the file type.
+
  
== Contents ==
+
The file _rels/.rels contains information about the structure of the document.  It contains paths to the metadata information as well as the main XML document that contains the content of the document itself.
The contents of a DBX file is dependent on the content CLSID.
+
* Message Database (CLSID: 6F74FDC5-E366-11d1-9A4E-00C04FA309D4)
+
* Folder Database (CLSID: 6F74FDC6-E366-11d1-9A4E-00C04FA309D4)
+
  
== Encryption ==
+
Metadata information are usually stored in the folder docProps.  Two or more XML files are stored inside that folder, app.xml that stores metadata information extracted from the Word application itself and core.xml that stores metadata from the document itself, such as the author name, last time it was printed, etc.
  
== See also==
+
Another folder contains the actual content of the document, in a Word document, or an .docx document the folder's name is word.  A XML file called document.xml is the main document, containing most of the content of the document itself.
  
* [http://en.wikipedia.org/wiki/Outlook_Express Outlook Express Wikipedia]
+
= Relationship to OOXML =
* [http://oedbx.aroh.de/ Outlook Express dbx file format by Arne Schloh]
+
 
* [http://www.fpns.net/willy/DBX-FMT.HTM Outlook Express Version 5.0 file format]
+
Office Open XML is an open XML standard developed by Microsoft for word processing documents, spreadsheets, presentations and charts. The OOXML standard was submitted to the ISO for approval.  After initially being rejected over technical concerns, the ISO approved a modified version as ISO/IEC 29500:2008. Microsoft intended to use the OOXML standard for its Office suite. However, Office does not support the standard that the ISO approved, it only supports the standard that was originally rejected by the ISO[http://arstechnica.com/microsoft/news/2010/04/iso-ooxml-convener-microsofts-format-heading-for-failure.ars]. As of Office 2010, Microsoft has still not brought its software into compliance with the standard.
* [http://sourceforge.net/projects/ol2mbox/ libdbx], see FILE-FORMAT in the package.
+
 
 +
For most purposes OOXML may be considered a subset of DOCX (DOCX contains additional features, like OLE serialization).
 +
 
 +
Documentation on OOXML may provide a guide to analysing a DOCX file.
 +
 
 +
= Metadata =
 +
 
 +
== Content types ==
 +
<pre>
 +
[Content_Types].xml
 +
</pre>
 +
 
 +
<pre>
 +
&lt;?xml version="1.0" encoding="UTF-8" standalone="yes"?&gt;
 +
&lt;Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"&gt;
 +
&lt;Default Extension="emf" ContentType="image/x-emf"/&gt;
 +
&lt;Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/&gt;
 +
&lt;Default Extension="xml" ContentType="application/xml"/&gt;
 +
&lt;Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/&gt;
 +
&lt;Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml"/&gt;
 +
&lt;Override PartName="/word/stylesWithEffects.xml" ContentType="application/vnd.ms-word.stylesWithEffects+xml"/&gt;
 +
&lt;Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml"/&gt;
 +
&lt;Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml"/&gt;
 +
&lt;Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml"/&gt;
 +
&lt;Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml"/&gt;
 +
&lt;Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/&gt;
 +
&lt;Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/&gt;
 +
&lt;/Types&gt;
 +
</pre>
 +
 
 +
== Relationships ==
 +
<pre>
 +
_rels/.rels
 +
</pre>
 +
 
 +
<pre>
 +
&lt;?xml version="1.0" encoding="UTF-8" standalone="yes"?&gt;
 +
&lt;Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"&gt;
 +
&lt;Relationship Id="rId3" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/&gt;
 +
&lt;Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/thumbnail" Target="docProps/thumbnail.emf"/&gt;
 +
&lt;Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/&gt;
 +
&lt;Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/&gt;
 +
&lt;/Relationships&gt;
 +
</pre>
 +
 
 +
Other relationship files:
 +
<pre>
 +
word/_rels/document.xml.rels
 +
</pre>
 +
 
 +
== Document properties - core ==
 +
<pre>
 +
docProps/core.xml
 +
</pre>
 +
 
 +
<pre>
 +
&lt;?xml version="1.0" encoding="UTF-8" standalone="yes"?&gt;
 +
&lt;cp:coreProperties
 +
    xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
 +
    xmlns:dc="http://purl.org/dc/elements/1.1/"
 +
    xmlns:dcterms="http://purl.org/dc/terms/"
 +
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
 +
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"&gt;
 +
&lt;dc:creator&gt;User 1&lt;/dc:creator&gt;
 +
&lt;cp:lastModifiedBy&gt;User 2&lt;/cp:lastModifiedBy&gt;
 +
&lt;cp:revision&gt;3&lt;/cp:revision&gt;
 +
&lt;dcterms:created xsi:type="dcterms:W3CDTF"&gt;2012-11-07T23:29:00Z&lt;/dcterms:created&gt;
 +
&lt;dcterms:modified xsi:type="dcterms:W3CDTF"&gt;2013-08-25T22:18:00Z&lt;/dcterms:modified&gt;
 +
&lt;/cp:coreProperties&gt;
 +
</pre>
 +
 
 +
== Document properties - extended: application ==
 +
<pre>
 +
docProps/app.xml
 +
</pre>
 +
 
 +
<pre>
 +
&lt;?xml version="1.0" encoding="UTF-8" standalone="yes"?&gt;
 +
&lt;Properties
 +
    xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties"
 +
    xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes"&gt;
 +
&lt;Template&gt;Normal.dotm&lt;/Template&gt;
 +
    &lt;TotalTime&gt;1385&lt;/TotalTime&gt;
 +
    &lt;Pages&gt;1&lt;/Pages&gt;
 +
    &lt;Words&gt;2&lt;/Words&gt;
 +
    &lt;Characters&gt;13&lt;/Characters&gt;
 +
    &lt;Application&gt;Microsoft Office Word&lt;/Application&gt;
 +
    &lt;DocSecurity&gt;0&lt;/DocSecurity&gt;
 +
    &lt;Lines&gt;1&lt;/Lines&gt;
 +
    &lt;Paragraphs&gt;1&lt;/Paragraphs&gt;
 +
    &lt;ScaleCrop&gt;false&lt;/ScaleCrop&gt;
 +
    &lt;HeadingPairs&gt;
 +
        &lt;vt:vector size="2" baseType="variant"&gt;
 +
            &lt;vt:variant&gt;
 +
                &lt;vt:lpstr&gt;Title&lt;/vt:lpstr&gt;
 +
            &lt;/vt:variant&gt;
 +
            &lt;vt:variant&gt;
 +
                &lt;vt:i4&gt;1&lt;/vt:i4&gt;
 +
            &lt;/vt:variant&gt;
 +
        &lt;/vt:vector&gt;
 +
    &lt;/HeadingPairs&gt;
 +
    &lt;TitlesOfParts&gt;
 +
        &lt;vt:vector size="1" baseType="lpstr"&gt;
 +
            &lt;vt:lpstr&gt;&lt;/vt:lpstr&gt;
 +
        &lt;/vt:vector&gt;
 +
    &lt;/TitlesOfParts&gt;
 +
    &lt;Company&gt;&lt;/Company&gt;
 +
    &lt;LinksUpToDate&gt;false&lt;/LinksUpToDate&gt;
 +
    &lt;CharactersWithSpaces&gt;14&lt;/CharactersWithSpaces&gt;
 +
    &lt;SharedDoc&gt;false&lt;/SharedDoc&gt;
 +
    &lt;HyperlinksChanged&gt;false&lt;/HyperlinksChanged&gt;
 +
    &lt;AppVersion&gt;14.0000&lt;/AppVersion&gt;
 +
&lt;/Properties&gt;
 +
</pre>
 +
 
 +
= External Links =
 +
 
 +
* [http://msdn.microsoft.com/en-us/library/aa338205.aspx Introducing the Office (2007) Open XML File Formats], by [[Microsoft]], May 2006
 +
* [http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements# DCMI Metadata Terms]
 +
* [http://www.simson.net/clips/academic/2009.IEEE.DOCX.pdf The new XML Office Document Files: Implications For Forensics], [[Simson L. Garfinkel]] and James Migletz
 +
* [http://blog.kiddaland.net/2009/06/office-2007-metadata/ Perl script that displays metadata information that is extracted from an OpenXML document], by [[Kristinn Gudjonsson]], June 2009
 +
* [http://blog.kiddaland.net/2009/07/antiword-for-office-2007/ Perl script that displays the content of a Docx document, similar to Antiword], by [[Kristinn Gudjonsson]], July 2009
 +
* [http://computer-forensics.sans.org/blog/2009/07/10/office-2007-metadata/ Office 2007 Metadata], by [[Kristinn Gudjonsson]], July 10, 2009
  
 
[[Category:File Formats]]
 
[[Category:File Formats]]

Latest revision as of 23:42, 30 September 2013

DOCX is the file format for Microsoft Office 2007 and later.

DOCX should not be confused with DOC, the format used by earlier versions of Microsoft Office.

Container Format

DOCX is written in an XML format, which consists of a ZIP archive file containing XML and binaries. Content can be analysed without modification by unzipping the file (e.g. in WinZIP) and analysing the contents of the archive.

The file _rels/.rels contains information about the structure of the document. It contains paths to the metadata information as well as the main XML document that contains the content of the document itself.

Metadata information are usually stored in the folder docProps. Two or more XML files are stored inside that folder, app.xml that stores metadata information extracted from the Word application itself and core.xml that stores metadata from the document itself, such as the author name, last time it was printed, etc.

Another folder contains the actual content of the document, in a Word document, or an .docx document the folder's name is word. A XML file called document.xml is the main document, containing most of the content of the document itself.

Relationship to OOXML

Office Open XML is an open XML standard developed by Microsoft for word processing documents, spreadsheets, presentations and charts. The OOXML standard was submitted to the ISO for approval. After initially being rejected over technical concerns, the ISO approved a modified version as ISO/IEC 29500:2008. Microsoft intended to use the OOXML standard for its Office suite. However, Office does not support the standard that the ISO approved, it only supports the standard that was originally rejected by the ISO[1]. As of Office 2010, Microsoft has still not brought its software into compliance with the standard.

For most purposes OOXML may be considered a subset of DOCX (DOCX contains additional features, like OLE serialization).

Documentation on OOXML may provide a guide to analysing a DOCX file.

Metadata

Content types

[Content_Types].xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
<Default Extension="emf" ContentType="image/x-emf"/>
<Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
<Default Extension="xml" ContentType="application/xml"/>
<Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
<Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml"/>
<Override PartName="/word/stylesWithEffects.xml" ContentType="application/vnd.ms-word.stylesWithEffects+xml"/>
<Override PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml"/>
<Override PartName="/word/webSettings.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml"/>
<Override PartName="/word/fontTable.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml"/>
<Override PartName="/word/theme/theme1.xml" ContentType="application/vnd.openxmlformats-officedocument.theme+xml"/>
<Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>
<Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/>
</Types>

Relationships

_rels/.rels
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/>
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/thumbnail" Target="docProps/thumbnail.emf"/>
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="word/document.xml"/>
<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/>
</Relationships>

Other relationship files:

word/_rels/document.xml.rels

Document properties - core

docProps/core.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cp:coreProperties
    xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<dc:creator>User 1</dc:creator>
<cp:lastModifiedBy>User 2</cp:lastModifiedBy>
<cp:revision>3</cp:revision>
<dcterms:created xsi:type="dcterms:W3CDTF">2012-11-07T23:29:00Z</dcterms:created>
<dcterms:modified xsi:type="dcterms:W3CDTF">2013-08-25T22:18:00Z</dcterms:modified>
</cp:coreProperties>

Document properties - extended: application

docProps/app.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties
    xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties"
    xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
<Template>Normal.dotm</Template>
    <TotalTime>1385</TotalTime>
    <Pages>1</Pages>
    <Words>2</Words>
    <Characters>13</Characters>
    <Application>Microsoft Office Word</Application>
    <DocSecurity>0</DocSecurity>
    <Lines>1</Lines>
    <Paragraphs>1</Paragraphs>
    <ScaleCrop>false</ScaleCrop>
    <HeadingPairs>
        <vt:vector size="2" baseType="variant">
            <vt:variant>
                <vt:lpstr>Title</vt:lpstr>
            </vt:variant>
            <vt:variant>
                <vt:i4>1</vt:i4>
            </vt:variant>
        </vt:vector>
    </HeadingPairs>
    <TitlesOfParts>
        <vt:vector size="1" baseType="lpstr">
            <vt:lpstr></vt:lpstr>
        </vt:vector>
    </TitlesOfParts>
    <Company></Company>
    <LinksUpToDate>false</LinksUpToDate>
    <CharactersWithSpaces>14</CharactersWithSpaces>
    <SharedDoc>false</SharedDoc>
    <HyperlinksChanged>false</HyperlinksChanged>
    <AppVersion>14.0000</AppVersion>
</Properties>

External Links