Difference between pages "Word Document (DOCX)" and "Executable"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
(Core (Document) Properties)
 
(External Links)
 
Line 1: Line 1:
DOCX is the file format for Microsoft Office 2007 and later.
+
{{expand}}
  
DOCX should not be confused with [[DOC]], the format used by earlier versions of Microsoft Office.
+
An executable file is used to perform tasks according to encoded instructions. Executable files are sometimes also referred to as binaries which technically can be considered a sub class of executable files.
  
= Container Format =
+
There are multiple families of executable files:
 +
* Scripts; e.g. shell scripts, batch scripts (.bat)
 +
* DOS, Windows executable files (.exe) which can be of various formats like: MZ, PE/COFF, NE
 +
* ELF
 +
* Mach-O
  
DOCX is written in an XML format, which consists of a [[ZIP archive]] file containing [[XML]] and binaries. Content can be analysed without modification by unzipping the file (e.g. in WinZIP) and analysing the contents of the archive.
+
== External Links ==
 +
* [http://en.wikipedia.org/wiki/Executable Wikipedia: Executable]
 +
* [ftp://ftp.cs.wisc.edu/paradyn/papers/Rosenblum10prov.pdf Extracting Compiler Provenance from Program Binaries], by Nathan E. Rosenblum, Barton P. Miller, Xiaojin Zhu, June 2010
  
The file _rels/.rels contains information about the structure of the document.  It contains paths to the metadata information as well as the main XML document that contains the content of the document itself.
+
=== ELF ===
 +
* [http://robinhoksbergen.com/papers/howto_elf.html Manually Creating an ELF Executable], by Robin Hoksbergen
  
Metadata information are usually stored in the folder docProps.  Two or more XML files are stored inside that folder, app.xml that stores metadata information extracted from the Word application itself and core.xml that stores metadata from the document itself, such as the author name, last time it was printed, etc.
+
=== MZ, PE/COFF ===
 +
* [http://en.wikipedia.org/wiki/Portable_Executable Wikipedia: Portable Executable]
 +
* [http://msdn.microsoft.com/en-us/windows/hardware/gg463119.aspx Microsoft PE and COFF Specification]
 +
* [http://msdn.microsoft.com/en-us/magazine/ms809762.aspx Peering Inside the PE: A Tour of the Win32 Portable Executable File Format], by Matt Pietrek, March 1994
 +
* [http://www.microsoft.com/msj/0797/hood0797.aspx Under the Hood], by Matt Pietrek, July 1997
 +
* [http://msdn.microsoft.com/en-us/magazine/cc301805.aspx An In-Depth Look into the Win32 Portable Executable File Format], by Matt Pietrek, February 2002
 +
* [https://googledrive.com/host/0B3fBvzttpiiSd1dKQVU0WGVESlU/Executable%20(EXE)%20file%20format.pdf MZ, PE-COFF executable file format (EXE)], by the [[libexe|libexe project]], October 2011
 +
* [http://seclists.org/fulldisclosure/2013/Oct/157 The Internal of Reloc .text], Full Disclosure Mailing list, October 21, 2013
  
Another folder contains the actual content of the document, in a Word document, or an .docx document the folder's name is word. A XML file called document.xml is the main document, containing most of the content of the document itself.
+
=== DBG, PDB ===
 +
* [http://en.wikipedia.org/wiki/Program_database Wikipedia: Program database]
 +
* [http://www.debuginfo.com/articles/debuginfomatch.html Matching Debug Information], by debuginfo.com
 +
* [http://support.microsoft.com/kb/121366 Description of the .PDB files and of the .DBG files], by [[Microsoft]]
 +
* [http://msdn.microsoft.com/en-us/library/ff553493(v=vs.85).aspx Public and Private Symbols], by [[Microsoft]]
 +
* [http://msdn.microsoft.com/en-us/library/windows/desktop/ms679293(v=vs.85).aspx DbgHelp Structures], by [[Microsoft]]
 +
* [http://web.archive.org/web/20070915060650/http://www.x86.org/ftp/manuals/tools/sym.pdf Internet Archive: Microsoft Symbol and Type Information], by [[Microsoft]]
 +
* [http://pierrelib.pagesperso-orange.fr/exec_formats/MS_Symbol_Type_v1.0.pdf Microsoft Symbol and Type Information]
 +
* [https://code.google.com/p/pdbparse/wiki/StreamDescriptions Stream Descriptions], [https://code.google.com/p/pdbparse/ pdbparse project]
 +
* [http://sourceforge.net/p/mingw-w64/code/HEAD/tree/experimental/tools/libmsdebug/ libmsdebug], by the [[MinGW|MinGW project]]
 +
* [http://moyix.blogspot.com/2007/10/types-stream.html The Types Stream], by [[Brendan Dolan-Gavitt]], October 4, 2007
  
= Relationship to OOXML =
+
=== Minidump ===
 +
* [http://msdn.microsoft.com/en-us/library/windows/desktop/ms680378(v=vs.85).aspx MSDN: MINIDUMP_HEADER structure]
 +
* [https://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/common/minidump_format.h minidump_format.h], by [[Google]], 2006
 +
* [http://moyix.blogspot.com/2008/05/parsing-windows-minidumps.html Parsing Windows Minidumps], by [[Brendan Dolan-Gavitt]], May 7, 2008
 +
* [http://web.archive.org/web/20110814041817/http://www.stackhash.com/blog/post/Format-of-a-minidump-(mdmp)-file.aspx Format of a minidump (mdmp) file], Internet Archive: StackHash blog, May 16, 2011
  
Office Open XML is an open XML standard developed by Microsoft for word processing documents, spreadsheets, presentations and charts. The OOXML standard was submitted to the ISO for approval.  After initially being rejected over technical concerns, the ISO approved a modified version as ISO/IEC 29500:2008. Microsoft intended to use the OOXML standard for its Office suite. However, Office does not support the standard that the ISO approved, it only supports the standard that was originally rejected by the ISO[http://arstechnica.com/microsoft/news/2010/04/iso-ooxml-convener-microsofts-format-heading-for-failure.ars]. As of Office 2010, Microsoft has still not brought its software into compliance with the standard.
+
=== Mach-O ===
 +
* [http://en.wikipedia.org/wiki/Mach-O Wikipedia: Mach-O]
  
For most purposes OOXML may be considered a subset of DOCX (DOCX contains additional features, like OLE serialization).
+
=== Packers ===
 +
* [http://www.woodmann.com/crackz/Packers.htm Packers & Unpackers]
  
Documentation on OOXML may provide a guide to analysing a DOCX file.
+
== Tools ==
  
= Metadata =
+
=== MZ, PE/COFF ===
 +
* [https://code.google.com/p/pefile/ pefile], multi-platform Python module to read and work with Portable Executable (aka PE) files
  
== Core (Document) Properties ==
+
=== PDB ===
<pre>
+
* [https://code.google.com/p/pdbparse/ pdbparse], Open-source parser for Microsoft debug symbols (PDB files)
docProps/core.xml
+
</pre>
+
  
<pre>
+
=== Minidump ===
&lt;?xml version="1.0" encoding="UTF-8" standalone="yes"?&gt;
+
* [http://support.microsoft.com/kb/315271 Dumpchk.exe], by [[Microsoft]]
&lt;cp:coreProperties
+
* [http://amnesia.gtisc.gatech.edu/~moyix/minidump.py minidump.py], by [[Brendan Dolan-Gavitt]]
    xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
+
    xmlns:dc="http://purl.org/dc/elements/1.1/"
+
    xmlns:dcterms="http://purl.org/dc/terms/"
+
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
+
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"&gt;
+
&lt;dc:creator&gt;User 1&lt;/dc:creator&gt;
+
&lt;cp:lastModifiedBy&gt;User 2&lt;/cp:lastModifiedBy&gt;
+
&lt;cp:revision&gt;3&lt;/cp:revision&gt;
+
&lt;dcterms:created xsi:type="dcterms:W3CDTF"&gt;2012-11-07T23:29:00Z&lt;/dcterms:created&gt;
+
&lt;dcterms:modified xsi:type="dcterms:W3CDTF"&gt;2013-08-25T22:18:00Z&lt;/dcterms:modified&gt;
+
&lt;/cp:coreProperties&gt;
+
</pre>
+
 
+
= External Links =
+
 
+
* [http://msdn.microsoft.com/en-us/library/aa338205.aspx Introducing the Office (2007) Open XML File Formats], by [[Microsoft]], May 2006
+
* [http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements# DCMI Metadata Terms]
+
* [http://www.simson.net/clips/academic/2009.IEEE.DOCX.pdf The new XML Office Document Files: Implications For Forensics], [[Simson L. Garfinkel]] and James Migletz
+
* [http://blog.kiddaland.net/2009/06/office-2007-metadata/ Perl script that displays metadata information that is extracted from an OpenXML document], by [[Kristinn Gudjonsson]], June 2009
+
* [http://blog.kiddaland.net/2009/07/antiword-for-office-2007/ Perl script that displays the content of a Docx document, similar to Antiword], by [[Kristinn Gudjonsson]], July 2009
+
* [http://computer-forensics.sans.org/blog/2009/07/10/office-2007-metadata/ Office 2007 Metadata], by [[Kristinn Gudjonsson]], July 10, 2009
+
 
+
[[Category:File Formats]]
+

Revision as of 13:11, 18 May 2014

Information icon.png

Please help to improve this article by expanding it.
Further information might be found on the discussion page.

An executable file is used to perform tasks according to encoded instructions. Executable files are sometimes also referred to as binaries which technically can be considered a sub class of executable files.

There are multiple families of executable files:

  • Scripts; e.g. shell scripts, batch scripts (.bat)
  • DOS, Windows executable files (.exe) which can be of various formats like: MZ, PE/COFF, NE
  • ELF
  • Mach-O

External Links

ELF

MZ, PE/COFF

DBG, PDB

Minidump

Mach-O

Packers

Tools

MZ, PE/COFF

  • pefile, multi-platform Python module to read and work with Portable Executable (aka PE) files

PDB

  • pdbparse, Open-source parser for Microsoft debug symbols (PDB files)

Minidump