Difference between pages "Word Document (DOC)" and "DFXML"

From Forensics Wiki
(Difference between pages)
Redirect page
Jump to: navigation, search
(File Header)
 
m (Redirected page to Category:Digital Forensics XML)
 
Line 1: Line 1:
The '''DOC file format''' ('''document file format''') usually has the '''.doc''' extension. Mostly these documents belong to [[Microsoft]] [[Word]] software files. However, other text editing software can be used to display these files (including [[WordPad]], [[WordPerfect]], [[OpenOffice]] and others).
+
#REDIRECT [[Category:Digital Forensics XML]]
 
+
The DOC file format should not be confused with [[DOCX]].
+
 
+
== MIME types ==
+
 
+
The following [[MIME types]] apply to this [[file format]]:
+
 
+
* application/msword
+
* application/doc
+
* appl/text
+
* application/vnd.msword
+
* application/vnd.ms-word
+
* application/winword
+
* application/word
+
* application/x-msw6
+
* application/x-msword
+
* zz-application/zz-winassoc-doc
+
 
+
== File signature ==
+
 
+
[[Microsoft Word]] documents of version 97-2003 use the [[OLE Compound File]] (OLECF). These files therefore have the OLECF file signature
+
 
+
The object stream of the OLECF containing a Word document contains the string "Word.Document" with some version.
+
 
+
== Word 97-2003 documents ==
+
 
+
The Word Binary File format is stored in the OLECF using multiple streams:
+
* WordDocument stream
+
* Table stream (0Table, 1Table)
+
 
+
== Encryption ==
+
 
+
Versions 97/2000 encrypt documents with a very weak algorithm. This password scheme can be broken easily by several different products and it is possible to decrypt the contents without discovering the password. This is done by testing all 1,099,511,627,776 possible keys. Ultimate Zip Cracker by VDGSoftware is one utility that can perform this decryption.
+
== See Also==
+
 
+
[http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/Word97-2007BinaryFileFormat(doc)Specification.pdf Word 97-2007 Binary File Format by Microsoft]
+
 
+
== Extracting Strings ==
+
 
+
On a unix-like machine try this command to extract strings from a .doc file:
+
 
+
<code>
+
cat /tmp/test.doc | tr -d \\0  | strings | more
+
</code>
+
 
+
(where /tmp/test.doc is the path to your .doc file)
+
 
+
[[Category:File Formats]]
+

Latest revision as of 11:57, 19 January 2012