Difference between revisions of "Word Document (DOC)"

From ForensicsWiki
Jump to: navigation, search
m
(Adding simple command to extract strings from DOC file. Adding distinction between DOC and DOCX.)
Line 1: Line 1:
 
The '''DOC file format''' ('''document file format''') usually has the '''.doc''' extension. Mostly these documents belong to [[Microsoft]] [[Word]] software files. However, other text editing software can be used to display these files (including [[WordPad]], [[WordPerfect]], [[OpenOffice]] and others).
 
The '''DOC file format''' ('''document file format''') usually has the '''.doc''' extension. Mostly these documents belong to [[Microsoft]] [[Word]] software files. However, other text editing software can be used to display these files (including [[WordPad]], [[WordPerfect]], [[OpenOffice]] and others).
 +
 +
The DOC file format should not be confused with [[DOCX]].
  
 
== MIME types ==
 
== MIME types ==
Line 25: Line 27:
 
== See Also==
 
== See Also==
 
[[Media:Compdocfileformat.pdf|Microsoft Compound Document File Format]]
 
[[Media:Compdocfileformat.pdf|Microsoft Compound Document File Format]]
 +
 +
== Extracting Strings ==
 +
 +
On a unix-like machine try this command to extract strings from a .doc file:
 +
 +
<code>
 +
cat /tmp/test.doc | tr -d \\0  | strings | more
 +
</code>
 +
 +
(where /tmp/test.doc is the path to your .doc file)
  
 
[[Category:File Formats]]
 
[[Category:File Formats]]

Revision as of 17:44, 28 October 2008

The DOC file format (document file format) usually has the .doc extension. Mostly these documents belong to Microsoft Word software files. However, other text editing software can be used to display these files (including WordPad, WordPerfect, OpenOffice and others).

The DOC file format should not be confused with DOCX.

MIME types

The following MIME types apply to this file format:

  • application/msword
  • application/doc
  • appl/text
  • application/vnd.msword
  • application/vnd.ms-word
  • application/winword
  • application/word
  • application/x-msw6
  • application/x-msword
  • zz-application/zz-winassoc-doc

File Header

MS Word documents begin with the hex string 0xd0cf11e0a1b11ae1 and end with the string "Word.Document.8".

Encryption

Versions 97/2000 encrypt documents with a very weak algorithm. This password scheme can be broken easily by several different products and it is possible to decrypt the contents without discovering the password. This is done by testing all 1,099,511,627,776 possible keys. Ultimate Zip Cracker by VDGSoftware is one utility that can perform this decryption.

See Also

Microsoft Compound Document File Format

Extracting Strings

On a unix-like machine try this command to extract strings from a .doc file:

cat /tmp/test.doc | tr -d \\0 | strings | more

(where /tmp/test.doc is the path to your .doc file)