Difference between pages "Word Document (DOC)" and "Windows SuperFetch Format"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
m (DOC moved to Word (DOC): There are multiple files with the doc extension)
 
(File header)
 
Line 1: Line 1:
The '''DOC file format''' ('''document file format''') usually has the '''.doc''' extension. Mostly these documents belong to [[Microsoft]] [[Word]] software files. However, other text editing software can be used to display these files (including [[WordPad]], [[WordPerfect]], [[OpenOffice]] and others).
+
{{expand}}
  
The DOC file format should not be confused with [[DOCX]].
+
== MEMO file ==
 +
Some of the <tt>Ag*.db</tt> files are MEMO files.
  
== MIME types ==
+
The MEMO file consists of:
 +
* file header
 +
* compressed blocks
  
The following [[MIME types]] apply to this [[file format]]:
+
=== File header ===
 +
The file header is 84 bytes of size and consists of:
 +
{| class="wikitable"
 +
|-
 +
! Offset
 +
! Size
 +
! Value
 +
! Description
 +
|-
 +
| 0
 +
| 4
 +
| 0x304D454D ("MEM0") or 0x4F4D454D ("MEMO")
 +
| Signature
 +
|-
 +
| 4
 +
| 4
 +
|
 +
| Uncompressed (total) data size
 +
|-
 +
|}
  
* application/msword
+
=== Compressed blocks ===
* application/doc
+
The file header is followed by compressed blocks:
* appl/text
+
{| class="wikitable"
* application/vnd.msword
+
|-
* application/vnd.ms-word
+
! Offset
* application/winword
+
! Size
* application/word
+
! Value
* application/x-msw6
+
! Description
* application/x-msword
+
|-
* zz-application/zz-winassoc-doc
+
| 0
 +
| 4
 +
|
 +
| Compressed data size
 +
|-
 +
| 4
 +
| ...
 +
|
 +
| Compressed data
 +
|-
 +
|}
  
== File Header ==
+
=== Uncompressed data ===
 +
<b>TODO</b>
  
MS Word documents of version 97 (and probably earlier) begin with the file signature (in hexadecimal) d0cf11e0a1b11ae1 .
+
== TRX file ==
This signature signifies the file to be an OLE Compound File (AKA Compound Document File or Compound Binary File)
+
The <tt>Ag*.db.trx</tt> files are TRX files.
  
The OLE Compound File has no distinct footer and a can be considered a file containing a FAT like file system.
+
<b>Note that the following format specification is incomplete.</b>
  
The Word document format is places on top of the OLE Compound File.
+
=== File header ===
 +
The file header is 84 bytes of size and consists of:
 +
{| class="wikitable"
 +
|-
 +
! Offset
 +
! Size
 +
! Value
 +
! Description
 +
|-
 +
| 0
 +
| 4
 +
| 1
 +
| Unknown (Version?)
 +
|-
 +
| 4
 +
| 4
 +
|
 +
| Unknown
 +
|-
 +
| 8
 +
| 4
 +
|
 +
| File size
 +
|-
 +
| 12
 +
| 4
 +
|
 +
| Unknown (Record count?)
 +
|-
 +
| 16
 +
| 4
 +
|
 +
| Unknown (Record count?)
 +
|-
 +
| 20
 +
| 4
 +
|
 +
| Unknown (Records offset or file header size)
 +
|-
 +
|}
  
The object stream of a word documents contains the string "Word.Document" with some version.
+
== See Also ==
 +
* [[SuperFetch]]
  
== Encryption ==
+
== External Links ==
 
+
* [http://blog.rewolf.pl/blog/?p=214 Windows SuperFetch file format – partial specification], by ReWolf, October 5, 2011
Versions 97/2000 encrypt documents with a very weak algorithm. This password scheme can be broken easily by several different products and it is possible to decrypt the contents without discovering the password. This is done by testing all 1,099,511,627,776 possible keys. Ultimate Zip Cracker by VDGSoftware is one utility that can perform this decryption.
+
== See Also==
+
[[Media:Compdocfileformat.pdf|Microsoft Compound Document File Format]] (This is actually the OpenOffice specification)
+
 
+
[http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/WindowsCompoundBinaryFileFormatSpecification.pdf Compound Binary File Specification by Microsoft]
+
 
+
Be warned this file contains at least one error: the directory entry name length is a size in bytes not in characters.
+
 
+
== Extracting Strings ==
+
 
+
On a unix-like machine try this command to extract strings from a .doc file:
+
 
+
<code>
+
cat /tmp/test.doc | tr -d \\0  | strings | more
+
</code>
+
 
+
(where /tmp/test.doc is the path to your .doc file)
+
  
 
[[Category:File Formats]]
 
[[Category:File Formats]]

Revision as of 01:28, 15 April 2014

Information icon.png

Please help to improve this article by expanding it.
Further information might be found on the discussion page.

MEMO file

Some of the Ag*.db files are MEMO files.

The MEMO file consists of:

  • file header
  • compressed blocks

File header

The file header is 84 bytes of size and consists of:

Offset Size Value Description
0 4 0x304D454D ("MEM0") or 0x4F4D454D ("MEMO") Signature
4 4 Uncompressed (total) data size

Compressed blocks

The file header is followed by compressed blocks:

Offset Size Value Description
0 4 Compressed data size
4 ... Compressed data

Uncompressed data

TODO

TRX file

The Ag*.db.trx files are TRX files.

Note that the following format specification is incomplete.

File header

The file header is 84 bytes of size and consists of:

Offset Size Value Description
0 4 1 Unknown (Version?)
4 4 Unknown
8 4 File size
12 4 Unknown (Record count?)
16 4 Unknown (Record count?)
20 4 Unknown (Records offset or file header size)

See Also

External Links