Difference between pages "List of Cyberspeak Podcast Interviews" and "Internet Explorer History File Format"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
m (Fixed typo)
 
 
Line 1: Line 1:
The [[Cyberspeak podcast]] usually features at least one interview per show. The guests on each show are listed below.
+
{{Expand}}
 +
[[Internet Explorer]] as of version 4 up to version 9 stores the web browsing history in files named <tt>index.dat</tt>. The files contain multiple records.
 +
MSIE version 3 probably uses similar records in its History (Cache) files.
  
=== 2005 ===
+
== File Locations ==
  
* 18 Dec 2005: [[Nick Harbour]], author of [[Dcfldd|dcfldd]]
+
Internet Explorer history files keep a record of URLs that the browser has visited, cookies that were created by these sites, and any temporary internet files that were downloaded by the site visit.  As a result, Internet Explorer history files are kept in several locations.  Regardless of the information stored in the file, the file is named index.dat.
* 31 Dec 2005: [[Jesse Kornblum]], author of [[foremost]] and [[md5deep]]
+
  
=== 2006 ===
+
On Windows 95/98 these files were located in the following locations:
 +
<pre>
 +
%systemdir%\Temporary Internet Files\Content.ie5
 +
%systemdir%\Cookies
 +
%systemdir%\History\History.ie5
 +
</pre>
  
* 7 Jan 2006: [[Drew Fahey]], author of [[Helix]]
+
On Windows 2000/XP the file locations have changed:
* 18 Jan 2006: [[Simple Nomad]]
+
<pre>
* 21 Jan 2006: [[Johnny Long]]
+
%systemdir%\Documents and Settings\%username%\Local Settings\Temporary Internet Files\Content.ie5
* 28 Jan 2006: [[Kevin Mandia]]
+
%systemdir%\Documents and Settings\%username%\Cookies
 +
%systemdir%\Documents and Settings\%username%\Local Settings\History\history.ie5
 +
</pre>
  
 +
Internet Explorer also keeps daily, weekly, and monthly history logs that will be located in subfolders of %systemdir%\Documents and Settings\%username%\Local Settings\History\history.ie5.  The folders will be named <tt>MSHist<two-digit number><starting four-digit year><starting two-digit month><starting two-digit day><ending four-digit year><ending two-digit month><ending two-digit day></tt>.  For example, the folder containing data from March 26, 2008 to March 27, 2008 might be named <tt>MSHist012008032620080327</tt>.
  
* 4 Feb 2006: [[Brian Carrier]]
+
Note that not every file named index.dat is a MSIE History (Cache) file.
* 11 Feb 2006: [[Jesse Kornblum]]
+
* 18 Feb 2006: [[Bruce Potter]] of the Shmoo Group
+
* 25 Feb 2006: [[Kris Kendall]] speaks about malware analysis
+
  
 +
== File Header ==
 +
Every version of Internet Explorer since Internet Explorer 5 has used the same structure for the file header and the individual records.  Internet Explorer history files begin with:
 +
43 6c 69 65 6e 74 20 55 72 6c 43 61 63 68 65 20 4d 4d 46 20 56 65 72 20 35 2e 32
 +
Which represents the ascii string "Client UrlCache MMF Ver 5.2"
  
* 4 Mar 2006: [[Dave Merkel]]
+
The next field in the file header starts at byte offset 28 and is a four byte representation of the file size.  The number will be stored in [[endianness | little-endian]] format so the numbers must actually be reversed to calculate the value.
* 11 Mar 2006: [[James Wiebe]] of [[Wiebe Tech]]. Also [[Todd Bellows]] of [[LogiCube]] about [[CellDek]]
+
* 18 Mar 2006: [[Kris Kendall]]
+
* 25 Mar 2006: (No interview)
+
  
 +
Also of interest in the file header is the location of the cache directories.  In the URL records the cache directories are given as a number, with one representing the first cache directory, two representing the second and so on.  The names of the cache directories are kept at byte offset 64 in the file.  Each directory entry is 12 bytes long of which the first eight bytes contain the directory name.
  
* 1 Apr 2006: [[Harlan Carvey]], creator of the [[Forensic Server Project]]
+
== Allocation bitmap ==
* 8 Apr 2006: (No interview)
+
The IE History File contains an allocation bitmap starting from offset 0x250 to 0x4000.
* 15 Apr 2006: (No interview), but first to mention the [[Main_Page|Forensics Wiki]]!
+
* 22 Apr 2006: [[Jaime Florence]] about [[Mercury]], a text indexing product
+
  
 +
== Record Formats ==
  
* 6 May 2006: [[Mark Rache]] and [[Dave Merkel]]
+
Every record has a similar header that consists of 8 bytes.
* 13 May 2006: [[Steve Bunting]]
+
* 21 May 2006: [[Mike Younger]]
+
* 29 May 2006: [[Mike Younger]]
+
  
 +
<pre>typedef struct _RECORD_HEADER {
 +
  /* 000 */ char        Signature[4];
 +
  /* 004 */ uint32_t    NumberOfBlocksInRecord;
 +
} RECORD_HEADER;</pre>
  
* 3 Jun 2006: [[Jesse Kornblum]] about [[Windows Memory Analysis]]
+
The size of the record can be determined from the number of blocks in the record; per default the block size is 128 bytes. Therefore, a length of <pre>05 00 00 00</pre> would indicate five blocks (because the number is stored in little-endian format) of 128 bytes for a total record length of 640 bytes. Note that even for allocated records the number of blocks value cannot be fully relied upon.
* 10 Jun 2006: (No interview)
+
* 17 Jun 2006: [[Mike Younger]]
+
* 24 Jun 2006: (No interview)
+
  
 +
The blocks that make up a record can have slack space.
  
* 1 Jul 2006: (No interview)
+
Currently 4 types of records are known:
* 9 Jul 2006: [[Johnny Long]]
+
* URL
* 18 Jul 2006: [[Dark Tangent]]
+
* REDR
* 30 Jul 2006: [[Jesse Kornblum]] about [[Ssdeep|ssdeep]] and [[Context Triggered Piecewise Hashing|Fuzzy Hashing]]
+
* HASH
 +
* LEAK
  
 +
Note that the location and filename strings are stored in the local codepage, normally these strings will only use the ASCII character set. Chinese versions of Windows are known to also use extended characters as well.
  
* 10 Aug 2006: [[Brian Contos]] discusses his book ''Insider Threat: Enemy at the Watercooler''
+
=== URL Records ===
* 13 Aug 2006: [[Richard Bejtlich]] discusses his book ''Real Digital Forensics''
+
* 27 Aug 2006: [[David Farquhar]]
+
  
 +
These records indicate URIs that were actually requested. They contain the location and additional data like the web server's HTTP response. They begin with the header, in hexadecimal:
  
* 3 Sep 2006: [[Keith Jones]]
+
<pre>55 52 4C 20</pre>
* 10 Sep 2006: (No Interview)
+
This corresponds to the string <tt>URL</tt> followed by a space.
* 17 Sep 2006: (No Interview)
+
* 24 Sep 2006: (No Interview)
+
  
 +
The definition for the structure in C99 format:
  
* 1 Oct 2006: [[Brian Kaplan]], author of [[LiveView]]
+
<pre>typedef struct _URL_RECORD_HEADER {
* 8 Oct 2006: [[Tom Gallagher]] discusses his book ''Hunting Security Bugs''
+
  /* 000 */ char        Signature[4];
* 15 Oct 2006: (No Interview)
+
  /* 004 */ uint32_t    AmountOfBlocksInRecord;
* 29 Oct 2006: (No Interview)
+
  /* 008 */ FILETIME    LastModified;
 +
  /* 010 */ FILETIME    LastAccessed;
 +
  /* 018 */ FATTIME    Expires;
 +
  /* 01c */
 +
  // Not finished yet
 +
} URL_RECORD_HEADER;</pre>
  
 +
<pre>
 +
typedef struct _FILETIME {
 +
  /* 000 */ uint32_t    lower;
 +
  /* 004 */ uint32_t    upper;
 +
} FILETIME;</pre>
  
* 12 Nov 2006: [[Jesse Kornblum]] discusses his paper ''Exploiting the Rootkit Paradox with Windows Memory Analysis''
+
<pre>
* 19 Nov 2006: [[Kris Kendall]] discusses unpacking binaries when conducting malware analysis
+
typedef struct _FATTIME {
* 26 Nov 2006: (No Interview)
+
  /* 000 */ uint16_t    date;
 +
  /* 002 */ uint16_t    time;
 +
} FATTIME;</pre>
  
 +
The actual interpretation of the "LastModified" and "LastAccessed" fields depends on the type of history file in which the record is contained. As a matter of fact, Internet Explorer uses three different types of history files, namely Daily History, Weekly History, and Main History. Other "index.dat" files are used to store cached copies of visited pages and cookies.
 +
The information concerning how to intepret the dates of these different files can be found on Capt. Steve Bunting's web page at the University of Delaware Computer Forensics Lab (http://www.stevebunting.org/udpd4n6/forensics/index_dat2.htm).
 +
Please be aware that most free and/or open source index.dat parsing programs, as well as quite a few commercial forensic tools, are not able to correctly interpret the above dates. More specifically, they interpret all the time and dates as if the records were contained into a Daily History file regardless of the actual type of the file they are stored in.
  
* 3 Dec 2006: [[Brian Dykstra]]
+
=== REDR Records ===
* 10 Dec 2006: [[Mike Younger]]
+
REDR records are very simple records.  They simply indicate that the browser was redirected to another site.  REDR records always start with the string REDR (0x52 45  44 52).  The next four bytes are the size of the record in little endian format.  The size will indicate the number 128 byte blocks.
* 17 Dec 2006: [[Mike Younger]] and [[Geoff Michelli]]
+
  
=== 2007 ===
+
At offset 8 from the start of the REDR record is an unknown data field.  It has been confirmed that this is not a date field.
  
* 7 Jan 2007: [[Jamie Butler]]
+
16 bytes into the REDR record is the URL that was visited in a null-terminated string.  After the URL, the REDR record appears to be padded with zeros until the end of the 128 byte block.
* 17 Jan 2007: [[Chad McMillan]]
+
* 28 Jan 2007: [[Jesse Kornblum]]
+
  
 +
=== HASH Records ===
  
* 11 Feb 2007: [[Scott Moulton]]
+
=== LEAK Records ===
* 18 Fen 2007: [[Phil Zimmerman]], creator of [[PGP]] discussing his new [[Zfone]]
+
The exact purpose of LEAK records remains unknown, however research performed by Mike Murr suggests that LEAK records are created when the machine attempts to delete records from the history file while a corresponding Temporary Internet File (TIF) is held open and cannot be deleted.
* 25 Feb 2007: [[Mark Menz]] and [[Jeff Moss]]
+
  
 +
== External Links ==
  
* 4 Mar 2007: No show due to technical difficulties
+
* [http://www.cqure.net/wp/iehist/ IEHist program for reading index.dat files]
* 12 Mar 2007: [[Trevor Fairchild]] of Ontario Provincial Police Department discussing [[C4P]] and [[C4M]], both add-ons to [[EnCase]]
+
* [http://www.milincorporated.com/a3_index.dat.html What is in Index.dat files]
* 18 Mar 2007: [[Tony Hogeveen]] of DeepSpar Date Recovery Systems
+
* [http://www.foundstone.com/us/pdf/wp_index_dat.pdf Detailed analysis of index.dat file format]
* 25 Mar 2007: Shmoocon broadcast
+
* [http://downloads.sourceforge.net/sourceforge/libmsiecf/MSIE_Cache_File_format.pdf MSIE Cache File (index.dat) format specification]  
 +
* [http://www.forensicblog.org/2009/09/10/the-meaning-of-leak-records/ The Meaning of LEAK records]
 +
* [http://www.tzworks.net/prototype_page.php?proto_id=6 Windows 'index.dat' Parser] Free tool that can be run on Windows, Linux or Mac OS-X.
  
 
+
[[Category:File Formats]]
* 1 Apr 2007: [[Kevin Smith]] from LTU Technologies about [[Image Seeker]]
+

Revision as of 03:43, 17 May 2012

Information icon.png

Please help to improve this article by expanding it.
Further information might be found on the discussion page.

Internet Explorer as of version 4 up to version 9 stores the web browsing history in files named index.dat. The files contain multiple records.

MSIE version 3 probably uses similar records in its History (Cache) files.

File Locations

Internet Explorer history files keep a record of URLs that the browser has visited, cookies that were created by these sites, and any temporary internet files that were downloaded by the site visit. As a result, Internet Explorer history files are kept in several locations. Regardless of the information stored in the file, the file is named index.dat.

On Windows 95/98 these files were located in the following locations:

%systemdir%\Temporary Internet Files\Content.ie5
%systemdir%\Cookies
%systemdir%\History\History.ie5

On Windows 2000/XP the file locations have changed:

%systemdir%\Documents and Settings\%username%\Local Settings\Temporary Internet Files\Content.ie5
%systemdir%\Documents and Settings\%username%\Cookies
%systemdir%\Documents and Settings\%username%\Local Settings\History\history.ie5

Internet Explorer also keeps daily, weekly, and monthly history logs that will be located in subfolders of %systemdir%\Documents and Settings\%username%\Local Settings\History\history.ie5. The folders will be named MSHist<two-digit number><starting four-digit year><starting two-digit month><starting two-digit day><ending four-digit year><ending two-digit month><ending two-digit day>. For example, the folder containing data from March 26, 2008 to March 27, 2008 might be named MSHist012008032620080327.

Note that not every file named index.dat is a MSIE History (Cache) file.

File Header

Every version of Internet Explorer since Internet Explorer 5 has used the same structure for the file header and the individual records. Internet Explorer history files begin with:

43 6c 69 65 6e 74 20 55 72 6c 43 61 63 68 65 20 4d 4d 46 20 56 65 72 20 35 2e 32

Which represents the ascii string "Client UrlCache MMF Ver 5.2"

The next field in the file header starts at byte offset 28 and is a four byte representation of the file size. The number will be stored in little-endian format so the numbers must actually be reversed to calculate the value.

Also of interest in the file header is the location of the cache directories. In the URL records the cache directories are given as a number, with one representing the first cache directory, two representing the second and so on. The names of the cache directories are kept at byte offset 64 in the file. Each directory entry is 12 bytes long of which the first eight bytes contain the directory name.

Allocation bitmap

The IE History File contains an allocation bitmap starting from offset 0x250 to 0x4000.

Record Formats

Every record has a similar header that consists of 8 bytes.

typedef struct _RECORD_HEADER {
  /* 000 */ char        Signature[4];
  /* 004 */ uint32_t    NumberOfBlocksInRecord;
} RECORD_HEADER;
The size of the record can be determined from the number of blocks in the record; per default the block size is 128 bytes. Therefore, a length of
05 00 00 00
would indicate five blocks (because the number is stored in little-endian format) of 128 bytes for a total record length of 640 bytes. Note that even for allocated records the number of blocks value cannot be fully relied upon.

The blocks that make up a record can have slack space.

Currently 4 types of records are known:

  • URL
  • REDR
  • HASH
  • LEAK

Note that the location and filename strings are stored in the local codepage, normally these strings will only use the ASCII character set. Chinese versions of Windows are known to also use extended characters as well.

URL Records

These records indicate URIs that were actually requested. They contain the location and additional data like the web server's HTTP response. They begin with the header, in hexadecimal:

55 52 4C 20

This corresponds to the string URL followed by a space.

The definition for the structure in C99 format:

typedef struct _URL_RECORD_HEADER {
  /* 000 */ char        Signature[4];
  /* 004 */ uint32_t    AmountOfBlocksInRecord;
  /* 008 */ FILETIME    LastModified;
  /* 010 */ FILETIME    LastAccessed;
  /* 018 */ FATTIME     Expires;
  /* 01c */ 
  // Not finished yet
} URL_RECORD_HEADER;
typedef struct _FILETIME {
  /* 000 */ uint32_t    lower;
  /* 004 */ uint32_t    upper;
} FILETIME;
typedef struct _FATTIME {
  /* 000 */ uint16_t    date;
  /* 002 */ uint16_t    time;
} FATTIME;

The actual interpretation of the "LastModified" and "LastAccessed" fields depends on the type of history file in which the record is contained. As a matter of fact, Internet Explorer uses three different types of history files, namely Daily History, Weekly History, and Main History. Other "index.dat" files are used to store cached copies of visited pages and cookies. The information concerning how to intepret the dates of these different files can be found on Capt. Steve Bunting's web page at the University of Delaware Computer Forensics Lab (http://www.stevebunting.org/udpd4n6/forensics/index_dat2.htm). Please be aware that most free and/or open source index.dat parsing programs, as well as quite a few commercial forensic tools, are not able to correctly interpret the above dates. More specifically, they interpret all the time and dates as if the records were contained into a Daily History file regardless of the actual type of the file they are stored in.

REDR Records

REDR records are very simple records. They simply indicate that the browser was redirected to another site. REDR records always start with the string REDR (0x52 45 44 52). The next four bytes are the size of the record in little endian format. The size will indicate the number 128 byte blocks.

At offset 8 from the start of the REDR record is an unknown data field. It has been confirmed that this is not a date field.

16 bytes into the REDR record is the URL that was visited in a null-terminated string. After the URL, the REDR record appears to be padded with zeros until the end of the 128 byte block.

HASH Records

LEAK Records

The exact purpose of LEAK records remains unknown, however research performed by Mike Murr suggests that LEAK records are created when the machine attempts to delete records from the history file while a corresponding Temporary Internet File (TIF) is held open and cannot be deleted.

External Links