Difference between pages "Document Metadata Extraction" and "Chrome Disk Cache Format"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
(Office Files: Evidence Center can handle metadata for MS/Open office files)
 
(See Also)
 
Line 1: Line 1:
Here are tools that will extract metadata from document files.
+
{{expand}}
  
=Office Files=
+
== Cache files ==
 +
The cache is stored in multiple:
 +
{| class="wikitable"
 +
|-
 +
! Filename
 +
! Description
 +
|-
 +
| index
 +
| The index file
 +
|-
 +
| data_#
 +
| Data block files
 +
|-
 +
| f_######
 +
| (Separate) data stream file
 +
|}
  
; [[antiword]]
+
== Cache address ==
: http://www.winfield.demon.nl/
+
The cache address is 4 bytes in size and consists of:  
 +
{| class="wikitable"
 +
|-
 +
! offset
 +
! size
 +
! value
 +
! description
 +
|-
 +
| <i>If file type is 0 (Separate file)</i>
 +
|
 +
|
 +
|
 +
|-
 +
| 0.0
 +
| 28 bits
 +
|
 +
| File number <br> The value represents the value of # in f_######
 +
|-
 +
| <i>Else</i>
 +
|
 +
|
 +
|
 +
|-
 +
| 0.0
 +
| 16 bits
 +
|
 +
| Block number
 +
|-
 +
| 2.0
 +
| 8 bits
 +
|
 +
| File number (or file selector) <br> The value represents the value of # in data_#
 +
|-
 +
| 3.0
 +
| 2 bits
 +
|
 +
| Block size <br> The number of contiguous blocks where 0 represents 1 block and 3 represents 4 blocks.
 +
|-
 +
| 3.2
 +
| 2 bits
 +
|
 +
| Reserved
 +
|-
 +
| <i>Common</i>
 +
|
 +
|
 +
|
 +
|-
 +
| 3.4
 +
| 3 bits
 +
|
 +
| File type
 +
|-
 +
| 3.7
 +
| 1 bit
 +
|
 +
| Initialized flag
 +
|}
  
; [[Belkasoft]] Evidence Center
+
=== File types ===
: http://belkasoft.com/
+
{| class="wikitable"
: Extracts metadata from various [[Microsoft]] Office files (both 97-2003 and 2007-2013 formats), as well as Open Office documents. Besides, can extract plain texts (combining all texts from all XLS/XLSX/ODS pages and PPT/PPTX/ODP slides) and embedded objects. For pictures, embedded into a document, the tool can visualize them all right in its user interface.
+
|-
 +
! Value
 +
! Description
 +
|-
 +
| 0
 +
| (Separate) data stream file
 +
|-
 +
| 1
 +
| (Rankings) block data file (36 byte block data file)
 +
|-
 +
| 2
 +
| 256 byte block data file
 +
|-
 +
| 3
 +
| 1024 byte block data file
 +
|-
 +
| 4
 +
| 4096 byte block data file
 +
|-
 +
|
 +
|
 +
|-
 +
| 6
 +
| Unknown; seen on Mac OS  X 0x6f430074
 +
|}
  
; [[catdoc]]
+
==== Examples ====
: http://www.45.free.net/~vitus/software/catdoc/
+
{| class="wikitable"
 +
|-
 +
! Value
 +
! Description
 +
|-
 +
| 0x00000000
 +
| Not initialized
 +
|-
 +
| 0x8000002a
 +
| Data stream file: f_00002a
 +
|-
 +
| 0xa0010003
 +
| Block data file: data_1, block number 3, 1 block of size
 +
|}
  
; [[laola]]
+
== Index file format (index) ==
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html
+
Overview:
 +
* File header
 +
* least recently used (LRU) data (or eviction control data)
 +
* index table
  
; [[word2x]]
+
=== File header ===
: http://word2x.sourceforge.net/
+
*TODO*
  
; [[wvWare]]
+
== Data block file format (data_#) ==
: http://wvware.sourceforge.net/
+
Overview:
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.
+
* File header
 +
* array of blocks
  
; [[Outside In]]
+
=== File header ===
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html
+
*TODO*
: Originally developed by Stellant, supports hundreds of file types.
+
  
; [[FI Tools]]
+
== Data stream ==
: http://forensicinnovations.com/
+
See: [[gzip]]
: More than 100 file types.
+
  
=StickyNotes=
+
== See Also ==
; StickyNotes Parser
+
* [[Google Chrome]]
Windows 7 StickyNotes follow the [http://msdn.microsoft.com/en-us/library/dd942138%28v=prot.13%29.aspx MS Compound Document binary format]; the StickyNotes Parser extracts metadata (time stamps) from the OLE format, including the text content (not the RTF contents) of the notes themselves. Sn.exe also extracts the modified time of the Root Entry to the Compound Document; all times are displayed in UTC format
+
* [[gzip]]
:http://code.google.com/p/winforensicaanalysis/downloads/list
+
  
=PDF Files=
+
== External Links ==
  
; [[xpdf]]
+
[[Category:File Formats]]
: http://www.foolabs.com/xpdf/
+
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.
+
 
+
 
+
(See [[PDF]])
+
 
+
=Images=
+
 
+
; [[Exiftool]]
+
: http://www.sno.phy.queensu.ca/~phil/exiftool/
+
: Free, cross-platform tool to extract metadata from many different file formats. Also supports writing
+
 
+
; [[jhead]]
+
: http://www.sentex.net/~mwandel/jhead/
+
: Displays or modifies [[Exif]] data in [[JPEG]] files.
+
 
+
; [[vinetto]]
+
: http://vinetto.sourceforge.net/
+
: Examines [[Thumbs.db]] files.
+
 
+
;[[libexif]]
+
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library
+
 
+
; [[Adroit Photo Forensics]]
+
: http://digital-assembly.com/products/adroit-photo-forensics/
+
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.
+
 
+
; Exif Viewer
+
: http://araskin.webs.com/exif/exif.html
+
: Add-on for Firefox and Thunderbird that displays various [[JPEG]]/JPG metadata in local and remote images.
+
 
+
; exiftags
+
: http://johnst.org/sw/exiftags/
+
: open source utility to parse and edit [[exif]] data in [[JPEG]] images. Found in many Debian based distributions.
+
 
+
; exifprobe
+
: http://www.virtual-cafe.com/~dhh/tools.d/exifprobe.d/exifprobe.html
+
: Open source utility that reads [[exif]] data in [[JPEG]] and some "RAW" image formats. Found in many Debian based distributions.
+
 
+
; Exiv2
+
: http://www.exiv2.org
+
: Open source C++ library and command line tool for reading and writing metadata in various image formats. Found in almost every GNU/Linux distribution
+
 
+
; pngtools
+
: http://www.stillhq.com/pngtools/
+
: Open source suite of commands (pnginfo, pngchunks, pngchunksdesc) that reads metadata found in [[PNG]] files. Found in many Debian based distributions.
+
 
+
; pngmeta
+
: http://sourceforge.net/projects/pmt/files/
+
: Open source command line tool that extracts metadata from [[PNG]] images. Found in many Debian based distributions.
+
 
+
=General=
+
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.
+
 
+
; [[Metadact-e]]
+
: "Patented server-based metadata cleaning software that previews, cleans and converts documents in Microsoft Outlook, Web Access email, tablets and smartphones, as well as desktop-based documents."
+
: http://www.litera.com/Products/Metadact-e.aspx
+
 
+
; [[Metadata Extraction Tool]]
+
: "Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others."
+
: http://meta-extractor.sourceforge.net/
+
 
+
; [[Metadata Assistant]]
+
: http://www.payneconsulting.com/products/metadataent/
+
 
+
; [[hachoir|hachoir-metadata]]
+
: Extraction tool, part of '''[[Hachoir]]''' project
+
 
+
; [[file]]
+
: The UNIX '''file''' program can extract some metadata
+
 
+
; [[GNU libextractor]]
+
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata
+
 
+
; [[Directory Lister Pro]]
+
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.
+
: http://www.krksoft.com
+
 
+
[[Category:Tools]]
+

Revision as of 14:29, 21 June 2014

Information icon.png

Please help to improve this article by expanding it.
Further information might be found on the discussion page.

Cache files

The cache is stored in multiple:

Filename Description
index The index file
data_# Data block files
f_###### (Separate) data stream file

Cache address

The cache address is 4 bytes in size and consists of:

offset size value description
If file type is 0 (Separate file)
0.0 28 bits File number
The value represents the value of # in f_######
Else
0.0 16 bits Block number
2.0 8 bits File number (or file selector)
The value represents the value of # in data_#
3.0 2 bits Block size
The number of contiguous blocks where 0 represents 1 block and 3 represents 4 blocks.
3.2 2 bits Reserved
Common
3.4 3 bits File type
3.7 1 bit Initialized flag

File types

Value Description
0 (Separate) data stream file
1 (Rankings) block data file (36 byte block data file)
2 256 byte block data file
3 1024 byte block data file
4 4096 byte block data file
6 Unknown; seen on Mac OS X 0x6f430074

Examples

Value Description
0x00000000 Not initialized
0x8000002a Data stream file: f_00002a
0xa0010003 Block data file: data_1, block number 3, 1 block of size

Index file format (index)

Overview:

  • File header
  • least recently used (LRU) data (or eviction control data)
  • index table

File header

  • TODO*

Data block file format (data_#)

Overview:

  • File header
  • array of blocks

File header

  • TODO*

Data stream

See: gzip

See Also

External Links