Difference between pages "Virtual machine" and "Document Metadata Extraction"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
(From the linux command prompt)
 
(Office Files: Evidence Center can handle metadata for MS/Open office files)
 
Line 1: Line 1:
= Creating a VM instance file from a forensic image =
+
Here are tools that will extract metadata from document files.
  
There are a number of ways to convert forensic image to a VM instance.  At present, this article provides a series of tools that can convert images to VMDK files.
+
=Office Files=
+
== Creating a VMDK file from a forensic image ==  
+
  
=== Linux tools as included in SIFT ===
+
; [[antiword]]
 +
: http://www.winfield.demon.nl/
  
Via the SIFT workstation (free), use the following steps:
+
; [[Belkasoft]] Evidence Center
 +
: http://belkasoft.com/
 +
: Extracts metadata from various [[Microsoft]] Office files (both 97-2003 and 2007-2013 formats), as well as Open Office documents. Besides, can extract plain texts (combining all texts from all XLS/XLSX/ODS pages and PPT/PPTX/ODP slides) and embedded objects. For pictures, embedded into a document, the tool can visualize them all right in its user interface.
  
1.open a terminal window
+
; [[catdoc]]
2.sudo su
+
: http://www.45.free.net/~vitus/software/catdoc/
3.mkdir /mnt/ewf1
+
4.mount_ewf.py (Encase Image file path) /mnt/ewf1
+
5.qemu-img convert /mnt/ewf1/(encase image file name) -O vmdk (give_a_name).vmdk
+
  
=== Paladin 4 ===
+
; [[laola]]
 +
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html
  
- Paladin 4 (free) can convert DD and E01 images to VDMK as well.
+
; [[word2x]]
 +
: http://word2x.sourceforge.net/
  
=== Live View ===
+
; [[wvWare]]
 +
: http://wvware.sourceforge.net/
 +
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.
  
[http://liveview.sourceforge.net/ Live View] (opensource) is reported as not reliable, but it does work with some images.
+
; [[Outside In]]
 +
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html
 +
: Originally developed by Stellant, supports hundreds of file types.
  
=== EnCase ===
+
; [[FI Tools]]
 +
: http://forensicinnovations.com/
 +
: More than 100 file types.
  
use EnCase (Commercial) to mount the E01 image as an emulated disk (you need to have the Physical Disk Emulator (“PDE”) module installed), then VMware to create virtual machine from the emulated physical disk. Guidance software has a good guide on how to do this in their support portal.  
+
=StickyNotes=
 +
; StickyNotes Parser
 +
Windows 7 StickyNotes follow the [http://msdn.microsoft.com/en-us/library/dd942138%28v=prot.13%29.aspx MS Compound Document binary format]; the StickyNotes Parser extracts metadata (time stamps) from the OLE format, including the text content (not the RTF contents) of the notes themselves. Sn.exe also extracts the modified time of the Root Entry to the Compound Document; all times are displayed in UTC format
 +
:http://code.google.com/p/winforensicaanalysis/downloads/list
  
Note – EnCase v7 hasn't been proven to support this, just EnCase 6
+
=PDF Files=
  
=== VFC - Virtual Forensic Computing ===
+
; [[xpdf]]
 +
: http://www.foolabs.com/xpdf/
 +
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.
  
VFC (Commercial) is reportedly very good, but troubles with booting Windows 2003 servers have been reported. It's a little pricey ($1350 for a Corp license) but per one user it WORKS the vast majority of the time and the developer provides excellent support.
 
  
== Creating a KVM image ==
+
(See [[PDF]])
  
=== From the linux command prompt ===
+
=Images=
kvm -hda myimage.dd
+
  
memory can be set as an option, cd drives can be presented, etc., and there is an option equivalent to the VMware non persistent mode.
+
; [[Exiftool]]
 +
: http://www.sno.phy.queensu.ca/~phil/exiftool/
 +
: Free, cross-platform tool to extract metadata from many different file formats. Also supports writing
  
Warning: It has been determined that using kvm's non-persistent mode can still result in an altered image. Always, always, always work from a copy.
+
; [[jhead]]
 +
: http://www.sentex.net/~mwandel/jhead/
 +
: Displays or modifies [[Exif]] data in [[JPEG]] files.
  
= Using the VMDK file =
+
; [[vinetto]]
 +
: http://vinetto.sourceforge.net/
 +
: Examines [[Thumbs.db]] files.
  
Once you have the VMDK file, you can create a virtual machine in
+
;[[libexif]]
Virtualbox or VMware Workstation and use the VMDK as an existing hard
+
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library
disk for the virtual machine. I prefer to use VMware Workstation
+
because it has a non persistent mode which allows you to write changes
+
to a cache file rather than the forensic image itself thus maintaining
+
integrity.
+
  
= External Links =
+
; [[Adroit Photo Forensics]]
* [http://www.myfixlog.com/fix.php?fid=35 How to Create a Virtual Machine from a Raw Hard Drive Image]
+
: http://digital-assembly.com/products/adroit-photo-forensics/
 +
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.
 +
 
 +
; Exif Viewer
 +
: http://araskin.webs.com/exif/exif.html
 +
: Add-on for Firefox and Thunderbird that displays various [[JPEG]]/JPG metadata in local and remote images.
 +
 
 +
; exiftags
 +
: http://johnst.org/sw/exiftags/
 +
: open source utility to parse and edit [[exif]] data in [[JPEG]] images. Found in many Debian based distributions.
 +
 
 +
; exifprobe
 +
: http://www.virtual-cafe.com/~dhh/tools.d/exifprobe.d/exifprobe.html
 +
: Open source utility that reads [[exif]] data in [[JPEG]] and some "RAW" image formats. Found in many Debian based distributions.
 +
 
 +
; Exiv2
 +
: http://www.exiv2.org
 +
: Open source C++ library and command line tool for reading and writing metadata in various image formats. Found in almost every GNU/Linux distribution
 +
 
 +
; pngtools
 +
: http://www.stillhq.com/pngtools/
 +
: Open source suite of commands (pnginfo, pngchunks, pngchunksdesc) that reads metadata found in [[PNG]] files. Found in many Debian based distributions.
 +
 
 +
; pngmeta
 +
: http://sourceforge.net/projects/pmt/files/
 +
: Open source command line tool that extracts metadata from [[PNG]] images. Found in many Debian based distributions.
 +
 
 +
=General=
 +
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.
 +
 
 +
; [[Metadact-e]]
 +
: "Patented server-based metadata cleaning software that previews, cleans and converts documents in Microsoft Outlook, Web Access email, tablets and smartphones, as well as desktop-based documents."
 +
: http://www.litera.com/Products/Metadact-e.aspx
 +
 
 +
; [[Metadata Extraction Tool]]
 +
: "Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others."
 +
: http://meta-extractor.sourceforge.net/
 +
 
 +
; [[Metadata Assistant]]
 +
: http://www.payneconsulting.com/products/metadataent/
 +
 
 +
; [[hachoir|hachoir-metadata]]
 +
: Extraction tool, part of '''[[Hachoir]]''' project
 +
 
 +
; [[file]]
 +
: The UNIX '''file''' program can extract some metadata
 +
 
 +
; [[GNU libextractor]]
 +
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata
 +
 
 +
; [[Directory Lister Pro]]
 +
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.
 +
: http://www.krksoft.com
 +
 
 +
[[Category:Tools]]

Revision as of 15:20, 2 July 2013

Here are tools that will extract metadata from document files.

Office Files

antiword
http://www.winfield.demon.nl/
Belkasoft Evidence Center
http://belkasoft.com/
Extracts metadata from various Microsoft Office files (both 97-2003 and 2007-2013 formats), as well as Open Office documents. Besides, can extract plain texts (combining all texts from all XLS/XLSX/ODS pages and PPT/PPTX/ODP slides) and embedded objects. For pictures, embedded into a document, the tool can visualize them all right in its user interface.
catdoc
http://www.45.free.net/~vitus/software/catdoc/
laola
http://user.cs.tu-berlin.de/~schwartz/pmh/index.html
word2x
http://word2x.sourceforge.net/
wvWare
http://wvware.sourceforge.net/
Extracts metadata from various Microsoft Word files (doc). Can also convert doc files to other formats such as HTML or plain text.
Outside In
http://www.oracle.com/technology/products/content-management/oit/oit_all.html
Originally developed by Stellant, supports hundreds of file types.
FI Tools
http://forensicinnovations.com/
More than 100 file types.

StickyNotes

StickyNotes Parser

Windows 7 StickyNotes follow the MS Compound Document binary format; the StickyNotes Parser extracts metadata (time stamps) from the OLE format, including the text content (not the RTF contents) of the notes themselves. Sn.exe also extracts the modified time of the Root Entry to the Compound Document; all times are displayed in UTC format

http://code.google.com/p/winforensicaanalysis/downloads/list

PDF Files

xpdf
http://www.foolabs.com/xpdf/
pdfinfo (part of the xpdf package) displays some metadata of PDF files.


(See PDF)

Images

Exiftool
http://www.sno.phy.queensu.ca/~phil/exiftool/
Free, cross-platform tool to extract metadata from many different file formats. Also supports writing
jhead
http://www.sentex.net/~mwandel/jhead/
Displays or modifies Exif data in JPEG files.
vinetto
http://vinetto.sourceforge.net/
Examines Thumbs.db files.
libexif
http://sourceforge.net/projects/libexif EXIF tag Parsing Library
Adroit Photo Forensics
http://digital-assembly.com/products/adroit-photo-forensics/
Displays meta data and uses date and camera meta-data for grouping, timelines etc.
Exif Viewer
http://araskin.webs.com/exif/exif.html
Add-on for Firefox and Thunderbird that displays various JPEG/JPG metadata in local and remote images.
exiftags
http://johnst.org/sw/exiftags/
open source utility to parse and edit exif data in JPEG images. Found in many Debian based distributions.
exifprobe
http://www.virtual-cafe.com/~dhh/tools.d/exifprobe.d/exifprobe.html
Open source utility that reads exif data in JPEG and some "RAW" image formats. Found in many Debian based distributions.
Exiv2
http://www.exiv2.org
Open source C++ library and command line tool for reading and writing metadata in various image formats. Found in almost every GNU/Linux distribution
pngtools
http://www.stillhq.com/pngtools/
Open source suite of commands (pnginfo, pngchunks, pngchunksdesc) that reads metadata found in PNG files. Found in many Debian based distributions.
pngmeta
http://sourceforge.net/projects/pmt/files/
Open source command line tool that extracts metadata from PNG images. Found in many Debian based distributions.

General

These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.

Metadact-e
"Patented server-based metadata cleaning software that previews, cleans and converts documents in Microsoft Outlook, Web Access email, tablets and smartphones, as well as desktop-based documents."
http://www.litera.com/Products/Metadact-e.aspx
Metadata Extraction Tool
"Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others."
http://meta-extractor.sourceforge.net/
Metadata Assistant
http://www.payneconsulting.com/products/metadataent/
hachoir-metadata
Extraction tool, part of Hachoir project
file
The UNIX file program can extract some metadata
GNU libextractor
http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata
Directory Lister Pro
Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.
http://www.krksoft.com