Difference between pages "Document Metadata Extraction" and "Forensic Live CD issues"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
(Images)
 
(Root file system spoofing)
 
Line 1: Line 1:
Here are tools that will extract metadata from document files.
+
== The problem ==
  
=Office Files=
+
[[Tools#Forensics_Live_CDs | Forensic Linux Live CD distributions]] are widely used during computer forensic investigations. Currently, many vendors of such Live CD distributions spread false claims that their distributions "do not touch anything", "write protect everything" and so on. Community-developed distributions are not exception here, unfortunately. Finally, it turns out that many forensic Linux Live CD distributions are not tested properly and there are no suitable test cases developed.
  
; [[antiword]]
+
== Another side of the problem ==
: http://www.winfield.demon.nl/
+
  
; [[catdoc]]
+
Another side of the problem of insufficient testing of forensic Live CD distributions is that many users do not know what happens "under the hood" of such distributions and cannot adequately test them.
: http://www.45.free.net/~vitus/software/catdoc/
+
  
; [[laola]]
+
=== Example ===
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html
+
  
; [[word2x]]
+
For example, [http://forensiccop.blogspot.com/2009/10/forensic-cop-journal-13-2009.html ''Forensic Cop Journal'' (Volume 1(3), Oct 2009)] describes a test case when an Ext3 file system was mounted using "-o ro" mount flag as a way to write protect the data. The article says that all tests were successful (i.e. no data modification was found after unmounting the file system), but it is known that damaged (i.e not properly unmounted) Ext3 file systems cannot be write protected using only "-o ro" mount flags (write access will be enabled during file system recovery).
: http://word2x.sourceforge.net/
+
  
; [[wvWare]]
+
And the question is: will many users test damaged Ext3 file system (together with testing the clean one) when validating their favourite forensic Live CD distribution? My answer is "no", because many users are unaware of such traits.
: http://wvware.sourceforge.net/
+
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.
+
  
=PDF Files=
+
== Problems ==
  
; [[xpdf]]
+
Here is a list of common problems of forensic Linux Live CD distributions that can be used by developers and users for testing purposes. Each problem is followed by an up to date list of distributions affected.
: http://www.foolabs.com/xpdf/
+
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.
+
  
 +
=== Journaling file systems updates ===
  
(See [[PDF]])
+
When mounting (and unmounting) several journaling file systems with only "-o ro" mount flag a different number of data writes may occur. Here is a list of such file systems:
  
=Images=
+
{| class="wikitable" border="1"
 +
|-
 +
!  File system
 +
!  When data writes happen
 +
!  Notes
 +
|-
 +
|  Ext3
 +
|  File system requires journal recovery
 +
|  To disable recovery: use "noload" flag, or use "ro,loop" flags, or use "ext2" file system type
 +
|-
 +
|  Ext4
 +
|  File system requires journal recovery
 +
|  To disable recovery: use "noload" flag, or use "ro,loop" flags, or use "ext2" file system type
 +
|-
 +
|  ReiserFS
 +
|  Always
 +
|  "nolog" flag does not work (see ''man mount''). To disable journal updates: use "ro,loop" flags
 +
|-
 +
|  XFS
 +
|  Always
 +
|  "norecovery" flag does not help. To disable data writes: use "ro,loop" flags. The bug was fixed in recent 2.6 kernels.
 +
|}
  
; [[jhead]]
+
Incorrect mount flags can be used to mount a file system on evidentiary media during the boot process or during the file system preview process. As described above, this may result in modification of a file system's data. For example, several Ubuntu-based forensic Linux Live CD distributions mount Ext3/4 file systems on fixed media (e.g. hard drives) during execution of ''initrd'' scripts (these scripts mount every supported file system type on every supported media type using only "-o ro" flag in order to find a root file system image).
: http://www.sentex.net/~mwandel/jhead/
+
: Displays or modifies [[Exif]] data in [[JPEG]] files.
+
  
; [[vinetto]]
+
List of distributions that recover Ext3 (and sometimes Ext4) file systems during the boot:
: http://vinetto.sourceforge.net/
+
: Examines [[Thumbs.db]] files.
+
  
;[[libexif]]
+
{| class="wikitable" border="1"
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library
+
|-
 +
!  Distribution
 +
!  Version
 +
|-
 +
|  Helix3
 +
|  2009R1
 +
|-
 +
|  SMART Linux (Ubuntu)
 +
|  2010-01-20
 +
|-
 +
|  FCCU GNU/Linux Forensic Boot CD
 +
|  12.1
 +
|-
 +
|  SPADA
 +
|  4
 +
|}
  
; [[Adroit Photo Forensics]]
+
=== Root file system spoofing ===
: http://digital-assembly.com/products/adroit-photo-forensics/
+
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.
+
  
=General=
+
Lets look at Casper scripts (that are used by initrd to search for a root file system image). Here is a function (see ''scripts/casper'' in the initrd image of Ubuntu-based forensic Linux Live CD distributions) that is used to search for a root file system image during the early stage of boot process:
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.
+
  
; [[Metadata Extraction Tool]]
+
find_livefs() {
: "Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others."
+
    timeout="${1}"
: http://meta-extractor.sourceforge.net/
+
    # first look at the one specified in the command line
 +
    if [ ! -z "${LIVEMEDIA}" ]; then
 +
        if check_dev "null" "${LIVEMEDIA}" "skip_uuid_check"; then
 +
            return 0
 +
        fi
 +
    fi
 +
    # don't start autodetection before timeout has expired
 +
    if [ -n "${LIVEMEDIA_TIMEOUT}" ]; then
 +
        if [ "${timeout}" -lt "${LIVEMEDIA_TIMEOUT}" ]; then
 +
            return 1
 +
        fi
 +
    fi
 +
    # or do the scan of block devices
 +
    for sysblock in $(echo /sys/block/* | tr ' ' '\n' | grep -v loop | grep -v ram); do
 +
        devname=$(sys2dev "${sysblock}")
 +
        fstype=$(get_fstype "${devname}")
 +
        if /lib/udev/cdrom_id ${devname} > /dev/null; then
 +
            if check_dev "null" "${devname}" ; then
 +
                return 0
 +
            fi
 +
        elif is_nice_device "${sysblock}" ; then
 +
            for dev in $(subdevices "${sysblock}"); do
 +
                if check_dev "${dev}" ; then
 +
                    return 0
 +
                fi
 +
            done
 +
        elif [ "${fstype}" = "squashfs" -o \
 +
                "${fstype}" = "ext3" -o \
 +
                "${fstype}" = "ext2" ]; then
 +
            # This is an ugly hack situation, the block device has
 +
            # an image directly on it.  It's hopefully
 +
            # casper, so take it and run with it.
 +
            ln -s "${devname}" "${devname}.${fstype}"
 +
            echo "${devname}.${fstype}"
 +
            return 0
 +
        fi
 +
    done
 +
    return 1
 +
}
  
; [[Metadata Assistant]]
+
=== Swap space activation ===
: http://www.payneconsulting.com/products/metadataent/
+
  
; [[hachoir|hachoir-metadata]]
+
=== Incorrect automount policy for removable media ===
: Extraction tool, part of '''[[Hachoir]]''' project
+
  
; [[file]]
+
=== Incorrect write-blocking approach ===
: The UNIX '''file''' program can extract some metadata
+
  
; [[GNU libextractor]]
+
=== Software RAID (Linux RAID) activation ===
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata
+
 
+
; [[Directory Lister Pro]]
+
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.
+
: http://www.krksoft.com
+
 
+
[[Category:Tools]]
+

Revision as of 18:30, 1 February 2010

The problem

Forensic Linux Live CD distributions are widely used during computer forensic investigations. Currently, many vendors of such Live CD distributions spread false claims that their distributions "do not touch anything", "write protect everything" and so on. Community-developed distributions are not exception here, unfortunately. Finally, it turns out that many forensic Linux Live CD distributions are not tested properly and there are no suitable test cases developed.

Another side of the problem

Another side of the problem of insufficient testing of forensic Live CD distributions is that many users do not know what happens "under the hood" of such distributions and cannot adequately test them.

Example

For example, Forensic Cop Journal (Volume 1(3), Oct 2009) describes a test case when an Ext3 file system was mounted using "-o ro" mount flag as a way to write protect the data. The article says that all tests were successful (i.e. no data modification was found after unmounting the file system), but it is known that damaged (i.e not properly unmounted) Ext3 file systems cannot be write protected using only "-o ro" mount flags (write access will be enabled during file system recovery).

And the question is: will many users test damaged Ext3 file system (together with testing the clean one) when validating their favourite forensic Live CD distribution? My answer is "no", because many users are unaware of such traits.

Problems

Here is a list of common problems of forensic Linux Live CD distributions that can be used by developers and users for testing purposes. Each problem is followed by an up to date list of distributions affected.

Journaling file systems updates

When mounting (and unmounting) several journaling file systems with only "-o ro" mount flag a different number of data writes may occur. Here is a list of such file systems:

File system When data writes happen Notes
Ext3 File system requires journal recovery To disable recovery: use "noload" flag, or use "ro,loop" flags, or use "ext2" file system type
Ext4 File system requires journal recovery To disable recovery: use "noload" flag, or use "ro,loop" flags, or use "ext2" file system type
ReiserFS Always "nolog" flag does not work (see man mount). To disable journal updates: use "ro,loop" flags
XFS Always "norecovery" flag does not help. To disable data writes: use "ro,loop" flags. The bug was fixed in recent 2.6 kernels.

Incorrect mount flags can be used to mount a file system on evidentiary media during the boot process or during the file system preview process. As described above, this may result in modification of a file system's data. For example, several Ubuntu-based forensic Linux Live CD distributions mount Ext3/4 file systems on fixed media (e.g. hard drives) during execution of initrd scripts (these scripts mount every supported file system type on every supported media type using only "-o ro" flag in order to find a root file system image).

List of distributions that recover Ext3 (and sometimes Ext4) file systems during the boot:

Distribution Version
Helix3 2009R1
SMART Linux (Ubuntu) 2010-01-20
FCCU GNU/Linux Forensic Boot CD 12.1
SPADA 4

Root file system spoofing

Lets look at Casper scripts (that are used by initrd to search for a root file system image). Here is a function (see scripts/casper in the initrd image of Ubuntu-based forensic Linux Live CD distributions) that is used to search for a root file system image during the early stage of boot process:

find_livefs() {
    timeout="${1}"
    # first look at the one specified in the command line
    if [ ! -z "${LIVEMEDIA}" ]; then
        if check_dev "null" "${LIVEMEDIA}" "skip_uuid_check"; then
            return 0
        fi
    fi
    # don't start autodetection before timeout has expired
    if [ -n "${LIVEMEDIA_TIMEOUT}" ]; then
        if [ "${timeout}" -lt "${LIVEMEDIA_TIMEOUT}" ]; then
            return 1
        fi
    fi
    # or do the scan of block devices
    for sysblock in $(echo /sys/block/* | tr ' ' '\n' | grep -v loop | grep -v ram); do
        devname=$(sys2dev "${sysblock}")
        fstype=$(get_fstype "${devname}")
        if /lib/udev/cdrom_id ${devname} > /dev/null; then
            if check_dev "null" "${devname}" ; then
                return 0
            fi
        elif is_nice_device "${sysblock}" ; then
            for dev in $(subdevices "${sysblock}"); do
                if check_dev "${dev}" ; then
                    return 0
                fi
            done
        elif [ "${fstype}" = "squashfs" -o \
                "${fstype}" = "ext3" -o \
                "${fstype}" = "ext2" ]; then
            # This is an ugly hack situation, the block device has
            # an image directly on it.  It's hopefully
            # casper, so take it and run with it.
            ln -s "${devname}" "${devname}.${fstype}"
            echo "${devname}.${fstype}"
            return 0
        fi
    done
    return 1
}

Swap space activation

Incorrect automount policy for removable media

Incorrect write-blocking approach

Software RAID (Linux RAID) activation