Difference between revisions of "Metadata"

From ForensicsWiki
Jump to: navigation, search
m (File types that support metadata and extraction tools)
(Various fixes.)
Line 1: Line 1:
Metadata is data about data. Metadata plays a number of important roles in computer forensics:
+
'''Metadata''' is data about data. Metadata plays a number of important roles in [[computer forensics]]:
 
* It can provide corroborating information about the document data itself.
 
* It can provide corroborating information about the document data itself.
 
* It can reveal information that someone tried to hide, delete, or obscure.
 
* It can reveal information that someone tried to hide, delete, or obscure.
Line 7: Line 7:
  
 
=Kinds of Metadata=
 
=Kinds of Metadata=
 +
 
Here are some kinds of metadata that are interesting in computer forensics:
 
Here are some kinds of metadata that are interesting in computer forensics:
* File system metadata (e.g. MAC times, access control lists, etc.)
+
* [[File system]] metadata (e.g. [[MAC times]], [[access control lists]], etc.)
* Digital image metadata. Although information such as the image size and number of colors are techncially metadata, JPEG and file formats store additional data about the photo or the device that acquired it.
+
* Digital image metadata. Although information such as the image size and number of colors are technically metadata, [[JPEG]] and other file formats store additional data about the photo or the device that acquired it.
  
 
=File types that support metadata and extraction tools=
 
=File types that support metadata and extraction tools=
 +
 
Below are some common data and metadata formats, the files in which they are found, and a collection of tools that can be used to extract information.
 
Below are some common data and metadata formats, the files in which they are found, and a collection of tools that can be used to extract information.
  
 
+
; [[EXIF]] ([[JPEG]] and [[TIFF]] image files; Music Files)
;EXIF (JPEG and TIFF Image files; Music Files)
+
: The [[Exchangeable Image File]] format describes a format for a block of data that can be embedded into JPEG and TIFF image files, as well as [[RIFF WAVE]] audio files. Information includes date and time information, camera settings, location information, textual descriptions, and copyright information.
: The Exchangeable Image File format describes a format for a block of data that can be embedded into JPEG and TIFF image files, as well as RIFF WAVE audio files. Information includes date and time information, camera settings, locaiton information, textual descriptions, and copyright information. For more information, see [http://www.exif.org] and the [http://en.wikipedia.org/wiki/Exchangeable_image_file_format Wikipedia entry.]
+
 
:* [http://pel.sourceforge.net/ PEL: PHP Exif Library]
 
:* [http://pel.sourceforge.net/ PEL: PHP Exif Library]
 
:* [http://libexif.sourceforge.net/ LibExif] (C)
 
:* [http://libexif.sourceforge.net/ LibExif] (C)
 +
:* [http://www.drewnoakes.com/code/exif/ Metadata extraction in Java]
  
 
+
; [[ID3]] ([[MP3]] files)
;ID3 (MP3 files)
+
: Implemented as a small block of data stored at the end of MP3 files. [[ID3v1]] is a 128-byte block in a specified format allowing 30 bytes for song, artist and album, 4 bytes for year, 30 bytes for comment, and 1 byte for genre. [[ID3v1.1]] adds a track number. [[ID3v2]] is a general container structure. For more information, see [http://www.id3.org/].
: Implemented as a small block of data stored at the end of MP3 files. ID3v1 is a 128-byte block in a specified format allowing 30 bytes for slong, artist and album, 4 bytes for year, 30 bytes for comment, and 1 byte for genre. ID3v1.1 adds a track number. ID3v2 is a general container structure. For more information, see [http://www.id3.org/].
+
 
:* [http://id3lib.sourceforge.net/ id3lib], a widely-used open source C/C++ ID3 implementation.
 
:* [http://id3lib.sourceforge.net/ id3lib], a widely-used open source C/C++ ID3 implementation.
 
:* [http://www.vdheide.de/projects.html Java library MP3]
 
:* [http://www.vdheide.de/projects.html Java library MP3]
Line 28: Line 29:
 
:* [http://search.cpan.org/dist/MPEG-ID3v2Tag/ MPEG::ID3v2Tag] (Perl)
 
:* [http://search.cpan.org/dist/MPEG-ID3v2Tag/ MPEG::ID3v2Tag] (Perl)
  
;Microsoft OLE 2
+
; [[Microsoft]] [[OLE 2]]
:Microsoft Office document files contain a huge amount of metadata. They are created as OLE 2 files. Here are some tools for processing them:  
+
: Microsoft Office document files contain a huge amount of metadata. They are created as OLE 2 files. Here are some tools for processing them:  
 
:* [http://jakarta.apache.org/poi/index.html Jakarta POI] Open Source implementation in Java.
 
:* [http://jakarta.apache.org/poi/index.html Jakarta POI] Open Source implementation in Java.
  
 
+
; [[TIFF]]
;TIFF
+
: The [[Tagged Image File Format]] allows one or more images to be bundled in a single file. Multiple [[compression]] formats are supported. [[EXIF]] files can be stored inside TIFFs.
: The Tagged Image File Format allows one or more images to be bundled in a single file. Multiple compression formats are supported. EXIF files can be stored inside TIFFs.
+
:* [http://www.remotesensing.org/libtiff/ LibTIFF]
:* [http://www.remotesensing.org/libtiff/ LibTIFF]  
+
 
:* [http://www.awaresystems.be/imaging/tiff/faq.html TIFF FAQ]
 
:* [http://www.awaresystems.be/imaging/tiff/faq.html TIFF FAQ]
  
 
=External Links=
 
=External Links=
Wikipedia has a nice [http://en.wikipedia.org/wiki/Metadata entry on metadata].
 
  
[http://www.drewnoakes.com/code/exif/ Metadata extraction in Java]
+
* [http://en.wikipedia.org/wiki/Metadata Wikipedia: Metadata].

Revision as of 18:40, 24 March 2006

Metadata is data about data. Metadata plays a number of important roles in computer forensics:

  • It can provide corroborating information about the document data itself.
  • It can reveal information that someone tried to hide, delete, or obscure.
  • It can be used to automatically correlate documents from different sources.

Since metadata is fundamentally data, it suffers all of the data quality and pedigre issues as any other form of data. Nevertheless, because metadata isn't generally visible unless you use a special tool, more skill is required to alter or otherwise manipulate it.

Kinds of Metadata

Here are some kinds of metadata that are interesting in computer forensics:

  • File system metadata (e.g. MAC times, access control lists, etc.)
  • Digital image metadata. Although information such as the image size and number of colors are technically metadata, JPEG and other file formats store additional data about the photo or the device that acquired it.

File types that support metadata and extraction tools

Below are some common data and metadata formats, the files in which they are found, and a collection of tools that can be used to extract information.

EXIF (JPEG and TIFF image files; Music Files)
The Exchangeable Image File format describes a format for a block of data that can be embedded into JPEG and TIFF image files, as well as RIFF WAVE audio files. Information includes date and time information, camera settings, location information, textual descriptions, and copyright information.
ID3 (MP3 files)
Implemented as a small block of data stored at the end of MP3 files. ID3v1 is a 128-byte block in a specified format allowing 30 bytes for song, artist and album, 4 bytes for year, 30 bytes for comment, and 1 byte for genre. ID3v1.1 adds a track number. ID3v2 is a general container structure. For more information, see [1].
Microsoft OLE 2
Microsoft Office document files contain a huge amount of metadata. They are created as OLE 2 files. Here are some tools for processing them:
TIFF
The Tagged Image File Format allows one or more images to be bundled in a single file. Multiple compression formats are supported. EXIF files can be stored inside TIFFs.

External Links