<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://www.forensicswiki.org/w/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://www.forensicswiki.org/w/api.php?action=feedcontributions&amp;user=Hypertex&amp;feedformat=atom</id>
		<title>Forensics Wiki - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="http://www.forensicswiki.org/w/api.php?action=feedcontributions&amp;user=Hypertex&amp;feedformat=atom"/>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Special:Contributions/Hypertex"/>
		<updated>2013-05-25T09:22:12Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.20.3</generator>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Document_Metadata_Extraction</id>
		<title>Document Metadata Extraction</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Document_Metadata_Extraction"/>
				<updated>2010-10-25T18:39:20Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Images */ added Exiftool&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are tools that will extract metadata from document files.&lt;br /&gt;
&lt;br /&gt;
=Office Files=&lt;br /&gt;
&lt;br /&gt;
; [[antiword]]&lt;br /&gt;
: http://www.winfield.demon.nl/&lt;br /&gt;
&lt;br /&gt;
; [[catdoc]]&lt;br /&gt;
: http://www.45.free.net/~vitus/software/catdoc/&lt;br /&gt;
&lt;br /&gt;
; [[laola]]&lt;br /&gt;
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html&lt;br /&gt;
&lt;br /&gt;
; [[word2x]]&lt;br /&gt;
: http://word2x.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[wvWare]]&lt;br /&gt;
: http://wvware.sourceforge.net/&lt;br /&gt;
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.&lt;br /&gt;
&lt;br /&gt;
; [[Outside In]]&lt;br /&gt;
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html&lt;br /&gt;
: Originally developed by Stellant, supports hundreds of file types.&lt;br /&gt;
&lt;br /&gt;
; [[FI Tools]]&lt;br /&gt;
: http://forensicinnovations.com/&lt;br /&gt;
: More than 100 file types.&lt;br /&gt;
&lt;br /&gt;
=PDF Files=&lt;br /&gt;
&lt;br /&gt;
; [[xpdf]]&lt;br /&gt;
: http://www.foolabs.com/xpdf/&lt;br /&gt;
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(See [[PDF]])&lt;br /&gt;
&lt;br /&gt;
=Images=&lt;br /&gt;
&lt;br /&gt;
; [[Exiftool]]&lt;br /&gt;
: http://www.sno.phy.queensu.ca/~phil/exiftool/&lt;br /&gt;
: Free, cross-platform tool to extract metadata from many different file formats. Also supports writing&lt;br /&gt;
&lt;br /&gt;
; [[jhead]]&lt;br /&gt;
: http://www.sentex.net/~mwandel/jhead/&lt;br /&gt;
: Displays or modifies [[Exif]] data in [[JPEG]] files.&lt;br /&gt;
&lt;br /&gt;
; [[vinetto]]&lt;br /&gt;
: http://vinetto.sourceforge.net/&lt;br /&gt;
: Examines [[Thumbs.db]] files.&lt;br /&gt;
&lt;br /&gt;
;[[libexif]]&lt;br /&gt;
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library&lt;br /&gt;
&lt;br /&gt;
; [[Adroit Photo Forensics]]&lt;br /&gt;
: http://digital-assembly.com/products/adroit-photo-forensics/&lt;br /&gt;
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.&lt;br /&gt;
&lt;br /&gt;
; Exif Viewer&lt;br /&gt;
: http://araskin.webs.com/exif/exif.html&lt;br /&gt;
: Add-on for Firefox and Thunderbird that displays various [[JPEG]]/JPG metadata in local and remote images.&lt;br /&gt;
&lt;br /&gt;
; exiftags&lt;br /&gt;
: http://johnst.org/sw/exiftags/&lt;br /&gt;
: open source utility to parse and edit [[exif]] data in [[JPEG]] images. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; exifprobe&lt;br /&gt;
: http://www.virtual-cafe.com/~dhh/tools.d/exifprobe.d/exifprobe.html&lt;br /&gt;
: Open source utility that reads [[exif]] data in [[JPEG]] and some &amp;quot;RAW&amp;quot; image formats. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; Exiv2&lt;br /&gt;
: http://www.exiv2.org&lt;br /&gt;
: Open source C++ library and command line tool for reading and writing metadata in various image formats. Found in almost every GNU/Linux distribution&lt;br /&gt;
&lt;br /&gt;
; pngtools&lt;br /&gt;
: http://www.stillhq.com/pngtools/&lt;br /&gt;
: Open source suite of commands (pnginfo, pngchunks, pngchunksdesc) that reads metadata found in [[PNG]] files. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; pngmeta&lt;br /&gt;
: http://sourceforge.net/projects/pmt/files/&lt;br /&gt;
: Open source command line tool that extracts metadata from [[PNG]] images. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
=General=&lt;br /&gt;
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Extraction Tool]]&lt;br /&gt;
: &amp;quot;Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.&amp;quot;&lt;br /&gt;
: http://meta-extractor.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Assistant]]&lt;br /&gt;
: http://www.payneconsulting.com/products/metadataent/&lt;br /&gt;
&lt;br /&gt;
; [[hachoir|hachoir-metadata]]&lt;br /&gt;
: Extraction tool, part of '''[[Hachoir]]''' project&lt;br /&gt;
&lt;br /&gt;
; [[file]]&lt;br /&gt;
: The UNIX '''file''' program can extract some metadata&lt;br /&gt;
&lt;br /&gt;
; [[GNU libextractor]]&lt;br /&gt;
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata&lt;br /&gt;
&lt;br /&gt;
; [[Directory Lister Pro]]&lt;br /&gt;
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.&lt;br /&gt;
: http://www.krksoft.com&lt;br /&gt;
&lt;br /&gt;
[[Category:Tools]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Exiftool</id>
		<title>Exiftool</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Exiftool"/>
				<updated>2010-10-25T18:36:54Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: Created page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox_Software |&lt;br /&gt;
  name = Exiftool |&lt;br /&gt;
  maintainer = Phil Harvey |&lt;br /&gt;
  os = {{Cross-platform}} |&lt;br /&gt;
  genre = {{Analysis}} |&lt;br /&gt;
  license = multiple |&lt;br /&gt;
  website = [http://www.sno.phy.queensu.ca/~phil/exiftool/] |&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
Exiftool is a Perl library and a command-line tool that can be used for reading and writing metadata in files. In addition to exif images, Exiftool also supports many other file formats, such as video and audio files as well as Word documents. Note that Exiftool does not support writing metadata to some formats.  While Exiftool can be helpful in a forensic analysis, it is not a forensic tool, nor is it an anti-forensic tool.&lt;br /&gt;
&lt;br /&gt;
=License=&lt;br /&gt;
Exiftool is free and open source release under the same terms as Perl.&lt;br /&gt;
=Supported Files=&lt;br /&gt;
The Exiftool [http://www.sno.phy.queensu.ca/~phil/exiftool/#supported website] has a comprehensive list of the supported file types, and indicates if Exiftool supports reading, writing, and/or creating metadata for each file type.&lt;br /&gt;
Exiftool's specialty is image file formats.  It can extract exif and gps metadata and it can extract Maker Notes from several popular manufacturers.&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Open_Document_Format</id>
		<title>Open Document Format</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Open_Document_Format"/>
				<updated>2010-04-14T00:40:47Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Metadata */  added info about annotations and tracked changes&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''Open Document Format''' (ODF) is an open, XML-based file format standard for word processing documents, spreadsheets, charts, and presentations. The specification was originally developed by Sun Microsystems, but has been standardized by the Organization for the Advancement of Structured Information Standards (OASIS). ODF version 1.0 has been standardized as ISO/IEC 26300:2006. ODF is the primary format for the OpenOffice.org office suite.&lt;br /&gt;
&lt;br /&gt;
=File Extensions=&lt;br /&gt;
The main file extensions for ODF documents are&lt;br /&gt;
* .odt for word processing documents&lt;br /&gt;
* .ods for spreadsheet documents&lt;br /&gt;
* .odp for presentation documents&lt;br /&gt;
* .odb for database documents&lt;br /&gt;
* .odg for graphical documents&lt;br /&gt;
* .odf for mathematical formulae&lt;br /&gt;
&lt;br /&gt;
ODF also supports template files for each type of document.  The 'd' in file extension is replaced by a 't' for template files.&lt;br /&gt;
&lt;br /&gt;
=File Structure=&lt;br /&gt;
An ODF document can be as simple as a single XML file.  However, this is rarely practical. The standard specifies that an ODF file can also be stored as a collection of several subdocuments.  The latter is the most common implementation.&lt;br /&gt;
&lt;br /&gt;
A packaged ODF file will contain, at a minimum, six files and two directories archived into a modified ZIP file. The structure of the basic package is as follows&lt;br /&gt;
&lt;br /&gt;
 |-- META-INF&lt;br /&gt;
 |   `-- manifest.xml&lt;br /&gt;
 |-- Thumbnails&lt;br /&gt;
 |   `-- thumbnail.png&lt;br /&gt;
 |-- content.xml&lt;br /&gt;
 |-- meta.xml&lt;br /&gt;
 |-- mimetype&lt;br /&gt;
 |-- settings.xml&lt;br /&gt;
 `-- styles.xml&lt;br /&gt;
&lt;br /&gt;
Again, this represents a minimal ODF file. The structure can become much more complicated as directories can be added that contain embedded images, macros, and the like. &lt;br /&gt;
&lt;br /&gt;
An important caveat in the structure of the ZIP file is that the first file must be the &amp;quot;mimetype&amp;quot; file and it must not be compressed. [http://www.jejik.com/articles/2010/03/how_to_correctly_create_odf_documents_using_zip/]  The string &amp;quot;mimetype&amp;quot; should appear at position 30 and the actual MIME type at position 38.  This adaptation makes it possible for operating systems to determine the MIME type of a file without relying on the file extension.&lt;br /&gt;
&lt;br /&gt;
==Main Sub-Files==&lt;br /&gt;
&lt;br /&gt;
The '''manifest.xml''' file contains a list of all files in the packages, as well as their media type, path, and any information required for decryption.  The '''content.xml''' file contains the content of the document (e.g., the text in a word processing document), while the '''styles.xml''' file contains the information on how the content is to be styled.  The '''settings.xml''' file is self-explanatory.&lt;br /&gt;
&lt;br /&gt;
==Metadata==&lt;br /&gt;
&lt;br /&gt;
Because ODF files are basically ZIP files, the files contain the same meta-information about each file as that of a standard ZIP archive, namely the name and size of each sub-file, compression information, and creation date of each sub-file.  In addition, much metadata is contained within the xml files themselves.  The '''meta.xml''' file contains metadata for the entire document.  The types of metadata contained in the file can comprise pre-defined metadata, user defined metadata, as well as custom metadata:&lt;br /&gt;
&lt;br /&gt;
* which version of ODF is used by the document&lt;br /&gt;
* the document generator, that is, the user-agent software that generated or last modified the ODF document. This string is similar to the HTTP user agent string as described in RFC-2616. This can contains the name and version of the software as well as the name of the operating system.&lt;br /&gt;
* document title&lt;br /&gt;
* document description&lt;br /&gt;
* document subject&lt;br /&gt;
* keywords&lt;br /&gt;
* initial creator&lt;br /&gt;
* Creator (person who last modified the document)&lt;br /&gt;
* printed by&lt;br /&gt;
* creation date/time&lt;br /&gt;
* modification date/time&lt;br /&gt;
* print date/time&lt;br /&gt;
* document template, the path of the document template if one was used to generate the current document&lt;br /&gt;
* automatic reload&lt;br /&gt;
* hyperlink behavior&lt;br /&gt;
* language&lt;br /&gt;
* number of editing cycles stored as a string. The number is incremented each time the document is saved.&lt;br /&gt;
* editing duration -- amount of time spent editing the document. The specification is not clear as to how this value is to be calculated.&lt;br /&gt;
* document statistics -- this field varies by file type, but includes information such as page count, object count, paragraph count, cell count, etc.&lt;br /&gt;
* user-defined metadata -- allowable types: string, integer, float, boolean&lt;br /&gt;
&lt;br /&gt;
Conforming applications are permitted to store non-standard fields in this file, and the software should preserve any custom fields.&lt;br /&gt;
&lt;br /&gt;
Not all metadata is stored in the meta.xml file.  The content.xml file can contain meta-information such as annotations and tracked changes, as well as the creator and creation date time of those annotations or tracked changes.&lt;br /&gt;
&lt;br /&gt;
=External Links=&lt;br /&gt;
[http://docs.oasis-open.org/office/v1.1/OS/OpenDocument-v1.1-html/OpenDocument-v1.1.html ODF specification]&lt;br /&gt;
&lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Open_Document_Format</id>
		<title>Open Document Format</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Open_Document_Format"/>
				<updated>2010-04-14T00:36:35Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: Expanded information; added sections: file structure, metadata&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''Open Document Format''' (ODF) is an open, XML-based file format standard for word processing documents, spreadsheets, charts, and presentations. The specification was originally developed by Sun Microsystems, but has been standardized by the Organization for the Advancement of Structured Information Standards (OASIS). ODF version 1.0 has been standardized as ISO/IEC 26300:2006. ODF is the primary format for the OpenOffice.org office suite.&lt;br /&gt;
&lt;br /&gt;
=File Extensions=&lt;br /&gt;
The main file extensions for ODF documents are&lt;br /&gt;
* .odt for word processing documents&lt;br /&gt;
* .ods for spreadsheet documents&lt;br /&gt;
* .odp for presentation documents&lt;br /&gt;
* .odb for database documents&lt;br /&gt;
* .odg for graphical documents&lt;br /&gt;
* .odf for mathematical formulae&lt;br /&gt;
&lt;br /&gt;
ODF also supports template files for each type of document.  The 'd' in file extension is replaced by a 't' for template files.&lt;br /&gt;
&lt;br /&gt;
=File Structure=&lt;br /&gt;
An ODF document can be as simple as a single XML file.  However, this is rarely practical. The standard specifies that an ODF file can also be stored as a collection of several subdocuments.  The latter is the most common implementation.&lt;br /&gt;
&lt;br /&gt;
A packaged ODF file will contain, at a minimum, six files and two directories archived into a modified ZIP file. The structure of the basic package is as follows&lt;br /&gt;
&lt;br /&gt;
 |-- META-INF&lt;br /&gt;
 |   `-- manifest.xml&lt;br /&gt;
 |-- Thumbnails&lt;br /&gt;
 |   `-- thumbnail.png&lt;br /&gt;
 |-- content.xml&lt;br /&gt;
 |-- meta.xml&lt;br /&gt;
 |-- mimetype&lt;br /&gt;
 |-- settings.xml&lt;br /&gt;
 `-- styles.xml&lt;br /&gt;
&lt;br /&gt;
Again, this represents a minimal ODF file. The structure can become much more complicated as directories can be added that contain embedded images, macros, and the like. &lt;br /&gt;
&lt;br /&gt;
An important caveat in the structure of the ZIP file is that the first file must be the &amp;quot;mimetype&amp;quot; file and it must not be compressed. [http://www.jejik.com/articles/2010/03/how_to_correctly_create_odf_documents_using_zip/]  The string &amp;quot;mimetype&amp;quot; should appear at position 30 and the actual MIME type at position 38.  This adaptation makes it possible for operating systems to determine the MIME type of a file without relying on the file extension.&lt;br /&gt;
&lt;br /&gt;
==Main Sub-Files==&lt;br /&gt;
&lt;br /&gt;
The '''manifest.xml''' file contains a list of all files in the packages, as well as their media type, path, and any information required for decryption.  The '''content.xml''' file contains the content of the document (e.g., the text in a word processing document), while the '''styles.xml''' file contains the information on how the content is to be styled.  The '''settings.xml''' file is self-explanatory.&lt;br /&gt;
&lt;br /&gt;
==Metadata==&lt;br /&gt;
&lt;br /&gt;
Because ODF files are basically ZIP files, the files contain the same meta-information about each file as that of a standard ZIP archive, namely the name and size of each sub-file, compression information, and creation date of each sub-file.  In addition, much metadata is contained within the xml files themselves.  The '''meta.xml''' file contains metadata for the entire document.  The types of metadata contained in the file can comprise pre-defined metadata, user defined metadata, as well as custom metadata:&lt;br /&gt;
&lt;br /&gt;
* which version of ODF is used by the document&lt;br /&gt;
* the document generator, that is, the user-agent software that generated or last modified the ODF document. This string is similar to the HTTP user agent string as described in RFC-2616. This can contains the name and version of the software as well as the name of the operating system.&lt;br /&gt;
* document title&lt;br /&gt;
* document description&lt;br /&gt;
* document subject&lt;br /&gt;
* keywords&lt;br /&gt;
* initial creator&lt;br /&gt;
* Creator (person who last modified the document)&lt;br /&gt;
* printed by&lt;br /&gt;
* creation date/time&lt;br /&gt;
* modification date/time&lt;br /&gt;
* print date/time&lt;br /&gt;
* document template, the path of the document template if one was used to generate the current document&lt;br /&gt;
* automatic reload&lt;br /&gt;
* hyperlink behavior&lt;br /&gt;
* language&lt;br /&gt;
* number of editing cycles stored as a string. The number is incremented each time the document is saved.&lt;br /&gt;
* editing duration -- amount of time spent editing the document. The specification is not clear as to how this value is to be calculated.&lt;br /&gt;
* document statistics -- this field varies by file type, but includes information such as page count, object count, paragraph count, cell count, etc.&lt;br /&gt;
* user-defined metadata -- allowable types: string, integer, float, boolean&lt;br /&gt;
&lt;br /&gt;
Conforming applications are permitted to store non-standard fields in this file, and the software should preserve any custom fields.&lt;br /&gt;
&lt;br /&gt;
=External Links=&lt;br /&gt;
[http://docs.oasis-open.org/office/v1.1/OS/OpenDocument-v1.1-html/OpenDocument-v1.1.html ODF specification]&lt;br /&gt;
&lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Open_Document_Format</id>
		<title>Open Document Format</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Open_Document_Format"/>
				<updated>2010-04-13T22:45:04Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: Created ODF page, more to come.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''Open Document Format''' (ODF) is an open, XML-based file format standard for word processing documents, spreadsheets, charts, and presentations. The specification was originally developed by Sun Microsystems, but has been standardized by the Organization for the Advancement of Structured Information Standards (OASIS). ODF version 1.0 has been standardized as ISO/IEC 26300:2006. ODF is the primary format for the OpenOffice.org office suite.&lt;br /&gt;
&lt;br /&gt;
=File Extensions=&lt;br /&gt;
The main file extensions for ODF documents are&lt;br /&gt;
* .odt for word processing documents&lt;br /&gt;
* .ods for spreadsheet documents&lt;br /&gt;
* .odp for presentation documents&lt;br /&gt;
* .odb for database documents&lt;br /&gt;
* .odg for graphical documents&lt;br /&gt;
* .odf for mathematical formulae&lt;br /&gt;
&lt;br /&gt;
ODF also supports template files for each type of document.  The 'd' in file extension is replaced by a 't' for template files.&lt;br /&gt;
&lt;br /&gt;
=File Structure=&lt;br /&gt;
An ODF document can be as simple as a single XML file.  However, this is rarely practical. The standard specifies that an ODF file can also be stored as a collection of several subdocuments.  The latter is the most common implementation.&lt;br /&gt;
&lt;br /&gt;
[Category:File Formats]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Word_Document_(DOCX)</id>
		<title>Word Document (DOCX)</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Word_Document_(DOCX)"/>
				<updated>2010-04-13T21:36:46Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Relationship to OOXML */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;DOCX is the file format for Microsoft Office 2007 and later. &lt;br /&gt;
&lt;br /&gt;
DOCX should not be confused with [[DOC]], the format used by earlier versions of Microsoft Office.&lt;br /&gt;
&lt;br /&gt;
= Container Format =&lt;br /&gt;
&lt;br /&gt;
DOCX is written in an XML format, which consists of a [[ZIP archive]] file containing [[XML]] and binaries. Content can be analysed without modification by unzipping the file (e.g. in WinZIP) and analysing the contents of the archive.&lt;br /&gt;
&lt;br /&gt;
The file _rels/.rels contains information about the structure of the document.  It contains paths to the metadata information as well as the main XML document that contains the content of the document itself.&lt;br /&gt;
&lt;br /&gt;
Metadata information are usually stored in the folder docProps.  Two or more XML files are stored inside that folder, app.xml that stores metadata information extracted from the Word application itself and core.xml that stores metadata from the document itself, such as the author name, last time it was printed, etc.&lt;br /&gt;
&lt;br /&gt;
Another folder contains the actual content of the document, in a Word document, or an .docx document the folder's name is word.  A XML file called document.xml is the main document, containing most of the content of the document itself.&lt;br /&gt;
&lt;br /&gt;
= Relationship to OOXML =&lt;br /&gt;
&lt;br /&gt;
Office Open XML is an open XML standard developed by Microsoft for word processing documents, spreadsheets, presentations and charts. The OOXML standard was submitted to the ISO for approval.  After initially being rejected over technical concerns, the ISO approved a modified version as ISO/IEC 29500:2008. Microsoft intended to use the OOXML standard for its Office suite. However, Office does not support the standard that the ISO approved, it only supports the standard that was originally rejected by the ISO[http://arstechnica.com/microsoft/news/2010/04/iso-ooxml-convener-microsofts-format-heading-for-failure.ars]. As of Office 2010, Microsoft has still not brought its software into compliance with the standard.&lt;br /&gt;
&lt;br /&gt;
For most purposes OOXML may be considered a subset of DOCX (DOCX contains additional features, like OLE serialization).&lt;br /&gt;
&lt;br /&gt;
Documentation on OOXML may provide a guide to analysing a DOCX file.&lt;br /&gt;
&lt;br /&gt;
= External Links =&lt;br /&gt;
&lt;br /&gt;
* [http://msdn.microsoft.com/en-us/library/aa338205.aspx Information from Microsoft about the structure of OpenXML documents]&lt;br /&gt;
&lt;br /&gt;
* [http://www.simson.net/clips/academic/2009.IEEE.DOCX.pdf The new XML Office Document Files: Implications For Forensics], [[Simson L. Garfinkel]] and James Migletz&lt;br /&gt;
&lt;br /&gt;
* [http://blog.kiddaland.net/2009/07/antiword-for-office-2007/ Perl script that displays the content of a Docx document, similar to Antiword]&lt;br /&gt;
&lt;br /&gt;
* [http://blog.kiddaland.net/2009/06/office-2007-metadata/ Perl script that displays metadata information that is extracted from an OpenXML document] &lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Word_Document_(DOCX)</id>
		<title>Word Document (DOCX)</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Word_Document_(DOCX)"/>
				<updated>2010-04-13T21:20:03Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Relationship to OOXML */  expanded information about OOXML&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;DOCX is the file format for Microsoft Office 2007 and later. &lt;br /&gt;
&lt;br /&gt;
DOCX should not be confused with [[DOC]], the format used by earlier versions of Microsoft Office.&lt;br /&gt;
&lt;br /&gt;
= Container Format =&lt;br /&gt;
&lt;br /&gt;
DOCX is written in an XML format, which consists of a [[ZIP archive]] file containing [[XML]] and binaries. Content can be analysed without modification by unzipping the file (e.g. in WinZIP) and analysing the contents of the archive.&lt;br /&gt;
&lt;br /&gt;
The file _rels/.rels contains information about the structure of the document.  It contains paths to the metadata information as well as the main XML document that contains the content of the document itself.&lt;br /&gt;
&lt;br /&gt;
Metadata information are usually stored in the folder docProps.  Two or more XML files are stored inside that folder, app.xml that stores metadata information extracted from the Word application itself and core.xml that stores metadata from the document itself, such as the author name, last time it was printed, etc.&lt;br /&gt;
&lt;br /&gt;
Another folder contains the actual content of the document, in a Word document, or an .docx document the folder's name is word.  A XML file called document.xml is the main document, containing most of the content of the document itself.&lt;br /&gt;
&lt;br /&gt;
= Relationship to OOXML =&lt;br /&gt;
&lt;br /&gt;
Office Open XML is an open XML standard developed by Microsoft for word processing documents, spreadsheets, presentations and charts. The OOXML standard was submitted to the ISO for approval.  After initially being rejected over technical concerns, the ISO approved a modified version as ISO/IEC 29500:2008. Microsoft intended to use the OOXML standard for its Office suite. However, Office does not support the standard that the ISO approved, it only supports the standard that was originally [http://arstechnica.com/microsoft/news/2010/04/iso-ooxml-convener-microsofts-format-heading-for-failure.ars rejected] by the ISO.  As of Office 2010, Microsoft has still not brought its software into compliance with the standard.&lt;br /&gt;
&lt;br /&gt;
For most purposes OOXML may be considered a subset of DOCX (DOCX contains additional features, like OLE serialization).&lt;br /&gt;
&lt;br /&gt;
Documentation on OOXML may provide a guide to analysing a DOCX file.&lt;br /&gt;
&lt;br /&gt;
= External Links =&lt;br /&gt;
&lt;br /&gt;
* [http://msdn.microsoft.com/en-us/library/aa338205.aspx Information from Microsoft about the structure of OpenXML documents]&lt;br /&gt;
&lt;br /&gt;
* [http://www.simson.net/clips/academic/2009.IEEE.DOCX.pdf The new XML Office Document Files: Implications For Forensics], [[Simson L. Garfinkel]] and James Migletz&lt;br /&gt;
&lt;br /&gt;
* [http://blog.kiddaland.net/2009/07/antiword-for-office-2007/ Perl script that displays the content of a Docx document, similar to Antiword]&lt;br /&gt;
&lt;br /&gt;
* [http://blog.kiddaland.net/2009/06/office-2007-metadata/ Perl script that displays metadata information that is extracted from an OpenXML document] &lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Word_Document_(DOCX)</id>
		<title>Word Document (DOCX)</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Word_Document_(DOCX)"/>
				<updated>2010-04-13T20:58:56Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Container Format */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;DOCX is the file format for Microsoft Office 2007 and later. &lt;br /&gt;
&lt;br /&gt;
DOCX should not be confused with [[DOC]], the format used by earlier versions of Microsoft Office.&lt;br /&gt;
&lt;br /&gt;
= Container Format =&lt;br /&gt;
&lt;br /&gt;
DOCX is written in an XML format, which consists of a [[ZIP archive]] file containing [[XML]] and binaries. Content can be analysed without modification by unzipping the file (e.g. in WinZIP) and analysing the contents of the archive.&lt;br /&gt;
&lt;br /&gt;
The file _rels/.rels contains information about the structure of the document.  It contains paths to the metadata information as well as the main XML document that contains the content of the document itself.&lt;br /&gt;
&lt;br /&gt;
Metadata information are usually stored in the folder docProps.  Two or more XML files are stored inside that folder, app.xml that stores metadata information extracted from the Word application itself and core.xml that stores metadata from the document itself, such as the author name, last time it was printed, etc.&lt;br /&gt;
&lt;br /&gt;
Another folder contains the actual content of the document, in a Word document, or an .docx document the folder's name is word.  A XML file called document.xml is the main document, containing most of the content of the document itself.&lt;br /&gt;
&lt;br /&gt;
= Relationship to OOXML =&lt;br /&gt;
&lt;br /&gt;
For most purposes OOXML may be considered a subset of DOCX (DOCX contains additional features, like OLE serialization).&lt;br /&gt;
&lt;br /&gt;
Documentation on OOXML may provide a guide to analysing a DOCX file.&lt;br /&gt;
&lt;br /&gt;
= External Links =&lt;br /&gt;
&lt;br /&gt;
* [http://msdn.microsoft.com/en-us/library/aa338205.aspx Information from Microsoft about the structure of OpenXML documents]&lt;br /&gt;
&lt;br /&gt;
* [http://www.simson.net/clips/academic/2009.IEEE.DOCX.pdf The new XML Office Document Files: Implications For Forensics], [[Simson L. Garfinkel]] and James Migletz&lt;br /&gt;
&lt;br /&gt;
* [http://blog.kiddaland.net/2009/07/antiword-for-office-2007/ Perl script that displays the content of a Docx document, similar to Antiword]&lt;br /&gt;
&lt;br /&gt;
* [http://blog.kiddaland.net/2009/06/office-2007-metadata/ Perl script that displays metadata information that is extracted from an OpenXML document] &lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Metadata</id>
		<title>Metadata</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Metadata"/>
				<updated>2010-04-13T17:33:03Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''Metadata''' is data about data. Metadata plays a number of important roles in [[computer forensics]]:&lt;br /&gt;
* It can provide corroborating information about the document data itself.&lt;br /&gt;
* It can reveal information that someone tried to hide, delete, or obscure.&lt;br /&gt;
* It can be used to automatically correlate documents from different sources.&lt;br /&gt;
&lt;br /&gt;
Since metadata is fundamentally data, it suffers all of the data quality and pedigre issues as any other form of data. Nevertheless, because metadata isn't generally visible unless you use a special tool, more skill is required to alter or otherwise manipulate it.&lt;br /&gt;
&lt;br /&gt;
==Kinds of Metadata==&lt;br /&gt;
Some kinds of metadata that are interesting in computer forensics:&lt;br /&gt;
* [[File system]] metadata (e.g. [[MAC times]], [[access control lists]], etc.)&lt;br /&gt;
* Digital image metadata. Although information such as the image size and number of colors are technically metadata, [[JPEG]] and other file formats store additional data about the photo or the device that acquired it.&lt;br /&gt;
* Document metadata, such as the creator of a document, it's last print time, etc.&lt;br /&gt;
&lt;br /&gt;
==File types that support metadata and extraction tools==&lt;br /&gt;
&lt;br /&gt;
Below are some common data and metadata formats, the files in which they are found, and a collection of tools that can be used to extract information.&lt;br /&gt;
&lt;br /&gt;
; [[EXIF]] ([[JPEG]] and [[TIFF]] image files; Music Files)&lt;br /&gt;
: The [[Exchangeable Image File]] format describes a format for a block of data that can be embedded into JPEG and TIFF image files, as well as [[RIFF WAVE]] audio files. Information includes date and time information, camera settings, location information, textual descriptions, and copyright information.&lt;br /&gt;
:* [http://pel.sourceforge.net/ PEL: PHP Exif Library]&lt;br /&gt;
:* [http://libexif.sourceforge.net/ LibExif] (C)&lt;br /&gt;
:* [http://www.drewnoakes.com/code/exif/ Metadata extraction in Java]&lt;br /&gt;
:* [http://digital-assembly.com/products/adroit-photo-forensics/ Adroit Photo Forensics]&lt;br /&gt;
&lt;br /&gt;
; [[ID3]] ([[MP3]] files)&lt;br /&gt;
: Implemented as a small block of data stored at the end of MP3 files. [[ID3v1]] is a 128-byte block in a specified format allowing 30 bytes for song, artist and album, 4 bytes for year, 30 bytes for comment, and 1 byte for genre. [[ID3v1.1]] adds a track number. [[ID3v2]] is a general container structure. For more information, see [http://www.id3.org/].&lt;br /&gt;
:* [http://id3lib.sourceforge.net/ id3lib], a widely-used open source C/C++ ID3 implementation.&lt;br /&gt;
:* [http://www.vdheide.de/projects.html Java library MP3]&lt;br /&gt;
:* [http://search.cpan.org/dist/MP3-Info/ MP3::Info] (Perl)&lt;br /&gt;
:* [http://search.cpan.org/dist/MPEG-ID3v2Tag/ MPEG::ID3v2Tag] (Perl)&lt;br /&gt;
&lt;br /&gt;
; [[Microsoft]] [[OLE Compound File]]&lt;br /&gt;
: Microsoft Office document files contain a huge amount of metadata. They are created as OLE Compound Files and mainly stored in the so called property set streams. Here are some tools for processing them: &lt;br /&gt;
:* [http://jakarta.apache.org/poi/index.html Jakarta POI] Open Source implementation in Java.&lt;br /&gt;
:* [http://www.payneconsulting.com/ Payne Consulting] Metadata Analysis and cleanup.&lt;br /&gt;
:* [http://www.inforenz.com/software/forager.html Inforenz Forager] Inforenz Forager &lt;br /&gt;
&lt;br /&gt;
; [[TIFF]]&lt;br /&gt;
: The [[Tagged Image File Format]] allows one or more images to be bundled in a single file. Multiple [[compression]] formats are supported. [[EXIF]] files can be stored inside TIFFs.&lt;br /&gt;
:* [http://www.remotesensing.org/libtiff/ LibTIFF]&lt;br /&gt;
:* [http://www.awaresystems.be/imaging/tiff/faq.html TIFF FAQ]&lt;br /&gt;
&lt;br /&gt;
=See Also=&lt;br /&gt;
[[Document_Metadata_Extraction]]&lt;br /&gt;
&lt;br /&gt;
=External links=&lt;br /&gt;
* [http://en.wikipedia.org/wiki/Metadata Wikipedia: Metadata]&lt;br /&gt;
* [http://theses.nps.navy.mil/08Jun_Migletz.pdf Automated Metadata Extraction],James Migletz, Master's Thesis, Naval Postgraduate School, June 2008&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Document_Metadata_Extraction</id>
		<title>Document Metadata Extraction</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Document_Metadata_Extraction"/>
				<updated>2010-04-13T17:23:52Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Images */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are tools that will extract metadata from document files.&lt;br /&gt;
&lt;br /&gt;
=Office Files=&lt;br /&gt;
&lt;br /&gt;
; [[antiword]]&lt;br /&gt;
: http://www.winfield.demon.nl/&lt;br /&gt;
&lt;br /&gt;
; [[catdoc]]&lt;br /&gt;
: http://www.45.free.net/~vitus/software/catdoc/&lt;br /&gt;
&lt;br /&gt;
; [[laola]]&lt;br /&gt;
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html&lt;br /&gt;
&lt;br /&gt;
; [[word2x]]&lt;br /&gt;
: http://word2x.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[wvWare]]&lt;br /&gt;
: http://wvware.sourceforge.net/&lt;br /&gt;
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.&lt;br /&gt;
&lt;br /&gt;
; [[Outside In]]&lt;br /&gt;
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html&lt;br /&gt;
: Originally developed by Stellant, supports hundreds of file types.&lt;br /&gt;
&lt;br /&gt;
; [[FI Tools]]&lt;br /&gt;
: http://forensicinnovations.com/&lt;br /&gt;
: More than 100 file types.&lt;br /&gt;
&lt;br /&gt;
=PDF Files=&lt;br /&gt;
&lt;br /&gt;
; [[xpdf]]&lt;br /&gt;
: http://www.foolabs.com/xpdf/&lt;br /&gt;
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(See [[PDF]])&lt;br /&gt;
&lt;br /&gt;
=Images=&lt;br /&gt;
&lt;br /&gt;
; [[jhead]]&lt;br /&gt;
: http://www.sentex.net/~mwandel/jhead/&lt;br /&gt;
: Displays or modifies [[Exif]] data in [[JPEG]] files.&lt;br /&gt;
&lt;br /&gt;
; [[vinetto]]&lt;br /&gt;
: http://vinetto.sourceforge.net/&lt;br /&gt;
: Examines [[Thumbs.db]] files.&lt;br /&gt;
&lt;br /&gt;
;[[libexif]]&lt;br /&gt;
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library&lt;br /&gt;
&lt;br /&gt;
; [[Adroit Photo Forensics]]&lt;br /&gt;
: http://digital-assembly.com/products/adroit-photo-forensics/&lt;br /&gt;
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.&lt;br /&gt;
&lt;br /&gt;
; Exif Viewer&lt;br /&gt;
: http://araskin.webs.com/exif/exif.html&lt;br /&gt;
: Add-on for Firefox and Thunderbird that displays various [[JPEG]]/JPG metadata in local and remote images.&lt;br /&gt;
&lt;br /&gt;
; exiftags&lt;br /&gt;
: http://johnst.org/sw/exiftags/&lt;br /&gt;
: open source utility to parse and edit [[exif]] data in [[JPEG]] images. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; exifprobe&lt;br /&gt;
: http://www.virtual-cafe.com/~dhh/tools.d/exifprobe.d/exifprobe.html&lt;br /&gt;
: Open source utility that reads [[exif]] data in [[JPEG]] and some &amp;quot;RAW&amp;quot; image formats. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; Exiv2&lt;br /&gt;
: http://www.exiv2.org&lt;br /&gt;
: Open source C++ library and command line tool for reading and writing metadata in various image formats. Found in almost every GNU/Linux distribution&lt;br /&gt;
&lt;br /&gt;
; pngtools&lt;br /&gt;
: http://www.stillhq.com/pngtools/&lt;br /&gt;
: Open source suite of commands (pnginfo, pngchunks, pngchunksdesc) that reads metadata found in [[PNG]] files. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; pngmeta&lt;br /&gt;
: http://sourceforge.net/projects/pmt/files/&lt;br /&gt;
: Open source command line tool that extracts metadata from [[PNG]] images. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
=General=&lt;br /&gt;
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Extraction Tool]]&lt;br /&gt;
: &amp;quot;Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.&amp;quot;&lt;br /&gt;
: http://meta-extractor.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Assistant]]&lt;br /&gt;
: http://www.payneconsulting.com/products/metadataent/&lt;br /&gt;
&lt;br /&gt;
; [[hachoir|hachoir-metadata]]&lt;br /&gt;
: Extraction tool, part of '''[[Hachoir]]''' project&lt;br /&gt;
&lt;br /&gt;
; [[file]]&lt;br /&gt;
: The UNIX '''file''' program can extract some metadata&lt;br /&gt;
&lt;br /&gt;
; [[GNU libextractor]]&lt;br /&gt;
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata&lt;br /&gt;
&lt;br /&gt;
; [[Directory Lister Pro]]&lt;br /&gt;
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.&lt;br /&gt;
: http://www.krksoft.com&lt;br /&gt;
&lt;br /&gt;
[[Category:Tools]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Portable_Network_Graphics_(PNG)</id>
		<title>Portable Network Graphics (PNG)</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Portable_Network_Graphics_(PNG)"/>
				<updated>2010-04-09T18:45:26Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''Portable Network Graphics''' (aka PNG) is an image file format developed to replace GIF images on the web.  It uses lossless compression and is thus not ideal for photographs. The types of images ideally suited for PNG are line-art, text, other graphics that have sharp transitions.&lt;br /&gt;
&lt;br /&gt;
=Format=&lt;br /&gt;
A PNG file comprises &amp;quot;chunks&amp;quot; of data, some of which are mandatory and others of which are ancillary. Some ancillary chunks can contain metadata text or timestamps, but does not contain nearly as much metadata as [[exif]] images.  All PNG files begin with an 8-byte signature: &amp;lt;tt&amp;gt;89 50 4E 47 0D 0A 1A 0A&amp;lt;/tt&amp;gt; (hexadecimal).&lt;br /&gt;
&lt;br /&gt;
=See Also=&lt;br /&gt;
* [http://www.forensicswiki.org/wiki/Tools:Document_Metadata_Extraction#Images Tools for extracting image metadata]&lt;br /&gt;
&lt;br /&gt;
=External Links=&lt;br /&gt;
* [http://en.wikipedia.org/wiki/Portable_Network_Graphics Wikipedia article]&lt;br /&gt;
* [http://www.w3.org/TR/2003/REC-PNG-20031110/ W3C Recommendation]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/PDF</id>
		<title>PDF</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/PDF"/>
				<updated>2010-04-09T18:43:59Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The '''Portable Document Format''' ('''PDF''') is a document format from [[Adobe]] Inc. It is widely available on the web. Originally developed as a propriety format, version 1.7 was released as an open standard in 2008. The standard is published as ISO/IEC 32000-1:2008. Although an open standard, Adobe still owns patents and copyrights related to the PDF standard. Adobe has granted a worldwide royalty-free license to produce PDF software, but only if the software complies with the PDF standard.&lt;br /&gt;
&lt;br /&gt;
== Format ==&lt;br /&gt;
&lt;br /&gt;
It is a common misconception that PDF files are simply a collection of images, one per page.  Certainly a PDF can be formed that way (which is typical of document scanners), but in reality the document structure is much more complex.  A PDF file can contain text streams (which cam be encoded and/or compressed in dozens of ways), vector and raster images, fonts, and various interactive elements.&lt;br /&gt;
A PDF file comprises sections called &amp;quot;objects.&amp;quot; Each object is numbered and can represent a page, a font, a data stream, etc. Each file begins with the string &amp;lt;tt&amp;gt;%PDF&amp;lt;/tt&amp;gt;. Each file ends with the letters &amp;lt;tt&amp;gt;%%EOF&amp;lt;/tt&amp;gt;, but there can be multiple &amp;lt;tt&amp;gt;EOF&amp;lt;/tt&amp;gt;'s in a single file (this often confuses programs like [[foremost]] that search for footers).&lt;br /&gt;
&lt;br /&gt;
Adobe's Acrobat software supports &amp;quot;incremental updates.&amp;quot;  The standard allows this so that modifications can simply be appended to the file, leaving the original data intact.  Any new or altered object is simply appended to the end of the original file.  Deleted objects are left intact and simply marked deleted. This can potentially cause inadvertent disclosure of sensitive information.&lt;br /&gt;
&lt;br /&gt;
== Metadata ==&lt;br /&gt;
&lt;br /&gt;
PDF metadata can be stored in a document information dictionary or as a metadata stream, sometimes both. A metadata stream can describe the entire document or an individual component of a document. Thus, multiple metadata streams may exist in a single document, making it difficult to find all of it. Metadata streams are stored in Adobe's XML based XMP (Extensible Metadata Platform) format. Even if a PDF document is encrypted, the accompanying metadata is not required to be, and often is not, encrypted. &lt;br /&gt;
&lt;br /&gt;
The metadata (or parts of it) can be extracted with [[pdfinfo]], a utility which is part of the [[xpdf]] package.&lt;br /&gt;
&lt;br /&gt;
== Embedded Objects==&lt;br /&gt;
&lt;br /&gt;
The PDF standard supports embedding many types of files such as images. Embedded files may contain their own metadata. You can use [[pdfimages]], part of the [[xpdf]], to extract all of the images out of a PDF file and put each in its own file.&lt;br /&gt;
&lt;br /&gt;
== Subformats ==&lt;br /&gt;
&lt;br /&gt;
Several related standards exist that contain subsets or supersets of the PDF standard features. These standards include&lt;br /&gt;
&lt;br /&gt;
* PDF/A a simpler set of features for archiving documents, allowing for long-term reproducibility. Some scanning software saves documents in PDF/A by default.&lt;br /&gt;
* PDF/X for graphic arts.&lt;br /&gt;
* PDF/UA for universal accessibility.&lt;br /&gt;
* PDF/E for engineering drawings.&lt;br /&gt;
&lt;br /&gt;
==PDF Software==&lt;br /&gt;
&lt;br /&gt;
Due to the popularity of the PDF format, there is much software available for viewing and creating PDF documents. However, Adobe maintains a de facto monopoly on software capable of editing PDF documents.  There are quite a few tools that merge or split pdf documents, but few that can make meaningful edits.  Software such as OpenOffice.org and Inkscape can import PDF files into their native formats, where the documents can be edited and then exported back to PDF. Unfortunately, this option can be quite cumbersome. &lt;br /&gt;
&lt;br /&gt;
=== PDF Tools ===&lt;br /&gt;
&lt;br /&gt;
; Origami&lt;br /&gt;
: http://security-labs.org/origami/&lt;br /&gt;
: A powerful open source framework and GUI written in Ruby. It allows for parsing and exploring pdf files and graphically browsing its contents.&lt;br /&gt;
&lt;br /&gt;
; PDF Tools&lt;br /&gt;
: [http://blog.didierstevens.com/programs/pdf-tools/]&lt;br /&gt;
: Didier Stevens' pdf-parse and pdfid, written in Python&lt;br /&gt;
&lt;br /&gt;
; pdfresurrect&lt;br /&gt;
: http://www.757labs.com/projects/pdfresurrect/#downloads&lt;br /&gt;
: Retrieves previous versions of PDF files that have changes appended with &amp;quot;incremental updates&amp;quot;&lt;br /&gt;
&lt;br /&gt;
; QPDF&lt;br /&gt;
: http://sourceforge.net/projects/qpdf/&lt;br /&gt;
: Open source, cross-platform library and set of programs to inspect and manipulate PDF files. Packaged in recent Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
== External Links == &lt;br /&gt;
&lt;br /&gt;
* [http://partners.adobe.com/public/developer/pdf/index_reference.html Adobe PDF Reference]&lt;br /&gt;
* [http://en.wikipedia.org/wiki/PDF Wikipedia: PDF]&lt;br /&gt;
* [http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/ Portable Document Format: An Introduction for Programmers], MacTech Magazine, Volume 15, (1999), Issue 9&lt;br /&gt;
* [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=51502 ISO Standard]&lt;br /&gt;
* [http://partners.adobe.com/public/developer/support/topic_legal_notices.html Patent Licenses]&lt;br /&gt;
&lt;br /&gt;
=See Also=&lt;br /&gt;
* [[Arabic PDFs]]&lt;br /&gt;
* [[Tools:Document Metadata Extraction]]&lt;br /&gt;
&lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/PDF</id>
		<title>PDF</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/PDF"/>
				<updated>2010-04-09T18:43:16Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: Added additional information about metadata and file format; inserted list of pdf tools&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The '''Portable Document Format''' ('''PDF''') is a document format from [[Adobe]] Inc. It is widely available on the web. Originally developed as a propriety format, version 1.7 was released as an open standard in 2008. The standard is published as ISO/IEC 32000-1:2008. Although an open standard, Adobe still owns patents and copyrights related to the PDF standard. Adobe has granted a worldwide royalty-free license to produce PDF software, but only if the software complies with the PDF standard.&lt;br /&gt;
&lt;br /&gt;
== Format ==&lt;br /&gt;
&lt;br /&gt;
It is a common misconception that PDF files are simply a collection of images, one per page.  Certainly a PDF can be formed that way (which is typical of document scanners), but in reality the document structure is much more complex.  A PDF file can contain text streams (which cam be encoded and/or compressed in dozens of ways), vector and raster images, fonts, and various interactive elements.&lt;br /&gt;
A PDF file comprises sections called &amp;quot;objects.&amp;quot; Each object is numbered and can represent a page, a font, a data stream, etc. Each file begins with the string &amp;lt;tt&amp;gt;%PDF&amp;lt;/tt&amp;gt;. Each file ends with the letters &amp;lt;tt&amp;gt;%%EOF&amp;lt;/tt&amp;gt;, but there can be multiple &amp;lt;tt&amp;gt;EOF&amp;lt;/tt&amp;gt;'s in a single file (this often confuses programs like [[foremost]] that search for footers).&lt;br /&gt;
&lt;br /&gt;
Adobe's Acrobat software supports &amp;quot;incremental updates.&amp;quot;  The standard allows this so that modifications can simply be appended to the file, leaving the original data intact.  Any new or altered object is simply appended to the end of the original file.  Deleted objects are left intact and simply marked deleted. This can potentially cause inadvertent disclosure of sensitive information.&lt;br /&gt;
&lt;br /&gt;
== Metadata ==&lt;br /&gt;
&lt;br /&gt;
PDF metadata can be stored in a document information dictionary or as a metadata stream, sometimes both. A metadata stream can describe the entire document or an individual component of a document. Thus, multiple metadata streams may exist in a single document, making it difficult to find all of it. Metadata streams are stored in Adobe's XML based XMP (Extensible Metadata Platform) format. Even if a PDF document is encrypted, the accompanying metadata is not required to be, and often is not, encrypted. &lt;br /&gt;
&lt;br /&gt;
The metadata (or parts of it) can be extracted with [[pdfinfo]], a utility which is part of the [[xpdf]] package.&lt;br /&gt;
&lt;br /&gt;
== Embedded Objects==&lt;br /&gt;
&lt;br /&gt;
The PDF standard supports embedding many types of files such as images. Embedded files may contain their own metadata. You can use [[pdfimages]], part of the [[xpdf]], to extract all of the images out of a PDF file and put each in its own file.&lt;br /&gt;
&lt;br /&gt;
== Subformats ==&lt;br /&gt;
&lt;br /&gt;
Several related standards exist that contain subsets or supersets of the PDF standard features. These standards include&lt;br /&gt;
&lt;br /&gt;
* PDF/A a simpler set of features for archiving documents, allowing for long-term reproducibility. Some scanning software saves documents in PDF/A by default.&lt;br /&gt;
* PDF/X for graphic arts.&lt;br /&gt;
* PDF/UA for universal accessibility.&lt;br /&gt;
* PDF/E for engineering drawings.&lt;br /&gt;
&lt;br /&gt;
==PDF Software==&lt;br /&gt;
&lt;br /&gt;
Due to the popularity of the PDF format, there is much software available for viewing and creating PDF documents. However, Adobe maintains a de facto monopoly on software capable of editing PDF documents.  There are quite a few tools that merge or split pdf documents, but few that can make meaningful edits.  Software such as OpenOffice.org and Inkscape can import PDF files into their native formats, where the documents can be edited and then exported back to PDF. Unfortunately, this option can be quite cumbersome. &lt;br /&gt;
&lt;br /&gt;
=== PDF Tools ===&lt;br /&gt;
&lt;br /&gt;
; Origami&lt;br /&gt;
: http://security-labs.org/origami/&lt;br /&gt;
: A powerful open source framework and GUI written in Ruby. It allows for parsing and exploring pdf files and graphically browsing its contents.&lt;br /&gt;
&lt;br /&gt;
; PDF Tools&lt;br /&gt;
: [http://blog.didierstevens.com/programs/pdf-tools/]&lt;br /&gt;
: Didier Stevens' pdf-parse and pdfid, written in Python&lt;br /&gt;
&lt;br /&gt;
; pdfresurrect&lt;br /&gt;
: http://www.757labs.com/projects/pdfresurrect/#downloads&lt;br /&gt;
: Retrieves previous versions of PDF files that have changes appended with &amp;quot;incremental updates&amp;quot;&lt;br /&gt;
&lt;br /&gt;
; QPDF&lt;br /&gt;
: http://sourceforge.net/projects/qpdf/&lt;br /&gt;
: Open source, cross-platform library and set of programs to inspect and manipulate PDF files. Packaged in recent Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
== External Links == &lt;br /&gt;
&lt;br /&gt;
* [http://partners.adobe.com/public/developer/pdf/index_reference.html Adobe PDF Reference]&lt;br /&gt;
* [http://en.wikipedia.org/wiki/PDF Wikipedia: PDF]&lt;br /&gt;
* [http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/ Portable Document Format: An Introduction for Programmers], MacTech Magazine, Volume 15, (1999), Issue 9&lt;br /&gt;
* [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=51502 ISO Standard]&lt;br /&gt;
* [http://partners.adobe.com/public/developer/support/topic_legal_notices.html Patent Licenses]&lt;br /&gt;
&lt;br /&gt;
=See Also=&lt;br /&gt;
* [[Arabic PDFs]]&lt;br /&gt;
* [[Tools:Document Metadata Extraction]]&lt;br /&gt;
* [[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/PDF</id>
		<title>PDF</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/PDF"/>
				<updated>2010-04-09T15:40:17Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: Updated format info--open standard.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The '''Portable Document Format''' ('''PDF''') is a document format from [[Adobe]] Inc. It is widely available on the web. Originally developed as a propriety format, it was released as an open standard in 2008. The standard is published as ISO/IEC 32000-1:2008.&lt;br /&gt;
&lt;br /&gt;
== Format ==&lt;br /&gt;
&lt;br /&gt;
Each file begins with the string &amp;lt;tt&amp;gt;%PDF&amp;lt;/tt&amp;gt;. Each block ends with the letters &amp;lt;tt&amp;gt;EOF&amp;lt;/tt&amp;gt;, but there can be multiple &amp;lt;tt&amp;gt;EOF&amp;lt;/tt&amp;gt;'s in a single file (this often confuses programs like [[foremost]] that search for footers).&lt;br /&gt;
&lt;br /&gt;
== Metadata ==&lt;br /&gt;
&lt;br /&gt;
The metadata (or parts of it) can be extracted with [[pdfinfo]], a utility which is part of the [[xpdf]] package.&lt;br /&gt;
&lt;br /&gt;
== Embedded Objects==&lt;br /&gt;
&lt;br /&gt;
You can use [[pdfimages]], part of the [[xpdf]], to extract all of the images out of a PDF file and put each in its own file.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== External Links == &lt;br /&gt;
&lt;br /&gt;
* [http://partners.adobe.com/public/developer/pdf/index_reference.html Adobe PDF Reference]&lt;br /&gt;
* [http://en.wikipedia.org/wiki/PDF Wikipedia: PDF]&lt;br /&gt;
* [http://www.mactech.com/articles/mactech/Vol.15/15.09/PDFIntro/ Portable Document Format: An Introduction for Programmers], MacTech Magazine, Volume 15, (1999), Issue 9&lt;br /&gt;
&lt;br /&gt;
=See Also=&lt;br /&gt;
* [[Arabic PDFs]]&lt;br /&gt;
* [[Tools:Document Metadata Extraction]]&lt;br /&gt;
* [[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Portable_Network_Graphics_(PNG)</id>
		<title>Portable Network Graphics (PNG)</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Portable_Network_Graphics_(PNG)"/>
				<updated>2010-04-09T14:58:38Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: Created page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''Portable Network Graphics''' (aka PNG) is an image file format developed to replace GIF images on the web.  It uses lossless compression and is thus not ideal for photographs. The types of images ideally suited for PNG are line-art, text, other graphics that have sharp transitions.&lt;br /&gt;
&lt;br /&gt;
=Format=&lt;br /&gt;
A PNG file comprises &amp;quot;chunks&amp;quot; of data, some of which are mandatory and others of which are ancillary. Some ancillary chunks can contain metadata text or timestamps, but does not contain nearly as much metadata as [[exif]] images.  All PNG files begin with an 8-byte signature: &amp;lt;tt&amp;gt;89 50 4E 47 0D 0A 1A 0A&amp;lt;/tt&amp;gt; (hexadecimaml).&lt;br /&gt;
&lt;br /&gt;
=See Also=&lt;br /&gt;
* [http://www.forensicswiki.org/wiki/Tools:Document_Metadata_Extraction#Images Tools for extracting image metadata]&lt;br /&gt;
&lt;br /&gt;
=External Links=&lt;br /&gt;
* [http://en.wikipedia.org/wiki/Portable_Network_Graphics Wikipedia article]&lt;br /&gt;
* [http://www.w3.org/TR/2003/REC-PNG-20031110/ W3C Recommendation]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:File Formats]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Document_Metadata_Extraction</id>
		<title>Document Metadata Extraction</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Document_Metadata_Extraction"/>
				<updated>2010-04-09T14:40:25Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Images */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are tools that will extract metadata from document files.&lt;br /&gt;
&lt;br /&gt;
=Office Files=&lt;br /&gt;
&lt;br /&gt;
; [[antiword]]&lt;br /&gt;
: http://www.winfield.demon.nl/&lt;br /&gt;
&lt;br /&gt;
; [[catdoc]]&lt;br /&gt;
: http://www.45.free.net/~vitus/software/catdoc/&lt;br /&gt;
&lt;br /&gt;
; [[laola]]&lt;br /&gt;
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html&lt;br /&gt;
&lt;br /&gt;
; [[word2x]]&lt;br /&gt;
: http://word2x.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[wvWare]]&lt;br /&gt;
: http://wvware.sourceforge.net/&lt;br /&gt;
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.&lt;br /&gt;
&lt;br /&gt;
; [[Outside In]]&lt;br /&gt;
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html&lt;br /&gt;
: Originally developed by Stellant, supports hundreds of file types.&lt;br /&gt;
&lt;br /&gt;
; [[FI Tools]]&lt;br /&gt;
: http://forensicinnovations.com/&lt;br /&gt;
: More than 100 file types.&lt;br /&gt;
&lt;br /&gt;
=PDF Files=&lt;br /&gt;
&lt;br /&gt;
; [[xpdf]]&lt;br /&gt;
: http://www.foolabs.com/xpdf/&lt;br /&gt;
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(See [[PDF]])&lt;br /&gt;
&lt;br /&gt;
=Images=&lt;br /&gt;
&lt;br /&gt;
; [[jhead]]&lt;br /&gt;
: http://www.sentex.net/~mwandel/jhead/&lt;br /&gt;
: Displays or modifies [[Exif]] data in [[JPEG]] files.&lt;br /&gt;
&lt;br /&gt;
; [[vinetto]]&lt;br /&gt;
: http://vinetto.sourceforge.net/&lt;br /&gt;
: Examines [[Thumbs.db]] files.&lt;br /&gt;
&lt;br /&gt;
;[[libexif]]&lt;br /&gt;
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library&lt;br /&gt;
&lt;br /&gt;
; [[Adroit Photo Forensics]]&lt;br /&gt;
: http://digital-assembly.com/products/adroit-photo-forensics/&lt;br /&gt;
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.&lt;br /&gt;
&lt;br /&gt;
; Exif Viewer&lt;br /&gt;
: http://araskin.webs.com/exif/exif.html&lt;br /&gt;
: Add-on for Firefox and Thunderbird that displays various [[JPEG]]/JPG metadata in local and remote images.&lt;br /&gt;
&lt;br /&gt;
; exiftags&lt;br /&gt;
: http://johnst.org/sw/exiftags/&lt;br /&gt;
: open source utility to parse and edit [[exif]] data in [[JPEG]] images. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; exifprobe&lt;br /&gt;
: http://www.virtual-cafe.com/~dhh/tools.d/exifprobe.d/exifprobe.html&lt;br /&gt;
: Open source utility that reads [[exif]] data in [[JPEG]] and some &amp;quot;RAW&amp;quot; image formats. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; pngtools&lt;br /&gt;
: http://www.stillhq.com/pngtools/&lt;br /&gt;
: Open source suite of commands (pnginfo, pngchunks, pngchunksdesc) that reads metadata found in [[PNG]] files. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; pngmeta&lt;br /&gt;
: http://sourceforge.net/projects/pmt/files/&lt;br /&gt;
: Open source command line tool that extracts metadata from [[PNG]] images. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
=General=&lt;br /&gt;
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Extraction Tool]]&lt;br /&gt;
: &amp;quot;Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.&amp;quot;&lt;br /&gt;
: http://meta-extractor.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Assistant]]&lt;br /&gt;
: http://www.payneconsulting.com/products/metadataent/&lt;br /&gt;
&lt;br /&gt;
; [[hachoir|hachoir-metadata]]&lt;br /&gt;
: Extraction tool, part of '''[[Hachoir]]''' project&lt;br /&gt;
&lt;br /&gt;
; [[file]]&lt;br /&gt;
: The UNIX '''file''' program can extract some metadata&lt;br /&gt;
&lt;br /&gt;
; [[GNU libextractor]]&lt;br /&gt;
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata&lt;br /&gt;
&lt;br /&gt;
; [[Directory Lister Pro]]&lt;br /&gt;
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.&lt;br /&gt;
: http://www.krksoft.com&lt;br /&gt;
&lt;br /&gt;
[[Category:Tools]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Document_Metadata_Extraction</id>
		<title>Document Metadata Extraction</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Document_Metadata_Extraction"/>
				<updated>2010-04-09T14:23:21Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Images */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are tools that will extract metadata from document files.&lt;br /&gt;
&lt;br /&gt;
=Office Files=&lt;br /&gt;
&lt;br /&gt;
; [[antiword]]&lt;br /&gt;
: http://www.winfield.demon.nl/&lt;br /&gt;
&lt;br /&gt;
; [[catdoc]]&lt;br /&gt;
: http://www.45.free.net/~vitus/software/catdoc/&lt;br /&gt;
&lt;br /&gt;
; [[laola]]&lt;br /&gt;
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html&lt;br /&gt;
&lt;br /&gt;
; [[word2x]]&lt;br /&gt;
: http://word2x.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[wvWare]]&lt;br /&gt;
: http://wvware.sourceforge.net/&lt;br /&gt;
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.&lt;br /&gt;
&lt;br /&gt;
; [[Outside In]]&lt;br /&gt;
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html&lt;br /&gt;
: Originally developed by Stellant, supports hundreds of file types.&lt;br /&gt;
&lt;br /&gt;
; [[FI Tools]]&lt;br /&gt;
: http://forensicinnovations.com/&lt;br /&gt;
: More than 100 file types.&lt;br /&gt;
&lt;br /&gt;
=PDF Files=&lt;br /&gt;
&lt;br /&gt;
; [[xpdf]]&lt;br /&gt;
: http://www.foolabs.com/xpdf/&lt;br /&gt;
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(See [[PDF]])&lt;br /&gt;
&lt;br /&gt;
=Images=&lt;br /&gt;
&lt;br /&gt;
; [[jhead]]&lt;br /&gt;
: http://www.sentex.net/~mwandel/jhead/&lt;br /&gt;
: Displays or modifies [[Exif]] data in [[JPEG]] files.&lt;br /&gt;
&lt;br /&gt;
; [[vinetto]]&lt;br /&gt;
: http://vinetto.sourceforge.net/&lt;br /&gt;
: Examines [[Thumbs.db]] files.&lt;br /&gt;
&lt;br /&gt;
;[[libexif]]&lt;br /&gt;
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library&lt;br /&gt;
&lt;br /&gt;
; [[Adroit Photo Forensics]]&lt;br /&gt;
: http://digital-assembly.com/products/adroit-photo-forensics/&lt;br /&gt;
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.&lt;br /&gt;
&lt;br /&gt;
; Exif Viewer&lt;br /&gt;
: http://araskin.webs.com/exif/exif.html&lt;br /&gt;
: Add-on for Firefox and Thunderbird that displays various [[JPEG]]/JPG metadata in local and remote images.&lt;br /&gt;
&lt;br /&gt;
; exiftags&lt;br /&gt;
: http://johnst.org/sw/exiftags/&lt;br /&gt;
: open source utility to parse and edit [[exif]] data in [[JPEG]] images. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
; exifprobe&lt;br /&gt;
: http://www.virtual-cafe.com/~dhh/tools.d/exifprobe.d/exifprobe.html&lt;br /&gt;
: Open source utility that reads [[exif]] data in [[JPEG]] and some &amp;quot;RAW&amp;quot; image formats. Found in many Debian based distributions.&lt;br /&gt;
&lt;br /&gt;
=General=&lt;br /&gt;
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Extraction Tool]]&lt;br /&gt;
: &amp;quot;Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.&amp;quot;&lt;br /&gt;
: http://meta-extractor.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Assistant]]&lt;br /&gt;
: http://www.payneconsulting.com/products/metadataent/&lt;br /&gt;
&lt;br /&gt;
; [[hachoir|hachoir-metadata]]&lt;br /&gt;
: Extraction tool, part of '''[[Hachoir]]''' project&lt;br /&gt;
&lt;br /&gt;
; [[file]]&lt;br /&gt;
: The UNIX '''file''' program can extract some metadata&lt;br /&gt;
&lt;br /&gt;
; [[GNU libextractor]]&lt;br /&gt;
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata&lt;br /&gt;
&lt;br /&gt;
; [[Directory Lister Pro]]&lt;br /&gt;
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.&lt;br /&gt;
: http://www.krksoft.com&lt;br /&gt;
&lt;br /&gt;
[[Category:Tools]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Document_Metadata_Extraction</id>
		<title>Document Metadata Extraction</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Document_Metadata_Extraction"/>
				<updated>2010-04-09T13:43:35Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Images */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are tools that will extract metadata from document files.&lt;br /&gt;
&lt;br /&gt;
=Office Files=&lt;br /&gt;
&lt;br /&gt;
; [[antiword]]&lt;br /&gt;
: http://www.winfield.demon.nl/&lt;br /&gt;
&lt;br /&gt;
; [[catdoc]]&lt;br /&gt;
: http://www.45.free.net/~vitus/software/catdoc/&lt;br /&gt;
&lt;br /&gt;
; [[laola]]&lt;br /&gt;
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html&lt;br /&gt;
&lt;br /&gt;
; [[word2x]]&lt;br /&gt;
: http://word2x.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[wvWare]]&lt;br /&gt;
: http://wvware.sourceforge.net/&lt;br /&gt;
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.&lt;br /&gt;
&lt;br /&gt;
; [[Outside In]]&lt;br /&gt;
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html&lt;br /&gt;
: Originally developed by Stellant, supports hundreds of file types.&lt;br /&gt;
&lt;br /&gt;
; [[FI Tools]]&lt;br /&gt;
: http://forensicinnovations.com/&lt;br /&gt;
: More than 100 file types.&lt;br /&gt;
&lt;br /&gt;
=PDF Files=&lt;br /&gt;
&lt;br /&gt;
; [[xpdf]]&lt;br /&gt;
: http://www.foolabs.com/xpdf/&lt;br /&gt;
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(See [[PDF]])&lt;br /&gt;
&lt;br /&gt;
=Images=&lt;br /&gt;
&lt;br /&gt;
; [[jhead]]&lt;br /&gt;
: http://www.sentex.net/~mwandel/jhead/&lt;br /&gt;
: Displays or modifies [[Exif]] data in [[JPEG]] files.&lt;br /&gt;
&lt;br /&gt;
; [[vinetto]]&lt;br /&gt;
: http://vinetto.sourceforge.net/&lt;br /&gt;
: Examines [[Thumbs.db]] files.&lt;br /&gt;
&lt;br /&gt;
;[[libexif]]&lt;br /&gt;
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library&lt;br /&gt;
&lt;br /&gt;
; [[Adroit Photo Forensics]]&lt;br /&gt;
: http://digital-assembly.com/products/adroit-photo-forensics/&lt;br /&gt;
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.&lt;br /&gt;
&lt;br /&gt;
; Exif Viewer&lt;br /&gt;
: http://araskin.webs.com/exif/exif.html&lt;br /&gt;
: Add-on for Firefox and Thunderbird that displays various [[JPEG]]/JPG metadata in local and remote images.&lt;br /&gt;
&lt;br /&gt;
=General=&lt;br /&gt;
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Extraction Tool]]&lt;br /&gt;
: &amp;quot;Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.&amp;quot;&lt;br /&gt;
: http://meta-extractor.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Assistant]]&lt;br /&gt;
: http://www.payneconsulting.com/products/metadataent/&lt;br /&gt;
&lt;br /&gt;
; [[hachoir|hachoir-metadata]]&lt;br /&gt;
: Extraction tool, part of '''[[Hachoir]]''' project&lt;br /&gt;
&lt;br /&gt;
; [[file]]&lt;br /&gt;
: The UNIX '''file''' program can extract some metadata&lt;br /&gt;
&lt;br /&gt;
; [[GNU libextractor]]&lt;br /&gt;
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata&lt;br /&gt;
&lt;br /&gt;
; [[Directory Lister Pro]]&lt;br /&gt;
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.&lt;br /&gt;
: http://www.krksoft.com&lt;br /&gt;
&lt;br /&gt;
[[Category:Tools]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Document_Metadata_Extraction</id>
		<title>Document Metadata Extraction</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Document_Metadata_Extraction"/>
				<updated>2010-04-09T12:28:20Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* Images */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Here are tools that will extract metadata from document files.&lt;br /&gt;
&lt;br /&gt;
=Office Files=&lt;br /&gt;
&lt;br /&gt;
; [[antiword]]&lt;br /&gt;
: http://www.winfield.demon.nl/&lt;br /&gt;
&lt;br /&gt;
; [[catdoc]]&lt;br /&gt;
: http://www.45.free.net/~vitus/software/catdoc/&lt;br /&gt;
&lt;br /&gt;
; [[laola]]&lt;br /&gt;
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html&lt;br /&gt;
&lt;br /&gt;
; [[word2x]]&lt;br /&gt;
: http://word2x.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[wvWare]]&lt;br /&gt;
: http://wvware.sourceforge.net/&lt;br /&gt;
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.&lt;br /&gt;
&lt;br /&gt;
; [[Outside In]]&lt;br /&gt;
: http://www.oracle.com/technology/products/content-management/oit/oit_all.html&lt;br /&gt;
: Originally developed by Stellant, supports hundreds of file types.&lt;br /&gt;
&lt;br /&gt;
; [[FI Tools]]&lt;br /&gt;
: http://forensicinnovations.com/&lt;br /&gt;
: More than 100 file types.&lt;br /&gt;
&lt;br /&gt;
=PDF Files=&lt;br /&gt;
&lt;br /&gt;
; [[xpdf]]&lt;br /&gt;
: http://www.foolabs.com/xpdf/&lt;br /&gt;
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(See [[PDF]])&lt;br /&gt;
&lt;br /&gt;
=Images=&lt;br /&gt;
&lt;br /&gt;
; [[jhead]]&lt;br /&gt;
: http://www.sentex.net/~mwandel/jhead/&lt;br /&gt;
: Displays or modifies [[Exif]] data in [[JPEG]] files.&lt;br /&gt;
&lt;br /&gt;
; [[vinetto]]&lt;br /&gt;
: http://vinetto.sourceforge.net/&lt;br /&gt;
: Examines [[Thumbs.db]] files.&lt;br /&gt;
&lt;br /&gt;
;[[libexif]]&lt;br /&gt;
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library&lt;br /&gt;
&lt;br /&gt;
; [[Adroit Photo Forensics]]&lt;br /&gt;
: http://digital-assembly.com/products/adroit-photo-forensics/&lt;br /&gt;
: Displays meta data and uses date and camera meta-data for grouping, timelines etc.&lt;br /&gt;
&lt;br /&gt;
; Exif Viewer&lt;br /&gt;
: http://araskin.webs.com/exif/exif.html&lt;br /&gt;
: Add-on for Firefox and Thunderbird that displays various JPEG/JPG metadata in local and remote images.&lt;br /&gt;
&lt;br /&gt;
=General=&lt;br /&gt;
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Extraction Tool]]&lt;br /&gt;
: &amp;quot;Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.&amp;quot;&lt;br /&gt;
: http://meta-extractor.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
; [[Metadata Assistant]]&lt;br /&gt;
: http://www.payneconsulting.com/products/metadataent/&lt;br /&gt;
&lt;br /&gt;
; [[hachoir|hachoir-metadata]]&lt;br /&gt;
: Extraction tool, part of '''[[Hachoir]]''' project&lt;br /&gt;
&lt;br /&gt;
; [[file]]&lt;br /&gt;
: The UNIX '''file''' program can extract some metadata&lt;br /&gt;
&lt;br /&gt;
; [[GNU libextractor]]&lt;br /&gt;
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata&lt;br /&gt;
&lt;br /&gt;
; [[Directory Lister Pro]]&lt;br /&gt;
: Directory Lister Pro is a Windows tool which creates listings of files from selected directories on hard disks, CD-ROMs, DVD-ROMs, floppies, USB storages and network shares. Listing can be in HTML, text or CSV format (for easy import to Excel). Listing can contain standard file information like file name, extension, type, owner and date created, but especially for forensic analysis file meta data can be extracted from various formats: 1) executable file information (EXE, DLL, OCX) like file version, description, company, product name. 2) multimedia properties (MP3, AVI, WAV, JPG, GIF, BMP, MKV, MKA, MPEG) like track, title, artist, album, genre, video format, bits per pixel, frames per second, audio format, bits per channel. 3) Microsoft Office files (DOC, DOCX, XLS, XLSX, PPT, PPTX) like document title, author, keywords, word count. For each file and folder it is also possible to obtain its CRC32, MD5, SHA-1 and Whirlpool hash sum. Extensive number of options allows to completely customize the visual look of the output. Filter on file name, date, size or attributes can be applied so it is possible to limit the files listed.&lt;br /&gt;
: http://www.krksoft.com&lt;br /&gt;
&lt;br /&gt;
[[Category:Tools]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Libexif</id>
		<title>Libexif</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Libexif"/>
				<updated>2010-04-08T18:16:41Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: Created page; added basic info, a few examples and links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox_Software |&lt;br /&gt;
  name = Xpdf |&lt;br /&gt;
  maintainer = Mueller, Patera, Ulrich, Figuiere |&lt;br /&gt;
  os = Posix, Win32 |&lt;br /&gt;
  genre = [[Document Metadata Extraction]] |&lt;br /&gt;
  license = {{LGPL}} |&lt;br /&gt;
  website = [http://libexif.sourceforge.net/ libexif.sourceforge.net/] |&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
'''libexif''' is an [[open source]] library for reading and editing [[Exif]] metadata, commonly found in digital pictures.  It is written in C and released under the [[LGPL]].  The library runs under GNU/Linux, BSD, Mac OS X, and Win32. The library is developed by the &amp;quot;libexif project.&amp;quot; The project also includes command-line and graphical front-ends.&lt;br /&gt;
&lt;br /&gt;
==Examples==&lt;br /&gt;
&lt;br /&gt;
The use of the command-line front end is typical of many Unix or Linux commands.&lt;br /&gt;
  exif example.jpg&lt;br /&gt;
&lt;br /&gt;
This command will write all tags and values contained in the file to standard output.  It will also indicate if the file contains an embedded thumbnail.&lt;br /&gt;
&lt;br /&gt;
Other common options for the exif command:&lt;br /&gt;
&lt;br /&gt;
* -e extracts the thumbnail. You can use the --output option to a specified file&lt;br /&gt;
* -r removed the thumbnail from the image. Will write a new image. You can specify the file with the --output option&lt;br /&gt;
* -l lists all known tags&lt;br /&gt;
*  --remove removes a specified tag or entire IFD (Image File Directory) is no tag is given&lt;br /&gt;
&lt;br /&gt;
==External Links==&lt;br /&gt;
&lt;br /&gt;
* [http://libexif.sourceforge.net/docs.html libexif project documentation]&lt;br /&gt;
* [http://libexif.sourceforge.net/man/exif.html man page for exif command line tool]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Hachoir</id>
		<title>Hachoir</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Hachoir"/>
				<updated>2010-04-07T18:16:51Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: updated website; updated release info&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox_Software |&lt;br /&gt;
  name = Hachoir |&lt;br /&gt;
  maintainer = Victor Stinner |&lt;br /&gt;
  os = {{Cross-platform}} |&lt;br /&gt;
  genre = {{Analysis}} |&lt;br /&gt;
  license = {{GPL}} |&lt;br /&gt;
  website = [http://bitbucket.org/haypo/hachoir/wiki/Home] |&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
'''Hachoir''' is a generic framework for binary file manipulation. Written in Python, it's operating system independent and has many text/graphic user interfaces (ncurses, wxWidget, Gtk+). Although it contains a few functions to modify files, it is generally intended for examining existing files. Hachoir currently supports more than sixty file formats. File format recognition is based on the headers and footers in a disk image of file. It has a fault tolerant parser designed to handle truncated or buggy files. The framework also automatically adjusts for endian or character set issues.  The framework can be scripted and extended.&lt;br /&gt;
&lt;br /&gt;
The package includes several sample programs based on the core framework and parser:&lt;br /&gt;
&lt;br /&gt;
* hachoir-metadata: extract metadata&lt;br /&gt;
* hachoir-strip: remove metadata and other &amp;quot;useless&amp;quot; informations&lt;br /&gt;
* hachoir-grep: find substring in a binary file (using hachoir parsers: so search is Unicode aware)&lt;br /&gt;
* hachoir-subfile: find all subfiles in a file&lt;br /&gt;
&lt;br /&gt;
The current version of hachoir-core is 1.3.4 and was released in February 2010. Precompiled packages are available for the Debian, Gentoo, Mandriva, and Arch [[Linux]] distributions along with FreeBSD'''.&lt;br /&gt;
&lt;br /&gt;
== External Links ==&lt;br /&gt;
&lt;br /&gt;
* [http://bitbucket.org/haypo/hachoir/wiki/Home/ Official website]&lt;br /&gt;
&lt;br /&gt;
[[Category:Metadata]]&lt;br /&gt;
[[Category:Windows]]&lt;br /&gt;
[[Category:Linux]]&lt;br /&gt;
[[Category:FreeBSD]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Hachoir</id>
		<title>Hachoir</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Hachoir"/>
				<updated>2010-04-07T18:07:57Z</updated>
		
		<summary type="html">&lt;p&gt;Hypertex: /* External Links */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox_Software |&lt;br /&gt;
  name = Hachoir |&lt;br /&gt;
  maintainer = Victor Stinner |&lt;br /&gt;
  os = {{Cross-platform}} |&lt;br /&gt;
  genre = {{Analysis}} |&lt;br /&gt;
  license = {{GPL}} |&lt;br /&gt;
  website = [http://hachoir.org/ hachoir.org] |&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
'''Hachoir''' is a generic framework for binary file manipulation. Written in Python, it's operating system independent and has many text/graphic user interfaces (ncurses, wxWidget, Gtk+). Although it contains a few functions to modify files, it is generally intended for examining existing files. Hachoir currently supports more than sixty file formats. File format recognition is based on the headers and footers in a disk image of file. It has a fault tolerant parser designed to handle truncated or buggy files. The framework also automatically adjusts for endian or character set issues.  The framework can be scripted and extended.&lt;br /&gt;
&lt;br /&gt;
The package includes several sample programs based on the core framework and parser:&lt;br /&gt;
&lt;br /&gt;
* hachoir-metadata: extract metadata&lt;br /&gt;
* hachoir-strip: remove metadata and other &amp;quot;useless&amp;quot; informations&lt;br /&gt;
* hachoir-grep: find substring in a binary file (using hachoir parsers: so search is Unicode aware)&lt;br /&gt;
* hachoir-subfile: find all subfiles in a file&lt;br /&gt;
&lt;br /&gt;
The current version is 1.2 and was released on September 2008. Precompiled packages are available for the Debian, Gentoo, Mandriva, and Arch [[Linux]] distributions along with FreeBSD'''.&lt;br /&gt;
&lt;br /&gt;
== External Links ==&lt;br /&gt;
&lt;br /&gt;
* [http://bitbucket.org/haypo/hachoir/wiki/Home/ Official website]&lt;br /&gt;
&lt;br /&gt;
[[Category:Metadata]]&lt;br /&gt;
[[Category:Windows]]&lt;br /&gt;
[[Category:Linux]]&lt;br /&gt;
[[Category:FreeBSD]]&lt;/div&gt;</summary>
		<author><name>Hypertex</name></author>	</entry>

	</feed>