Here are tools that will extract metadata from document files.
=Office Files=
; [[antiword]]
: http://www.winfield.demon.nl/
; [[catdoc]]
: http://www.45.free.net/~vitus/software/catdoc/
; [[laola]]
: http://user.cs.tu-berlin.de/~schwartz/pmh/index.html
; [[word2x]]
: http://word2x.sourceforge.net/
; [[wvWare]]
: http://wvware.sourceforge.net/
: Extracts metadata from various [[Microsoft]] Word files ([[doc]]). Can also convert doc files to other formats such as HTML or plain text.
=PDF Files=
; [[xpdf]]
: http://www.foolabs.com/xpdf/
: [[pdfinfo]] (part of the [[xpdf]] package) displays some metadata of [[PDF]] files.
(See [[PDF]])
; [[jhead]]
: http://www.sentex.net/~mwandel/jhead/
: Displays or modifies [[Exif]] data in [[JPEG]] files.
; [[vinetto]]
: http://vinetto.sourceforge.net/
: Examines [[Thumbs.db]] files.
: http://sourceforge.net/projects/libexif EXIF tag Parsing Library
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.
; [[Metadata Extraction Tool]]
: "Developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others."
: http://meta-extractor.sourceforge.net/
; [[Metadata Assistant]]
: http://www.payneconsulting.com/products/metadataent/
; [[hachoir|hachoir-metadata]]
: Extraction tool, part of '''[[Hachoir]]''' project
; [[file]]
: The UNIX '''file''' program can extract some metadata
; [[GNU libextractor]]
: http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata

