Document Metadata Extraction
From Forensics Wiki
Here are tools that will extract metadata from document files.
Contents |
Office Files
- wvWare
- http://wvware.sourceforge.net/
- Extracts metadata from various Microsoft Word files (doc). Can also convert doc files to other formats such as HTML or plain text.
PDF Files
- xpdf
- http://www.foolabs.com/xpdf/
- pdfinfo (part of the xpdf package) displays some metadata of PDF files.
- pdfimages
- Part of xpdf, this program will strip all of the images out of a PDF file and put each in its own file.
(See PDF)
Images
- jhead
- http://www.sentex.net/~mwandel/jhead/
- Displays or modifies Exif data in JPEG files.
- vinetto
- http://vinetto.sourceforge.net/
- Examines Thumbs.db files.
- libexif
- http://sourceforge.net/projects/libexif EXIF tag Parsing Library
General
These general-purpose programs frequently work when the special-purpose programs fail, but they generally provide less detailed information.
- hachoir-metadata
- Extraction tool, part of Hachoir project
- file
- The UNIX file program can extract some metadata
- GNU libextractor
- http://gnunet.org/libextractor/ The libextractor library is a plugable system for extracting metadata