File Format Identification

From ForensicsWiki
Revision as of 00:54, 21 October 2008 by Kent (Talk | contribs) (Bibliography)

Jump to: navigation, search

File Format Identification is the process of figuring out the format of a sequence of bytes. Operating systems typically do this by file extension or by embedded MIME information. Forensic applications need to identify file types by content.



  • Written in C.
  • Rules in /usr/share/file/magic and compiled at runtime.
  • Powers the Unix “file” command, but you can also call the library directly from a C program.



Stellent/Oracle Outside-In


Current research papers on the file format identification problem. Most of these papers concern themselves with identifying file format of a few file sectors, rather than an entire file.

  • Mason McDaniel, Automatic File Type Detection Algorithm, Masters Thesis, James Madison University,2001
  • FORSIGS; Forensic Signature Analysis of the Hard Drive for Multimedia File Fingerprints, John Haggerty and Mark Taylor, IFIP TC11 International Information Security Conference, 2006, Sandton, South Africa.
  • Karresand and Shahmehri, 2006 Martin Karresand and Nahid Shahmehri, Oscar – file type identification of binary data in disk clusters and RAM pages, IFIP security and privacy in dynamic environments vol. 201 (2006) p. 413–424.