ForensicsWiki will continue to operate as it has before and will not be shutting down. Thank you for your continued support of ForensicsWiki.
File Format Identification
File Format Identification is the process of figuring out the format of a sequence of bytes. Operating systems typically do this by file extension or by embedded MIME information. Forensic applications need to identify file types by content.
- Written in C.
- Rules in /usr/share/file/magic and compiled at runtime.
- Powers the Unix “file” command, but you can also call the library directly from a C program.
- Writen in Java
- Developed by National Archives of the United Kingdom.
- XML config file
- Closed source; free for non-commercial use
- Proprietary but free demo.
Current research papers on the file format identification problem. Most of these papers concern themselves with identifying file format of a few file sectors, rather than an entire file.
- Mason McDaniel, Automatic File Type Detection Algorithm, Masters Thesis, James Madison University,2001
- Content Based File Type Detection Algorithms, hicss,pp.332a, 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 9, 2003.
- Fileprints: identifying file types by n-gram analysis, LiWei-Jen, Wang Ke, Stolfo SJ, Herzog B.., IProceeding of the 2005 IEEE workshop on information assurance; 2005 [slides]
- File type identification of data fragments by their binary structure. , Karresand Martin, Shahmehri Nahid. Proceedings of the IEEE workshop on information assurance; 2006b. p. 140–7. [slides]
- Using Artificial Neural Networks for Forensic File Type Identification, Ryan M. Harris, Master's Thesis, Purdue University, May 2007