Difference between revisions of "File Format Identification"
From Forensics Wiki
(Adding New Bibliographies) |
m (Editing and formatting Bibliographies) |
||
| (7 intermediate revisions by 4 users not shown) | |||
| Line 17: | Line 17: | ||
* Closed source; free for non-commercial use | * Closed source; free for non-commercial use | ||
* http://mark0.net/soft-trid-e.html | * http://mark0.net/soft-trid-e.html | ||
| + | |||
| + | ==Forensic Innovations File Investigator TOOLS== | ||
| + | * Proprietary, but free trial available. | ||
| + | * Available as consumer applications and OEM API. | ||
| + | * Identifies 3,000+ file types, using multiple methods to maintain high accuracy. | ||
| + | * Extracts metadata for many of the supported file types. | ||
| + | * http://www.forensicinnovations.com/fitools.html | ||
==Stellent/Oracle Outside-In== | ==Stellent/Oracle Outside-In== | ||
* Proprietary but free demo. | * Proprietary but free demo. | ||
* http://www.oracle.com/technology/products/content-management/oit/oit_all.html | * http://www.oracle.com/technology/products/content-management/oit/oit_all.html | ||
| + | |||
| + | ==[[Forensic Assistant]]== | ||
| + | * Proprietary. | ||
| + | * Provides detection of password protected archives, some files of cryptographic programs, Pinch/Zeus binary reports, etc. | ||
[[Category:Tools]] | [[Category:Tools]] | ||
| Line 38: | Line 49: | ||
; 2005 | ; 2005 | ||
| − | * | + | * Fileprints: identifying file types by n-gram analysis, LiWei-Jen, Wang Ke, Stolfo SJ, Herzog B.., IProceeding of the 2005 IEEE workshop on information assurance; 2005 [http://www.itoc.usma.edu/workshop/2005/Papers/Follow%20ups/FilePrintPresentation-final.pdf [slides]] [http://www1.cs.columbia.edu/ids/publications/FilePrintPaper-revised.pdf PDF] |
| − | * | + | * Douglas J. Hickok, Daine Richard Lesniak, Michael C. Rowe, File Type Detection Technology, 2005 Midwest Instruction and Computing Symposium. [http://www.micsymposium.org/mics_2005/papers/paper7.pdf PDF] |
; 2006 | ; 2006 | ||
| − | * [http://ieeexplore.ieee.org/iel5/10992/34632/01652088.pdf File type identification of data fragments by their binary structure. ], | + | * Karresand Martin, Shahmehri Nahid [http://ieeexplore.ieee.org/iel5/10992/34632/01652088.pdf File type identification of data fragments by their binary structure. ], Proceedings of the IEEE workshop on information assurance, pp.140–147, 2006.[http://www.itoc.usma.edu/workshop/2006/Program/Presentations/IAW2006-07-3.pdf [slides]] |
| − | * | + | * Gregory A. Hall, Sliding Window Measurement for File Type Identification, Computer Forensics and Intrusion Analysis Group, ManTech Security and Mission Assurance, 2006. [http://www.mantechcfia.com/SlidingWindowMeasurementforFileTypeIdentification.pdf PDF] |
* FORSIGS; Forensic Signature Analysis of the Hard Drive for Multimedia File Fingerprints, John Haggerty and Mark Taylor, IFIP TC11 International Information Security Conference, 2006, Sandton, South Africa. | * FORSIGS; Forensic Signature Analysis of the Hard Drive for Multimedia File Fingerprints, John Haggerty and Mark Taylor, IFIP TC11 International Information Security Conference, 2006, Sandton, South Africa. | ||
| − | * Oscar -- Using Byte Pairs to Find File Type and Camera Make of Data Fragments, | + | * Martin Karresand , Nahid Shahmehri, "Oscar -- Using Byte Pairs to Find File Type and Camera Make of Data Fragments," Annual Workshop on Digital Forensics and Incident Analysis, Pontypridd, Wales, UK, pp.85-94, Springer-Verlag, 2006. |
; 2007 | ; 2007 | ||
| Line 56: | Line 67: | ||
* Karresand M., Shahmehri N., [http://dx.doi.org/10.1007/0-387-33406-8_35 Oscar: File Type Identification of Binary Data in Disk Clusters and RAM Pages], Proceedings of IFIP International Information Security Conference: Security and Privacy in Dynamic Environments (SEC2006), Springer, ISBN 0-387-33405-x, pp 413-424, May 22 - 24, Karlstad, Sweden. | * Karresand M., Shahmehri N., [http://dx.doi.org/10.1007/0-387-33406-8_35 Oscar: File Type Identification of Binary Data in Disk Clusters and RAM Pages], Proceedings of IFIP International Information Security Conference: Security and Privacy in Dynamic Environments (SEC2006), Springer, ISBN 0-387-33405-x, pp 413-424, May 22 - 24, Karlstad, Sweden. | ||
| − | * "Identification and Localization of Data Types within Large-Scale File Systems," | + | * Robert F. Erbacher and John Mulholland, "Identification and Localization of Data Types within Large-Scale File Systems," Proceedings of the 2nd International Workshop on Systematic Approaches to Digital Forensic Engineering, Seattle, WA, April 2007. |
| − | * [https://www.cerias.purdue.edu/tools_and_resources/bibtex_archive/archive/2007-19.pdf | + | * Ryan M. Harris, "Using Artificial Neural Networks for Forensic File Type Identification," Master's Thesis, Purdue University, May 2007. [https://www.cerias.purdue.edu/tools_and_resources/bibtex_archive/archive/2007-19.pdf PDF] |
| − | * [http://www.dfrws.org/2008/proceedings/p14- | + | * Predicting the Types of File Fragments, William Calhoun, Drue Coles, DFRWS 2008 [http://www.dfrws.org/2008/proceedings/p14-calhoun_pres.pdf [slides]] [http://www.dfrws.org/2008/proceedings/p14-calhoun.pdf PDF] |
| − | * [http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=04545366 SÁDI – Statistical Analysis for Data type Identification] | + | * Sarah J. Moody and Robert F. Erbacher, [http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=04545366 SÁDI – Statistical Analysis for Data type Identification], 3rd International Workshop on Systematic Approaches to Digital Forensic Engineering, 2008. |
; 2008 | ; 2008 | ||
| − | * Mehdi Chehel Amirani, Mohsen Toorani, and Ali Asghar Beheshti Shirazi, [http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4625611 A New Approach to Content-based File Type Detection], Proceedings of the 13th IEEE Symposium on Computers and Communications (ISCC'08), pp.1103-1108, IEEE ComSoc, Marrakech, Morocco, July 2008. | + | * Mehdi Chehel Amirani, Mohsen Toorani, and Ali Asghar Beheshti Shirazi, [http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4625611 A New Approach to Content-based File Type Detection], Proceedings of the 13th IEEE Symposium on Computers and Communications |
| + | (ISCC'08), pp.1103-1108, IEEE ComSoc, Marrakech, Morocco, July 2008.[http://webpages.iust.ac.ir/mtoorani/FTD.pdf Slides] [http://webpages.iust.ac.ir/mtoorani/C2.pdf PDF] | ||
| + | |||
| + | ; 2009 | ||
| + | * Roussev, Vassil, and Garfinkel, Simson, "File Classification Fragment-The Case for Specialized Approaches," Systematic Approaches to Digital Forensics Engineering (IEEE/SADFE 2009), Oakland, California. [http://simson.net/clips/academic/2009.SADFE.Fragments.pdf PDF] | ||
| + | |||
| + | * Irfan Ahmed, Kyung-suk Lhee, Hyunjung Shin and ManPyo Hong, [http://www.springerlink.com/content/g2655k2044615q75/ On Improving the Accuracy and Performance of Content-based File Type Identification], Proceedings of the 14th Australasian Conference on Information Security and Privacy (ACISP 2009), pp.44-59, LNCS (Springer), Brisbane, Australia, July 2009. | ||
| + | ; 2010 | ||
| + | *Irfan Ahmed, Kyung-suk Lhee, Hyunjung Shin and ManPyo Hong, [http://www.alphaminers.net/sub05/sub05_03.php?swf_pn=5&swf_sn=3&swf_pn2=3 Fast File-type Identification], Proceedings of the 25th ACM Symposium on Applied Computing (ACM SAC 2010), ACM, Sierre, Switzerland, March 2010. | ||
[[Category:Bibliographies]] | [[Category:Bibliographies]] | ||
Revision as of 01:54, 19 December 2009
File Format Identification is the process of figuring out the format of a sequence of bytes. Operating systems typically do this by file extension or by embedded MIME information. Forensic applications need to identify file types by content.
Contents |
Tools
libmagic
- Written in C.
- Rules in /usr/share/file/magic and compiled at runtime.
- Powers the Unix “file” command, but you can also call the library directly from a C program.
- http://sourceforge.net/projects/libmagic
DROID
- Writen in Java
- Developed by National Archives of the United Kingdom.
- http://droid.sourceforge.net
TrID
- XML config file
- Closed source; free for non-commercial use
- http://mark0.net/soft-trid-e.html
Forensic Innovations File Investigator TOOLS
- Proprietary, but free trial available.
- Available as consumer applications and OEM API.
- Identifies 3,000+ file types, using multiple methods to maintain high accuracy.
- Extracts metadata for many of the supported file types.
- http://www.forensicinnovations.com/fitools.html
Stellent/Oracle Outside-In
- Proprietary but free demo.
- http://www.oracle.com/technology/products/content-management/oit/oit_all.html
Forensic Assistant
- Proprietary.
- Provides detection of password protected archives, some files of cryptographic programs, Pinch/Zeus binary reports, etc.
Bibliography
- 2001
Current research papers on the file format identification problem. Most of these papers concern themselves with identifying file format of a few file sectors, rather than an entire file. Please note that this bibliography is in chronological order!
- Mason McDaniel, Automatic File Type Detection Algorithm, Masters Thesis, James Madison University,2001
- 2003
- Content Based File Type Detection Algorithms, Mason McDaniel and M. Hossain Heydari, 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 9, 2003.
- 2005
- Fileprints: identifying file types by n-gram analysis, LiWei-Jen, Wang Ke, Stolfo SJ, Herzog B.., IProceeding of the 2005 IEEE workshop on information assurance; 2005 [slides] PDF
- Douglas J. Hickok, Daine Richard Lesniak, Michael C. Rowe, File Type Detection Technology, 2005 Midwest Instruction and Computing Symposium. PDF
- 2006
- Karresand Martin, Shahmehri Nahid File type identification of data fragments by their binary structure. , Proceedings of the IEEE workshop on information assurance, pp.140–147, 2006.[slides]
- Gregory A. Hall, Sliding Window Measurement for File Type Identification, Computer Forensics and Intrusion Analysis Group, ManTech Security and Mission Assurance, 2006. PDF
- FORSIGS; Forensic Signature Analysis of the Hard Drive for Multimedia File Fingerprints, John Haggerty and Mark Taylor, IFIP TC11 International Information Security Conference, 2006, Sandton, South Africa.
- Martin Karresand , Nahid Shahmehri, "Oscar -- Using Byte Pairs to Find File Type and Camera Make of Data Fragments," Annual Workshop on Digital Forensics and Incident Analysis, Pontypridd, Wales, UK, pp.85-94, Springer-Verlag, 2006.
- 2007
- Karresand M., Shahmehri N., Oscar: File Type Identification of Binary Data in Disk Clusters and RAM Pages, Proceedings of IFIP International Information Security Conference: Security and Privacy in Dynamic Environments (SEC2006), Springer, ISBN 0-387-33405-x, pp 413-424, May 22 - 24, Karlstad, Sweden.
- Robert F. Erbacher and John Mulholland, "Identification and Localization of Data Types within Large-Scale File Systems," Proceedings of the 2nd International Workshop on Systematic Approaches to Digital Forensic Engineering, Seattle, WA, April 2007.
- Ryan M. Harris, "Using Artificial Neural Networks for Forensic File Type Identification," Master's Thesis, Purdue University, May 2007. PDF
- Sarah J. Moody and Robert F. Erbacher, SÁDI – Statistical Analysis for Data type Identification, 3rd International Workshop on Systematic Approaches to Digital Forensic Engineering, 2008.
- 2008
- Mehdi Chehel Amirani, Mohsen Toorani, and Ali Asghar Beheshti Shirazi, A New Approach to Content-based File Type Detection, Proceedings of the 13th IEEE Symposium on Computers and Communications
(ISCC'08), pp.1103-1108, IEEE ComSoc, Marrakech, Morocco, July 2008.Slides PDF
- 2009
- Roussev, Vassil, and Garfinkel, Simson, "File Classification Fragment-The Case for Specialized Approaches," Systematic Approaches to Digital Forensics Engineering (IEEE/SADFE 2009), Oakland, California. PDF
- Irfan Ahmed, Kyung-suk Lhee, Hyunjung Shin and ManPyo Hong, On Improving the Accuracy and Performance of Content-based File Type Identification, Proceedings of the 14th Australasian Conference on Information Security and Privacy (ACISP 2009), pp.44-59, LNCS (Springer), Brisbane, Australia, July 2009.
- 2010
- Irfan Ahmed, Kyung-suk Lhee, Hyunjung Shin and ManPyo Hong, Fast File-type Identification, Proceedings of the 25th ACM Symposium on Applied Computing (ACM SAC 2010), ACM, Sierre, Switzerland, March 2010.