Difference between pages "Linux" and "File Carving"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
 
(Memory Carving)
 
Line 1: Line 1:
{{Expand}}
+
'''File Carving,''' or sometimes simply '''Carving,''' is the practice of searching an input for files or other kinds of objects based on content, rather than on metadata. File carving is a powerful tool for recovering files and fragments of files when directory entries are corrupt or missing, as may be the case with old files that have been deleted or when performing an analysis on damaged media. Memory carving is a useful tool for analyzing physical and virtual memory dumps when the memory structures are unknown or have been overwritten.
  
Linux refers to the family of Unix-like computer operating systems using the Linux kernel. Due to the nature of Linux it is possible for a wide range of high penetration forensic tools. 
 
  
The wide variety of useful Linux utilities exist for desktop computers can also be used on Linux-based PDAs.  These utilities can often be used as a part of the [[forensics investigation]] process.
+
Most file carvers operate by looking for file headers and/or footers, and then "carving out" the blocks between these two boundaries. [[Semantic Carving]] performs carving based on an analysis of the contents of the proposed files.  
  
Software for Linux systems are not only targets at personal computers, desktops, laptops etc, but also server based tools exist for both accessing, monitoring and analysing servers.  
+
File carving should be done on a [[disk image]], rather than on the original disk.
  
== Specialist Software ==
+
File carving tools are listed on the [[Tools:Data_Recovery]] wiki page.
  
=== Helix ===
+
Many carving programs have an option to only look at or near sector boundaries where headers are found. However, searching the entire input can find files that have been embedded into other files, such as [[JPEG]]s being embedded into [[Microsoft]] [[DOC|Word documents]]. This may be considered an advantage or a disadvantage, depending on the circumstances.
  
[http://www.e-fense.com/h3-enterprise.php Helix] is a live Linux CD designed for live incident response. Helix is targeted towards the more experienced users and forensic investigators.  
+
The majority of file carving programs will only recover files that are contiguous on the media (in other words files that are not fragmented).
  
The latest version of Helix, Helix 3, is based on the Ubuntu version of Linux, this allows for greater stability and ease of use.  
+
== Fragmented File Recovery ==
 +
[[Simson Garfinkel]] estimated that upto 58% of outlook, 17% of jpegs and 16% of MS-Word files are fragmented and, therefore, appear corrupted or missing to a user using traditional data carving. The first set of file carving programs that can handle fragmented files automatically have finally arrived.
 +
[[User:PashaPal|A. Pal]], [[User:NasirMemon|N. Memon]]. T. Sencar and K. Shanmugasundaram have introduced a technique called [[File_Carving:SmartCarving|SmartCarving]] that can recover fragmented files.
  
Due to Helix being a live disc it is possible to run it on a "suspect" machine whilst the installed operating system remains inactive, also live network forensics are possible when running the Helix Live Disc allowing for users to perform checks on networks that their machines are attached to.
+
== File Carving Taxonomy==
 +
[[Simson Garfinkel]] and [[Joachim Metz]] have proposed the following file carving taxonomy:
  
== Tools ==
+
;Carving
 +
:General term for extracting data (files) out of undifferentiated blocks (raw data), like "carving" a sculpture out of soap stone.
  
=== dd ===
+
;Block-Based Carving
 +
:Any carving method (algorithm) that analyzes the input on block-by-block basis to determine if a block is part of a possible output file. This method assumes that each block can only be part of a single file (or embedded file).
  
'''[[dd]]''', or duplicate disk, is a Unix and Linux utility that allows the user to create a bitstream image of a disk or device. Once the Linux-based PDA is connected to another device and the dd utility is run, the mirror image can be uploaded onto [[memory card]]s or even an external desktop workstation connected via a network. Images created by dd are readable by [[forensics software]] tools such as [[EnCase]] and [[Forensic Toolkit]]. Since the device uses a Linux [[filesystem]], the image may also be mounted and examined on a Linux workstation.
+
;Statistical Carving
 +
:Any carving method (algorithm) that analyzes the input on characteristic or statistic for example, entropy) to determine if the input is part of a possible output file.
  
=== foremost ===
+
;Header/Footer Carving
 +
:A method for carving files out of raw data using a distinct header (start of file marker) and footer (end of file marker).
  
'''[[foremost]]''' is a Linux based program data for [[Recovering_deleted_data|recovering deleted files]] and served as the basis for the more modern [[Scalpel]]. The program uses a configuration file to specify [[File_Formats|headers and footers]] to search for. Intended to be run on disk images, foremost can search through most any kind of data without worrying about the format.
+
;Header/Maximum (file) size Carving
 +
:A method for carving files out of raw data using a distinct header (start of file marker) and a maximum (file) size. This approach works because many file formats (e.g. JPEG, MP3) do not care if additional junk is appended to the end of a valid file.
  
=== EtherApe ===
+
;Header/Embedded Length Carving
 +
:A method for carving files out of raw data using a distinct header and a file length (size) which is embedded in the file format
  
[http://etherape.sourceforge.net/ EtherApe]is a free program built on the structure of Etherman. It is designed as a high level wide range network monitoring tool which provides a graphical display to the user illustrating packet information. Although EtherApe might be seen as a security orientated tool it does have forensic application.  
+
;File structure based Carving
 +
:A method for carving files out of raw data using a certain level of knowledge of the internal structure of file types. Garfinkel called this approach "Semantic Carving" in his DFRWS2006 carving challenge submission, while Metz and Mora called the approach "Deep Carving."
  
EtherApe has two main modes, live monitoring which can be run on a server machine which will map any packets passing to and from that machine, illustrating with colours the type of packet, as well as by diameter the amount of traffic that type of packet brings. It is also possible to see the different nodes attached , by IP and IPv6 addresses.  
+
;Semantic Carving
 +
:A method for carving files based on a linguistic analysis of the file's content. For example, a semantic carver might conclude that six blocks of french in the middle of a long HTML file written in English is a fragment left from a previous allocated file, and not from the English-language HTML file.
  
EtherApe's secondary function is a review ability, taking a selection of packets captured either by TCPDUMP command or another piece of capture software. When running the file through EtherApe the program displays the same information as it does with a live capture but reading from the data file imported instead of the live network. A review of files can be done on any machine, regardless of network connectivity.
+
;Carving with Validation
 +
:A method for carving files out of raw data where the carved files are validated using a file type specific validator.
  
=References=
+
;Fragment Recovery Carving
 +
:A carving method in which two or more fragments are reassembled to form the original file or object. Garfinkel previously called this approach "Split Carving."
  
* http://en.wikipedia.org/wiki/Linux
+
;Repackaging Carving
* http://en.wikipedia.org/wiki/Android_(mobile_device_platform)
+
:A carving method that modifies the extracted data by adding new headers, footers, or other information so that it can be viewed with standard utilities. For example, Garfinkel's [[ZIP Carver]] looks for individual components of a ZIP file and repackages them with a new Central Directory so that they can be opened with a standard unzip utility.
* http://www.android-freeware.org/
+
  
[[Category:Operating systems]]
+
== File Carving challenges and test images ==
 +
 
 +
[http://www.dfrws.org/2006/challenge/ File Carving Challenge] - [[Digital Forensic Research Workshop|DFRWS]] 2006
 +
 
 +
[http://www.dfrws.org/2007/challenge/ File Carving Challenge] - [[Digital Forensic Research Workshop|DFRWS]] 2007
 +
 
 +
[http://dftt.sourceforge.net/test6/index.html FAT Undelete Test #1] - Digital Forensics Tool Testing Image (dftt #6)
 +
 
 +
[http://dftt.sourceforge.net/test7/index.html NTFS Undelete (and leap year) Test #1] - Digital Forensics Tool Testing Image (dftt #7)
 +
 
 +
[http://dftt.sourceforge.net/test11/index.html Basic Data Carving Test - fat32], Nick Mikus - Digital Forensics Tool Testing Image (dftt #11)
 +
 
 +
[http://dftt.sourceforge.net/test12/index.html Basic Data Carving Test - ext2],  Nick Mikus - Digital Forensics Tool Testing Image (dftt #12)
 +
 
 +
== See also ==
 +
* [[Tools:Data_Recovery#Carving | File Carving Tools]]
 +
* [[File Carving Bibliography]]
 +
* [[Carver 2.0 Planning Page]]
 +
* [[File Carving:SmartCarving|SmartCarving]]
 +
 
 +
=Memory Carving=
 +
 
 +
== External Links ==
 +
* [http://sourceforge.net/projects/revit/files/Documentation/Master%20Thesis%20-%20Advanced%20File%20Carving/ Measuring and Improving the Quality of File Carving Methods], by [[Bas Kloet]]

Revision as of 04:45, 31 July 2012

File Carving, or sometimes simply Carving, is the practice of searching an input for files or other kinds of objects based on content, rather than on metadata. File carving is a powerful tool for recovering files and fragments of files when directory entries are corrupt or missing, as may be the case with old files that have been deleted or when performing an analysis on damaged media. Memory carving is a useful tool for analyzing physical and virtual memory dumps when the memory structures are unknown or have been overwritten.


Most file carvers operate by looking for file headers and/or footers, and then "carving out" the blocks between these two boundaries. Semantic Carving performs carving based on an analysis of the contents of the proposed files.

File carving should be done on a disk image, rather than on the original disk.

File carving tools are listed on the Tools:Data_Recovery wiki page.

Many carving programs have an option to only look at or near sector boundaries where headers are found. However, searching the entire input can find files that have been embedded into other files, such as JPEGs being embedded into Microsoft Word documents. This may be considered an advantage or a disadvantage, depending on the circumstances.

The majority of file carving programs will only recover files that are contiguous on the media (in other words files that are not fragmented).

Fragmented File Recovery

Simson Garfinkel estimated that upto 58% of outlook, 17% of jpegs and 16% of MS-Word files are fragmented and, therefore, appear corrupted or missing to a user using traditional data carving. The first set of file carving programs that can handle fragmented files automatically have finally arrived. A. Pal, N. Memon. T. Sencar and K. Shanmugasundaram have introduced a technique called SmartCarving that can recover fragmented files.

File Carving Taxonomy

Simson Garfinkel and Joachim Metz have proposed the following file carving taxonomy:

Carving
General term for extracting data (files) out of undifferentiated blocks (raw data), like "carving" a sculpture out of soap stone.
Block-Based Carving
Any carving method (algorithm) that analyzes the input on block-by-block basis to determine if a block is part of a possible output file. This method assumes that each block can only be part of a single file (or embedded file).
Statistical Carving
Any carving method (algorithm) that analyzes the input on characteristic or statistic for example, entropy) to determine if the input is part of a possible output file.
Header/Footer Carving
A method for carving files out of raw data using a distinct header (start of file marker) and footer (end of file marker).
Header/Maximum (file) size Carving
A method for carving files out of raw data using a distinct header (start of file marker) and a maximum (file) size. This approach works because many file formats (e.g. JPEG, MP3) do not care if additional junk is appended to the end of a valid file.
Header/Embedded Length Carving
A method for carving files out of raw data using a distinct header and a file length (size) which is embedded in the file format
File structure based Carving
A method for carving files out of raw data using a certain level of knowledge of the internal structure of file types. Garfinkel called this approach "Semantic Carving" in his DFRWS2006 carving challenge submission, while Metz and Mora called the approach "Deep Carving."
Semantic Carving
A method for carving files based on a linguistic analysis of the file's content. For example, a semantic carver might conclude that six blocks of french in the middle of a long HTML file written in English is a fragment left from a previous allocated file, and not from the English-language HTML file.
Carving with Validation
A method for carving files out of raw data where the carved files are validated using a file type specific validator.
Fragment Recovery Carving
A carving method in which two or more fragments are reassembled to form the original file or object. Garfinkel previously called this approach "Split Carving."
Repackaging Carving
A carving method that modifies the extracted data by adding new headers, footers, or other information so that it can be viewed with standard utilities. For example, Garfinkel's ZIP Carver looks for individual components of a ZIP file and repackages them with a new Central Directory so that they can be opened with a standard unzip utility.

File Carving challenges and test images

File Carving Challenge - DFRWS 2006

File Carving Challenge - DFRWS 2007

FAT Undelete Test #1 - Digital Forensics Tool Testing Image (dftt #6)

NTFS Undelete (and leap year) Test #1 - Digital Forensics Tool Testing Image (dftt #7)

Basic Data Carving Test - fat32, Nick Mikus - Digital Forensics Tool Testing Image (dftt #11)

Basic Data Carving Test - ext2, Nick Mikus - Digital Forensics Tool Testing Image (dftt #12)

See also

Memory Carving

External Links