Difference between pages "File Carving" and "Forensic Disk Differencing"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
m (See also)
 
m (See Also)
 
Line 1: Line 1:
'''Carving''' is the practice of searching an input for files or other kinds of objects based on content, rather than on metadata. File carving is a powerful tool for recovering files and fragments of files when directory entries are corrupt or missing, as may be the case with old files that have been deleted or when performing an analysis on damaged media. Memory carving is a useful tool for analyzing physical and virtual memory dumps when the memory structures are unknown or have been overwritten.
+
Forensic Disk Differencing is the process of taking two or more disk images from the same computer and determining what changes in the first disk image might have resulted in the changes that are observed in the second. One common use of differencing is to determine what an attacker did during a break-in. To be used for this purpose, it is necessary to have a forensic disk image of the computer before the break-in and after the break-in.
  
 +
==Differencing Tools==
 +
===idifference.py===
 +
idifference.py is part of the [[Digital Forensics XML]] Python Toolkit distributed with [[fiwalk]]. This tool will compare two different disk images and report changes in files between the first and the second. It also produces a timeline of changes.
  
=File Carving=
+
For example, using the '''nps-2009-canon2''' series of disk images:
  
Most file carvers operate by looking for file headers and/or footers, and then "carving out" the blocks between these two boundaries. [[Semantic Carving]] performs carving based on an analysis of the contents of the proposed files.  
+
<pre>
 +
$ python idifference.py /nps-2009-canon2-gen2.raw nps-2009-canon2-gen3.raw
 +
>>> Reading nps-2009-canon2-gen2.raw
 +
>>> Reading nps-2009-canon2-gen3.raw
  
File carving should be done on a [[disk image]], rather than on the original disk.
+
Disk image:/corp/drives/nps/nps-2009-canon2/nps-2009-canon2-gen3.raw
  
File carving tools are listed on the [[Tools:Data_Recovery]] wiki page.
+
New Files:  
  
Many carving programs have an option to only look at or near sector boundaries where headers are found. However, searching the entire input can find files that have been embedded into other files, such as [[JPEG]]s being embedded into [[Microsoft]] [[DOC|Word documents]]. This may be considered an advantage or a disadvantage, depending on the circumstances.
+
2008-12-23 14:26:12 1315993 DCIM/100CANON/IMG_0041.JPG
  
Today most file carving programs will only recover files that are contiguous on the media.
+
Deleted Files:
  
== File Carving Taxonomy==
+
2008-12-23 14:12:38 855935 DCIM/100CANON/IMG_0001.JPG
[[Simson Garfinkel]] and [[Joachim Metz]] have proposed the following file carving taxonomy:
+
2008-12-23 14:22:38 1347778 DCIM/100CANON/IMG_0037.JPG
  
;Carving
+
Files with modified content (but size unchanged):
:General term for extracting data (files) out of undifferentiated blocks (raw data), like "carving" a sculpture out of soap stone.
+
  
;Block Based Carving
+
Files with changed file properties:  
:Any carving method (algorithm) that analyzes the input on block-by-block basis to determine if a block is part of a possible output file. This method assumes that each block can only be part of a single file (or embedded file).
+
  
;Characteristic Based Carving
+
DCIM/CANONMSC/M0100.CTG SHA1 changed 69b30c352ee802f49b1ea25325af9fa05c3ffca1 -> baa42c03a917b01b212fb7e538e5deb525995f31
:Any carving method (algorithm) that analyzes the input on characteristic basis (for example, entropy) to determine if the input is part of a possible output file.
+
DCIM/CANONMSC/M0100.CTG crtime changed to 1230070924 -> 1230071142
 +
DCIM/CANONMSC/M0100.CTG mtime changed to 1230070924 -> 1230071142
 +
DCIM/CANONMSC/M0100.CTG resized 180 -> 188
  
;Header/Footer Carving
+
Timeline
:A method for carving files out of raw data using a distinct header (start of file marker) and footer (end of file marker).
+
  
;Header/Maximum (file) size Carving
+
2008-12-23 14:25:42 DCIM/CANONMSC/M0100.CTG SHA1 changed 69b30c352ee802f49b1ea25325af9fa05c3ffca1 -> baa42c03a917b01b212fb7e538e5deb525995f31
:A method for carving files out of raw data using a distinct header (start of file marker) and a maximum (file) size. This approach works because many file formats (e.g. JPEG, MP3) do not care if additional junk is appended to the end of a valid file.
+
2008-12-23 14:25:42 DCIM/CANONMSC/M0100.CTG crtime changed 1230070924 -> 1230071142
 +
2008-12-23 14:25:42 DCIM/CANONMSC/M0100.CTG mtime changed 1230070924 -> 1230071142
 +
2008-12-23 14:25:42 DCIM/CANONMSC/M0100.CTG resized 180 -> 188
 +
2008-12-23 14:26:12 DCIM/100CANON/IMG_0041.JPG created
 +
$
 +
</pre>
  
;Header/Embedded Length Carving
+
Here are some more examples:
:A method for carving files out of raw data using a distinct header and a file length (size) which is embedded in the file format
+
* [[File:Idifference-demo1.txt]] --- idifference.py run on two disks from the 2009-M57 Patents scenario (Jo's November 23 vs. November 24th disk)
  
;File structure based Carving
+
==See Also==
:A method for carving files out of raw data using a certain level of knowledge of the internal structure of file types. Garfinkel called this approach "Semantic Carving" in his DFRWS2006 carving challenge submission, while Metz and Mora called the approach "Deep Carving."
+
*[http://dfrws.org/2012/proceedings/DFRWS2012-6.pdf A general strategy for differential forensic analysis]
 
+
;Semantic Carving
+
:A method for carving files based on a linguistic analysis of the file's content. For example, a semantic carver might conclude that six blocks of french in the middle of a long HTML file written in English is a fragment left from a previous allocated file, and not from the English-language HTML file.
+
 
+
;Carving with Validation
+
:A method for carving files out of raw data where the carved files are validated using a file type specific validator.
+
 
+
;Fragment Recovery Carving
+
:A carving method in which two or more fragments are reassembled to form the original file or object. Garfinkel previously called this approach "Split Carving."
+
 
+
== File Carving challenges and test images ==
+
 
+
[http://www.dfrws.org/2006/challenge/]
+
File Carving Challenge - [[Digital Forensic Research Workshop|DFRWS]] 2006
+
 
+
[http://dftt.sourceforge.net/test6/index.html]
+
FAT Undelete Test #1 - Digital Forensics Tool Testing Image (dftt #6)
+
 
+
[http://dftt.sourceforge.net/test7/index.html]
+
NTFS Undelete (and leap year) Test #1 - Digital Forensics Tool Testing Image (dftt #7)
+
 
+
[http://dftt.sourceforge.net/test11/index.html]
+
Basic Data Carving Test - fat32 (by Nick Mikus) - Digital Forensics Tool Testing Image (dftt #11)
+
 
+
[http://dftt.sourceforge.net/test12/index.html]
+
Basic Data Carving Test - ext2 (by Nick Mikus) - Digital Forensics Tool Testing Image (dftt #12)
+
 
+
==File Carving Bibliography==
+
 
+
Mikus, Nicholas A. "An analysis of disc carving techniques," Master's Thesis, Naval Postgraduate School. March 2005. http://handle.dtic.mil/100.2/ADA432468
+
 
+
Garfinkel, S., "Carving Contiguous and Fragmented Files with Fast Object Validation", Digital Forensics Workshop (DFRWS 2007), Pittsburgh, PA, August 2007.  http://www.simson.net/clips/academic/2007.DFRWS.pdf
+
 
+
== See also ==
+
* [[Tools:Data_Recovery#Carving | FIle Carving Tools]]
+
* [[File Carving Bibliography]]
+
 
+
=Memory Carving=
+

Latest revision as of 20:21, 21 October 2013

Forensic Disk Differencing is the process of taking two or more disk images from the same computer and determining what changes in the first disk image might have resulted in the changes that are observed in the second. One common use of differencing is to determine what an attacker did during a break-in. To be used for this purpose, it is necessary to have a forensic disk image of the computer before the break-in and after the break-in.

Differencing Tools

idifference.py

idifference.py is part of the Digital Forensics XML Python Toolkit distributed with fiwalk. This tool will compare two different disk images and report changes in files between the first and the second. It also produces a timeline of changes.

For example, using the nps-2009-canon2 series of disk images:

$ python idifference.py /nps-2009-canon2-gen2.raw nps-2009-canon2-gen3.raw 
>>> Reading nps-2009-canon2-gen2.raw
>>> Reading nps-2009-canon2-gen3.raw

Disk image:/corp/drives/nps/nps-2009-canon2/nps-2009-canon2-gen3.raw 

New Files: 

2008-12-23 14:26:12	1315993	DCIM/100CANON/IMG_0041.JPG

Deleted Files: 

2008-12-23 14:12:38	855935	DCIM/100CANON/IMG_0001.JPG
2008-12-23 14:22:38	1347778	DCIM/100CANON/IMG_0037.JPG

Files with modified content (but size unchanged): 

Files with changed file properties: 

DCIM/CANONMSC/M0100.CTG	SHA1 changed	69b30c352ee802f49b1ea25325af9fa05c3ffca1	->	baa42c03a917b01b212fb7e538e5deb525995f31
DCIM/CANONMSC/M0100.CTG	crtime changed to	1230070924	->	1230071142
DCIM/CANONMSC/M0100.CTG	mtime changed to	1230070924	->	1230071142
DCIM/CANONMSC/M0100.CTG	resized	180	->	188

Timeline 

2008-12-23 14:25:42	DCIM/CANONMSC/M0100.CTG	SHA1 changed	69b30c352ee802f49b1ea25325af9fa05c3ffca1	->	baa42c03a917b01b212fb7e538e5deb525995f31
2008-12-23 14:25:42	DCIM/CANONMSC/M0100.CTG	crtime changed	1230070924	->	1230071142
2008-12-23 14:25:42	DCIM/CANONMSC/M0100.CTG	mtime changed	1230070924	->	1230071142
2008-12-23 14:25:42	DCIM/CANONMSC/M0100.CTG	resized	180	->	188
2008-12-23 14:26:12	DCIM/100CANON/IMG_0041.JPG	created
$

Here are some more examples:

  • File:Idifference-demo1.txt --- idifference.py run on two disks from the 2009-M57 Patents scenario (Jo's November 23 vs. November 24th disk)

See Also