Difference between pages "Tools:Visualization" and "File Carving"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
m (Open Source)
 
m (File Carving Bibliography)
 
Line 1: Line 1:
Although not strictly for forensic purposes, '''visualization tools''' such as the ones discussed here can be very useful for visualizing large data sets. As forensic practitioners need to process more and more data, it is likely that some of the techniques implemented by these tools will need to be adopted.
+
'''Carving''' is the practice of searching an input for files or other kinds of objects based on content, rather than on metadata. File carving is a powerful tool for recovering files and fragments of files when directory entries are corrupt or missing, as may be the case with old files that have been deleted or when performing an analysis on damaged media. Memory carving is a useful tool for analyzing physical and virtual memory dumps when the memory structures are unknown or have been overwritten.
  
==Programming Languages and Developer Toolkits==
 
If you are building forensic tools, you probably want to start with one of these:
 
; Java and Swing
 
: Advantage: Portable and lots of good documentation out there.
 
: Disadvantage: Programs are a bit verbose, and only offers about 1/2 the performance of C
 
  
; Python with tkinter
+
=File Carving=
: Advantage: Portable
+
: Disadvantage: Python is one of the slowest modern languages around.
+
  
; Python with wxWidgets
+
Most file carvers operate by looking for file headers and/or footers, and then "carving out" the blocks between these two boundaries. [[Semantic Carving]] performs carving based on an analysis of the contents of the proposed files.  
: Advantage: Portable and a better development environment than tkiner
+
: Disadvantage: wxWidgets is not installed by default, so you'll need to get it installed. Not as well documented as Tkinter
+
  
; [http://processing.org processing.org]
+
File carving should be done on a [[disk image]], rather than on the original disk.
: Advantage: Programming language specifically developed for visualization; compiles to java byte code
+
: Disadvantage: Very oddball
+
  
; JavaFX - Java's version of Flash
+
File carving tools are listed on the [[Tools:Data_Recovery]] wiki page.
  
; Flash
+
Many carving programs have an option to only look at or near sector boundaries where headers are found. However, searching the entire input can find files that have been embedded into other files, such as [[JPEG]]s being embedded into [[Microsoft]] [[DOC|Word documents]]. This may be considered an advantage or a disadvantage, depending on the circumstances.
  
== Applications ==
+
Today most file carving programs will only recover files that are contiguous on the media.  
Most of these are scriptable.
+
===Open Source===
+
====Data Plotting====
+
* http://ploticus.sourceforge.net
+
* http://www.gnuplot.info/
+
====Graph and Network Visualization====
+
* [http://www.graphviz.org/ Graphviz] - Originally developed by the [http://public.research.att.com/areas/visualization/ AT&T Information Visualization Gorup], designed for drawing connected graphs of nodes and edges. Neato is a similar system but does layout based on a spring model. Can produce output as [[PostScript]], [[PNG]], [[GIF]], or as an annotated graph file with the locations of all of the objects — ideal for drawing in a GUI. Runs from the command line on [[Unix]], [[Windows]] and [[Mac]], although there is also a [http://www.pixelglow.com/graphviz/ MacOS GUI version].
+
* [http://graphexploration.cond.org/ Guess: The Graph Exploration System] - Originally developed at HP, this is a large Jython/Java-based system that you can use for building your own applications. Distributed under GPL.
+
* [http://sourceforge.net/projects/ivc/ InfoVis Cyberinfrastructure] - Another graph drawing system written in Java.
+
* [http://jung.sourceforge.net/ Java Universal Network/Graph Framework (JUNG)] - Graphing, [[data mining]], [[social network]] analysis, and other stuff.
+
* [http://www.andrew.cmu.edu/user/krack/krackplot.shtml Krackplot] - "KrackPlot is a program for network visualization designed for social network analysts."
+
* [http://bioinformatics.icmb.utexas.edu/lgl/ Large Graph Layout (LGL)] - A bioinformatics system from University of Texas. They really mean Large.
+
* [http://www.sfu.ca/~richards/Multinet/Pages/multinet.htm MultiNet] - A data analysis package for drawing conventional data and graph data.
+
* [http://www.analytictech.com/netdraw.htm NetDraw] - "a free program written by Steve Borgatti for visualizing both 1-mode and 2-mode social network data."
+
* [http://web.mit.edu/bshi/Public/nv2d/ NetVis 2D] - Another graph visualization and layout tool written in Java.
+
* [http://www.opendx.org/ OpenDX] - Based on [[IBM]]'s Visualization Data Explorer, runs on [[Unix]]/X11/Motif.
+
* [http://vlado.fmf.uni-lj.si/pub/networks/pajek/ Pajek] - Windows program for drawing large networks.
+
* [http://sourceforge.net/projects/sonia/ Social Network Image Animator (SoNIA)] - Originally developed at Stanford. Written in Java. Makes movies.
+
* [http://www.informatik.uni-bremen.de/uDrawGraph/en/uDrawGraph/uDrawGraph.html uDrawGraph]
+
* [http://wilma.sourceforge.net/ WilmaScope] - Real-time animations of dynamic graph structures. Written in Java. Sophisticated force model with strings and attraction.
+
* [http://www.caida.org/tools/visualization/walrus/ Walrus] - A 3-d graph network exploration tool. Employs 3D hyperbolic displays and layout based on a user-supplied spanning tree.
+
  
=== Commercial Graphic Applications and Tools===
+
== File Carving Taxonomy==
 +
[[Simson Garfinkel]] and [[Joachim Metz]] have proposed the following file carving taxonomy:
  
* [http://www.aisee.com/ aiSee Graph Layout Software] - Supports 15 layout algorithms, recursive graph nesting, and easy printing. Runs on [[Windows]], [[Linux]], [[Solaris]], [[NetBSD]], and [[MacOS]]. 30-day trial and free registered versions available. Academic pricing available.
+
;Carving
*  [http://www.geomantics.com/ Geomantics] - Geographical, Visualization and Graphics software. Runs on [[Windows]].
+
:General term for extracting data (files) out of undifferentiated blocks (raw data), like "carving" a sculpture out of soap stone.  
* [http://www.kylebank.com/ Graphis 2D and 3D graphing software] - Runs on [[Windows]]. Free 30-day evaluation copy available.
+
* [http://www.openviz.com/ OpenViz] and  [http://www.powerviz.com/ PowerViz] - Both from Advanced Visual Systems, super high-end visualization toolkits. $$$$
+
* [http://www.tomsawyer.com/ Tom Sawyer Software] Analysis, Visualizaiton, and Layout programs. - Heavy support for drawing graphs. Beautiful gallery. ActiveX, Java, C++ and .NET editions.
+
* [http://www.netminer.com/ NetMiner] - A comprehensive tool for Social Network Analysis. Runs on Windows, with a Linux version under development. $35 for "Express" student version, $250 for "Professional" student version, $950 for "Normal" "Professional" version.
+
* [http://www.analytictech.com/ucinet.htm UCINET] - A comprehensive package for the analysis of social network data as well as other 1-mode and 2-mode data.
+
* [http://www.clarifiednetworks.com/logster Logster] - an ultra-easy software tool to visualize Apache-style logs on a world map.
+
* [http://www.clarifiednetworks.com/Clarified%20Analyzer Clarified Analyzer] - Visualizes Network Traffic and allows to drill down from visualizations to the packet level.
+
  
== Visualization Toolkits and Libraries ==
+
;Block Based Carving
===C/C++===
+
:Any carving method (algorithm) that analyzes the input on block-by-block basis to determine if a block is part of a possible output file. This method assumes that each block can only be part of a single file (or embedded file).
* [http://public.kitware.com/VTK/ The Visualization Toolkit] - C++ multi-platform with interfaces available for Tcl/Tk, Java and Python. Professional support provided by [http://www.kitware.com/ Kitware].
+
* [http://kdirstat.sourceforge.net/ KDirStat], an open source implementation of [http://www.cs.umd.edu/hcil/treemap-history/index.shtml Treemaps] written in C. (Treemaps are a visualization technique developed at the University of Maryland for visualizing large amounts of multi-dimensional data.)  You can find a copy of it in [http://www.derlien.com/ Disk Inventory X] and
+
===Java===
+
* [http://csbi.sourceforge.net/index.html Graph Interface Library (GINY)] - Java
+
* [http://hypergraph.sourceforge.net/ HyperGraph] - Hyperbolic trees, in Java. Check out the home page. Try clicking on the logo...
+
* [http://ivtk.sourceforge.net/ InfoViz Toolkit] - Java, originally developed at [[INRA]].
+
* [https://jdigraph.dev.java.net/ Jdigrah] - Java Directed Graphs.
+
* [http://jgrapht.sourceforge.net/ JGraphT] - A Java visualization kit designed to be simple and extensible.
+
* [http://prefuse.sourceforge.net/ Perfuse] - A Java-based toolkit for building interactive information visualization applications
+
* [http://www.ssec.wisc.edu/~billh/visad.html#intro VisAD] - A Java component library for interactive and collaborative visualization.
+
* [http://www.softwaresecretweapons.com/jspwiki/Wiki.jsp?page=LinguineMaps Linguine Maps] - An open-source Java-based system for visualizing software call maps.
+
* [http://zvtm.sourceforge.net/index.html Zoomable Visual Transformation Machine] - Java. Originally started at Xerox Research Europe.
+
* [http://openmap.bbn.com/ OpenMap] A Java-based Geographical Information System framework, from [[BBN]].
+
  
===Unclassified===
+
;Characteristic Based Carving
* [http://gravisto.fim.uni-passau.de/ Gravisto: Graph Visualization Toolkit] - An editor and toolkit for developing graph visualization algorithms.
+
:Any carving method (algorithm) that analyzes the input on characteristic basis (for example, entropy) to determine if the input is part of a possible output file.
* [http://www.gnu.frb.br:8080/rox Rox Graph Theory Framework] - An open-source plug-in framework for graph theory visualization.
+
* [http://touchgraph.sourceforge.net/ TouchGraph] - Library for building graph-based interfaces.
+
  
==Journals and Conferences==
+
;Header/Footer Carving
* [http://www.palgrave-journals.com/ivs/index.html Information Visualization Journal]
+
:A method for carving files out of raw data using a distinct header (start of file marker) and footer (end of file marker).
* [http://rw4.cs.uni-sb.de/~diehl/softvis/seminar/index.php?goto=seminar ACM Symposium on Software Visualization]
+
 
==Research Groups==
+
;Header/Maximum (file) size Carving
===Berkeley===
+
:A method for carving files out of raw data using a distinct header (start of file marker) and a maximum (file) size. This approach works because many file formats (e.g. JPEG, MP3) do not care if additional junk is appended to the end of a valid file.
* [http://bailando.sims.berkeley.edu/infovis.html Bailando Visualization]
+
 
* [http://vis.berkeley.edu/ Berkeley Visualization Lab]
+
;Header/Embedded Length Carving
===Brown===
+
:A method for carving files out of raw data using a distinct header and a file length (size) which is embedded in the file format
* [http://www.cs.brown.edu/people/rt/gd.html Roberto Tamassia's resources on Graph Drawing]
+
 
===Stanford===
+
;File structure based Carving
* [http://window.stanford.edu/projects/rivet/ Rivet Project] (Visualization complex systems)
+
:A method for carving files out of raw data using a certain level of knowledge of the internal structure of file types. Garfinkel called this approach "Semantic Carving" in his DFRWS2006 carving challenge submission, while Metz and Mora called the approach "Deep Carving."
===UNM===
+
 
* [http://www.msi.umn.edu/user_support/scivis/scivis-list.html Scientific Visualization at the Supercomputing Institute]
+
;Semantic Carving
===Wattenberg===
+
:A method for carving files based on a linguistic analysis of the file's content. For example, a semantic carver might conclude that six blocks of french in the middle of a long HTML file written in English is a fragment left from a previous allocated file, and not from the English-language HTML file.
* [http://www.bewitched.com/ Bewitched], a one-man research group.
+
 
==See Also==
+
;Carving with Validation
* [http://www-static.cc.gatech.edu/gvu/ii/resources/infovis.html GVU's Information Visualization Resources link farm]
+
:A method for carving files out of raw data where the carved files are validated using a file type specific validator.
* [http://directory.google.com/Top/Science/Math/Combinatorics/Software/Graph_Drawing/ Google Directory of Graph Drawing Software]
+
 
* [http://directory.fsf.org/science/visual/ GNU Free Software directory of scientific visualization software]
+
;Fragment Recovery Carving
* [http://www.manageability.org/blog/stuff/open-source-graph-network-visualization-in-java/view Open Source Graph Network Visualization in Java]
+
:A carving method in which two or more fragments are reassembled to form the original file or object. Garfinkel previously called this approach "Split Carving."
* [http://www.insna.org/INSNA/soft_inf.html INSNA's web page of Computer Programs for Social Network Analysis]
+
 
 +
== File Carving challenges and test images ==
 +
 
 +
[http://www.dfrws.org/2006/challenge/]
 +
File Carving Challenge - [[Digital Forensic Research Workshop|DFRWS]] 2006
 +
 
 +
[http://dftt.sourceforge.net/test6/index.html]
 +
FAT Undelete Test #1 - Digital Forensics Tool Testing Image (dftt #6)
 +
 
 +
[http://dftt.sourceforge.net/test7/index.html]
 +
NTFS Undelete (and leap year) Test #1 - Digital Forensics Tool Testing Image (dftt #7)
 +
 
 +
[http://dftt.sourceforge.net/test11/index.html]
 +
Basic Data Carving Test - fat32 (by Nick Mikus) - Digital Forensics Tool Testing Image (dftt #11)
 +
 
 +
[http://dftt.sourceforge.net/test12/index.html]
 +
Basic Data Carving Test - ext2 (by Nick Mikus) - Digital Forensics Tool Testing Image (dftt #12)
 +
 
 +
 
 +
 
 +
== See also ==
 +
* [[Tools:Data_Recovery#Carving | FIle Carving Tools]]
 +
* [[File Carving Bibliography]]
 +
 
 +
=Memory Carving=

Revision as of 23:56, 20 October 2008

Carving is the practice of searching an input for files or other kinds of objects based on content, rather than on metadata. File carving is a powerful tool for recovering files and fragments of files when directory entries are corrupt or missing, as may be the case with old files that have been deleted or when performing an analysis on damaged media. Memory carving is a useful tool for analyzing physical and virtual memory dumps when the memory structures are unknown or have been overwritten.


File Carving

Most file carvers operate by looking for file headers and/or footers, and then "carving out" the blocks between these two boundaries. Semantic Carving performs carving based on an analysis of the contents of the proposed files.

File carving should be done on a disk image, rather than on the original disk.

File carving tools are listed on the Tools:Data_Recovery wiki page.

Many carving programs have an option to only look at or near sector boundaries where headers are found. However, searching the entire input can find files that have been embedded into other files, such as JPEGs being embedded into Microsoft Word documents. This may be considered an advantage or a disadvantage, depending on the circumstances.

Today most file carving programs will only recover files that are contiguous on the media.

File Carving Taxonomy

Simson Garfinkel and Joachim Metz have proposed the following file carving taxonomy:

Carving
General term for extracting data (files) out of undifferentiated blocks (raw data), like "carving" a sculpture out of soap stone.
Block Based Carving
Any carving method (algorithm) that analyzes the input on block-by-block basis to determine if a block is part of a possible output file. This method assumes that each block can only be part of a single file (or embedded file).
Characteristic Based Carving
Any carving method (algorithm) that analyzes the input on characteristic basis (for example, entropy) to determine if the input is part of a possible output file.
Header/Footer Carving
A method for carving files out of raw data using a distinct header (start of file marker) and footer (end of file marker).
Header/Maximum (file) size Carving
A method for carving files out of raw data using a distinct header (start of file marker) and a maximum (file) size. This approach works because many file formats (e.g. JPEG, MP3) do not care if additional junk is appended to the end of a valid file.
Header/Embedded Length Carving
A method for carving files out of raw data using a distinct header and a file length (size) which is embedded in the file format
File structure based Carving
A method for carving files out of raw data using a certain level of knowledge of the internal structure of file types. Garfinkel called this approach "Semantic Carving" in his DFRWS2006 carving challenge submission, while Metz and Mora called the approach "Deep Carving."
Semantic Carving
A method for carving files based on a linguistic analysis of the file's content. For example, a semantic carver might conclude that six blocks of french in the middle of a long HTML file written in English is a fragment left from a previous allocated file, and not from the English-language HTML file.
Carving with Validation
A method for carving files out of raw data where the carved files are validated using a file type specific validator.
Fragment Recovery Carving
A carving method in which two or more fragments are reassembled to form the original file or object. Garfinkel previously called this approach "Split Carving."

File Carving challenges and test images

[1] File Carving Challenge - DFRWS 2006

[2] FAT Undelete Test #1 - Digital Forensics Tool Testing Image (dftt #6)

[3] NTFS Undelete (and leap year) Test #1 - Digital Forensics Tool Testing Image (dftt #7)

[4] Basic Data Carving Test - fat32 (by Nick Mikus) - Digital Forensics Tool Testing Image (dftt #11)

[5] Basic Data Carving Test - ext2 (by Nick Mikus) - Digital Forensics Tool Testing Image (dftt #12)


See also

Memory Carving