<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://www.forensicswiki.org/w/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://www.forensicswiki.org/w/api.php?action=feedcontributions&amp;user=Capibara&amp;feedformat=atom</id>
		<title>Forensics Wiki - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="http://www.forensicswiki.org/w/api.php?action=feedcontributions&amp;user=Capibara&amp;feedformat=atom"/>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Special:Contributions/Capibara"/>
		<updated>2013-06-19T16:40:14Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.21.1</generator>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Fileobject</id>
		<title>Talk:Fileobject</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Fileobject"/>
				<updated>2011-08-23T10:21:18Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;===sugestion: Use carvpath anotations===&lt;br /&gt;
I see the use of colons in byte_runs content. Earlier versions of libcarvpath used such a notation, but cross platform concerns resulted in this being changed into the plus sign.&lt;br /&gt;
I would suggest that the [http://sourceforge.net/apps/mediawiki/carvpath/index.php?title=CarvPath_annotations  CarvPath annotation format] may be a suitable alternative for annotating&lt;br /&gt;
byte_runs content. This is a straight forward and simple format, for example:&lt;br /&gt;
 &lt;br /&gt;
  4096+53248_258048+28672_S1047552&lt;br /&gt;
&lt;br /&gt;
Would be the way to annotate a byte_runs consisting of:&lt;br /&gt;
&lt;br /&gt;
1) A 53248 byte fragment at offset 4096&lt;br /&gt;
&lt;br /&gt;
2) A 28672 byte fragment at offset 258048&lt;br /&gt;
&lt;br /&gt;
3) 1047552 bytes of sparse data.&lt;br /&gt;
&lt;br /&gt;
I don't suggest adding long token notations. Just the '+', '_' and 'S' stuff, and possibly the '/' for nesting purposes.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Fileobject</id>
		<title>Talk:Fileobject</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Fileobject"/>
				<updated>2011-08-23T10:18:15Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;===sugestion: Use carvpath anotations===&lt;br /&gt;
I see the use of colons in byte_runs content. Earlier versions of libcarvpath used such a notation, but cross platform concerns resulted in this being changed into the plus sign.&lt;br /&gt;
I would suggest that the [http://sourceforge.net/apps/mediawiki/carvpath/index.php?title=CarvPath_annotations  CarvPath annotation format] may be a suitable alternative for annotating&lt;br /&gt;
byte_runs content. This is a straight forward and simple format, for example:&lt;br /&gt;
 &lt;br /&gt;
  4096+53248_258048+28672_S1047552&lt;br /&gt;
&lt;br /&gt;
Would be the way to annotate a byte_runs consisting of:&lt;br /&gt;
&lt;br /&gt;
1) A 53248 byte fragment at offset 4096&lt;br /&gt;
&lt;br /&gt;
2) A 28672 byte fragment at offset 258048&lt;br /&gt;
&lt;br /&gt;
3) 1047552 bytes of sparse data.&lt;br /&gt;
&lt;br /&gt;
I don't suggest adding long token notations. Just the '+', '_' and 'S' stuff.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Fileobject</id>
		<title>Talk:Fileobject</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Fileobject"/>
				<updated>2011-08-23T09:25:31Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* sugestion: Use carvpath anotations */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;===sugestion: Use carvpath anotations===&lt;br /&gt;
I see the use of colons in byte_runs content. Earlier versions of libcarvpath used such a notation, but cross platform concerns resulted in this being changed into the plus sign.&lt;br /&gt;
I would suggest that the [http://sourceforge.net/apps/mediawiki/carvpath/index.php?title=CarvPath_annotations  CarvPath annotation format] may be a suitable alternative for annotating&lt;br /&gt;
byte_runs content. This is a straight forward and simple format, for example:&lt;br /&gt;
 &lt;br /&gt;
  4096+53248_258048+28672_S1047552&lt;br /&gt;
&lt;br /&gt;
Would be the way to annotate a byte_runs consisting of:&lt;br /&gt;
&lt;br /&gt;
1) A 53248 byte fragment at offset 4096&lt;br /&gt;
2) A 28672 byte fragment at offset 258048&lt;br /&gt;
3) 1047552 bytes of sparse data.&lt;br /&gt;
&lt;br /&gt;
I don't suggest adding long token notations. Just the '+', '_' and 'S' stuff.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Fileobject</id>
		<title>Talk:Fileobject</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Fileobject"/>
				<updated>2011-08-23T09:24:42Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;===sugestion: Use carvpath anotations===&lt;br /&gt;
I see the use of colons in byte_runs content. Earlier versions of libcarvpath used such a notation, but cross platform concerns resulted in this being changed into the plus sign.&lt;br /&gt;
I would suggest that the [http://sourceforge.net/apps/mediawiki/carvpath/index.php?title=CarvPath_annotations | CarvPath annotation format] may be a suitable alternative for annotating&lt;br /&gt;
byte_runs content. This is a straight forward and simple format, for example:&lt;br /&gt;
 &lt;br /&gt;
  4096+53248_258048+28672_S1047552&lt;br /&gt;
&lt;br /&gt;
Would be the way to annotate a byte_runs consisting of:&lt;br /&gt;
&lt;br /&gt;
1) A 53248 byte fragment at offset 4096&lt;br /&gt;
2) A 28672 byte fragment at offset 258048&lt;br /&gt;
3) 1047552 bytes of sparse data.&lt;br /&gt;
&lt;br /&gt;
I don't suggest adding long token notations. Just the '+', '_' and 'S' stuff.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Fileobject</id>
		<title>Talk:Fileobject</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Fileobject"/>
				<updated>2011-08-23T09:15:14Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: Created page with &amp;quot;===sugestion: Use carvpath anotations=== I see the use of colons in byte_runs content. Earlier versions of libcarvpath used such a notation, but cross platform concerns resulted ...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;===sugestion: Use carvpath anotations===&lt;br /&gt;
I see the use of colons in byte_runs content. Earlier versions of libcarvpath used such a notation, but cross platform concerns resulted in this being changed into the plus sign.&lt;br /&gt;
I would suggest that the [http://sourceforge.net/apps/mediawiki/carvpath/index.php?title=CarvPath_annotations | CarvPath annotation format] may be a suitable alternative for annotating&lt;br /&gt;
byte_runs content.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/LibCarvPath</id>
		<title>LibCarvPath</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/LibCarvPath"/>
				<updated>2010-03-12T14:31:41Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;LibCarvPath is a library designed to be used by carving and file system analysis tools.&lt;br /&gt;
LibCarvPath allows fragments represented by offset and size to be combined in a CarvPath&lt;br /&gt;
annotation that take the form of file system paths. LibCarvPath addresses the limits of&lt;br /&gt;
file system paths by mapping extremely fragmented files to a uniquely identifying key in&lt;br /&gt;
a long-path database.&lt;br /&gt;
&lt;br /&gt;
The following tools use LibCarvPath and/or CarvPath Annotations:&lt;br /&gt;
&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpelcp]]&lt;br /&gt;
* Modules for the [[Open Computer Forensics Architecture]] that use the [[OCFA treegraph API]].&lt;br /&gt;
&lt;br /&gt;
Next to these, in [[Photorec]] work has started to include LibCarvPath support.&lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/apps/mediawiki/carvpath/ CarvPath wiki]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/CarvFs</id>
		<title>CarvFs</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/CarvFs"/>
				<updated>2010-03-12T14:30:58Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;CarvFs is a modular [[Fuse]] based user space file system on top op [[LibCarvPath]]. &lt;br /&gt;
CarvFS makes CarvPath style annotations as used by LibCarvPath available as files.&lt;br /&gt;
Using CarvFs makes it possible to process carved entities as files without the need for copy-out.&lt;br /&gt;
&lt;br /&gt;
CarvFs is modular with respect to access to image files.&lt;br /&gt;
The CarvFs distribution comes with a default module for access to (split) raw files.&lt;br /&gt;
&lt;br /&gt;
A separate [[LibEwf]] module is available for access to ewf images.  &lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/apps/mediawiki/carvpath/ CarvPath wiki]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/CarvFs</id>
		<title>CarvFs</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/CarvFs"/>
				<updated>2010-03-12T14:30:15Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;CarvFs is a modular [[Fuse]] based user space file system on top op [[LibCarvPath]]. &lt;br /&gt;
CarvFS makes CarvPath style annotations as used by LibCarvPath available as files.&lt;br /&gt;
Using CarvFs makes it possible to process carved entities as files without the need for copy-out.&lt;br /&gt;
&lt;br /&gt;
CarvFs is modular with respect to access to image files.&lt;br /&gt;
The CarvFs distribution comes with a default module for access to (split) raw files.&lt;br /&gt;
&lt;br /&gt;
A separate [[LibEwf]] module is available for access to ewf images.  &lt;br /&gt;
&lt;br /&gt;
[https://sourceforge.net/apps/mediawiki/carvpath/index.php?title=Main_Page CarvPath wiki]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/LibCarvPath</id>
		<title>LibCarvPath</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/LibCarvPath"/>
				<updated>2009-10-28T09:06:46Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;LibCarvPath is a library designed to be used by carving and file system analysis tools.&lt;br /&gt;
LibCarvPath allows fragments represented by offset and size to be combined in a CarvPath&lt;br /&gt;
annotation that take the form of file system paths. LibCarvPath addresses the limits of&lt;br /&gt;
file system paths by mapping extremely fragmented files to a uniquely identifying key in&lt;br /&gt;
a long-path database.&lt;br /&gt;
&lt;br /&gt;
The following tools use LibCarvPath and/or CarvPath Annotations:&lt;br /&gt;
&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpelcp]]&lt;br /&gt;
* Modules for the [[Open Computer Forensics Architecture]] that use the [[OCFA treegraph API]].&lt;br /&gt;
&lt;br /&gt;
Next to these, in [[Photorec]] work has started to include LibCarvPath support.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/projects/ocfa/  http://sourceforge.net/projects/ocfa/]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2009-10-09T05:33:07Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Existing Code that we have */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive [[computer forensics framework]] (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
Ship with validators for&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
&lt;br /&gt;
* Grapical Images&lt;br /&gt;
** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] How different is JPEG 2000?&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** BMP&lt;br /&gt;
** TIFF&lt;br /&gt;
* Office documents&lt;br /&gt;
** Microsoft Office 97 - 2003 [[OLE Compound File]] format based with [[Word Document (DOC)]] and [[Excel Spreadsheet (XLS)]] file format support&lt;br /&gt;
** [[PDF]]&lt;br /&gt;
** Open Office and Microsoft Office 2007 [[ZIP archive]] based file formats&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[Word Document (DOCX)]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data [[Excel Spreadsheet (XLSB)]]&lt;br /&gt;
* Archive files&lt;br /&gt;
* [[ZIP archive]] file format&lt;br /&gt;
** 7z&lt;br /&gt;
** tar, gzip, bzip2&lt;br /&gt;
** RAR&lt;br /&gt;
* E-mail files&lt;br /&gt;
** [[Personal Folder File (PAB, PST, OST)]]&lt;br /&gt;
** MBOX (text based format, base64 content support)&lt;br /&gt;
* Audio/Video files&lt;br /&gt;
** MPEG&lt;br /&gt;
** MP2/MP3&lt;br /&gt;
** AVI&lt;br /&gt;
** ASF/WMV&lt;br /&gt;
** QuickTime&lt;br /&gt;
** MKV&lt;br /&gt;
* Printer spool files&lt;br /&gt;
** EMF (if I remember correctly)&lt;br /&gt;
* Internet history files&lt;br /&gt;
** index.dat&lt;br /&gt;
** firefox (sqllite 3)&lt;br /&gt;
* Other files&lt;br /&gt;
** thumbs.db which also is an [[OLE Compound File]] based format&lt;br /&gt;
** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations.&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]]  I'm currently working on wrapping TSK filesystem as several loadable modules for OCFA. In OCFA a loadable (tree-graph) module implements the OCFA tree-graph API that basically states 'everything is a tree-graph node'. Possibly you could look at the OCFA treegraph API and module loading interface as an example, or we could work together on changing the API and module loading interface in such a way that it doesn't break OCFA and is usefull for the stand alone carver, but allowing both to use exactly the same tree-graph loadable modules. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
::: [[User:Joachim Metz|Joachim]] that could be useful ;-)&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
* libtsk&lt;br /&gt;
* libaff ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* libewf ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* posix ? : Can we depend especially on the availability of UNIX domain sockets and the possibility to use msg_accrights for passing opn file handles as ocaps?&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a [[computer forensics framework]] .&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
==Anti forensics and system integrity concerns==&lt;br /&gt;
* It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Carving NTFS-compressed (lznt1) files (http://sourceforge.net/projects/revit/files/Documentation/Carving%20NTFS-compressed%20data/Carving%20for%20NTFS%20compressed%20files.pdf/download)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] sparse (file system) blocks e.g. NTFS cluster blocks&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* [[libnk2]]&lt;br /&gt;
* [[libpff]]&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* [[libewf]]&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
* The tree-graph loadable module support, module loader and loadable modules of the Open Computer Forensics Architecture.&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2009-10-09T05:31:25Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Ideas */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive [[computer forensics framework]] (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
Ship with validators for&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
&lt;br /&gt;
* Grapical Images&lt;br /&gt;
** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] How different is JPEG 2000?&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** BMP&lt;br /&gt;
** TIFF&lt;br /&gt;
* Office documents&lt;br /&gt;
** Microsoft Office 97 - 2003 [[OLE Compound File]] format based with [[Word Document (DOC)]] and [[Excel Spreadsheet (XLS)]] file format support&lt;br /&gt;
** [[PDF]]&lt;br /&gt;
** Open Office and Microsoft Office 2007 [[ZIP archive]] based file formats&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[Word Document (DOCX)]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data [[Excel Spreadsheet (XLSB)]]&lt;br /&gt;
* Archive files&lt;br /&gt;
* [[ZIP archive]] file format&lt;br /&gt;
** 7z&lt;br /&gt;
** tar, gzip, bzip2&lt;br /&gt;
** RAR&lt;br /&gt;
* E-mail files&lt;br /&gt;
** [[Personal Folder File (PAB, PST, OST)]]&lt;br /&gt;
** MBOX (text based format, base64 content support)&lt;br /&gt;
* Audio/Video files&lt;br /&gt;
** MPEG&lt;br /&gt;
** MP2/MP3&lt;br /&gt;
** AVI&lt;br /&gt;
** ASF/WMV&lt;br /&gt;
** QuickTime&lt;br /&gt;
** MKV&lt;br /&gt;
* Printer spool files&lt;br /&gt;
** EMF (if I remember correctly)&lt;br /&gt;
* Internet history files&lt;br /&gt;
** index.dat&lt;br /&gt;
** firefox (sqllite 3)&lt;br /&gt;
* Other files&lt;br /&gt;
** thumbs.db which also is an [[OLE Compound File]] based format&lt;br /&gt;
** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations.&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]]  I'm currently working on wrapping TSK filesystem as several loadable modules for OCFA. In OCFA a loadable (tree-graph) module implements the OCFA tree-graph API that basically states 'everything is a tree-graph node'. Possibly you could look at the OCFA treegraph API and module loading interface as an example, or we could work together on changing the API and module loading interface in such a way that it doesn't break OCFA and is usefull for the stand alone carver, but allowing both to use exactly the same tree-graph loadable modules. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
::: [[User:Joachim Metz|Joachim]] that could be useful ;-)&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
* libtsk&lt;br /&gt;
* libaff ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* libewf ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* posix ? : Can we depend especially on the availability of UNIX domain sockets and the possibility to use msg_accrights for passing opn file handles as ocaps?&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a [[computer forensics framework]] .&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
==Anti forensics and system integrity concerns==&lt;br /&gt;
* It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Carving NTFS-compressed (lznt1) files (http://sourceforge.net/projects/revit/files/Documentation/Carving%20NTFS-compressed%20data/Carving%20for%20NTFS%20compressed%20files.pdf/download)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] sparse (file system) blocks e.g. NTFS cluster blocks&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* [[libnk2]]&lt;br /&gt;
* [[libpff]]&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* [[libewf]]&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/The_Sleuth_Kit</id>
		<title>The Sleuth Kit</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/The_Sleuth_Kit"/>
				<updated>2009-08-28T10:20:13Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* See Also */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox_Software |&lt;br /&gt;
  name = The Sleuth Kit |&lt;br /&gt;
  maintainer = [[Brian Carrier]] |&lt;br /&gt;
  os = {{Linux}}, {{FreeBSD}}, {{OpenBSD}}, {{Mac OS X}}, {{SunOS}} |&lt;br /&gt;
  genre = {{Analysis}} |&lt;br /&gt;
  license = {{IBM Open Source License}}, {{Common Public License}}, {{GPL}} |&lt;br /&gt;
  website = [http://www.sleuthkit.org/ sleuthkit.org] |&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
'''The Sleuth Kit''' ('''TSK''') is a collection of [[UNIX]]-based command line tools that allow you to investigate a computer. The current focus of the tools is the file and volume systems and TSK supports [[FAT]] (12/16/32), [[Ext2]]/[[Ext3|3]], [[NTFS]], [[Ufs|UFS]] (1 &amp;amp; 2), and ISO 9660 [[file system]]s.&lt;br /&gt;
&lt;br /&gt;
[[Autopsy]] is a frontend for TSK which allows browser-based access to the TSK tools.&lt;br /&gt;
 &lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
The Sleuth Kit is arranged in layers. There is a ''data layer'' which is concerned with how information is stored on a disk and a ''metadata layer'' which is considered with information such as [[inode]]s and [[directory|directories]]. The commands that deal with the data layer are prefixed with the letter ''d'', which the commands that deal with the metadata layer are prefixed with the letter ''i''.&lt;br /&gt;
&lt;br /&gt;
Some of the commands in Sleuth Kit are:&lt;br /&gt;
&lt;br /&gt;
; dcat&lt;br /&gt;
: Views the contents of a [[block]].&lt;br /&gt;
&lt;br /&gt;
; dls&lt;br /&gt;
: Lists [[unallocated block]]s. Makes keyword searches more efficient. Gets a list of unallocated blocks.&lt;br /&gt;
&lt;br /&gt;
; dcalc&lt;br /&gt;
: Tells you where an unallocated blocks are.&lt;br /&gt;
&lt;br /&gt;
; dstat&lt;br /&gt;
: Details about a given block.&lt;br /&gt;
&lt;br /&gt;
; icat&lt;br /&gt;
: View contents of a file given its inode value or [[cluster number]]. Doesn't list directories, lists the contents.&lt;br /&gt;
&lt;br /&gt;
; ils&lt;br /&gt;
: Lists the files extents on a disk.&lt;br /&gt;
&lt;br /&gt;
; istat&lt;br /&gt;
: Information about an inode number.&lt;br /&gt;
&lt;br /&gt;
==File Systems Understood==&lt;br /&gt;
&lt;br /&gt;
* [[NTFS]]&lt;br /&gt;
* [[FAT]]&lt;br /&gt;
* [[Ext2]], [[Ext3]]&lt;br /&gt;
* [[Ufs|UFS]] (1 &amp;amp; 2)&lt;br /&gt;
* ISO 9660&lt;br /&gt;
 &lt;br /&gt;
==File Search Facilities==&lt;br /&gt;
&lt;br /&gt;
* Lists allocated and unallocated files.&lt;br /&gt;
* Lists and sorts by file type.&lt;br /&gt;
* Shows a time of creation and change.&lt;br /&gt;
 &lt;br /&gt;
==Historical Reconstruction==&lt;br /&gt;
 &lt;br /&gt;
==Searching Abilities==&lt;br /&gt;
 &lt;br /&gt;
* Searches for keywords.&lt;br /&gt;
* Builds an index.&lt;br /&gt;
&lt;br /&gt;
==Hash Databases==&lt;br /&gt;
&lt;br /&gt;
* Uses [[MD5]] or [[SHA-1]].&lt;br /&gt;
* Interfaces with NIST [[NSRL]], [[Hashkeeper]] and customer databases.&lt;br /&gt;
 &lt;br /&gt;
==Evidence Collection Features==&lt;br /&gt;
 &lt;br /&gt;
* Tracks forensic activity.&lt;br /&gt;
&lt;br /&gt;
=History=&lt;br /&gt;
&lt;br /&gt;
==License Notes==&lt;br /&gt;
&lt;br /&gt;
&amp;quot;The file system tools (in the src/fstools directory) are released&lt;br /&gt;
under the IBM open source license and Common Public License, both&lt;br /&gt;
are located in the license directory.  The modifications to 'mactime'&lt;br /&gt;
from the original 'mactime' in TCT and 'mac-daddy' are released&lt;br /&gt;
under the Common Public License.  Other tools in the src directory&lt;br /&gt;
are either Common Public License or the GNU Public License.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
= See Also =&lt;br /&gt;
* [[The Sleuth Kit How-To]]&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* The mmls [[OCFA treegraph API]] example module.&lt;br /&gt;
&lt;br /&gt;
= External Links =&lt;br /&gt;
&lt;br /&gt;
* [http://www.sleuthkit.org/autopsy/desc.php Autopsy website]&lt;br /&gt;
 &lt;br /&gt;
==External Reviews==&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Zero_storage_carving</id>
		<title>Zero storage carving</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Zero_storage_carving"/>
				<updated>2009-08-28T10:17:55Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Zero storage carving is the concept of using techniques to enable doing [[carving]] of meaningfull and processable chunks or files of uncompressed unencoded data on disks, &lt;br /&gt;
disk-images or container files without the need for additional storage to be allocated for copies of the relevant data chunks or files.&lt;br /&gt;
Zero storage carving is sometimes also referred to as in-line carving.&lt;br /&gt;
&lt;br /&gt;
Tools with support or facilities for zero storage carving include:&lt;br /&gt;
&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpel]]&lt;br /&gt;
* [[PhotoRec]]&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[LibCarvPath]]&lt;br /&gt;
* [[OCFA treegraph API]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Zero_storage_carving</id>
		<title>Zero storage carving</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Zero_storage_carving"/>
				<updated>2009-08-28T10:17:29Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Zero storage carving is the concept of using techniques to enable doing [[carving]] of meaningfull and processable chunks or files of uncompressed unencoded data on disks, &lt;br /&gt;
disk-images or container files without the need for additional storage to be allocated for copies of the relevant data chunks or files.&lt;br /&gt;
Zero storage carving is sometimes also referred to as in-line carving.&lt;br /&gt;
&lt;br /&gt;
Tools with support or facilities for zero storage carving include:&lt;br /&gt;
&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpel]]&lt;br /&gt;
* [[PhotoRec]]&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[LibCarvPath]]&lt;br /&gt;
* [[OCFA Treegraph API]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Zero_storage_carving</id>
		<title>Zero storage carving</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Zero_storage_carving"/>
				<updated>2009-08-28T10:16:16Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Zero storage carving is the concept of using techniques to enable doing [[carving]] of meaningfull and processable chunks or files of uncompressed unencoded data on disks, &lt;br /&gt;
disk-images or container files without the need for additional storage to be allocated for copies of the relevant data chunks or files.&lt;br /&gt;
Zero storage carving is sometimes also referred to as in-line carving.&lt;br /&gt;
&lt;br /&gt;
Tools with support or facilities for zero storage carving include:&lt;br /&gt;
&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpel]]&lt;br /&gt;
* [[PhotoRec]]&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[LibCarvPath]]&lt;br /&gt;
* [[Ocfa Treegraph API]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Zero_storage_carving</id>
		<title>Zero storage carving</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Zero_storage_carving"/>
				<updated>2009-08-28T10:15:42Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Zero storage carving is the concept of using techniques to enable doing [[carving]] of meaningfull and processable chunks or files of uncompressed unencoded data on disks, &lt;br /&gt;
disk-images or container files without the need for additional storage to be allocated for copies of the relevant data chunks or files.&lt;br /&gt;
Zero storage carving is sometimes also referred to as in-line carving.&lt;br /&gt;
&lt;br /&gt;
Tools with support or facilities for zero storage carving include:&lt;br /&gt;
&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpel]]&lt;br /&gt;
* [[Photorec]]&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[LibCarvPath]]&lt;br /&gt;
* [[Ocfa Treegraph API]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Zero_storage_carving</id>
		<title>Zero storage carving</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Zero_storage_carving"/>
				<updated>2009-08-28T10:13:02Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: Created page with 'Zero storage carving is the concept of using techniques to enable doing carving of meaningfull and processable chunks or files of uncompressed unencoded data on disks,  disk-…'&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Zero storage carving is the concept of using techniques to enable doing [[carving]] of meaningfull and processable chunks or files of uncompressed unencoded data on disks, &lt;br /&gt;
disk-images or container files without the need for additional storage to be allocated for copies of the relevant data chunks or files.&lt;br /&gt;
Zero storage carving is sometimes also referred to as in-line carving.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture</id>
		<title>Open Computer Forensics Architecture</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture"/>
				<updated>2009-08-28T10:03:30Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The '''Open Computer Forensics Architecture''' ('''OCFA''') is a modular [[computer forensics framework]] built by the [[Dutch National Police Agency]]. The main goal is to automate the digital forensic process to speed up the investigation and give tactical [[investigator]]s direct access to the seized data through an easy to use search and browse interface.&lt;br /&gt;
&lt;br /&gt;
The architecture forms an environment where existing forensic [[tools]] and libraries can be easily plugged into the architecture and can thus be made part of the recursive extraction of data and [[metadata]] from digital evidence.&lt;br /&gt;
&lt;br /&gt;
The Open Computer Forensics Architecture aims to be highly modular, robust, fault tolerant, recursive and scalable in order to be usable in large investigations that spawn numerous terabytes of evidence data and covers hundreds of evidence items.&lt;br /&gt;
&lt;br /&gt;
Modules in OCFA for reasons of fault tolerance are processes. The basic [[OcfaLib API]] makes it possible and relatively easy to build an OCFA module out of any data processing library or tool. OCFA comes with numerous such modules that are mostly wrappers around libraries like [[libmagic]] or tools such as those found in the [[Sleuthkit]].&lt;br /&gt;
&lt;br /&gt;
The 2.2 version of OCFA (released April 2009) makes the previously internal [[OCFA treegraph API]] available for OCFA module development. The OCFA treegraph API allows more advanced dissectors that produce data and meta-data for a treegraph representation of an input file.  The OCFA treegraph API also allows dissectors that are programed to be [[CarvFs]] aware to use [[zero storage carving]]. &lt;br /&gt;
&lt;br /&gt;
Communication between modules within OCFA is governed by a two layered communication infrastructure as provided by OCFA. At the lowest layer is a messaging system with at is center the OCFA Anycast Relay. The Anycast Relay provides the facilities of module crash resistance, distributed processing load balancing and flow control.&lt;br /&gt;
At a higher level of communication, the OCFA XML Router provides for the routing of individual pieces of evidence through the most appropriate tool chain for its particular type of content. &lt;br /&gt;
&lt;br /&gt;
Although OCFA contains a rudimentary user interface, most of its power is in the backend architecture.&lt;br /&gt;
The last and final module in the tool chain of any evidence will be the OCFA Data Store Module. This module&lt;br /&gt;
processes the evidence XML (that contains all of the evidence data its meta data) and stores relevant parts into a postgesql database. Extending the apache based user interface with interfaces for your own case bound queries&lt;br /&gt;
is something that should proof very useful in most investigations.&lt;br /&gt;
&lt;br /&gt;
For more information consult [http://sourceforge.net/projects/ocfa/  sourceforge.net/projects/ocfa/ ] .&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/OCFA_treegraph_API</id>
		<title>OCFA treegraph API</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/OCFA_treegraph_API"/>
				<updated>2009-08-28T10:01:54Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The OCFA treegraph API is a more advanced API for the [[Open Computer Forensics Architecture]]. The basic [[OcfaLib API]] allows for the fast and simple creation of simple dissector and extractor modules for OCFA, but has some limitations.&lt;br /&gt;
To overcome these limitations, the 2.2 version of OCFA re-vectored and promoted an API that was previously used internally by th OCFA library to an API available to module builders.&lt;br /&gt;
The OCFA treegraph API defines an interface that a loadable library must implement in order to be usable as an advanced dissector module bu the Open Computer Forensics Architecture.&lt;br /&gt;
Basically it defines an interface  'TreeGraphNode' that a treegraph module will need to derive one or many classes from. A TreeGraphNode can contain data, meta-data and sub node's that are also TreeGraphNode implementations.&lt;br /&gt;
The data interface of the TreeGraphNode also allows treegraph modules that are [[CarvFs]] aware, to return a carvpath as so called soft linkable path'. Doing so allows OCFA to use substantially less storage resources.&lt;br /&gt;
&lt;br /&gt;
An example of a treegraph module for OCFA is included in the 2.2 release of OCFA. This example is the OCFA mmls module.&lt;br /&gt;
The ocfa mmls module reproduces the functionality of the [[sleuthkit]] mmls tool. It does this using the OCFA treegraph library,&lt;br /&gt;
the [[LibCarvPath]] library, and the [[sleuthkit]] library.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/OCFA_treegraph_API</id>
		<title>OCFA treegraph API</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/OCFA_treegraph_API"/>
				<updated>2009-08-28T10:00:59Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: Created page with 'The OCFA treegraph API is a more advanced API for the Open Computer Forensics Architecture. The basic OCFA API allows for the fast and simple creation of simple dissector…'&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The OCFA treegraph API is a more advanced API for the [[Open Computer Forensics Architecture]]. The basic [[OCFA API]] allows for the fast and simple creation of simple dissector and extractor modules for OCFA, but has some limitations.&lt;br /&gt;
To overcome these limitations, the 2.2 version of OCFA re-vectored and promoted an API that was previously used internally by th OCFA library to an API available to module builders.&lt;br /&gt;
The OCFA treegraph API defines an interface that a loadable library must implement in order to be usable as an advanced dissector module bu the Open Computer Forensics Architecture.&lt;br /&gt;
Basically it defines an interface  'TreeGraphNode' that a treegraph module will need to derive one or many classes from. A TreeGraphNode can contain data, meta-data and sub node's that are also TreeGraphNode implementations.&lt;br /&gt;
The data interface of the TreeGraphNode also allows treegraph modules that are [[CarvFs]] aware, to return a carvpath as so called soft linkable path'. Doing so allows OCFA to use substantially less storage resources.&lt;br /&gt;
&lt;br /&gt;
An example of a treegraph module for OCFA is included in the 2.2 release of OCFA. This example is the OCFA mmls module.&lt;br /&gt;
The ocfa mmls module reproduces the functionality of the [[sleuthkit]] mmls tool. It does this using the OCFA treegraph library,&lt;br /&gt;
the [[LibCarvPath]] library, and the [[sleuthkit]] library.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/LibCarvPath</id>
		<title>LibCarvPath</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/LibCarvPath"/>
				<updated>2009-08-28T09:31:30Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;LibCarvPath is a library designed to be used by carving and file system analysis tools.&lt;br /&gt;
LibCarvPath allows fragments represented by offset and size to be combined in a CarvPath&lt;br /&gt;
annotation that take the form of file system paths. LibCarvPath addresses the limits of&lt;br /&gt;
file system paths by mapping extremely fragmented files to a uniquely identifying key in&lt;br /&gt;
a long-path database.&lt;br /&gt;
&lt;br /&gt;
The following tools use LibCarvPath and/or CarvPath Annotations:&lt;br /&gt;
&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpelcp]]&lt;br /&gt;
* Modules for the [[Open Computer Forensics Architecture]] that use the [[OCFA Treegraph API]].&lt;br /&gt;
&lt;br /&gt;
Next to these, in [[Photorec]] work has started to include LibCarvPath support.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/projects/ocfa/  http://sourceforge.net/projects/ocfa/]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture</id>
		<title>Open Computer Forensics Architecture</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture"/>
				<updated>2009-08-28T09:28:12Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The '''Open Computer Forensics Architecture''' ('''OCFA''') is a modular [[computer forensics framework]] built by the [[Dutch National Police Agency]]. The main goal is to automate the digital forensic process to speed up the investigation and give tactical [[investigator]]s direct access to the seized data through an easy to use search and browse interface.&lt;br /&gt;
&lt;br /&gt;
The architecture forms an environment where existing forensic [[tools]] and libraries can be easily plugged into the architecture and can thus be made part of the recursive extraction of data and [[metadata]] from digital evidence.&lt;br /&gt;
&lt;br /&gt;
The Open Computer Forensics Architecture aims to be highly modular, robust, fault tolerant, recursive and scalable in order to be usable in large investigations that spawn numerous terabytes of evidence data and covers hundreds of evidence items.&lt;br /&gt;
&lt;br /&gt;
Modules in OCFA for reasons of fault tolerance are processes. The basic [[OcfaLib API]] makes it possible and relatively easy to build an OCFA module out of any data processing library or tool. OCFA comes with numerous such modules that are mostly wrappers around libraries like [[libmagic]] or tools such as those found in the [[Sleuthkit]].&lt;br /&gt;
&lt;br /&gt;
The 2.2 version of OCFA (released April 2009) makes the previously internal [[OCFA treegraph API]] available for OCFA module development. The OCFA treegraph API allows more advanced dissectors that produce data and meta-data for a treegraph representation of an input file.  The OCFA treegraph API also allows dissectors that are programed to be [[CarvFs]] aware to use [[Zero Storage Carving]]. &lt;br /&gt;
&lt;br /&gt;
Communication between modules within OCFA is governed by a two layered communication infrastructure as provided by OCFA. At the lowest layer is a messaging system with at is center the OCFA Anycast Relay. The Anycast Relay provides the facilities of module crash resistance, distributed processing load balancing and flow control.&lt;br /&gt;
At a higher level of communication, the OCFA XML Router provides for the routing of individual pieces of evidence through the most appropriate tool chain for its particular type of content. &lt;br /&gt;
&lt;br /&gt;
Although OCFA contains a rudimentary user interface, most of its power is in the backend architecture.&lt;br /&gt;
The last and final module in the tool chain of any evidence will be the OCFA Data Store Module. This module&lt;br /&gt;
processes the evidence XML (that contains all of the evidence data its meta data) and stores relevant parts into a postgesql database. Extending the apache based user interface with interfaces for your own case bound queries&lt;br /&gt;
is something that should proof very useful in most investigations.&lt;br /&gt;
&lt;br /&gt;
For more information consult [http://sourceforge.net/projects/ocfa/  sourceforge.net/projects/ocfa/ ] .&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Computer_forensics_framework</id>
		<title>Computer forensics framework</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Computer_forensics_framework"/>
				<updated>2009-01-15T06:17:41Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The following more or less generic computer forensics frameworks exist: &lt;br /&gt;
&lt;br /&gt;
* [[Open Computer Forensics Architecture]]&lt;br /&gt;
* [[Pyflag]]&lt;br /&gt;
* [[DELV]]&lt;br /&gt;
* [[XIRAF]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/The_Sleuth_Kit</id>
		<title>The Sleuth Kit</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/The_Sleuth_Kit"/>
				<updated>2008-12-13T10:47:06Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* See Also */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox_Software |&lt;br /&gt;
  name = The Sleuth Kit |&lt;br /&gt;
  maintainer = [[Brian Carrier]] |&lt;br /&gt;
  os = {{Linux}}, {{FreeBSD}}, {{OpenBSD}}, {{Mac OS X}}, {{SunOS}} |&lt;br /&gt;
  genre = {{Analysis}} |&lt;br /&gt;
  license = {{IBM Open Source License}}, {{Common Public License}}, {{GPL}} |&lt;br /&gt;
  website = [http://www.sleuthkit.org/ sleuthkit.org] |&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
'''The Sleuth Kit''' ('''TSK''') is a collection of [[UNIX]]-based command line tools that allow you to investigate a computer. The current focus of the tools is the file and volume systems and TSK supports [[FAT]] (12/16/32), [[Ext2]]/[[Ext3|3]], [[NTFS]], [[Ufs|UFS]] (1 &amp;amp; 2), and ISO 9660 [[file system]]s.&lt;br /&gt;
&lt;br /&gt;
[[Autopsy]] is a frontend for TSK which allows browser-based access to the TSK tools.&lt;br /&gt;
 &lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
The Sleuth Kit is arranged in layers. There is a ''data layer'' which is concerned with how information is stored on a disk and a ''metadata layer'' which is considered with information such as [[inode]]s and [[directory|directories]]. The commands that deal with the data layer are prefixed with the letter ''d'', which the commands that deal with the metadata layer are prefixed with the letter ''i''.&lt;br /&gt;
&lt;br /&gt;
Some of the commands in Sleuth Kit are:&lt;br /&gt;
&lt;br /&gt;
; dcat&lt;br /&gt;
: Views the contents of a [[block]].&lt;br /&gt;
&lt;br /&gt;
; dls&lt;br /&gt;
: Lists [[unallocated block]]s. Makes keyword searches more efficient. Gets a list of unallocated blocks.&lt;br /&gt;
&lt;br /&gt;
; dcalc&lt;br /&gt;
: Tells you where an unallocated blocks are.&lt;br /&gt;
&lt;br /&gt;
; dstat&lt;br /&gt;
: Details about a given block.&lt;br /&gt;
&lt;br /&gt;
; icat&lt;br /&gt;
: View contents of a file given its inode value or [[cluster number]]. Doesn't list directories, lists the contents.&lt;br /&gt;
&lt;br /&gt;
; ils&lt;br /&gt;
: Lists the files extents on a disk.&lt;br /&gt;
&lt;br /&gt;
; istat&lt;br /&gt;
: Information about an inode number.&lt;br /&gt;
&lt;br /&gt;
==File Systems Understood==&lt;br /&gt;
&lt;br /&gt;
* [[NTFS]]&lt;br /&gt;
* [[FAT]]&lt;br /&gt;
* [[Ext2]], [[Ext3]]&lt;br /&gt;
* [[Ufs|UFS]] (1 &amp;amp; 2)&lt;br /&gt;
* ISO 9660&lt;br /&gt;
 &lt;br /&gt;
==File Search Facilities==&lt;br /&gt;
&lt;br /&gt;
* Lists allocated and unallocated files.&lt;br /&gt;
* Lists and sorts by file type.&lt;br /&gt;
* Shows a time of creation and change.&lt;br /&gt;
 &lt;br /&gt;
==Historical Reconstruction==&lt;br /&gt;
 &lt;br /&gt;
==Searching Abilities==&lt;br /&gt;
 &lt;br /&gt;
* Searches for keywords.&lt;br /&gt;
* Builds an index.&lt;br /&gt;
&lt;br /&gt;
==Hash Databases==&lt;br /&gt;
&lt;br /&gt;
* Uses [[MD5]] or [[SHA-1]].&lt;br /&gt;
* Interfaces with NIST [[NSRL]], [[Hashkeeper]] and customer databases.&lt;br /&gt;
 &lt;br /&gt;
==Evidence Collection Features==&lt;br /&gt;
 &lt;br /&gt;
* Tracks forensic activity.&lt;br /&gt;
&lt;br /&gt;
=History=&lt;br /&gt;
&lt;br /&gt;
==License Notes==&lt;br /&gt;
&lt;br /&gt;
&amp;quot;The file system tools (in the src/fstools directory) are released&lt;br /&gt;
under the IBM open source license and Common Public License, both&lt;br /&gt;
are located in the license directory.  The modifications to 'mactime'&lt;br /&gt;
from the original 'mactime' in TCT and 'mac-daddy' are released&lt;br /&gt;
under the Common Public License.  Other tools in the src directory&lt;br /&gt;
are either Common Public License or the GNU Public License.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
= See Also =&lt;br /&gt;
* [[The Sleuth Kit How-To]]&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
&lt;br /&gt;
= External Links =&lt;br /&gt;
&lt;br /&gt;
* [http://www.sleuthkit.org/autopsy/desc.php Autopsy website]&lt;br /&gt;
 &lt;br /&gt;
==External Reviews==&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Tsk-cp</id>
		<title>Tsk-cp</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Tsk-cp"/>
				<updated>2008-12-12T19:01:21Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Tsk-cp is a set of [[LibCarvPath]] aware versions of [[Sleuthkit]] tools, that are for use together with the&lt;br /&gt;
normal versions of the other sleuthkit tools in the process of doing [[zero storage carving]].&lt;br /&gt;
&lt;br /&gt;
The tools are:&lt;br /&gt;
&lt;br /&gt;
* mmls-cp : A CarvPath based version of mmls for listing a partitioned carvpath disk images as a list of partition carvpaths.&lt;br /&gt;
* dls-cp : A CarvPath based version of dls for listing all continuous unallocated fragments of a carvpath partition holding a filesystem as a list of unallocated block carvpaths.&lt;br /&gt;
* icat-cp : A CarvPath based version of icat that instead of copying out the data of an inode within a carvpath partition holding a filesystem as the carvpath of the file and the carvpath of the [[file slack]].&lt;br /&gt;
&lt;br /&gt;
The carvpaths output by dls-cp can be used as the input of a CarvPath aware carving tool.&lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/projects/ocfa/  http://sourceforge.net/projects/ocfa/]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/LibCarvPath</id>
		<title>LibCarvPath</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/LibCarvPath"/>
				<updated>2008-12-12T19:00:49Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;LibCarvPath is a library designed to be used by carving and file system analysis tools.&lt;br /&gt;
LibCarvPath allows fragments represented by offset and size to be combined in a CarvPath&lt;br /&gt;
annotation that take the form of file system paths. LibCarvPath addresses the limits of&lt;br /&gt;
file system paths by mapping extremely fragmented files to a uniquely identifying key in&lt;br /&gt;
a long-path database.&lt;br /&gt;
&lt;br /&gt;
The following tools use LibCarvPath and/or CarvPath Annotations:&lt;br /&gt;
&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpelcp]]&lt;br /&gt;
&lt;br /&gt;
Next to these, in [[Photorec]] work has started to include LibCarvPath support.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/projects/ocfa/  http://sourceforge.net/projects/ocfa/]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/CarvFs</id>
		<title>CarvFs</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/CarvFs"/>
				<updated>2008-12-12T18:59:01Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;CarvFs is a modular [[Fuse]] based user space file system on top op [[LibCarvPath]]. &lt;br /&gt;
CarvFS makes CarvPath style annotations as used by LibCarvPath available as files.&lt;br /&gt;
Using CarvFs makes it possible to process carved entities as files without the need for copy-out.&lt;br /&gt;
&lt;br /&gt;
CarvFs is modular with respect to access to image files.&lt;br /&gt;
The CarvFs distribution comes with a default module for access to (split) raw files.&lt;br /&gt;
&lt;br /&gt;
A separate [[LibEwf]] module is available for access to ewf images.  &lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/projects/ocfa/  http://sourceforge.net/projects/ocfa/]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Scalpelcp</id>
		<title>Scalpelcp</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Scalpelcp"/>
				<updated>2008-12-12T18:52:29Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: New page: ScalpelCp is a simple perl script that processes the logfile of scalpel run in preview mode on a CarvPath file and populates the output directory with symbolic links to CarvFs Carv...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;ScalpelCp is a simple perl script that processes the logfile of [[scalpel]] run in preview mode on a CarvPath file and populates&lt;br /&gt;
the output directory with symbolic links to [[CarvFs]] CarvPaths.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Tsk-cp</id>
		<title>Tsk-cp</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Tsk-cp"/>
				<updated>2008-12-12T18:49:23Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: New page: Tsk-cp is a set of LibCarvPath aware versions of Sleuthkit tools, that are for use together with the normal versions of the other sleuthkit tools in the process of doing [[zero sto...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Tsk-cp is a set of [[LibCarvPath]] aware versions of [[Sleuthkit]] tools, that are for use together with the&lt;br /&gt;
normal versions of the other sleuthkit tools in the process of doing [[zero storage carving]].&lt;br /&gt;
&lt;br /&gt;
The tools are:&lt;br /&gt;
&lt;br /&gt;
* mmls-cp : A CarvPath based version of mmls for listing a partitioned carvpath disk images as a list of partition carvpaths.&lt;br /&gt;
* dls-cp : A CarvPath based version of dls for listing all continuous unallocated fragments of a carvpath partition holding a filesystem as a list of unallocated block carvpaths.&lt;br /&gt;
* icat-cp : A CarvPath based version of icat that instead of copying out the data of an inode within a carvpath partition holding a filesystem as the carvpath of the file and the carvpath of the [[file slack]].&lt;br /&gt;
&lt;br /&gt;
The carvpaths output by dls-cp can be used as the input of a CarvPath aware carving tool.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/LibCarvPath</id>
		<title>LibCarvPath</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/LibCarvPath"/>
				<updated>2008-12-12T18:37:01Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: New page: LibCarvPath is a library designed to be used by carving and file system analysis tools. LibCarvPath allows fragments represented by offset and size to be combined in a CarvPath annotation ...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;LibCarvPath is a library designed to be used by carving and file system analysis tools.&lt;br /&gt;
LibCarvPath allows fragments represented by offset and size to be combined in a CarvPath&lt;br /&gt;
annotation that take the form of file system paths. LibCarvPath addresses the limits of&lt;br /&gt;
file system paths by mapping extremely fragmented files to a uniquely identifying key in&lt;br /&gt;
a long-path database.&lt;br /&gt;
&lt;br /&gt;
The following tools use LibCarvPath and/or CarvPath Annotations:&lt;br /&gt;
&lt;br /&gt;
* [[CarvFs]]&lt;br /&gt;
* [[tsk-cp]]&lt;br /&gt;
* [[scalpelcp]]&lt;br /&gt;
&lt;br /&gt;
Next to these, in [[Photorec]] work has started to include LibCarvPath support.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/CarvFs</id>
		<title>CarvFs</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/CarvFs"/>
				<updated>2008-12-12T18:25:02Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: New page: CarvFs is a modular Fuse based user space file system on top op LibCarvPath.  CarvFS makes CarvPath style annotations as used by LibCarvPath available as files. Using CarvFs makes ...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;CarvFs is a modular [[Fuse]] based user space file system on top op [[LibCarvPath]]. &lt;br /&gt;
CarvFS makes CarvPath style annotations as used by LibCarvPath available as files.&lt;br /&gt;
Using CarvFs makes it possible to process carved entities as files without the need for copy-out.&lt;br /&gt;
&lt;br /&gt;
[http://sourceforge.net/projects/ocfa/  http://sourceforge.net/projects/ocfa/]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Tools:Data_Recovery</id>
		<title>Tools:Data Recovery</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Tools:Data_Recovery"/>
				<updated>2008-12-12T18:14:32Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Carving */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Partition Recovery =&lt;br /&gt;
&lt;br /&gt;
*[http://www.ptdd.com/index.htm Partition Table Doctor]&lt;br /&gt;
: Recover deleted or lost partitions (FAT16/FAT32/NTFS/NTFS5/EXT2/EXT3/SWAP).&lt;br /&gt;
&lt;br /&gt;
*[http://www.diskinternals.com/ntfs-recovery/ NTFS Recovery]&lt;br /&gt;
: DiskInternals NTFS Recovery is a fully automatic utility that recovers data from damaged or formatted disks.&lt;br /&gt;
&lt;br /&gt;
*[http://www.stud.uni-hannover.de/user/76201/gpart/ gpart]&lt;br /&gt;
: Gpart is a tool which tries to guess the primary partition table of a PC-type hard disk in case the primary partition table in sector 0 is damaged, incorrect or deleted.&lt;br /&gt;
&lt;br /&gt;
*[http://www.cgsecurity.org/wiki/TestDisk Testdisk]&lt;br /&gt;
: TestDisk is OpenSource software and is licensed under the GNU Public License (GPL). &lt;br /&gt;
&lt;br /&gt;
== See Also ==&lt;br /&gt;
&lt;br /&gt;
* [http://support.microsoft.com/?kbid=166997 Using Norton Disk Edit to Backup Your Master Boot Record]&lt;br /&gt;
&lt;br /&gt;
== Notes ==&lt;br /&gt;
&lt;br /&gt;
* &amp;quot;fdisk /mbr&amp;quot; restores the boot code in the [[Master Boot Record]], but not the partition itself. On newer versions of Windows you should use fixmbr, bootrec, mbrfix, or mbrwizard. You can also extract a copy of the specific standard MBR code from tools like bootrec.exe and diskpart.exe in Windows (from various offsets) and copy it to disk with dd (Use bs=446 count=1). For Windows XP SP2 c:\%WINDIR%\System32\diskpart.exe the MBR code is found between offset 1b818h and 1ba17h.&lt;br /&gt;
&lt;br /&gt;
= Data Recovery =&lt;br /&gt;
The term &amp;quot;Data Recovery&amp;quot; is frequently used to mean forensic recovery, but the term really should be used for recovering data from damaged media. &lt;br /&gt;
&lt;br /&gt;
*[http://www.toolsthatwork.com/bringback.htm BringBack] &lt;br /&gt;
: BringBack offers easy to use, inexpensive, and highly successful data recovery for Windows and Linux (ext2) operating systems and digital images stored on memory cards, etc.&lt;br /&gt;
&lt;br /&gt;
*[http://www.runtime.org/raid.htm RAID Reconstructor]&lt;br /&gt;
: Runtime Software's RAID Reconstructor will reconstruct RAID Level 0 (Striping) and RAID Level 5 drives.&lt;br /&gt;
&lt;br /&gt;
*[http://www.salvationdata.com Salvation Data]&lt;br /&gt;
: Claims to have a program that can read the &amp;quot;bad blocks&amp;quot; of Maxtor drives with proprietary commands.&lt;br /&gt;
&lt;br /&gt;
* [http://www.e-rol.com/en/ e-ROL]&lt;br /&gt;
: Erol allows you to recover through the internet files erased by mistake. Recover your files online for free.&lt;br /&gt;
&lt;br /&gt;
* [http://www.recuva.com/ Recuva]&lt;br /&gt;
: Recuva is a freeware Windows tool that will recover accidentally deleted files.&lt;br /&gt;
&lt;br /&gt;
* [http://www.snapfiles.com/get/restoration.html Restoration]&lt;br /&gt;
: Restoration is a freeware Windows software that will allow you to recover deleted files&lt;br /&gt;
&lt;br /&gt;
* [http://www.undelete-plus.com/ Undelete Plus]&lt;br /&gt;
: Undelete Plus is a free deleted file recovery tool that works for all versions of Windows (95-Vista), FAT12/16/32, NTFS and NTFS5 filesystems and can perform recovery on various solid state devices.&lt;br /&gt;
&lt;br /&gt;
* [http://www.data-recovery-software.net/ R-Studio]&lt;br /&gt;
: R-Studio is a data recovery software suite that can recover files from FAT(12-32), NTFS, NTFS 5, HFS/HFS+, FFS, UFS/UFS2 (*BSD, Solaris), Ext2/Ext3 (Linux) and so on.&lt;br /&gt;
&lt;br /&gt;
See also [[Data Recovery Stories]]&lt;br /&gt;
&lt;br /&gt;
=Carving=&lt;br /&gt;
*[http://www.datalifter.com/products.htm DataLifter® - File Extractor Pro]&lt;br /&gt;
: Data carving runs on multiple threads to make use of modern processors &lt;br /&gt;
&lt;br /&gt;
*[http://foremost.sourceforge.net/ Foremost]&lt;br /&gt;
: Foremost is a console program to recover files based on their headers, footers, and internal data structures. &lt;br /&gt;
&lt;br /&gt;
*[http://www.digitalforensicssolutions.com/Scalpel/ Scalpel]&lt;br /&gt;
: Scalpel is a fast file carver that reads a database of header and footer definitions and extracts matching files from a set of image files or raw device files. Scalpel is filesystem-independent and will carve files from FATx, NTFS, ext2/3, or raw partitions.&lt;br /&gt;
&lt;br /&gt;
*[[EnCase]]&lt;br /&gt;
: EnCase comes with some eScripts that will do carving.&lt;br /&gt;
&lt;br /&gt;
*[[CarvFs]] &lt;br /&gt;
: A virtual file system (fuse) implementation that can provide carving tools with the possibility to do recursive multi tool zero-storage carving (also called in-place carving). Patches and scripts for scalpel and foremost are provided. Works on raw and encase images. &lt;br /&gt;
&lt;br /&gt;
*[[LibCarvPath]]&lt;br /&gt;
: A shared library that allows carving tools to use zero-storage carving on carvfs virtual files.&lt;br /&gt;
&lt;br /&gt;
*[http://www.cgsecurity.org/wiki/PhotoRec PhotoRec]&lt;br /&gt;
: PhotoRec is file data recovery software designed to recover lost files including video, documents and archives from Hard Disks and CDRom and lost pictures (thus, its 'Photo Recovery' name) from digital camera memory.&lt;br /&gt;
&lt;br /&gt;
*[http://www.datarescue.com/photorescue/ PhotoRescue]&lt;br /&gt;
: Datarescue PhotoRescue Advanced is picture and photo data recovery solution made by the creators of IDA Pro. PhotoRescue will undelete, unerase and recover pictures and files lost on corrupted, erased or damaged compact flash (CF) cards, SD Cards, Memory Sticks, SmartMedia and XD cards.&lt;br /&gt;
&lt;br /&gt;
* [https://www.uitwisselplatform.nl/projects/revit RevIt]&lt;br /&gt;
: RevIt (Revive It) is an experimental carving tool, initially developed for the DFRWS 2006 carving challenge. It uses 'file structure based carving'. Note that RevIt currently is a work in progress.&lt;br /&gt;
&lt;br /&gt;
* [http://jbj.rapanden.dk/magicrescue/ Magic Rescue]&lt;br /&gt;
: Magic Rescue is a file carving tool that uses &amp;quot;magic bytes&amp;quot; in a file contents to recover data.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2008-12-03T22:01:29Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Filesystem Detection */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive [[computer forensics framework]] (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
* Ship with validators for:&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
** JPEG&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** MSOLE&lt;br /&gt;
** ZIP&lt;br /&gt;
** TAR (gz/bz2)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] For a production carver we need at least the following formats&lt;br /&gt;
** Grapical Images&lt;br /&gt;
*** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
*** PNG&lt;br /&gt;
*** GIF&lt;br /&gt;
*** BMP&lt;br /&gt;
*** TIFF&lt;br /&gt;
** Office documents&lt;br /&gt;
*** OLE2 (Word/Excell content support)&lt;br /&gt;
*** PDF&lt;br /&gt;
*** Open Office/Office 2007 (ZIP+XML)&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[DOCX]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Archive Files==&lt;br /&gt;
** Archive files&lt;br /&gt;
*** ZIP&lt;br /&gt;
*** 7z&lt;br /&gt;
*** gzip&lt;br /&gt;
*** bzip2&lt;br /&gt;
*** tar&lt;br /&gt;
*** RAR&lt;br /&gt;
** E-mail files&lt;br /&gt;
*** PFF (PST/OST)&lt;br /&gt;
*** MBOX (text based format, base64 content support)&lt;br /&gt;
** Audio/Video files&lt;br /&gt;
*** MPEG&lt;br /&gt;
*** MP2/MP3&lt;br /&gt;
*** AVI&lt;br /&gt;
*** ASF/WMV&lt;br /&gt;
*** QuickTime&lt;br /&gt;
*** MKV&lt;br /&gt;
** Printer spool files&lt;br /&gt;
*** EMF (if I remember correctly)&lt;br /&gt;
** Internet history files&lt;br /&gt;
*** index.dat&lt;br /&gt;
*** firefox (sqllite 3)&lt;br /&gt;
** Other files&lt;br /&gt;
*** thumbs.db&lt;br /&gt;
*** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
* libtsk&lt;br /&gt;
* libaff ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* libewf ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* posix ? : Can we depend especially on the availability of UNIX domain sockets and the possibility to use msg_accrights for passing opn file handles as ocaps?&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a [[computer forensics framework]] .&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
==Anti forensics and system integrity concerns==&lt;br /&gt;
* It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* libpff (alpha release for PFF (PST/OST) file support) (http://sourceforge.net/projects/libpff/)&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* EWF (http://sourceforge.net/projects/libewf/)&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2008-12-03T22:00:36Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Filesystem Detection */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive [[computer forensics framework]] (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
* Ship with validators for:&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
** JPEG&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** MSOLE&lt;br /&gt;
** ZIP&lt;br /&gt;
** TAR (gz/bz2)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] For a production carver we need at least the following formats&lt;br /&gt;
** Grapical Images&lt;br /&gt;
*** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
*** PNG&lt;br /&gt;
*** GIF&lt;br /&gt;
*** BMP&lt;br /&gt;
*** TIFF&lt;br /&gt;
** Office documents&lt;br /&gt;
*** OLE2 (Word/Excell content support)&lt;br /&gt;
*** PDF&lt;br /&gt;
*** Open Office/Office 2007 (ZIP+XML)&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[DOCX]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Archive Files==&lt;br /&gt;
** Archive files&lt;br /&gt;
*** ZIP&lt;br /&gt;
*** 7z&lt;br /&gt;
*** gzip&lt;br /&gt;
*** bzip2&lt;br /&gt;
*** tar&lt;br /&gt;
*** RAR&lt;br /&gt;
** E-mail files&lt;br /&gt;
*** PFF (PST/OST)&lt;br /&gt;
*** MBOX (text based format, base64 content support)&lt;br /&gt;
** Audio/Video files&lt;br /&gt;
*** MPEG&lt;br /&gt;
*** MP2/MP3&lt;br /&gt;
*** AVI&lt;br /&gt;
*** ASF/WMV&lt;br /&gt;
*** QuickTime&lt;br /&gt;
*** MKV&lt;br /&gt;
** Printer spool files&lt;br /&gt;
*** EMF (if I remember correctly)&lt;br /&gt;
** Internet history files&lt;br /&gt;
*** index.dat&lt;br /&gt;
*** firefox (sqllite 3)&lt;br /&gt;
** Other files&lt;br /&gt;
*** thumbs.db&lt;br /&gt;
*** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
* libtsk&lt;br /&gt;
* libaff ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* libewf ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* posix ? : Can we depend especially on the availability of UNIX domain sockets and the possibility to use msg_accrights for passing opn file handles as ocaps?&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a [[computerr forensics framework]] .&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
==Anti forensics and system integrity concerns==&lt;br /&gt;
* It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* libpff (alpha release for PFF (PST/OST) file support) (http://sourceforge.net/projects/libpff/)&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* EWF (http://sourceforge.net/projects/libewf/)&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2008-12-03T21:58:52Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Requirements */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive [[computer forensics framework]] (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
* Ship with validators for:&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
** JPEG&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** MSOLE&lt;br /&gt;
** ZIP&lt;br /&gt;
** TAR (gz/bz2)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] For a production carver we need at least the following formats&lt;br /&gt;
** Grapical Images&lt;br /&gt;
*** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
*** PNG&lt;br /&gt;
*** GIF&lt;br /&gt;
*** BMP&lt;br /&gt;
*** TIFF&lt;br /&gt;
** Office documents&lt;br /&gt;
*** OLE2 (Word/Excell content support)&lt;br /&gt;
*** PDF&lt;br /&gt;
*** Open Office/Office 2007 (ZIP+XML)&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[DOCX]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Archive Files==&lt;br /&gt;
** Archive files&lt;br /&gt;
*** ZIP&lt;br /&gt;
*** 7z&lt;br /&gt;
*** gzip&lt;br /&gt;
*** bzip2&lt;br /&gt;
*** tar&lt;br /&gt;
*** RAR&lt;br /&gt;
** E-mail files&lt;br /&gt;
*** PFF (PST/OST)&lt;br /&gt;
*** MBOX (text based format, base64 content support)&lt;br /&gt;
** Audio/Video files&lt;br /&gt;
*** MPEG&lt;br /&gt;
*** MP2/MP3&lt;br /&gt;
*** AVI&lt;br /&gt;
*** ASF/WMV&lt;br /&gt;
*** QuickTime&lt;br /&gt;
*** MKV&lt;br /&gt;
** Printer spool files&lt;br /&gt;
*** EMF (if I remember correctly)&lt;br /&gt;
** Internet history files&lt;br /&gt;
*** index.dat&lt;br /&gt;
*** firefox (sqllite 3)&lt;br /&gt;
** Other files&lt;br /&gt;
*** thumbs.db&lt;br /&gt;
*** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
* libtsk&lt;br /&gt;
* libaff ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* libewf ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* posix ? : Can we depend especially on the availability of UNIX domain sockets and the possibility to use msg_accrights for passing opn file handles as ocaps?&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a framework like ocfa.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
==Anti forensics and system integrity concerns==&lt;br /&gt;
* It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* libpff (alpha release for PFF (PST/OST) file support) (http://sourceforge.net/projects/libpff/)&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* EWF (http://sourceforge.net/projects/libewf/)&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Computer_forensics_framework</id>
		<title>Computer forensics framework</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Computer_forensics_framework"/>
				<updated>2008-12-03T21:56:39Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The following more or less generic computer forensics frameworks exist: &lt;br /&gt;
&lt;br /&gt;
* [[Open Computer Forensics Architecture]]&lt;br /&gt;
* [[Pyflag]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/PyFlag</id>
		<title>PyFlag</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/PyFlag"/>
				<updated>2008-12-03T21:49:40Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Infobox_Software |&lt;br /&gt;
  name = PyFlag |&lt;br /&gt;
  maintainer = [[Michael Cohen]], [[David Collett]] |&lt;br /&gt;
  os = {{Linux}}, {{Web-based}} |&lt;br /&gt;
  genre = {{Analysis}} |&lt;br /&gt;
  license = {{GPL}} |&lt;br /&gt;
  website = [http://www.pyflag.net/ pyflag.net] |&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''PyFlag''' is a web-based, database-backed ''forensic and log analysis GUI'' and [[Computer forensics framework]] written in [[Python]].  PyFlag stores disk images in the [[sgzip]] format.&lt;br /&gt;
&lt;br /&gt;
=Features=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==File Systems Understood==&lt;br /&gt;
&lt;br /&gt;
==File Search Facilities==&lt;br /&gt;
&lt;br /&gt;
* Lists allocated and unallocated files.&lt;br /&gt;
* Sorts files by type.&lt;br /&gt;
* Searches for keywords.&lt;br /&gt;
* Works with compressed zip files.&lt;br /&gt;
&lt;br /&gt;
==Historical Reconstruction==&lt;br /&gt;
&lt;br /&gt;
Can it build timelines and search by creation date?&lt;br /&gt;
* Creates a &amp;quot;case file&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==Searching Abilities==&lt;br /&gt;
 &lt;br /&gt;
* Searches for keywords.&lt;br /&gt;
* Builds an index.&lt;br /&gt;
&lt;br /&gt;
==Hash Databases==&lt;br /&gt;
 &lt;br /&gt;
* Hashes and compares with [[Hashkeeper]] using [[MD5]].&lt;br /&gt;
&lt;br /&gt;
==Evidence Collection Features==&lt;br /&gt;
&lt;br /&gt;
=History=&lt;br /&gt;
&lt;br /&gt;
* Originally started by the [[Australian Department of Defence]], PyFlag is now hosted on [[SourceForge]].&lt;br /&gt;
&lt;br /&gt;
==License Notes==&lt;br /&gt;
&lt;br /&gt;
= External Links =&lt;br /&gt;
http://sourceforge.net/projects/pyflag/&lt;br /&gt;
&lt;br /&gt;
==External Reviews==&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Computer_forensics_framework</id>
		<title>Computer forensics framework</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Computer_forensics_framework"/>
				<updated>2008-12-03T21:47:12Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: New page: * Open Computer Forensics Architecture * pyflag&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;* [[Open Computer Forensics Architecture]]&lt;br /&gt;
* [[pyflag]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture</id>
		<title>Open Computer Forensics Architecture</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture"/>
				<updated>2008-12-03T21:45:28Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The '''Open Computer Forensics Architecture''' ('''OCFA''') is a modular [[computer forensics framework]] built by the [[Dutch National Police Agency]]. The main goal is to automate the digital forensic process to speed up the investigation and give tactical [[investigator]]s direct access to the seized data through an easy to use search and browse interface.&lt;br /&gt;
&lt;br /&gt;
The architecture forms an environment where existing forensic [[tools]] and libraries can be easily plugged into the architecture and can thus be made part of the recursive extraction of data and [[metadata]] from digital evidence.&lt;br /&gt;
&lt;br /&gt;
The Open Computer Forensics Architecture aims to be highly modular, robust, fault tolerant, recursive and scalable in order to be usable in large investigations that spawn numerous terabytes of evidence data and covers hundreds of evidence items.&lt;br /&gt;
&lt;br /&gt;
Modules in OCFA for reasons of fault tolerance are processes. The [[OcfaLib API]] makes it possible and relatively easy to build an OCFA module out of any data processing library or tool. OCFA comes with numerous such modules that are mostly wrappers around libraries like [[libmagic]] or tools such as those found in the [[Sleuthkit]].&lt;br /&gt;
&lt;br /&gt;
Communication between modules within OCFA is governed by a two layered communication infrastructure as provided by OCFA. At the lowest layer is a messaging system with at is center the OCFA Anycast Relay. The Anycast Relay provides the facilities of module crash resistance, distributed processing load balancing and flow control.&lt;br /&gt;
At a higher level of communication, the OCFA XML Router provides for the routing of individual pieces of evidence through the most appropriate tool chain for its particular type of content. &lt;br /&gt;
&lt;br /&gt;
Although OCFA contains a rudimentary user interface, most of its power is in the backend architecture.&lt;br /&gt;
The last and final module in the tool chain of any evidence will be the OCFA Data Store Module. This module&lt;br /&gt;
processes the evidence XML (that contains all of the evidence data its meta data) and stores relevant parts into a postgesql database. Extending the apache based user interface with interfaces for your own case bound queries&lt;br /&gt;
is something that should proof very useful in most investigations.&lt;br /&gt;
&lt;br /&gt;
For more information consult [http://sourceforge.net/projects/ocfa/  sourceforge.net/projects/ocfa/ ] .&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/OcfaLib_API</id>
		<title>OcfaLib API</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/OcfaLib_API"/>
				<updated>2008-12-03T20:34:27Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The OcfaLib API is a C++ API that is meant for usage by modules in the [[Open Computer Forensics Architecture]].&lt;br /&gt;
The C++ class ocfa::module::OcfaModule defines the base class of what all OCFA modules are derived. Depending on the type of module, the ocfa::facade namespace provides different convenient subclasses.&lt;br /&gt;
The OcfaModule class defines a virtual function named ProcessEvidence that the final module class must implement.&lt;br /&gt;
A typical module implementation implements a subclass by implementing a constructor and a ProcessEvidence method.&lt;br /&gt;
The main function of the module program will create an instance of the module class and call the run method on the object. The module will connect and register itself to the OCFA Anycast Relay and thus be connected into the Open Computer Forensics Architecture. For each piece of evidence data the module receives, the ProcessEvidence method will get invoked. The implementation of the ProcessEvidence method can depending on the type of facade used as baseclass can:&lt;br /&gt;
&lt;br /&gt;
* Gain read access to the input evidence data.&lt;br /&gt;
* Use its own private workdir for derived and temporary data &lt;br /&gt;
* Derive evidence from the input evidence.&lt;br /&gt;
* Access meta data created by other modules.   &lt;br /&gt;
* Add additional metadata to the input evidence or child evidences.&lt;br /&gt;
&lt;br /&gt;
For a simple example of a module that derives data from data look at the tar module in OcfaModules/dissector/tar.&lt;br /&gt;
For a simple example of a module that extracts metadata from data look at the pgp module in OcfaModule/extractor/pgp.&lt;br /&gt;
&lt;br /&gt;
Next to simple OCFA modules, the ocfa::treegraph namespace provides an aditional sub API for building treegraph loadable modules, that can be used to create modules that map evidence data to a tree graph of data and meta data.&lt;br /&gt;
&lt;br /&gt;
The only API available currently is the C++ API.&lt;br /&gt;
Work is currently being done to also create a Java API and plans exist to build a Perl API to OCFA in the future.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/OcfaLib_API</id>
		<title>OcfaLib API</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/OcfaLib_API"/>
				<updated>2008-12-03T20:22:44Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: New page: The OcfaLib API is a C++ API that is meant for usage by modules in the Open Computer Forensics Architecture. The C++ class ocfa::module::OcfaModule defines the base class of what all O...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The OcfaLib API is a C++ API that is meant for usage by modules in the [[Open Computer Forensics Architecture]].&lt;br /&gt;
The C++ class ocfa::module::OcfaModule defines the base class of what all OCFA modules are derived. Depending on the type of module, the ocfa::facade namespace provides different convenient subclasses.&lt;br /&gt;
The OcfaModule class defines a virtual function named ProcessEvidence that the final module class must implement.&lt;br /&gt;
A typical module implementation implements a subclass by implementing a constructor and a ProcessEvidence method.&lt;br /&gt;
The main function of the module program will create an instance of the module class and call the run method on the object. The module will connect and register itself to the OCFA Anycast Relay and thus be connected into the Open Computer Forensics Architecture. For each piece of evidence data the module receives, the ProcessEvidence method will get invoked. The implementation of the ProcessEvidence method can depending on the type of facade used as baseclass can:&lt;br /&gt;
&lt;br /&gt;
* Gain read access to the input evidence data.&lt;br /&gt;
* Use its own private workdir for derived and temporary data &lt;br /&gt;
* Derive evidence from the input evidence.&lt;br /&gt;
* Access meta data created by other modules.   &lt;br /&gt;
* Add additional metadata to the input evidence or child evidences.&lt;br /&gt;
&lt;br /&gt;
The only API available currently is the C++ API.&lt;br /&gt;
Work is currently being done to also create a Java API and plans exist to build a Perl API to OCFA in the future.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture</id>
		<title>Open Computer Forensics Architecture</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Open_Computer_Forensics_Architecture"/>
				<updated>2008-12-03T19:03:10Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The '''Open Computer Forensics Architecture''' ('''OCFA''') is a modular computer forensics framework built by the [[Dutch National Police Agency]]. The main goal is to automate the digital forensic process to speed up the investigation and give tactical [[investigator]]s direct access to the seized data through an easy to use search and browse interface.&lt;br /&gt;
&lt;br /&gt;
The architecture forms an environment where existing forensic [[tools]] and libraries can be easily plugged into the architecture and can thus be made part of the recursive extraction of data and [[metadata]] from digital evidence.&lt;br /&gt;
&lt;br /&gt;
The Open Computer Forensics Architecture aims to be highly modular, robust, fault tolerant, recursive and scalable in order to be usable in large investigations that spawn numerous terabytes of evidence data and covers hundreds of evidence items.&lt;br /&gt;
&lt;br /&gt;
Modules in OCFA for reasons of fault tolerance are processes. The [[OcfaLib API]] makes it possible and relatively easy to build an OCFA module out of any data processing library or tool. OCFA comes with numerous such modules that are mostly wrappers around libraries like [[libmagic]] or tools such as those found in the [[Sleuthkit]].&lt;br /&gt;
&lt;br /&gt;
Communication between modules within OCFA is governed by a two layered communication infrastructure as provided by OCFA. At the lowest layer is a messaging system with at is center the OCFA Anycast Relay. The Anycast Relay provides the facilities of module crash resistance, distributed processing load balancing and flow control.&lt;br /&gt;
At a higher level of communication, the OCFA XML Router provides for the routing of individual pieces of evidence through the most appropriate tool chain for its particular type of content. &lt;br /&gt;
&lt;br /&gt;
Although OCFA contains a rudimentary user interface, most of its power is in the backend architecture.&lt;br /&gt;
The last and final module in the tool chain of any evidence will be the OCFA Data Store Module. This module&lt;br /&gt;
processes the evidence XML (that contains all of the evidence data its meta data) and stores relevant parts into a postgesql database. Extending the apache based user interface with interfaces for your own case bound queries&lt;br /&gt;
is something that should proof very useful in most investigations.&lt;br /&gt;
&lt;br /&gt;
For more information consult [http://sourceforge.net/projects/ocfa/  sourceforge.net/projects/ocfa/ ] .&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page</id>
		<title>Talk:Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page"/>
				<updated>2008-11-24T21:40:01Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* POLA */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;License: have we even discussed a license yet?  Who chose it?  I'm not terribly opposed to a 3-clause BSD, but...? - [[User:RB|RB]] 00:39, 30 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I prefer the LPGL it's restricts the usage of the code somewhat more. When its integrated in other (closed source) tooling which is published, they must publish that the tool uses this code.&lt;br /&gt;
&lt;br /&gt;
:: LGPL?&lt;br /&gt;
:: [[User:.FUF|.FUF]] 19:40, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] GNU Library or &amp;quot;Lesser&amp;quot; General Public License (LGPL) (http://www.opensource.org/licenses/alphabetical)&lt;br /&gt;
:::: ''Joachim I prefer the LPGL'' :) [[User:.FUF|.FUF]] 19:51, 31 October 2008 (UTC)&lt;br /&gt;
:::::: [[User:Joachim Metz|Joachim]] To quote Homer Simpson &amp;quot;Doh!&amp;quot;&lt;br /&gt;
:: Agreed.  I sit on the fence between BSD and GPL: the business half of me agrees that open licensing should place as few restrictions or qualifications as possible, whereas the idealist/OSS side wants to ensure the project's freedom.  The LGPL is a more reasonable balance, encouraging widespread use but ensuring modifications' freedom. [[User:RB|RB]] 16:59, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Consolidation ==&lt;br /&gt;
&lt;br /&gt;
We've got a '''lot''' of good ideas here, but in interest of not stepping on anyone's toes, it's getting rather disjointed and hard to read.  Is anyone willing to (or allow me to) try to consolidate them into some sort of coherency?  I'd like at least one of the admins ([[User:.FUF|.FUF]] or [[User:Simsong|Simsong]] to concur before anyone moves forward.  I know the wiki way is to just let it grow, but even watching each addition I'm starting to have trouble visualizing where we are. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: That's good idea, but it is important to consolidate without losing the ideas and opinions. I think it's better for this page to enter some &amp;quot;stable&amp;quot; branch. And then we'll move to the next phase. What do you think, [[User:Simsong|Simsong]]? [[User:.FUF|.FUF]] 18:12, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
::: One option could be to break each major section into its own page so it can be properly discussed without clutter, then transclude each to this page.  A dedicated namespace would probably be overkill, but since we're throwing out ideas should at least be mentioned.  --[[User:RB|RB]] 19:03, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] Separate the parts into topics. Have a discussion and an informational part per topic.&lt;br /&gt;
== POLA ==&lt;br /&gt;
&lt;br /&gt;
[[User:Capibara|Capibara]] :Just noticed a presentation by Simson when looking for 'usable security' on google that included notes on designation and access (authority). From this I would gather that Simson has something of a background in [http://en.wikipedia.org/wiki/Capability-based_security capability security theory]. This makes me feel we might have sufficient ground to discuss 'open file handles as capabilities' at the library level. [[AppArmor]] allows instances of executables to be confined to least privilege. Unix domain sockets allow open file handles to be passed around as capabilities. The combination of these two results in the possibility of using [http://en.wikipedia.org/wiki/Object_capability  object capability] based least authority at the process level of granularity. For such to work with libraries such as libaff, libewf and libtsk, these libraries must not insist on opening their own files. At the moment this rules out using libaff or libewf directly as such. [http://ocfa.sourceforge.net/libcarvpath/ CarvFs] can provide a way around this, but it may be desirable to not have to use CarvFs in some cases. Given that the authors of both libaff and libewf are quite active here, It may be valid to discuss the possibility of extending the API of these libs so that the libs dont demand to open their own files, but could also be used by a confined carving tool that would get open file handles passed to it over a unix domain socket.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page</id>
		<title>Talk:Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page"/>
				<updated>2008-11-24T21:36:28Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* POLA */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;License: have we even discussed a license yet?  Who chose it?  I'm not terribly opposed to a 3-clause BSD, but...? - [[User:RB|RB]] 00:39, 30 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I prefer the LPGL it's restricts the usage of the code somewhat more. When its integrated in other (closed source) tooling which is published, they must publish that the tool uses this code.&lt;br /&gt;
&lt;br /&gt;
:: LGPL?&lt;br /&gt;
:: [[User:.FUF|.FUF]] 19:40, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] GNU Library or &amp;quot;Lesser&amp;quot; General Public License (LGPL) (http://www.opensource.org/licenses/alphabetical)&lt;br /&gt;
:::: ''Joachim I prefer the LPGL'' :) [[User:.FUF|.FUF]] 19:51, 31 October 2008 (UTC)&lt;br /&gt;
:::::: [[User:Joachim Metz|Joachim]] To quote Homer Simpson &amp;quot;Doh!&amp;quot;&lt;br /&gt;
:: Agreed.  I sit on the fence between BSD and GPL: the business half of me agrees that open licensing should place as few restrictions or qualifications as possible, whereas the idealist/OSS side wants to ensure the project's freedom.  The LGPL is a more reasonable balance, encouraging widespread use but ensuring modifications' freedom. [[User:RB|RB]] 16:59, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Consolidation ==&lt;br /&gt;
&lt;br /&gt;
We've got a '''lot''' of good ideas here, but in interest of not stepping on anyone's toes, it's getting rather disjointed and hard to read.  Is anyone willing to (or allow me to) try to consolidate them into some sort of coherency?  I'd like at least one of the admins ([[User:.FUF|.FUF]] or [[User:Simsong|Simsong]] to concur before anyone moves forward.  I know the wiki way is to just let it grow, but even watching each addition I'm starting to have trouble visualizing where we are. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: That's good idea, but it is important to consolidate without losing the ideas and opinions. I think it's better for this page to enter some &amp;quot;stable&amp;quot; branch. And then we'll move to the next phase. What do you think, [[User:Simsong|Simsong]]? [[User:.FUF|.FUF]] 18:12, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
::: One option could be to break each major section into its own page so it can be properly discussed without clutter, then transclude each to this page.  A dedicated namespace would probably be overkill, but since we're throwing out ideas should at least be mentioned.  --[[User:RB|RB]] 19:03, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] Separate the parts into topics. Have a discussion and an informational part per topic.&lt;br /&gt;
== POLA ==&lt;br /&gt;
&lt;br /&gt;
[[User:Capibara|Capibara]] :Just noticed a presentation by Simson when looking for 'usable security' on google that included notes on designation and access (authority). From this I would gather that Simson has something of a background in [[capability security theory]]. This makes me feel we might have sufficient ground to discuss 'open file handles as capabilities' at the library level. [[AppArmor]] allows instances of executables to be confined to least privilege. Unix domain sockets allow open file handles to be passed around as capabilities. The combination of these two results in the possibility of using [[object capability]] based least authority at the process level of granularity. For such to work with libraries such as libaff, libewf and libtsk, these libraries must not insist on opening their own files. At the moment this rules out using libaff or libewf directly as such. [http://ocfa.sourceforge.net/libcarvpath/ CarvFs] can provide a way around this, but it may be desirable to not have to use CarvFs in some cases. Given that the authors of both libaff and libewf are quite active here, It may be valid to discuss the possibility of extending the API of these libs so that the libs dont demand to open their own files, but could also be used by a confined carving tool that would get open file handles passed to it over a unix domain socket.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/AppArmor</id>
		<title>AppArmor</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/AppArmor"/>
				<updated>2008-11-24T21:34:07Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: New page: AppArmor, found in Suse and Ubuntu Linux, is a system for letting applications run according to the principle of least privilege. This can be a useful for running forensic or other...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;AppArmor, found in [[Suse]] and [[Ubuntu]] Linux, is a system for letting applications run according to the principle of least privilege.&lt;br /&gt;
This can be a useful for running forensic or other tools of questionable stability (what according to many means almost all tools) on data &lt;br /&gt;
that may constructed with malicious intent against the specific tool, and mitigating the potential damage to the forensic process.&lt;br /&gt;
While AppArmor works with static profiles, it does not intervene in communication of open files over unix sockets, that thus can be used&lt;br /&gt;
to dynamically communicate authority to an (untrusted) AppArmor confined tool.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page</id>
		<title>Talk:Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page"/>
				<updated>2008-11-24T21:21:40Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* POLA */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;License: have we even discussed a license yet?  Who chose it?  I'm not terribly opposed to a 3-clause BSD, but...? - [[User:RB|RB]] 00:39, 30 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I prefer the LPGL it's restricts the usage of the code somewhat more. When its integrated in other (closed source) tooling which is published, they must publish that the tool uses this code.&lt;br /&gt;
&lt;br /&gt;
:: LGPL?&lt;br /&gt;
:: [[User:.FUF|.FUF]] 19:40, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] GNU Library or &amp;quot;Lesser&amp;quot; General Public License (LGPL) (http://www.opensource.org/licenses/alphabetical)&lt;br /&gt;
:::: ''Joachim I prefer the LPGL'' :) [[User:.FUF|.FUF]] 19:51, 31 October 2008 (UTC)&lt;br /&gt;
:::::: [[User:Joachim Metz|Joachim]] To quote Homer Simpson &amp;quot;Doh!&amp;quot;&lt;br /&gt;
:: Agreed.  I sit on the fence between BSD and GPL: the business half of me agrees that open licensing should place as few restrictions or qualifications as possible, whereas the idealist/OSS side wants to ensure the project's freedom.  The LGPL is a more reasonable balance, encouraging widespread use but ensuring modifications' freedom. [[User:RB|RB]] 16:59, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Consolidation ==&lt;br /&gt;
&lt;br /&gt;
We've got a '''lot''' of good ideas here, but in interest of not stepping on anyone's toes, it's getting rather disjointed and hard to read.  Is anyone willing to (or allow me to) try to consolidate them into some sort of coherency?  I'd like at least one of the admins ([[User:.FUF|.FUF]] or [[User:Simsong|Simsong]] to concur before anyone moves forward.  I know the wiki way is to just let it grow, but even watching each addition I'm starting to have trouble visualizing where we are. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: That's good idea, but it is important to consolidate without losing the ideas and opinions. I think it's better for this page to enter some &amp;quot;stable&amp;quot; branch. And then we'll move to the next phase. What do you think, [[User:Simsong|Simsong]]? [[User:.FUF|.FUF]] 18:12, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
::: One option could be to break each major section into its own page so it can be properly discussed without clutter, then transclude each to this page.  A dedicated namespace would probably be overkill, but since we're throwing out ideas should at least be mentioned.  --[[User:RB|RB]] 19:03, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] Separate the parts into topics. Have a discussion and an informational part per topic.&lt;br /&gt;
== POLA ==&lt;br /&gt;
&lt;br /&gt;
[[User:Capibara|Capibara]] :Just noticed a presentation by Simson when looking for 'usable security' on google that included notes on designation and access (authority). From this I would gather that Simson has something of a background in [[capability security theory]]. This makes me feel we might have sufficient ground to discuss 'open file handles as capabilities' at the library level. [[AppArmor]] allows instances of executables to be confined to least privilege. Unix domain sockets allow open file handles to be passed around as capabilities. The combination of these two results in the possibility of using [[object capability]] based least authority at the process level of granularity. For such to work with libraries such as libaff, libewf and libtsk, these libraries must not insist on opening their own files. At the moment this rules out using libaff or libewf directly as such. [[CarvFs]] can provide a way around this, but it may be desirable to not have to use CarvFs in some cases. Given that the authors of both libaff and libewf are quite active here, It may be valid to discuss the possibility of extending the API of these libs so that the libs dont demand to open their own files, but could also be used by a confined carving tool that would get open file handles passed to it over a unix domain socket.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page</id>
		<title>Talk:Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Talk:Carver_2.0_Planning_Page"/>
				<updated>2008-11-24T08:48:40Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;License: have we even discussed a license yet?  Who chose it?  I'm not terribly opposed to a 3-clause BSD, but...? - [[User:RB|RB]] 00:39, 30 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I prefer the LPGL it's restricts the usage of the code somewhat more. When its integrated in other (closed source) tooling which is published, they must publish that the tool uses this code.&lt;br /&gt;
&lt;br /&gt;
:: LGPL?&lt;br /&gt;
:: [[User:.FUF|.FUF]] 19:40, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] GNU Library or &amp;quot;Lesser&amp;quot; General Public License (LGPL) (http://www.opensource.org/licenses/alphabetical)&lt;br /&gt;
:::: ''Joachim I prefer the LPGL'' :) [[User:.FUF|.FUF]] 19:51, 31 October 2008 (UTC)&lt;br /&gt;
:::::: [[User:Joachim Metz|Joachim]] To quote Homer Simpson &amp;quot;Doh!&amp;quot;&lt;br /&gt;
:: Agreed.  I sit on the fence between BSD and GPL: the business half of me agrees that open licensing should place as few restrictions or qualifications as possible, whereas the idealist/OSS side wants to ensure the project's freedom.  The LGPL is a more reasonable balance, encouraging widespread use but ensuring modifications' freedom. [[User:RB|RB]] 16:59, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Consolidation ==&lt;br /&gt;
&lt;br /&gt;
We've got a '''lot''' of good ideas here, but in interest of not stepping on anyone's toes, it's getting rather disjointed and hard to read.  Is anyone willing to (or allow me to) try to consolidate them into some sort of coherency?  I'd like at least one of the admins ([[User:.FUF|.FUF]] or [[User:Simsong|Simsong]] to concur before anyone moves forward.  I know the wiki way is to just let it grow, but even watching each addition I'm starting to have trouble visualizing where we are. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: That's good idea, but it is important to consolidate without losing the ideas and opinions. I think it's better for this page to enter some &amp;quot;stable&amp;quot; branch. And then we'll move to the next phase. What do you think, [[User:Simsong|Simsong]]? [[User:.FUF|.FUF]] 18:12, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
::: One option could be to break each major section into its own page so it can be properly discussed without clutter, then transclude each to this page.  A dedicated namespace would probably be overkill, but since we're throwing out ideas should at least be mentioned.  --[[User:RB|RB]] 19:03, 1 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] Separate the parts into topics. Have a discussion and an informational part per topic.&lt;br /&gt;
== POLA ==&lt;br /&gt;
&lt;br /&gt;
[[User:Capibara|Capibara]] :Just noticed a presentation by Simson when looking for 'usable security' on google that included notes on designation nd authorization. From this I would gather that Simson has something of a background in [[capability security theory]]. This makes me feel we might have sufficient ground to discuss 'open file handles as capabilities' at the library level. [[AppArmor]] allows instances of executables to be confined to least privilege. Unix domain sockets allow open file handles to be passed around as capabilities. The combination of these two results in the possibility of using [[object capability]] based least authority at the process level of granularity. For such to work with libraries such as libaff, libewf and libtsk, these libraries must not insist on opening their own files. At the moment this rules out using libaff or libewf directly as such. [[CarvFs]] can provide a way around this, but it may be desirable to not have to use CarvFs in some cases. Given that the authors of both libaff and libewf are quite active here, It may be valid to discuss the possibility of extending the API of these libs so that the libs dont demand to open their own files, but could also be used by a confined carving tool that would get open file handles passed to it over a unix domain socket.&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Anti-forensic_techniques</id>
		<title>Anti-forensic techniques</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Anti-forensic_techniques"/>
				<updated>2008-11-06T05:12:51Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;'''Anti-forensic techniques''' try to frustrate [[forensic investigator]]s and their [[techniques]].&lt;br /&gt;
&lt;br /&gt;
This can include refusing to run when [[debugging]] mode is enabled, refusing to run when running inside of a [[virtual machine]], or deliberately overwriting data. Although some anti-forensic tools have legitimate purposes, such as overwriting sensitive data that shouldn't fall into the wrong hands, like any [[Tools|tool]] they can be abused.&lt;br /&gt;
&lt;br /&gt;
=Traditional anti-forensics=&lt;br /&gt;
==Overwriting Data and Metadata==&lt;br /&gt;
=== Secure Data Deletion ===&lt;br /&gt;
&lt;br /&gt;
[[Secure data deletion|Securely deleting]] data, so that it cannot be restored with forensic methods. &lt;br /&gt;
&lt;br /&gt;
Overwriting programs typically operate in one of three modes:&lt;br /&gt;
# The program can overwrite the entire media.&lt;br /&gt;
# The program can attempt to overwrite individual files. This task is complicated by journaling file systems: the file itself may be overwritten, but portions may be left in the journal.&lt;br /&gt;
# The program can attempt to overwrite files that were previously “deleted” but left on the drive. Programs typically do this by creating one or more files on the media and then writing to these files until no free space remains, taking special measures to erase small files — for example, files that exist entirely within the Windows Master File Table of an NTFS partition (Garfinkel and Malan, 2005).&lt;br /&gt;
&lt;br /&gt;
Programs employ a variety of techniques to overwrite data. Apple’s Disk Utility allows data to be overwritten with a single pass of NULL bytes, with 7 passes of random data, or with 35 passes of data. Microsoft’s cipher.exe, writes a pass of zeros, a pass of FFs, and a pass of random data, in compliance with DoD standard 5220.22-M. (US DoD, 1995). In 1996 Gutmann asserted that it might be possible to recover overwritten data and proposed a 35-pass approach for assured sanitization (Gutmann 1996). However, a single overwriting pass is now viewed as sufficient for [[Sanitizing Tools|sanitizing]] data from ATA drives with capacities over 15 GB that were manufactured after 2001 (NIST 2006).&lt;br /&gt;
&lt;br /&gt;
Be aware that software 'data destroyers' may not necessarily do what they state on the burb site.  In particular a common mistake is the oversight of how the underlying file system actually stores files, for instance a 'wipe drive' application that will write a series of random values across unallocated space on the hard disk may not take into account the slack space at the end of allocated data blocks.  Thus allowing a large portion of old data to still be recoverable.  This is a very handy for a forensic analyst, but not so handy for IT Managers.&lt;br /&gt;
&lt;br /&gt;
===Overwriting Metadata===&lt;br /&gt;
If the examiner knows when an attacker had access to a Windows, Mac or Unix system, it is frequently possible to determine which files the attacker accessed, by examining file “access” times for every file on the system. Some CFTs can prepare a “timeline” of the attacker’s actions by sorting all of the computer’s timestamps in chronological order. Although an attacker could wipe the contents of the media, this action itself might attract attention. Instead, the attacker might hide her tracks by overwriting the access times themselves so that the timeline could not be reliably constructed.&lt;br /&gt;
&lt;br /&gt;
For example, [[Timestomp]] will overwrite [[NTFS]] “create,” “modify,” “access,” and “change” timestamps ([[Metasploi]]t 2006). [[The Defiler’s Toolkit]] can overwrite inode timestamps and deleted directory entries on many Unix systems; timestamps on allocated files can also be modified using the Unix touch command ([[The Grugq]] 2003).&lt;br /&gt;
&lt;br /&gt;
=== Preventing Data Creation ===&lt;br /&gt;
Prevent the creation of certain data in the first place. Data which was never there, obviously cannot be restored with forensic methods.&lt;br /&gt;
&lt;br /&gt;
For example, a partition can be mounted read-only or accessed through the raw device to prevent the file access times from being updated. The Windows registry key HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\NtfsDisableLastAccessUpdate can be set to “1” to disable updating of the last-accessed timestamp; this setting is default under Windows Vista (Microsoft 2006).&lt;br /&gt;
&lt;br /&gt;
==Cryptography, Steganography, and other Data Hiding Approaches==&lt;br /&gt;
=== Encrypted Data ===&lt;br /&gt;
Cryptographic file systems transparently encrypt data when it is written to the disk and decrypt data when it is read back, making the data opaque to any attacker (or CFT) that does not have the key. These file systems are now readily available for Windows, Mac OS, and Linux. The key can be protected with a passphrase or stored on an auxiliary device such as a USB token. If there is no copy of the key, intentionally destroying the key makes the data stored on the media inaccessible (Boneh and Lipton, 1996). Even if the cryptographic system lacks an intentional sanitization command or “self-destruct,” cryptography can still be a potent barrier to forensic analysis if the cryptographic key is unknown to the examiner. &lt;br /&gt;
&lt;br /&gt;
Cryptography can also be used at the application level. For example, Microsoft Word can be configured to encrypt the contents of a document by specifying that the document has a “password to open.” Although older versions of Microsoft Word encrypted documents with a 40-bit key that can be cracked with commercial tools, modern versions can optionally use a 128-bit encryption that is uncrackable if a secure passphrase is used. &lt;br /&gt;
===	Encrypted Network Protocols===&lt;br /&gt;
Network traffic can likewise be encrypted to protect its content from forensic analysis. Cryptographic encapsulation protocols such as [[SSL forensics|SSL]] and SSH only protect the content of the traffic. Protecting against traffic analysis requires the use of intermediaries. Onion Routing (Goldschlag, Reed and Syverson, 1999) combines both approaches with multiple layers of encryption, so that no intermediary knows both ends of the communication and the plaintext content.&lt;br /&gt;
&lt;br /&gt;
''More information: [[Tor]] and [[VPN]].''&lt;br /&gt;
&lt;br /&gt;
===Program Packers===&lt;br /&gt;
Packers are commonly used by attackers so that attack tools will not be subject to reverse engineering or detection by scanning. Packers such as PECompact (Bitsum 2006) and Burneye (Vrba 2004) will take a second program, compress and/or encrypt it, and wrap it with a suitable extractor. Packers can also incorporate active protection against debugging or reverse engineering techniques. For example, Shiva will exit if its process is being traced; if the process is not being traced, it will create a second process, and the two processes will then trace each other, since each process on a Unix system may only be traced by one other process. (Mehta and Clowes, 2003)&lt;br /&gt;
&lt;br /&gt;
Packed programs that require a password in order to be run can be as strong as their encryption and password. However, the programs are vulnerable at runtime. Burndump is a loadable kernel module (LKM) that automatically detects when a Burneye-protected file is run, waits for the program to be decrypted, and then writes the raw, unprotected binary to another location (ByteRage 2002). Packed programs are also vulnerable to static analysis if no password is required (Eagle 2003).&lt;br /&gt;
===	Steganography ===&lt;br /&gt;
Steganography can be used to embed encrypted data in a cover text to avoid detection. Steghide embeds text in JPEG, MBP, MP3, WAV and AU files (Hetzl 2002). Hydan exploits redundancy in the x86 instruction set; it can encode roughly 1 byte per 110 (El-Khalil 2004). Stegdetect (Provos 2004) can detect some forms of steganography. &lt;br /&gt;
&lt;br /&gt;
StegFS hides encrypted data in the unused blocks of a Linux ext2 file system, making the data “look like a partition in which unused blocks have recently been overwritten with random bytes using some disk wiping tool” (McDonald and Kuhn, 2003).&lt;br /&gt;
&lt;br /&gt;
TrueCrypt allows a second encrypted file system to be hidden within the first (TrueCrypt 2006). The goal of this filesystem-within-a-filesystem is to allow the TrueCrypt users to have a “decoy” file system with data that is interesting but not overtly sensitive. A person who is arrested or captured with a TrueCrypt-protected laptop could then give up the first file system’s password, with the hope that the decoy would be sufficient to satisfy the person’s interrogators. &lt;br /&gt;
&lt;br /&gt;
===	Generic Data Hiding===&lt;br /&gt;
Data can also be hidden in unallocated or otherwise unreachable locations that are ignored by the current generation of forensic tools. &lt;br /&gt;
&lt;br /&gt;
Metasploit’s Slacker will hide data within the slack space of FAT or NTFS file system. FragFS hides data within the NTFS Master File Table. RuneFS (Grugq 2003) stores data in bad blocks. (Thompson and Monroe, 2006). Waffen FS stores data in the ext3 journal file (Eckstein and Jahnke 2005). KY FS stores data in directories (Grugq 2003). Data Mule FS stores data in inode reserved space (Grugq 2003). It is also possible to store information in the unallocated pages of Microsoft Office files.&lt;br /&gt;
&lt;br /&gt;
Information can be stored in the Host Protected Area (HPA) and the Device Configuration Overlay (DCO) areas of modern ATA hard drives. Data in the HPA and DCO is not visible to the BIOS or operating system, although it can be extracted with special tools.&lt;br /&gt;
&lt;br /&gt;
== Detecting Forensic Analysis ==&lt;br /&gt;
&lt;br /&gt;
There are methods to detect whether an [[investigator]] tries to perform a (live) forensic analysis on the system. A malicious user or program could react to that by destroying evidence, for example.&lt;br /&gt;
&lt;br /&gt;
=Other Anti Forensics=&lt;br /&gt;
==Targeting forensic tool blind spots==&lt;br /&gt;
==Targeting forensic tool vulnerabilities==&lt;br /&gt;
==Targeting generic tool/lib vulnerabilities==&lt;br /&gt;
=References=&lt;br /&gt;
Garfinkel, S.,  Anti-Forensics: Techniques, Detection and Countermeasures, The 2nd International Conference on i-Warfare and Security (ICIW), Naval Postgraduate School, Monterey, CA, March 8-9, 2007. [http://www.simson.net/clips/academic/2007.ICIW.AntiForensics.pdf]&lt;br /&gt;
&lt;br /&gt;
Henrique, G. Wendel, Anti Forensics: Making computer forensics hard, Code Breakers III, São Paulo, Brazil, Setember 2006.&lt;br /&gt;
[http://ws.hackaholic.org/slides/AntiForensics-CodeBreakers2006-Translation-To-English.pdf]&lt;br /&gt;
== See also ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:Anti-forensics tools|Anti-forensics tools category]]&lt;br /&gt;
&lt;br /&gt;
== Externals Links ==&lt;br /&gt;
&lt;br /&gt;
* [http://www.safehack.com/Textware/forensic/Anti_Forensic_Break_Encase.pdf Breaking Encase with FILE0 and Winhex]&lt;br /&gt;
&lt;br /&gt;
* [http://ws.hackaholic.org/slides/AntiForensics-CodeBreakers2006-Translation-To-English.pdf Anti Forensics: making computer forensics hard]&lt;br /&gt;
&lt;br /&gt;
* [http://seclists.org/bugtraq/2008/Nov/0038.html PTK Forensic Local Command Execution Vulnerability]&lt;br /&gt;
&lt;br /&gt;
[[Category:Anti-Forensics]]&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2008-11-03T19:26:01Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Ideas */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive-forensic-framework (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
* Ship with validators for:&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
** JPEG&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** MSOLE&lt;br /&gt;
** ZIP&lt;br /&gt;
** TAR (gz/bz2)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] For a production carver we need at least the following formats&lt;br /&gt;
** Grapical Images&lt;br /&gt;
*** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
*** PNG&lt;br /&gt;
*** GIF&lt;br /&gt;
*** BMP&lt;br /&gt;
*** TIFF&lt;br /&gt;
** Office documents&lt;br /&gt;
*** OLE2 (Word/Excell content support)&lt;br /&gt;
*** PDF&lt;br /&gt;
*** Open Office/Office 2007 (ZIP+XML)&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[DOCX]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Archive Files==&lt;br /&gt;
** Archive files&lt;br /&gt;
*** ZIP&lt;br /&gt;
*** 7z&lt;br /&gt;
*** gzip&lt;br /&gt;
*** bzip2&lt;br /&gt;
*** tar&lt;br /&gt;
*** RAR&lt;br /&gt;
** E-mail files&lt;br /&gt;
*** PFF (PST/OST)&lt;br /&gt;
*** MBOX (text based format, base64 content support)&lt;br /&gt;
** Audio/Video files&lt;br /&gt;
*** MPEG&lt;br /&gt;
*** MP2/MP3&lt;br /&gt;
*** AVI&lt;br /&gt;
*** ASF/WMV&lt;br /&gt;
*** QuickTime&lt;br /&gt;
*** MKV&lt;br /&gt;
** Printer spool files&lt;br /&gt;
*** EMF (if I remember correctly)&lt;br /&gt;
** Internet history files&lt;br /&gt;
*** index.dat&lt;br /&gt;
*** firefox (sqllite 3)&lt;br /&gt;
** Other files&lt;br /&gt;
*** thumbs.db&lt;br /&gt;
*** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
* libtsk&lt;br /&gt;
* libaff ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* libewf ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* posix ? : Can we depend especially on the availability of UNIX domain sockets and the possibility to use msg_accrights for passing opn file handles as ocaps?&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a framework like ocfa.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
==Anti forensics and system integrity concerns==&lt;br /&gt;
* It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* libpff (alpha release for PFF (PST/OST) file support) (http://sourceforge.net/projects/libpff/)&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* EWF (http://sourceforge.net/projects/libewf/)&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2008-11-03T19:23:46Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Library Dependencies */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive-forensic-framework (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
* Ship with validators for:&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
** JPEG&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** MSOLE&lt;br /&gt;
** ZIP&lt;br /&gt;
** TAR (gz/bz2)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] For a production carver we need at least the following formats&lt;br /&gt;
** Grapical Images&lt;br /&gt;
*** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
*** PNG&lt;br /&gt;
*** GIF&lt;br /&gt;
*** BMP&lt;br /&gt;
*** TIFF&lt;br /&gt;
** Office documents&lt;br /&gt;
*** OLE2 (Word/Excell content support)&lt;br /&gt;
*** PDF&lt;br /&gt;
*** Open Office/Office 2007 (ZIP+XML)&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[DOCX]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Archive Files==&lt;br /&gt;
** Archive files&lt;br /&gt;
*** ZIP&lt;br /&gt;
*** 7z&lt;br /&gt;
*** gzip&lt;br /&gt;
*** bzip2&lt;br /&gt;
*** tar&lt;br /&gt;
*** RAR&lt;br /&gt;
** E-mail files&lt;br /&gt;
*** PFF (PST/OST)&lt;br /&gt;
*** MBOX (text based format, base64 content support)&lt;br /&gt;
** Audio/Video files&lt;br /&gt;
*** MPEG&lt;br /&gt;
*** MP2/MP3&lt;br /&gt;
*** AVI&lt;br /&gt;
*** ASF/WMV&lt;br /&gt;
*** QuickTime&lt;br /&gt;
*** MKV&lt;br /&gt;
** Printer spool files&lt;br /&gt;
*** EMF (if I remember correctly)&lt;br /&gt;
** Internet history files&lt;br /&gt;
*** index.dat&lt;br /&gt;
*** firefox (sqllite 3)&lt;br /&gt;
** Other files&lt;br /&gt;
*** thumbs.db&lt;br /&gt;
*** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
* libtsk&lt;br /&gt;
* libaff ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* libewf ? : possibly the discussion in the requirements section should move to his section.&lt;br /&gt;
* posix ? : Can we depend especially on the availability of UNIX domain sockets and the possibility to use msg_accrights for passing opn file handles as ocaps?&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a framework like ocfa.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* A wild idea that I hope at least one person will have a liking for: It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* libpff (alpha release for PFF (PST/OST) file support) (http://sourceforge.net/projects/libpff/)&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* EWF (http://sourceforge.net/projects/libewf/)&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	<entry>
		<id>http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page</id>
		<title>Carver 2.0 Planning Page</title>
		<link rel="alternate" type="text/html" href="http://www.forensicswiki.org/wiki/Carver_2.0_Planning_Page"/>
				<updated>2008-11-03T15:40:12Z</updated>
		
		<summary type="html">&lt;p&gt;Capibara: /* Requirements */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is for planning Carver 2.0.&lt;br /&gt;
&lt;br /&gt;
Please, do not delete text (ideas) here. Use something like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will look like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;s&amp;gt;bad idea&amp;lt;/s&amp;gt;&lt;br /&gt;
:: good idea&lt;br /&gt;
&lt;br /&gt;
= License =&lt;br /&gt;
&lt;br /&gt;
BSD-3.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] library based validators could require other licenses&lt;br /&gt;
::: Make the other libraries plug-able. If you them, you use them. [[User:Simsong|Simsong]] 06:34, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= OS =&lt;br /&gt;
&lt;br /&gt;
Linux/FreeBSD/MacOS&lt;br /&gt;
: Shouldn't this just match what the underlying afflib &amp;amp; sleuthkit cover? [[User:RB|RB]]&lt;br /&gt;
:: Yes, but you need to test and validate on each. Question: Do we want to support windows? [[User:Simsong|Simsong]] 21:09, 30 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I think we would do wise to design with windows support from the start this will improve the platform independence from the start&lt;br /&gt;
:::: Agreed; I would even settle at first for being able to run against Cygwin.  Note that I don't even own or use a copy of Windows, but the vast majority of forensic investigators do. [[User:RB|RB]] 14:01, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Capibara|Rob J Meijer]] Leaning heavily on the autotools might be the way to go. I do however feel that support requirements for windows would not be essential. Being able to run from a virtual machine with the main storage mounted over cifs should however be tested and if possible tuned extensively.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] You'll need more than autotools to do native Windows support i.e. file access, UTF-16 support, wrap some basic system functions or have them available otherwise&lt;br /&gt;
::::::[[User:Capibara|Rob J Meijer]] That´s exactly my point, windows support as in being able to build and run on windows natively is much more trouble than its worth. Better make for a lean and mean autotools based build with little dependencies and no or little recursion, and better spent effort on a lean POLA design on POSIX based systems than on supporting building and running on non POSIX systems.&lt;br /&gt;
&lt;br /&gt;
= Name tooling =&lt;br /&gt;
&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] A name for the tooling I propose coldcut&lt;br /&gt;
:: How about 'butcher'?  ;)  [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] cleaver ( scalpel on steroids ;-) )&lt;br /&gt;
* I would like to propose Gouge or Chisel :-) [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
&lt;br /&gt;
= Requirements =&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Could we do a MoSCoW evaluation of these.&lt;br /&gt;
* AFF and EWF file images supported from scratch. ([[User:Joachim Metz|Joachim]] I would like to have raw/split raw and device access as well)&lt;br /&gt;
:: If we base our image i/o on afflib, we get all three with one interface. [[User:RB|RB]] Instead of letting the tools use afflib, better to write an afflib module for carvfs, and update the libewf module. The tool could than be oblivious of the file format. [[User:Capibara|Rob J Meijer]]&lt;br /&gt;
:::: [[User:Simsong|Simsong]] 06:29, 3 November 2008 (UTC) The problem with using carvfs is that this adds another dependency. Do you really want to require that people install carvfs in order to run the carver? What about having the thing ported to Windows?&lt;br /&gt;
:::::: [[User:Capibara|Rob J Meijer]] I would support adding one build dependency (libcarvpath) and removing two (libewf/libaff) by moving them to a layer more suited for them (carvfs) that would possibly allow some form of file handle (as cap) based POLA design. I am a proponent of making small things that do one and do one thing right, and to stack those to do what you need. In my view that would lead ideally to the following (simplified) chain:&lt;br /&gt;
::::::* recursive-forensic-framework (ocfa/pyflag)&lt;br /&gt;
::::::** &amp;lt;b&amp;gt;The-(pola-based)-carving-tool&amp;lt;/b&amp;gt; &lt;br /&gt;
::::::*** &amp;lt;b&amp;gt;The-carving-lib&amp;lt;/b&amp;gt; working on open fd's.  &lt;br /&gt;
::::::**** libcarvpath&lt;br /&gt;
::::::***** carvfs (Over cifs/nfs-v4 on platforms that don't support Fuse).&lt;br /&gt;
::::::****** libewf&lt;br /&gt;
::::::****** libaff&lt;br /&gt;
::::::*** AppArmor (on supporting platforms)&lt;br /&gt;
::::::*** suid (on supporting platforms)&lt;br /&gt;
::::::*** iptables/ipfw (on supporting platforms)&lt;br /&gt;
:::::: As fow windows support, I would imagine making carvfs run over smb would come a long way, that is for as far as windows support is all that relevant. &lt;br /&gt;
:::::: There are two advantages to using libcarvpath and carvfs instead of libaff/libewf t this layer:&lt;br /&gt;
::::::* storage requirements for doing carving. Beyond what sleuthkit or alternatives provide I have seen many situations where carving was not done due to storage limitations.&lt;br /&gt;
::::::* File handles are like object capabilities. You can often do pretty simple POLA based implementations using file handles and something like AppArmor. POLA could IMHO be a strong weapon against the more nasty forms of anti forensics.&lt;br /&gt;
::::::Next to this, I would consider making different tools for different stages instead of one semi recursive one, and looking at how to integrate these tools into existing frameworks (ocfa/pyflag). &lt;br /&gt;
::::::Keep things simple but rigid and try to easily integrate things into existing frameworks as effectively as possible I would suggest.&lt;br /&gt;
::::::Please note, I am not ptoposing the lib/tool should be useless without libcarvpath, only that usage without carvfs should limit the&lt;br /&gt;
::::::supported image formats to raw images, and that libewf/libaff should be abstracted at the Fuse level or below and not at the tool level.  &lt;br /&gt;
:::::::[[User:Joachim Metz|Joachim]] do you have an idea what the performance impact of this approach would be? It might be wise to do a proof of concept for this approach first.&lt;br /&gt;
::::::::[[User:Capibara|Rob J Meijer]] It would I think depend greatly on behavior of the carving lib/tool. Small 512 byte reads are relatively very expensive, 128kb reads have negligible impact. Here are some numbers from ntfs-3g:  http://article.gmane.org/gmane.comp.file-systems.fuse.devel/6397/match=ntfs+3g+performance+ext3 that might be relevant. More relevant than performance might be library footprint. For example, using OCFA, we often would want to keep e few hundred images that total to tens of TB of projected storage size fuse mounted. If libewf/libaff have a big combined memory footprint in such cases, this can be a major issue for this approach.&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] this layer should support multi threaded decompression of compressed image types, this speeds up IO&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] volume/partition aware layer (what about carving unpartioned space)&lt;br /&gt;
* File system aware layer. This could be or make use of tsk-cp.&lt;br /&gt;
** By default, files are not carved. (clarify: only identified? [[User:RB|RB]]; I guess that it operates like [[Selective file dumper]] [[User:.FUF|.FUF]] 07:00, 29 October 2008 (UTC)). Alternatively, the tool could use libcarvpath and output carvpaths or create a directory with symlinks to carvpaths that point into a carvfs mountpoint [[User:Capibara|Rob J Meijer]].&lt;br /&gt;
* Plug-in architecture for identification/validation.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] support for multiple types of validators&lt;br /&gt;
*** dedicated validator&lt;br /&gt;
*** validator based on file library (i.e. we could specify/implement a file structure API for these)&lt;br /&gt;
*** configuration based validator (Can handle config files,like Revit07, to enter different file formats used by the carver.)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Moderator: Could we limit the requirements for prototype version 1 of the tool to get a working version up and running ASAP?&lt;br /&gt;
And keep discussing future options?&lt;br /&gt;
&lt;br /&gt;
I think the following set will be large enough to handle:&lt;br /&gt;
Input facilities&lt;br /&gt;
* IO support (AFF, device, EWF, RAW and split RAW)&lt;br /&gt;
:: Abstraction of input format and multi threaded decompression (spin-off code out of afflib?)&lt;br /&gt;
* Volume/Partitions support&lt;br /&gt;
:: at least for DOS based layout and GPT (spin-off code out of TSK/Photorec?)&lt;br /&gt;
* File system support&lt;br /&gt;
:: VFAT/NTFS (spin-off code out of TSK/Photorec?)&lt;br /&gt;
&lt;br /&gt;
Carving facilities&lt;br /&gt;
* File format support using plug-able validator model (use dedicated validators Photorec/Scarve and/or wrap revit07 file format as validator?)&lt;br /&gt;
* Content support using plug-able validator model (to handle text/mbox base64)&lt;br /&gt;
* File system carving support (to handle file system fragments, could be linked to file system support layer?)&lt;br /&gt;
* Basic fragment handling&lt;br /&gt;
&lt;br /&gt;
Output facilities&lt;br /&gt;
* audit/analysis/debug log&lt;br /&gt;
* extraction of result files&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
* Ship with validators for:&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I think we should distinguish between file format validators and content validators&lt;br /&gt;
** JPEG&lt;br /&gt;
** PNG&lt;br /&gt;
** GIF&lt;br /&gt;
** MSOLE&lt;br /&gt;
** ZIP&lt;br /&gt;
** TAR (gz/bz2)&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] For a production carver we need at least the following formats&lt;br /&gt;
** Grapical Images&lt;br /&gt;
*** JPEG (the 3 different types with JFIF/EXIF support)&lt;br /&gt;
*** PNG&lt;br /&gt;
*** GIF&lt;br /&gt;
*** BMP&lt;br /&gt;
*** TIFF&lt;br /&gt;
** Office documents&lt;br /&gt;
*** OLE2 (Word/Excell content support)&lt;br /&gt;
*** PDF&lt;br /&gt;
*** Open Office/Office 2007 (ZIP+XML)&lt;br /&gt;
:: Extension validation? AFAIK, MS Office 2007 [[DOCX]] format uses plain ZIP (or not?), and carved files will (or not?) have .zip extension instead of DOCX. Is there any way to fix this (may be using the file list in zip)? [[User:.FUF|.FUF]] 20:25, 31 October 2008 (UTC)&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Addition: Office 2007 also has a binary file format which is also a ZIP-ed data&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Archive Files==&lt;br /&gt;
** Archive files&lt;br /&gt;
*** ZIP&lt;br /&gt;
*** 7z&lt;br /&gt;
*** gzip&lt;br /&gt;
*** bzip2&lt;br /&gt;
*** tar&lt;br /&gt;
*** RAR&lt;br /&gt;
** E-mail files&lt;br /&gt;
*** PFF (PST/OST)&lt;br /&gt;
*** MBOX (text based format, base64 content support)&lt;br /&gt;
** Audio/Video files&lt;br /&gt;
*** MPEG&lt;br /&gt;
*** MP2/MP3&lt;br /&gt;
*** AVI&lt;br /&gt;
*** ASF/WMV&lt;br /&gt;
*** QuickTime&lt;br /&gt;
*** MKV&lt;br /&gt;
** Printer spool files&lt;br /&gt;
*** EMF (if I remember correctly)&lt;br /&gt;
** Internet history files&lt;br /&gt;
*** index.dat&lt;br /&gt;
*** firefox (sqllite 3)&lt;br /&gt;
** Other files&lt;br /&gt;
*** thumbs.db&lt;br /&gt;
*** pagefile?&lt;br /&gt;
&lt;br /&gt;
==Carving Strategies==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Note to moderator could this section be merged with the carving algorithm section?&lt;br /&gt;
&lt;br /&gt;
* Simple fragment recovery carving using gap carving.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have hook in for more advanced fragment recovery?&lt;br /&gt;
* Recovering of individual ZIP sections and JPEG icons that are not sector aligned.&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] I would propose a generic fragment detection and recovery&lt;br /&gt;
* Autonomous operation (some mode of operation should be completely non-interactive, requiring no human intervention to complete [[User:RB|RB]])&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] as much as possible, but allow to be overwritten by user&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] When the tool output files the filenames should contain the offset in the input data (in hexadecimal?)&lt;br /&gt;
:: [[User:Mark Stam|Mark]] I really like the fact carved files are named after the physical or logical sector in which the file is found (photorec)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This naming schema might cause duplicate name problem for extracting embedded files and extracting files from non sector aligned file systems.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export embedded files?&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Should the tool allow to export fragments separately?&lt;br /&gt;
* [[User:Mark Stam|Mark]] I personally use photorec often for carving files in the whole volume (not only unallocated clusters), so I can store information about all potential interesting files in MySQL&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] interesting, Bas Kloet and me have been discussing to use information about allocated files in the recovery process, i.e. recovered fragments could be part of allocated files. Do we want to be able to extract them? Or could we rebuild the file from the fragments and the allocated files.&lt;br /&gt;
* [[User:Mark Stam|Mark]] It would also be nice if the files can be hashed immediately (MD5) so looking for them in other tools (for example Encase) is a snap&lt;br /&gt;
&lt;br /&gt;
==Performance Requirements==&lt;br /&gt;
* Tested on 500GB-sized images. Should be able to carve a 500GB image in roughly 50% longer than it takes to read the image.&lt;br /&gt;
** Perhaps allocate a percentage budget per-validator (i.e. each validator adds N% to the carving time) [[User:RB|RB]]&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] have multiple carving phases for precision/speed trade off?&lt;br /&gt;
* Parallelizable&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] tunable for different architectures&lt;br /&gt;
* Configuration:&lt;br /&gt;
** Capability to parse some existing carvers' configuration files, either on-the-fly or as a one-way converter.&lt;br /&gt;
** Disengage internal configuration structure from configuration files, create parsers that present the expected structure&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] The validator should deal with the file structure the carving algorithm should not know anything about the file structure (as in revit07 design)&lt;br /&gt;
**  Either extend Scalpel/Foremost syntaxes for extended features or use a tertiary syntax ([[User:Joachim Metz|Joachim]] I would prefer a derivative of the revit07 configuration syntax which already has encountered some problems of dealing with defining file structure in a configuration file)&lt;br /&gt;
&lt;br /&gt;
==Output==&lt;br /&gt;
* Can output audit.txt file.&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output database with offset analysis values i.e. for visualization tooling&lt;br /&gt;
* [[User:Joachim Metz|Joachim]] Can output debug log for debugging the algorithm/validation&lt;br /&gt;
* Easy integration into ascription software.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm no native speaker what do you mean with &amp;quot;ascription software&amp;quot;?&lt;br /&gt;
::: I think this was another non-native requesting easy scriptability. [[User:RB|RB]] 14:20, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] that makes sense ;-)&lt;br /&gt;
::::: Incorrect. Ascription software is software that determines who the owner of a file is. [[User:Simsong|Simsong]] 06:36, 3 November 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Ideas =&lt;br /&gt;
* Use as much TSK if possible. Don't carry your own FS implementation the way photorec does.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] using TSK as much as possible would not allow to add your own file system support (i.e. mobile phones, memory structures, cap files) I would propose wrapping TSK and using it as much as possible but allow to integrate own FS implementations. &lt;br /&gt;
* Extracting/carving data from [[Thumbs.db]]? I've used [[foremost]] for it with some success. [[Vinetto]] has some critical bugs :( [[User:.FUF|.FUF]] 19:18, 28 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
==Recursive Carving==&lt;br /&gt;
[[User:Joachim Metz|Joachim]] do we want to support (let's call it) 'recursive in file carving' (for now) this is different from embedded files because there is a file system structure in the file and not just another file structure&lt;br /&gt;
* Is it just me, or do a lot of the above (and below) ideas somewhat skirt around the fact that many of us want recursive carving?  Can we bend back to that instead of discussing object particulars?  I think this can be distilled down to three requirements:&lt;br /&gt;
** Simple recursion: once an object is identified, have the ability to re-carve it for internal structures&lt;br /&gt;
** Directed recursion: the carver should be able to be directed at arbitrary blobs and told to carve it as a specified type.  This allows programmatically more simple methods of dealing with unidentifiably compressed or encrypted data.  Or filesystem fragments.&lt;br /&gt;
** Export: the ability to export an object (recognized or not) for later or external &amp;quot;recursion&amp;quot;.  Should go without saying for a carver, but...&lt;br /&gt;
:--[[User:RB|RB]] 18:45, 2 November 2008 (UTC)&lt;br /&gt;
:: [[User:Simsong|Simsong]] 06:30, 3 November 2008 (UTC) pyflag already does recursive carving. Are we just going to reimplement pyflag as a single executable?&lt;br /&gt;
&lt;br /&gt;
==Library Dependencies==&lt;br /&gt;
[[User:Capibara|Rob J Meijer]] :&lt;br /&gt;
* Use libcarvpath whenever possible and by default to avoid high storage requirements.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] For easy deployment I would not opt for making an integral part of the tool solely dependant on a single external library or the library must be integrated in the package&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] Integrating libraries (libtsk,libaff.libewf,libcarvpath etc) is bad practice, autotools are your friend IMO.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I'm not talking about integrating (shared) libraries. I'm talking about that an integral part of a tool should be part of it's package. Why can't the tool package contain shared or static libraries for local use? A far worse thing to do is to have a large set of dependencies and making the tool difficult to install for most users. The tool package should contain the most necessary code. afflib/libewf support could be detected by the autotools a neat separation of functionality.&lt;br /&gt;
::: From a packager's standpoint, [[User:Joachim Metz|Joachim]]'s other libraries do a really good job of this, carrying around what they need but using a system-global version if available.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
==Filesystem Detection==&lt;br /&gt;
* Dont stop with filesystem detection after the first match. Often if a partition is reused with a new FS and is not all that full yet, much of the old FS can still be valid. I have seen this with ext2/fat. The fact that you have identified a valid FS on a partition doesn't mean there isn't an(almost) valid second FS that would yield additional files. Identifying doubly allocated space might in some cases also be relevant.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] What your saying is that dealing with file system fragments should be part of the carving algorithm&lt;br /&gt;
* Allow use where filesystem based carving is done by other tool, and the tool is used as second stage on (sets of) unallocated block (pseudo) files and/or non FS partition (pseudo) files.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I would not opt for this. The tool would be dependent on other tools and their data format, which makes the tool difficult to maintain. I would opt to integrate the functionality of having multiple recovery phases (stages) and allow the tooling to run the phases after one and other or separately.&lt;br /&gt;
::[[User:Capibara|Rob J Meijer]] More generically, I feel a way should exist to communicate the 'left overs' a previous (non open, for example LE-only) tool left.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess if the tool is designed to handle multiple phases it should store its data somewhere. So it should be possible to convert results of such non open tooling to the format required. However I would opt to design the recovery functionality of these non-open tools into open tools. And not to limit ourselves making translators due to the design of these non-open tools.&lt;br /&gt;
* Ability to be used as a library instead of a tool. Ability to access metadata true library, and thus the ability to set metadata from the carving modules. This would be extremely usefull for integrating the project into a framework like ocfa.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] I guess most of the code could be integrated into libraries, but I would not opt integrating tool functionality into a library&lt;br /&gt;
* A wild idea that I hope at least one person will have a liking for: It might be very interesting to look at the possibilities of using a multi process style of module support and combine it with a least authority design. On platforms that support AppArmor (or similar) and uid based firewall rules, this could make for the first true POLA (principle of least authority) based forensic tool ever. POLA based forensics tools should make for a strong integrity guard against many anti forensics. Alternatively we could look at integrating a capability secure language (E?) for implementation of at least validation modules. I don't expect this idea to make it, but mentioning it I hope might spark off less strong alternatives that at least partially address the integrity + anti-forensics problem. If we can in some way introduce POLA to a wider forensics public, other tools might also pick up on it what would be great.&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] Could you give an example of how you see this in action?&lt;br /&gt;
::::[[User:Capibara|Rob J Meijer]] I see two layers where using POLA could be applied. The best one would require one of the folowing as prerequisites:&lt;br /&gt;
::::* The libaff/libewf layer is moved to a fuse implementation (for example carvfs).&lt;br /&gt;
::::* Libewf/Libaff are updated to accept opened filhandles instead of demanding to open their own files. &lt;br /&gt;
::::If one of these is fulfilled, than the tool running as some user can just have the simple task of opening the image files, starting up the 'real' tool and handing over the appropriate file handles. If the real tool runs with a restrictive AppArmor profile, and is started suid to a tool specific user that also has its own iptables uid based filter, than the real tool will run with least authority.&lt;br /&gt;
:::: A second alternative, if neither of the first prerequisite could not be bet, would be to run the modules as confined processes and have a non confined process run as proxy for the first.&lt;br /&gt;
:::: A third probably far fetched alternative would be to embed an object capability language in the tool and make the module interface thus that modules are to be written in this ocap language.&lt;br /&gt;
::::A 4th alternative might include minorfs or plash, but I havn't geven those sufficient thinking hours yet.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [[User:Mark Stam|Mark]] I think it would be very handy to have a CSV, TSV, XML or other delimited output (log)file with information about carved files. This output file can then be stored in a database or Excel sheet (report function)&lt;br /&gt;
&lt;br /&gt;
== Format syntax specification ==&lt;br /&gt;
* Carving data structures. For example, extract all TCP headers from image by defining TCP header structure and some fields (e.g. source port &amp;gt; 1024, dest port = 80). This will extract all data matching the pattern and write a file with other fields. Another example is carving INFO2 structures and URL activity records from index.dat [[User:.FUF|.FUF]] 20:51, 28 October 2008 (UTC)&lt;br /&gt;
** This has the opportunity to be extended to the concept of &amp;quot;point at blob FOO and interpret it as BAR&amp;quot;&lt;br /&gt;
.FUF added:&lt;br /&gt;
The main idea is to allow users to define structures, for example (in pascal-like form):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1: Byte = 123;&lt;br /&gt;
SomeTextLength: DWORD;&lt;br /&gt;
SomeText: string[SomeTextLength];&lt;br /&gt;
Field4: Char = 'r';&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will produce something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Field1 = 123&lt;br /&gt;
SomeTextLength = 5&lt;br /&gt;
SomeText = 'abcd1'&lt;br /&gt;
Field4 = 'r'&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
(In text or raw forms.)&lt;br /&gt;
&lt;br /&gt;
Opinions?&lt;br /&gt;
&lt;br /&gt;
Opinion: Simple pattern identification like that may not suffice, I think Simson's original intent was not only to identify but to allow for validation routines (plugins, as the original wording was).  As such, the format syntax would need to implement a large chunk of some programming language in order to be sufficiently flexible. [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
In my option your example is too limited. Making the revit configuration I learned you'll need a near programming language to specify some file formats.&lt;br /&gt;
A simple descriptive language is too limiting. I would also go for 2 bytes with endianess instead of using terminology like WORD and small integer, it's much more clear. The configuration also needs to deal with aspects like cardinality, required and optional structures.&lt;br /&gt;
:: This is simply data structures carving, see ideas above. Somebody (I cannot track so many changes per day) separated the original text. There is no need to count and join different structures. [[User:.FUF|.FUF]] 19:53, 31 October 2008 (UTC)&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] This was probably me is the text back in it's original form?&lt;br /&gt;
:::: I started it by moving your Revit07 comment to the validator/plugin section in [http://www.forensicswiki.org/index.php?title=Carver_2.0_Planning_Page&amp;amp;diff=prev&amp;amp;oldid=7583 this edit], since I was still at that point thinking operational configuration for that section, not parser configurations. [[User:RB|RB]]&lt;br /&gt;
:::: [[User:Joachim Metz|Joachim]] I renamed the title to format syntax, clarity is important ;-)&lt;br /&gt;
&lt;br /&gt;
Please take a look at the revit07 format syntax specification (configuration). It's not there yet but goes a far way. Some things currently missing:&lt;br /&gt;
* bitwise alignment&lt;br /&gt;
* handling encapsulated streams (MPEG/capture files)&lt;br /&gt;
* handling content based formats (MBOX)&lt;br /&gt;
&lt;br /&gt;
=Caving algorithm =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* should we allow for multiple carving phases (runs/stages)?&lt;br /&gt;
:: I opt yes (separation of concern)&lt;br /&gt;
* should we allow for multiple carving algorithms?&lt;br /&gt;
:: I opt yes, this allows testing of different approaches&lt;br /&gt;
* Should the algorithm try to do as much in 1 run over the input data? To reduce IO?&lt;br /&gt;
:: I opt that the tool should allow for multiple and single run over the input data to minimize the IO or the CPU as bottleneck&lt;br /&gt;
* Interaction between algorithm and validators&lt;br /&gt;
** does the algorithm passes data blocks to the validators?&lt;br /&gt;
** does a validator need to maintain a state?&lt;br /&gt;
** does a validator need to revert a state?&lt;br /&gt;
** How do we deal with embedded files and content validation? Do the validators call another validator?&lt;br /&gt;
* do we use the assumption that a data block can be used by a single file (with the exception of embedded/encapsulated files)?&lt;br /&gt;
* Revit07 allows for multiple concurrent result files states to deal with fragmentation. One has the attribute of being active (the preferred) and the other passive. Do we want/need something similar? The algorithm adds block of input data (offsets) to these result files states.&lt;br /&gt;
** if so what info would these result files states require (type, list of input data blocks)&lt;br /&gt;
* how do we deal with file system remainders?&lt;br /&gt;
** Can we abstract them and compare them against available file system information?&lt;br /&gt;
* Do we carve file systems in files?&lt;br /&gt;
:: I opt that at least the validator uses this information&lt;br /&gt;
&lt;br /&gt;
==Caving scenarios ==&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* normal file (file structure, loose text based structure (more a content structure?))&lt;br /&gt;
* fragmented file (the file entirely exist)&lt;br /&gt;
* a file fragment (the file does not entirely exist)&lt;br /&gt;
* intertwined file&lt;br /&gt;
* encapsulated file (MPEG/network capture)&lt;br /&gt;
* embedded file (JPEG thumbnail)&lt;br /&gt;
* obfuscation ('encrypted' PFF) this also entails encryption and/or compression&lt;br /&gt;
* file system in file&lt;br /&gt;
&lt;br /&gt;
=File System Awareness =&lt;br /&gt;
==Background: Why be File System Aware?==&lt;br /&gt;
Advantages of being FS aware:&lt;br /&gt;
* You can pick up sector allocation sizes&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] do you mean file system block sizes?&lt;br /&gt;
* Some file systems may store things off sector boundaries. (ReiserFS with tail packing)&lt;br /&gt;
* Increasingly file systems have compression (NTFS compression)&lt;br /&gt;
* Carve just the sectors that are not in allocated files.&lt;br /&gt;
&lt;br /&gt;
==Tasks that would be required==&lt;br /&gt;
&lt;br /&gt;
==Discussion==&lt;br /&gt;
:: As noted above, TSK should be utilized as much as possible, particularly the filesystem-aware portion.  If we want to identify filesystems outside of its supported set, it would be more worth our time to work on implementing them there than in the carver itself.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:::: I guess this tool operates like [[Selective file dumper]] and can recover files in both ways (or not?). Recovering files by using carving can recover files in situations where sleuthkit does nothing (e.g. file on NTFS was deleted using ntfs-3g, or filesystem was destroyed or just unknown). And we should build the list of filesystems supported by carver, not by TSK. [[User:.FUF|.FUF]] 07:08, 29 October 2008 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: This tool is still in the early planning stages (requirements discovery), hence few operational details (like precise modes of operation) have been fleshed out - those will and should come later.  The justification for strictly using TSK for the filesystem-sensitive approach is simple: TSK has good filesystem APIs, and it would be foolish to create yet another standalone, incompatible implementation of filesystem(foo) when time would be better spent improving those in TSK, aiding other methods of analysis as well.  This is the same reason individuals that have implemented several other carvers are participating: de-duplication of effort.  [[User:RB|RB]]&lt;br /&gt;
&lt;br /&gt;
:: [[User:Joachim Metz|Joachim]] A design problem might be that TSK currently is a single library operating on multiple layers (storage media IO, volume/partition analysis and file system analysis). I'm not aware how easily the parts can be used separately. But I estimate that for the carver we want to use these 3 layers differently than TSK currently does.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] I would like to have the carver (recovery tool) also do recovery using file allocation data or remainders of file allocation data.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] &lt;br /&gt;
I would go as far to ask you all to look beyond the carver as a tool and look from the perspective of the carver as part of the forensic investigation process. In my eyes certain information needed/acquired by the carver could be also very useful investigative information i.e. what part of a hard disk contains empty sectors.&lt;br /&gt;
&lt;br /&gt;
=Supportive tooling=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* validator (definitions) tester (detest in revit07)&lt;br /&gt;
* tool to make configuration based definitions&lt;br /&gt;
* post carving validation&lt;br /&gt;
* the carver needs to provide support for fuse mount of carved files (carvfs)&lt;br /&gt;
&lt;br /&gt;
=Testing =&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
* automated testing&lt;br /&gt;
* test data&lt;br /&gt;
&lt;br /&gt;
=Validator Construction=&lt;br /&gt;
Options:&lt;br /&gt;
* Write validators in C/C++&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] you mean dedicated validators&lt;br /&gt;
* Have a scripting language for writing them (python? Perl?) our own?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] use easy to embed programming languages i.e. Phyton or Lua&lt;br /&gt;
* Use existing programs (libjpeg?) as plug-in validators?&lt;br /&gt;
** [[User:Joachim Metz|Joachim]] define a file structure api for this&lt;br /&gt;
&lt;br /&gt;
=Existing Code that we have=&lt;br /&gt;
[[User:Joachim Metz|Joachim]]&lt;br /&gt;
Please add any missing links&lt;br /&gt;
&lt;br /&gt;
Documentation/Articles&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* DFRWS2008 paper on carving&lt;br /&gt;
&lt;br /&gt;
Carvers&lt;br /&gt;
* DFRWS2006/2007 carving challenge results&lt;br /&gt;
* photorec (http://www.cgsecurity.org/wiki/PhotoRec)&lt;br /&gt;
* revit06 and revit07 (http://sourceforge.net/projects/revit/)&lt;br /&gt;
* s3/scarve&lt;br /&gt;
&lt;br /&gt;
Possible file structure validator libraries&lt;br /&gt;
* divers existing file support libraries&lt;br /&gt;
* libole2 (inhouse experimental code of OLE2 support)&lt;br /&gt;
* libpff (alpha release for PFF (PST/OST) file support) (http://sourceforge.net/projects/libpff/)&lt;br /&gt;
&lt;br /&gt;
Input support&lt;br /&gt;
* AFF (http://www.afflib.org/)&lt;br /&gt;
* EWF (http://sourceforge.net/projects/libewf/)&lt;br /&gt;
* TSK device &amp;amp; raw &amp;amp; split raw (http://www.sleuthkit.org/)&lt;br /&gt;
&lt;br /&gt;
Volume/Partition support&lt;br /&gt;
* disktype (http://disktype.sourceforge.net/)&lt;br /&gt;
* testdisk (http://www.cgsecurity.org/wiki/TestDisk)&lt;br /&gt;
* TSK&lt;br /&gt;
&lt;br /&gt;
File system support&lt;br /&gt;
* TSK&lt;br /&gt;
* photorec FS code&lt;br /&gt;
* implementations of FS in Linux/BSD&lt;br /&gt;
&lt;br /&gt;
Content support&lt;br /&gt;
&lt;br /&gt;
Zero storage support&lt;br /&gt;
* libcarvpath ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210704 )&lt;br /&gt;
* carvfs ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=210954 )&lt;br /&gt;
* tsk-cp ( http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=267227 )&lt;br /&gt;
* carvfsmodewf (http://sourceforge.net/project/showfiles.php?group_id=170249&amp;amp;package_id=268256 )&lt;br /&gt;
POLA&lt;br /&gt;
* joe-e (java) ( http://code.google.com/p/joe-e/ )&lt;br /&gt;
* Emily (ocaml)  ( http://erights.org/download/emily/ )&lt;br /&gt;
* the E language ( http://www.erights.org/ )&lt;br /&gt;
* AppArmor&lt;br /&gt;
* iptables/ipfw&lt;br /&gt;
* minorfs ( http://polacanthus.net/minorfs.html )&lt;br /&gt;
* plash ( http://plash.beasts.org/wiki/ )&lt;br /&gt;
&lt;br /&gt;
=Implementation Timeline=&lt;br /&gt;
# gather the available resources/ideas/wishes/needs etc. (I guess we're in this phase)&lt;br /&gt;
# start discussing a high level design (in terms of algorithm, facilities, information needed)&lt;br /&gt;
## input formats facility&lt;br /&gt;
## partition/volume facility&lt;br /&gt;
## file system facility&lt;br /&gt;
## file format facility&lt;br /&gt;
## content facility&lt;br /&gt;
## how to deal with fragment detection (do the validators allow for fragment detection?)&lt;br /&gt;
## how to deal with recombination of fragments&lt;br /&gt;
## do we want multiple carving phases in light of speed/precision tradeoffs&lt;br /&gt;
# start detailing parts of the design&lt;br /&gt;
## Discuss options for a grammar driven validator?&lt;br /&gt;
## Hard-coded plug-ins?&lt;br /&gt;
## Which existing code can we use?&lt;br /&gt;
# start building/assembling parts of the tooling for a prototype&lt;br /&gt;
## Implement simple file carving with validation.&lt;br /&gt;
## Implement gap carving&lt;br /&gt;
# Initial Release&lt;br /&gt;
# Implement the ''threaded carving'' that [[User:.FUF|.FUF]] is describing above.&lt;br /&gt;
&lt;br /&gt;
[[User:Joachim Metz|Joachim]] Shouldn't multi threaded carving (MTC) not be part of the 1st version?&lt;br /&gt;
The MT approach makes for different design decisions&lt;br /&gt;
: It is virtually impossible to turn a non-MT application into an MT application .[[User:Simsong|Simsong]] 06:37, 3 November 2008 (UTC)&lt;/div&gt;</summary>
		<author><name>Capibara</name></author>	</entry>

	</feed>