Difference between pages "Encase image file format" and "Research Topics"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
 
 
Line 1: Line 1:
The Encase image file format is used by [[EnCase]] used to store various types of digital evidence e.g.
+
Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas.
* disk image (physical bitstream of an acquired disk)
+
* volume image
+
* memory
+
* logical files
+
  
 +
Many of these would make a nice master's project.
  
The format is (reportedly) based on ASR Data's Expert Witness Compression Format [http://www.asrdata.com/SMART/whitepaper.html].
+
=Programming Projects=
Currently there are 2 version of the format; version 1 is a closed format and was succeeded by version 2 in EnCase 7, for which a format specification is available, but requires registration.
+
  
 +
==Small-Sized Programming Projects==
 +
* Modify [[bulk_extractor]] so that it can directly acquire a raw device under Windows. This requires replacing the current ''open'' function call with a ''CreateFile'' function call and using windows file handles.
 +
* Rewrite SleuthKit '''sorter''' in C++ to make it faster and more flexible.
  
The media data can be stored in multiple evidence files, which are called segment files.
+
==Medium-Sized Programming Projects==
Each segment file consist of multiple sections, which has a distinct section start definition containing a section type.
+
* Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
Up to EnCase 5 the segment file were limited to 2 GiB, due to the internal 31-bit file offset representation. This limitation was lifted by adding a base offset value in EnCase 6.
+
** Automatically pull out the strings
 +
** Show histogram
 +
** Detect crypto and/or stenography.
 +
* Extend [[fiwalk]] to report the NTFS alternative data streams.
 +
* Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
 +
* Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
 +
* Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.
  
  
EnCase allows to store the data compressed either using a fast or best level of the deflate compression method.
+
==Big Programming Projects==
EnCase 7 no longer distinguishes between fast or best compression and just provides for either uncompressed or compressed.
+
* Develop a new carver with a plug-in architecture and support for fragment reassembly carving (see [[Carver 2.0 Planning Page]]).
 +
* Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
  
 +
* Correlation Engine:
 +
** Logfile correlation
 +
** Document identity identification
 +
** Correlation between stored data and intercept data
 +
** Online Social Network Analysis
  
Besides digital evidence the evidence files, or segment files, contain a header containing case information.
+
* Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
The case information which entails date and time of acquisition, an examiner's name, notes on the acquisition, and an optional password.
+
** Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
* In EnCase 3 the case information header is stored in the "header" section, which is defined twice within the file and contain the same information.
+
** Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login
* As of EnCase 4 an additional "header2" section was added. The "header" section now appears only once, but the new "header2" section twice.
+
  
 +
=Reverse-Engineering Projects=
 +
==Reverse-Engineering Projects==
 +
=== Application analysis ===
 +
* Reverse the on-disk structure of the [[Extensible Storage Engine (ESE) Database File (EDB) format]] to learn:
 +
** Fill in the missing information about older ESE databases
 +
** Exchange EDB (MAPI database), STM
 +
** Active Directory (Active Directory working document available on request)
 +
* Reverse the on-disk structure of the Lotus [[Notes Storage Facility (NSF)]]
 +
* Reverse the on-disk structure of Microsoft SQL Server databases
  
The format adds error detection by storing the data with checksums (Adler32), for both the metadata as the data blocks, which are by default 64 x 512 byte sectors (32 KiB).
+
=== Volume/File System analysis ===
As of EnCase 5 the number of sectors per block (chunk) can vary.
+
* Analysis of inter snapshot changes in [[Windows Shadow Volumes]]
EnCase 3F introduced an "error2" section that it uses to record the location and number of bad sector chunks. The way it handles the sections it can't read is that those areas are filled with zero.
+
* Add support to SleuthKit for [[FAT|eXFAT]], Microsoft's new FAT file system.
Then EnCase displays to the user the areas that could not be read when the image was acquired. The granularity of unreadable chunks appears to be 32K.
+
* Add support to SleuthKit for [[Resilient File System (ReFS)|ReFS]].
As of EnCase 5 the granularity of unreadable chunks can vary.
+
* Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
 +
* Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see [[NTFS]])
 +
* Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
  
 +
==EnCase Enhancement==
 +
* Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)
  
EnCase 3 can store a one-way hash of the data. For a bitstream it does so by calculating e.g. a MD5 hash of the original media data and adds a hash section to the last of the segment file.
+
= Timeline analysis =
As of EnCase 6 the option to store a SHA1 hash was added.
+
* Mapping differences and similarities in multiple versions of a system, e.g. those created by [[Windows Shadow Volumes]] but not limited to
  
 +
=Research Areas=
 +
These are research areas that could easily grow into a PhD thesis.
 +
* General-purpose detection of:
 +
** Stegnography
 +
** Sanitization attempts
 +
** Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
 +
* Visualization of data/information in digital forensic context
 +
* SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;
  
EnCase 5 and later have the option to store '''single files''' into the EnCase Logical Evidence File (LEF) or EWF-L01.
+
__NOTOC__
This format changed slightly in EnCase 6 and 7.
+
  
 
+
[[Category:Research]]
In EnCase 7 the EWF format was succeeded by the EnCase Evidence File Format Version 2 (EWF2-EX01 and EWF2-LX01).
+
EWF2-EX01 is at it's lower levels a different format then EWF-E01 and provides support for:
+
* bzip2 compression
+
* direct encryption (AES-256) of the section data
+
 
+
The same features are added to the new logical evidence file format (EWF2-LX01) with the exception of encryption.
+
EWF2-EX01, EWF2-LX01 are not backwards compatible with previous EnCase products.
+
 
+
== See Also ==
+
 
+
[[EnCase]]
+
 
+
== External Links ==
+
 
+
* [http://web.archive.org/web/20100905235827/http://www.asrdata.com/SMART/whitepaper.html Internet archive copy of: ASR Data's Expert Witness Compression Format]
+
* [http://www.guidancesoftware.com/DocumentRegistration.aspx?did=1000018246 EnCase Evidence File Format Version 2], requires registration
+
* [http://code.google.com/p/libewf/downloads/detail?name=Expert%20Witness%20Compression%20Format%20%28EWF%29.pdf Expert Witness Compression Format (EWF)].
+
* [http://code.google.com/p/libewf/downloads/detail?name=Expert%20Witness%20Compression%20Format%202%20%28EWF2%29.pdf Expert Witness Compression Format (EWF) version 2].
+
* [http://www.cfreds.nist.gov/v2/Basic_Mac_Image.html Sample image in EnCase, iLook, and dd format] - From the [[Computer Forensic Reference Data Sets]] Project
+
 
+
[[Category:Forensics File Formats]]
+

Revision as of 00:41, 6 September 2012

Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas.

Many of these would make a nice master's project.

Programming Projects

Small-Sized Programming Projects

  • Modify bulk_extractor so that it can directly acquire a raw device under Windows. This requires replacing the current open function call with a CreateFile function call and using windows file handles.
  • Rewrite SleuthKit sorter in C++ to make it faster and more flexible.

Medium-Sized Programming Projects

  • Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
    • Automatically pull out the strings
    • Show histogram
    • Detect crypto and/or stenography.
  • Extend fiwalk to report the NTFS alternative data streams.
  • Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
  • Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
  • Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.


Big Programming Projects

  • Develop a new carver with a plug-in architecture and support for fragment reassembly carving (see Carver 2.0 Planning Page).
  • Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
  • Correlation Engine:
    • Logfile correlation
    • Document identity identification
    • Correlation between stored data and intercept data
    • Online Social Network Analysis
  • Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
    • Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
    • Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login

Reverse-Engineering Projects

Reverse-Engineering Projects

Application analysis

Volume/File System analysis

  • Analysis of inter snapshot changes in Windows Shadow Volumes
  • Add support to SleuthKit for eXFAT, Microsoft's new FAT file system.
  • Add support to SleuthKit for ReFS.
  • Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
  • Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see NTFS)
  • Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)

EnCase Enhancement

  • Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)

Timeline analysis

  • Mapping differences and similarities in multiple versions of a system, e.g. those created by Windows Shadow Volumes but not limited to

Research Areas

These are research areas that could easily grow into a PhD thesis.

  • General-purpose detection of:
    • Stegnography
    • Sanitization attempts
    • Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
  • Visualization of data/information in digital forensic context
  • SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;