Difference between revisions of "Research Topics"

From ForensicsWiki
Jump to: navigation, search
(SleuthKit Enhancements)
(significant revisions)
Line 3: Line 3:
 
Many of these would make a nice master's project.
 
Many of these would make a nice master's project.
  
==Small Programming Projects==
+
=Programming Projects=
 +
 
 +
==Small-Sized Programming Projects==
 
* Modify [[bulk_extractor]] so that it can directly acquire a raw device under Windows. This requires replacing the current ''open'' function call with a ''CreateFile'' function call and using windows file handles.
 
* Modify [[bulk_extractor]] so that it can directly acquire a raw device under Windows. This requires replacing the current ''open'' function call with a ''CreateFile'' function call and using windows file handles.
 +
* Rewrite SleuthKit '''sorter''' in C++ to make it faster and more flexible.
 +
 +
==Medium-Sized Programming Projects==
 
* Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
 
* Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
 
** Automatically pull out the strings
 
** Automatically pull out the strings
 
** Show histogram
 
** Show histogram
 
** Detect crypto and/or stenography.
 
** Detect crypto and/or stenography.
** (I would write the program in java with a plug-in architecture)
+
* Extend [[fiwalk]] to report the NTFS alternative data streams.
* Extend [[fiwalk]] to report the NTFS "inodes."
+
==Big Programming Projects==
+
* Write [[Carver 2.0 Planning Page | Carver 2.0]]
+
 
* Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
 
* Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
 +
* Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
 +
* Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.
 +
 +
 +
==Big Programming Projects==
 +
* Develop a new carver with a plug-in architecture and support for fragment reassembly carving (see [[Carver 2.0 Planning Page]]).
 +
* Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
 +
 +
* Correlation Engine:
 +
** Logfile correlation
 +
** Document identity identification
 +
** Correlation between stored data and intercept data
 +
** Online Social Network Analysis
 +
 +
* Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
 +
** Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
 +
** Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login
  
 +
=Reverse-Engineering Projects=
 
==Reverse-Engineering Projects==
 
==Reverse-Engineering Projects==
* Continue work on the [[Extensible Storage Engine (ESE) Database File (EDB) format]] in regard to
+
* Reverse the on-disk structure of the [[Extensible Storage Engine (ESE) Database File (EDB) format]] to learn:
 
** Fill in the missing information about older ESE databases
 
** Fill in the missing information about older ESE databases
 
** Exchange EDB (MAPI database), STM
 
** Exchange EDB (MAPI database), STM
 
** Active Directory (Active Directory working document available on request)
 
** Active Directory (Active Directory working document available on request)
* Continue work on the [[Notes Storage Facility (NSF)]]
+
* Reverse the on-disk structure of the Lotus [[Notes Storage Facility (NSF)]]
* Microsoft SQL Server databases
+
* Reverse the on-disk structure of Microsoft SQL Server databases
 
+
* Add support to SleuthKit for [[XFAT]], Microsoft's new FAT file system.
* Physical layer access to flash storage.
+
* Add support to SleuthKit for [[Resilient File System (ReFS)|ReFS]].
** Gain access to the physical layer of SD or USB flash storage device. This will require reverse-engineering the proprietary APIs or gaining access to proprietary information from the manufacturers. Use these APIs to demonstrate the feasibility of recovering residual data that has been overwritten at the logical layer but which is still present at the physical layer.
+
* Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
 
+
* Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
==SleuthKit Enhancements==
+
* Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see [[NTFS]])
[[SleuthKit]] is the popular open-source system for forensics and data recovery.
+
* Add support for a new file system:
+
** The [[YAFFS]] [[flash file system]]. (YAFFS2 is currently used on the Google G1 phone.) (ViaForensics is currently working on this)
+
** The [[JFFS2]] [[flash file system]]. (JFFS2 is currently used on the One Laptop Per Child laptop.)
+
** [[XFAT]], Microsoft's new FAT file system.
+
** [[EXT4]] (JHUAPL is currently working on this); Also see: http://www.williballenthin.com/ext4/
+
** [[Resilient File System (ReFS)|ReFS]]
+
* Enhance support for an existing file system:
+
** Report the physical location on disk of compressed files.
+
** Add support for NTFS encrypted files (EFS)
+
** Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see [[NTFS]])
+
* Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
+
* Rewrite '''sorter''' in C++ to make it faster and more flexible.
+
  
 
==EnCase Enhancement==
 
==EnCase Enhancement==
 
* Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)
 
* Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)
  
==Timeline Analysis==
 
; Timeline Visualization and Analysis
 
: Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
 
  
==Research Areas==
+
=Research Areas=
 
These are research areas that could easily grow into a PhD thesis.
 
These are research areas that could easily grow into a PhD thesis.
; Stream-based Forensics
+
* General-purpose detection of:
: Process the entire disk with one pass to minimize seek time.  (You may find it necessary to do a quick metadata scan first.)
+
** Stegnography
; Stegnography Detection (general purpose)
+
** Sanitization attempts
: Detect the use of stegnography by through the analysis of file examplars and specifications.
+
** Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
; Sanitization Detection
+
* Visualization of data/information in digital forensic context
: Detect and diagnose sanitization attempts.
+
* SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;
; Compressed Data Reconstruction
+
: Reconstruct decompressed data from a GZIP file after the first 1K has been removed.
+
;Evidence Falsification Detection
+
: Automatically detect falsified digital evidence through the use of inconsistency in file system allocations, application data allocation, and log file analysis.
+
; Visualization of data/information in digital forensic context
+
: SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;
+
  
==Correlation==
 
* Logfile correlation
 
* Document identity identification
 
* Correlation between stored data and intercept data
 
* Online Social Network Analysis
 
** Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
 
** Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
 
* Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login
 
  
  
 
__NOTOC__
 
__NOTOC__

Revision as of 04:26, 27 May 2012

Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas.

Many of these would make a nice master's project.

Programming Projects

Small-Sized Programming Projects

  • Modify bulk_extractor so that it can directly acquire a raw device under Windows. This requires replacing the current open function call with a CreateFile function call and using windows file handles.
  • Rewrite SleuthKit sorter in C++ to make it faster and more flexible.

Medium-Sized Programming Projects

  • Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
    • Automatically pull out the strings
    • Show histogram
    • Detect crypto and/or stenography.
  • Extend fiwalk to report the NTFS alternative data streams.
  • Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
  • Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
  • Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.


Big Programming Projects

  • Develop a new carver with a plug-in architecture and support for fragment reassembly carving (see Carver 2.0 Planning Page).
  • Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
  • Correlation Engine:
    • Logfile correlation
    • Document identity identification
    • Correlation between stored data and intercept data
    • Online Social Network Analysis
  • Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
    • Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
    • Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login

Reverse-Engineering Projects

Reverse-Engineering Projects

  • Reverse the on-disk structure of the Extensible Storage Engine (ESE) Database File (EDB) format to learn:
    • Fill in the missing information about older ESE databases
    • Exchange EDB (MAPI database), STM
    • Active Directory (Active Directory working document available on request)
  • Reverse the on-disk structure of the Lotus Notes Storage Facility (NSF)
  • Reverse the on-disk structure of Microsoft SQL Server databases
  • Add support to SleuthKit for XFAT, Microsoft's new FAT file system.
  • Add support to SleuthKit for ReFS.
  • Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
  • Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
  • Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see NTFS)

EnCase Enhancement

  • Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)


Research Areas

These are research areas that could easily grow into a PhD thesis.

  • General-purpose detection of:
    • Stegnography
    • Sanitization attempts
    • Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
  • Visualization of data/information in digital forensic context
  • SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;