Difference between revisions of "Research Topics"

From Forensics Wiki
Jump to: navigation, search
m (Flash Memory)
(Reverse-Engineering Projects)
(64 intermediate revisions by 4 users not shown)
Line 1: Line 1:
Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas. ''Potential Sponsor,'' when present, indicates the name of a researcher who would be interested in lending support in the form of supervision or other resources to a project.
+
Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas.
  
=Tool Development=
+
Many of these would make a nice master's project.
==AFF Enhancement==
+
[[AFF]] is the Advanced Forensics Format, developed by Simson Garfinkel and Basis Technology. The following enhancements would be very useful to the format:
+
* Signing with X.509 or GPG keys data segments and metadata.
+
* Encryption of data segments with an AES-256 key specified by a password
+
* Encryption of the AES-256 key with a public key (and decryption with a corresponding private key)
+
* Evaluation of the AFF data page size. What is the optimal page size for compressed forensic work?
+
* Replacement of the AFF "BADFLAG" approach for indicating bad data with a bitmap.
+
  
''Sponsor for these projects: [[User:Simsong|Simson Garfinkel]]''
+
=Programming Projects=
  
==Flash Memory==
+
==Small-Sized Programming Projects==
Flash memory devices such as USB keys implement a [http://www.st.com/stonline/products/literature/an/10122.htm wear leveling algorithm] in hardware so that frequently rewritten blocks are actually written to many different physical blocks. Are there any devices that let you access the raw flash cells underneath the wear leveling chip? Can you get statistics out of the device? Can you access pages that have been mapped out (and still have valid data) but haven't been mapped back yet? Can you use this as a technique for accessing deleted information?
+
* Modify [[bulk_extractor]] so that it can directly acquire a raw device under Windows. This requires replacing the current ''open'' function call with a ''CreateFile'' function call and using windows file handles.
 +
* Rewrite SleuthKit '''sorter''' in C++ to make it faster and more flexible.
  
''Sponsor: [[User:Simsong|Simson Garfinkel]]''
+
==Medium-Sized Programming Projects==
 +
* Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
 +
** Automatically pull out the strings
 +
** Show histogram
 +
** Detect crypto and/or stenography.
 +
* Extend [[fiwalk]] to report the NTFS alternative data streams.
 +
* Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
 +
* Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
 +
* Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.
 +
 
 +
 
 +
==Big Programming Projects==
 +
* Develop a new carver with a plug-in architecture and support for fragment reassembly carving (see [[Carver 2.0 Planning Page]]).
 +
* Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
 +
 
 +
* Correlation Engine:
 +
** Logfile correlation
 +
** Document identity identification
 +
** Correlation between stored data and intercept data
 +
** Online Social Network Analysis
 +
 
 +
* Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
 +
** Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
 +
** Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login
 +
 
 +
=Reverse-Engineering Projects=
 +
==Reverse-Engineering Projects==
 +
* Reverse the on-disk structure of the [[Extensible Storage Engine (ESE) Database File (EDB) format]] to learn:
 +
** Fill in the missing information about older ESE databases
 +
** Exchange EDB (MAPI database), STM
 +
** Active Directory (Active Directory working document available on request)
 +
* Reverse the on-disk structure of the Lotus [[Notes Storage Facility (NSF)]]
 +
* Reverse the on-disk structure of Microsoft SQL Server databases
 +
* Add support to SleuthKit for [[FAT|eXFAT]], Microsoft's new FAT file system.
 +
* Add support to SleuthKit for [[Resilient File System (ReFS)|ReFS]].
 +
* Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
 +
* Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
 +
* Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see [[NTFS]])
 +
 
 +
==EnCase Enhancement==
 +
* Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)
 +
 
 +
= Timeline analysis =
 +
* Mapping differences and similarities in multiple versions of a system, e.g. those created by [[Windows Shadow Volumes]]
 +
 
 +
=Research Areas=
 +
These are research areas that could easily grow into a PhD thesis.
 +
* General-purpose detection of:
 +
** Stegnography
 +
** Sanitization attempts
 +
** Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
 +
* Visualization of data/information in digital forensic context
 +
* SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;
 +
 
 +
 
 +
 
 +
__NOTOC__

Revision as of 09:27, 23 August 2012

Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas.

Many of these would make a nice master's project.

Programming Projects

Small-Sized Programming Projects

  • Modify bulk_extractor so that it can directly acquire a raw device under Windows. This requires replacing the current open function call with a CreateFile function call and using windows file handles.
  • Rewrite SleuthKit sorter in C++ to make it faster and more flexible.

Medium-Sized Programming Projects

  • Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
    • Automatically pull out the strings
    • Show histogram
    • Detect crypto and/or stenography.
  • Extend fiwalk to report the NTFS alternative data streams.
  • Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
  • Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
  • Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.


Big Programming Projects

  • Develop a new carver with a plug-in architecture and support for fragment reassembly carving (see Carver 2.0 Planning Page).
  • Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
  • Correlation Engine:
    • Logfile correlation
    • Document identity identification
    • Correlation between stored data and intercept data
    • Online Social Network Analysis
  • Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
    • Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
    • Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login

Reverse-Engineering Projects

Reverse-Engineering Projects

  • Reverse the on-disk structure of the Extensible Storage Engine (ESE) Database File (EDB) format to learn:
    • Fill in the missing information about older ESE databases
    • Exchange EDB (MAPI database), STM
    • Active Directory (Active Directory working document available on request)
  • Reverse the on-disk structure of the Lotus Notes Storage Facility (NSF)
  • Reverse the on-disk structure of Microsoft SQL Server databases
  • Add support to SleuthKit for eXFAT, Microsoft's new FAT file system.
  • Add support to SleuthKit for ReFS.
  • Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
  • Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
  • Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see NTFS)

EnCase Enhancement

  • Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)

Timeline analysis

  • Mapping differences and similarities in multiple versions of a system, e.g. those created by Windows Shadow Volumes

Research Areas

These are research areas that could easily grow into a PhD thesis.

  • General-purpose detection of:
    • Stegnography
    • Sanitization attempts
    • Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
  • Visualization of data/information in digital forensic context
  • SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;