Difference between revisions of "Research Topics"

From Forensics Wiki
Jump to: navigation, search
m
m
Line 5: Line 5:
 
=Programming/Engineering Projects=
 
=Programming/Engineering Projects=
  
==Small-Sized Programming Projects==
+
==Small-Sized Projects==
* Modify [[bulk_extractor]] so that it can directly acquire a raw device under Windows. This requires replacing the current ''open'' function call with a ''CreateFile'' function call and using windows file handles.
+
; Sleuthkit:
 
* Rewrite SleuthKit '''sorter''' in C++ to make it faster and more flexible.
 
* Rewrite SleuthKit '''sorter''' in C++ to make it faster and more flexible.
 +
; tcpflow:
 +
* Modify [[tcpflow]]'s iptree.h implementation so that it only stores discriminating bit prefixes in the tree, similar to D. J. Bernstein's [http://cr.yp.to/critbit.html Crit-bit] trees.
 +
* Determine why [[tcpflow]]'s iptree.h implementation's ''prune'' works differently when caching is enabled then when it is disabled
  
==Medium-Sized Programming Projects==
+
==Medium-Sized Projects==
 +
===Forensic File Viewer ===
 
* Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
 
* Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
 
** Automatically pull out the strings
 
** Automatically pull out the strings
 
** Show histogram
 
** Show histogram
 
** Detect crypto and/or stenography.
 
** Detect crypto and/or stenography.
* Extend [[fiwalk]] to report the NTFS alternative data streams.
+
* Extend SleuthKit's [[fiwalk]] to report the NTFS alternative data streams.
 +
 
 +
===Data Sniffing===
 
* Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
 
* Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
 +
 +
 +
===SleuthKit Modifications===
 
* Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
 
* Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
 
* Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.
 
* Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.
  
==Big Programming/System Projects==
+
===Anti-Frensics Detection===
Most of these are large systems that could be split up into several small projects.
+
* A pluggable rule-based system that can detect the residual data or other remnants of running a variety of anti-forensics software
 +
 
 
===Carvers===
 
===Carvers===
 
Develop a new carver with a plug-in architecture and support for fragment reassembly carving. Take a look at:
 
Develop a new carver with a plug-in architecture and support for fragment reassembly carving. Take a look at:
Line 32: Line 42:
 
* Online Social Network Analysis
 
* Online Social Network Analysis
  
===Data Snarfing===
+
===Data Snarfing/Web Scraping===
 
* Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
 
* Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
 
* Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
 
* Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
 
* Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login
 
* Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login
  
===Anti-Frensics Detection===
 
A pluggable rule-based system that can detect the residual data or other remnants of running a variety of anti-forensics software
 
 
=== Timeline analysis ===
 
=== Timeline analysis ===
 
* Mapping differences and similarities in multiple versions of a system, e.g. those created by [[Windows Shadow Volumes]] but not limited to
 
* Mapping differences and similarities in multiple versions of a system, e.g. those created by [[Windows Shadow Volumes]] but not limited to
 
* Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
 
* Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
===Imaging Disk Farms===
 
How do you image an active file system?
 
===Audit===
 
How do we improve Audit capabilities?
 
  
=Reverse-Engineering Projects=
+
===EnCase Enhancement===
== Application analysis ==
+
* Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)
 +
 
 +
==Reverse-Engineering Projects==
 +
=== Application analysis ===
 
* Reverse the on-disk structure of the [[Extensible Storage Engine (ESE) Database File (EDB) format]] to learn:
 
* Reverse the on-disk structure of the [[Extensible Storage Engine (ESE) Database File (EDB) format]] to learn:
 
** Fill in the missing information about older ESE databases
 
** Fill in the missing information about older ESE databases
Line 56: Line 63:
 
* Reverse the on-disk structure of Microsoft SQL Server databases
 
* Reverse the on-disk structure of Microsoft SQL Server databases
  
== Volume/File System analysis ==
+
=== Volume/File System analysis ===
 
* Analysis of inter snapshot changes in [[Windows Shadow Volumes]]
 
* Analysis of inter snapshot changes in [[Windows Shadow Volumes]]
* Add support to SleuthKit for [[FAT|eXFAT]], Microsoft's new FAT file system.
 
* Add support to SleuthKit for [[Resilient File System (ReFS)|ReFS]].
 
 
* Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
 
* Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
 
* Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see [[NTFS]])
 
* Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see [[NTFS]])
 
* Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
 
* Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
 
+
* Add support to SleuthKit for [[Resilient File System (ReFS)|ReFS]].
==EnCase Enhancement==
+
* Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)
+
  
  
  
=Research Projects=
+
==Error Rates==
==Medium-Sized Research Projects==
+
* Develop an image processing program that can reliably detect screen shots. (Screen shots are useful to find on a hard drive because they can imply the presence of a remote control or surveillance program.)
+
 
* Develop improved techniques for identifying encrypted data. (It's especially important to distinguish encrypted data from compressed data).
 
* Develop improved techniques for identifying encrypted data. (It's especially important to distinguish encrypted data from compressed data).
 
* Quantify the error rate of different forensic tools and processes. Are these rates theoretical or implementation dependent? What is the interaction of the error rates and the [[Daubert]] standard?
 
* Quantify the error rate of different forensic tools and processes. Are these rates theoretical or implementation dependent? What is the interaction of the error rates and the [[Daubert]] standard?
Line 84: Line 85:
 
* SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;
 
* SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;
  
=See Also=
+
==See Also==
 
* [http://itsecurity.uiowa.edu/securityday/documents/guan.pdf Digital Forensics: Research Challenges and Open Problems, Dr. Yong Guan, Iowa State University, Dec. 4, 2007]
 
* [http://itsecurity.uiowa.edu/securityday/documents/guan.pdf Digital Forensics: Research Challenges and Open Problems, Dr. Yong Guan, Iowa State University, Dec. 4, 2007]
  

Revision as of 09:17, 29 April 2013

Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas.

Many of these would make a nice master's project.

Programming/Engineering Projects

Small-Sized Projects

Sleuthkit
  • Rewrite SleuthKit sorter in C++ to make it faster and more flexible.
tcpflow
  • Modify tcpflow's iptree.h implementation so that it only stores discriminating bit prefixes in the tree, similar to D. J. Bernstein's Crit-bit trees.
  • Determine why tcpflow's iptree.h implementation's prune works differently when caching is enabled then when it is disabled

Medium-Sized Projects

Forensic File Viewer

  • Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
    • Automatically pull out the strings
    • Show histogram
    • Detect crypto and/or stenography.
  • Extend SleuthKit's fiwalk to report the NTFS alternative data streams.

Data Sniffing

  • Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.


SleuthKit Modifications

  • Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
  • Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.

Anti-Frensics Detection

  • A pluggable rule-based system that can detect the residual data or other remnants of running a variety of anti-forensics software

Carvers

Develop a new carver with a plug-in architecture and support for fragment reassembly carving. Take a look at:

Correlation Engine

  • Logfile correlation
  • Document identity identification
  • Correlation between stored data and intercept data
  • Online Social Network Analysis

Data Snarfing/Web Scraping

  • Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
  • Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
  • Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login

Timeline analysis

  • Mapping differences and similarities in multiple versions of a system, e.g. those created by Windows Shadow Volumes but not limited to
  • Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.

EnCase Enhancement

  • Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)

Reverse-Engineering Projects

Application analysis

Volume/File System analysis

  • Analysis of inter snapshot changes in Windows Shadow Volumes
  • Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
  • Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see NTFS)
  • Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
  • Add support to SleuthKit for ReFS.


Error Rates

  • Develop improved techniques for identifying encrypted data. (It's especially important to distinguish encrypted data from compressed data).
  • Quantify the error rate of different forensic tools and processes. Are these rates theoretical or implementation dependent? What is the interaction of the error rates and the Daubert standard?

Research Areas

These are research areas that could easily grow into a PhD thesis.

  • General-purpose detection of:
    • Stegnography
    • Sanitization attempts
    • Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
  • Visualization of data/information in digital forensic context
  • SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;

See Also