Difference between pages "Research Topics" and "Chrome Disk Cache Format"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
m (EnCase Enhancement)
 
(See Also)
 
Line 1: Line 1:
Interested in doing research in computer forensics? Looking for a master's topic, or just some ideas for a research paper? Here is our list. Please feel free to add your own ideas.
+
{{expand}}
  
Many of these would make a nice master's project.
+
== Cache files ==
 +
The cache is stored in multiple:
 +
{| class="wikitable"
 +
|-
 +
! Filename
 +
! Description
 +
|-
 +
| index
 +
| The index file
 +
|-
 +
| data_#
 +
| Data block files
 +
|-
 +
| f_######
 +
| (Separate) data stream file
 +
|}
  
=Programming/Engineering Projects=
+
== Cache address ==
 +
The cache address is 4 bytes in size and consists of:
 +
{| class="wikitable"
 +
|-
 +
! offset
 +
! size
 +
! value
 +
! description
 +
|-
 +
| <i>If file type is 0 (Separate file)</i>
 +
|
 +
|
 +
|
 +
|-
 +
| 0.0
 +
| 28 bits
 +
|
 +
| File number <br> The value represents the value of # in f_######
 +
|-
 +
| <i>Else</i>
 +
|
 +
|
 +
|
 +
|-
 +
| 0.0
 +
| 16 bits
 +
|
 +
| Block number
 +
|-
 +
| 2.0
 +
| 8 bits
 +
|
 +
| File number (or file selector) <br> The value represents the value of # in data_#
 +
|-
 +
| 3.0
 +
| 2 bits
 +
|
 +
| Block size <br> The number of contiguous blocks where 0 represents 1 block and 3 represents 4 blocks.
 +
|-
 +
| 3.2
 +
| 2 bits
 +
|
 +
| Reserved
 +
|-
 +
| <i>Common</i>
 +
|
 +
|
 +
|
 +
|-
 +
| 3.4
 +
| 3 bits
 +
|
 +
| File type
 +
|-
 +
| 3.7
 +
| 1 bit
 +
|
 +
| Initialized flag
 +
|}
  
==Small-Sized Projects==
+
=== File types ===
; Sleuthkit:
+
{| class="wikitable"
* Rewrite SleuthKit '''sorter''' in C++ to make it faster and more flexible.
+
|-
; tcpflow:
+
! Value
* Modify [[tcpflow]]'s iptree.h implementation so that it only stores discriminating bit prefixes in the tree, similar to D. J. Bernstein's [http://cr.yp.to/critbit.html Crit-bit] trees.
+
! Description
* Determine why [[tcpflow]]'s iptree.h implementation's ''prune'' works differently when caching is enabled then when it is disabled
+
|-
 +
| 0
 +
| (Separate) data stream file
 +
|-
 +
| 1
 +
| (Rankings) block data file (36 byte block data file)
 +
|-
 +
| 2
 +
| 256 byte block data file
 +
|-
 +
| 3
 +
| 1024 byte block data file
 +
|-
 +
| 4
 +
| 4096 byte block data file
 +
|-
 +
|
 +
|
 +
|-
 +
| 6
 +
| Unknown; seen on Mac OS  X 0x6f430074
 +
|}
  
==Medium-Sized Projects==
+
==== Examples ====
===Forensic File Viewer ===
+
{| class="wikitable"
* Create a program that visualizes the contents of a file, sort of like hexedit, but with other features:
+
|-
** Automatically pull out the strings
+
! Value
** Show histogram
+
! Description
** Detect crypto and/or stenography.
+
|-
* Extend SleuthKit's [[fiwalk]] to report the NTFS alternative data streams.
+
| 0x00000000
 +
| Not initialized
 +
|-
 +
| 0x8000002a
 +
| Data stream file: f_00002a
 +
|-
 +
| 0xa0010003
 +
| Block data file: data_1, block number 3, 1 block of size
 +
|}
  
===Data Sniffing===
+
== Index file format (index) ==
* Create a method to detect NTFS-compressed cluster blocks on a disk (RAW data stream). A method could be to write a generic signature to detect the beginning of NTFS-compressed file segments on a disk. This method is useful in carving and scanning for textual strings.
+
Overview:
 +
* File header
 +
* least recently used (LRU) data (or eviction control data)
 +
* index table
  
 +
=== File header ===
 +
*TODO*
  
===SleuthKit Modifications===
+
== Data block file format (data_#) ==
* Write a FUSE-based mounter for SleuthKit, so that disk images can be forensically mounted using TSK.
+
Overview:
* Modify SleuthKit's API so that the physical location on disk of compressed files can be learned.
+
* File header
 +
* array of blocks
  
===Anti-Frensics Detection===
+
=== File header ===
* A pluggable rule-based system that can detect the residual data or other remnants of running a variety of anti-forensics software
+
*TODO*
  
===Carvers===
+
== Data stream ==
Develop a new carver with a plug-in architecture and support for fragment reassembly carving. Take a look at:
+
See: [[gzip]]
* [[Carver 2.0 Planning Page]]
+
* ([mailto:rainer.poisel@gmail.com Rainer Poisel']) [https://github.com/rpoisel/mmc Multimedia File Carver], which allows for the reassembly of multimedia fragmented files.
+
  
===Correlation Engine===
+
== See Also ==
* Logfile correlation
+
* [[Google Chrome]]
* Document identity identification
+
* [[gzip]]
* Correlation between stored data and intercept data
+
* Online Social Network Analysis
+
  
===Data Snarfing/Web Scraping===
+
== External Links ==
* Find and download in a forensically secure manner all of the information in a social network (e.g. Facebook, LinkedIn, etc.) associated with a targeted individual.
+
* Determine who is searching for a targeted individual. This might be done with a honeypot, or documents with a tracking device in them, or some kind of covert Facebook App.
+
* Automated grouping/annotation of low-level events, e.g. access-time, log-file entry, to higher-level events, e.g. program start, login
+
  
=== Timeline analysis ===
+
[[Category:File Formats]]
* Mapping differences and similarities in multiple versions of a system, e.g. those created by [[Windows Shadow Volumes]] but not limited to
+
* Write a new timeline viewer that supports Logfile fusion (with offsets) and provides the ability to view the logfile in the frequency domain.
+
 
+
===Enhancements for Guidance Softwaren's Encase===
+
* Develop an EnScript that allows you to script EnCase from Python. (You can do this because EnScripts can run arbitrary DLLs. The EnScript calls the DLL. Each "return" from the DLL is a specific EnCase command to execute. The EnScript then re-enters the DLL.)
+
 
+
==Reverse-Engineering Projects==
+
=== Application analysis ===
+
* Reverse the on-disk structure of the [[Extensible Storage Engine (ESE) Database File (EDB) format]] to learn:
+
** Fill in the missing information about older ESE databases
+
** Exchange EDB (MAPI database), STM
+
** Active Directory (Active Directory working document available on request)
+
* Reverse the on-disk structure of the Lotus [[Notes Storage Facility (NSF)]]
+
* Reverse the on-disk structure of Microsoft SQL Server databases
+
 
+
=== Volume/File System analysis ===
+
* Analysis of inter snapshot changes in [[Windows Shadow Volumes]]
+
* Modify SleuthKit's NTFS implementation to support NTFS encrypted files (EFS)
+
* Extend SleuthKit's implementation of NTFS to cover Transaction NTFS (TxF) (see [[NTFS]])
+
* Physical layer access to flash storage (requires reverse-engineering proprietary APIs for flash USB and SSD storage.)
+
* Add support to SleuthKit for [[Resilient File System (ReFS)|ReFS]].
+
 
+
 
+
 
+
==Error Rates==
+
* Develop improved techniques for identifying encrypted data. (It's especially important to distinguish encrypted data from compressed data).
+
* Quantify the error rate of different forensic tools and processes. Are these rates theoretical or implementation dependent? What is the interaction of the error rates and the [[Daubert]] standard?
+
 
+
==Research Areas==
+
These are research areas that could easily grow into a PhD thesis.
+
* General-purpose detection of:
+
** Stegnography
+
** Sanitization attempts
+
** Evidence Falsification (perhaps through inconsistency in file system allocations, application data allocation, and log file analysis.
+
* Visualization of data/information in digital forensic context
+
* SWOT of current visualization techniques in forensic tools; improvements; feasibility of 3D representation;
+
 
+
==See Also==
+
* [http://itsecurity.uiowa.edu/securityday/documents/guan.pdf Digital Forensics: Research Challenges and Open Problems, Dr. Yong Guan, Iowa State University, Dec. 4, 2007]
+
 
+
__NOTOC__
+
 
+
[[Category:Research]]
+

Revision as of 14:29, 21 June 2014

Information icon.png

Please help to improve this article by expanding it.
Further information might be found on the discussion page.

Cache files

The cache is stored in multiple:

Filename Description
index The index file
data_# Data block files
f_###### (Separate) data stream file

Cache address

The cache address is 4 bytes in size and consists of:

offset size value description
If file type is 0 (Separate file)
0.0 28 bits File number
The value represents the value of # in f_######
Else
0.0 16 bits Block number
2.0 8 bits File number (or file selector)
The value represents the value of # in data_#
3.0 2 bits Block size
The number of contiguous blocks where 0 represents 1 block and 3 represents 4 blocks.
3.2 2 bits Reserved
Common
3.4 3 bits File type
3.7 1 bit Initialized flag

File types

Value Description
0 (Separate) data stream file
1 (Rankings) block data file (36 byte block data file)
2 256 byte block data file
3 1024 byte block data file
4 4096 byte block data file
6 Unknown; seen on Mac OS X 0x6f430074

Examples

Value Description
0x00000000 Not initialized
0x8000002a Data stream file: f_00002a
0xa0010003 Block data file: data_1, block number 3, 1 block of size

Index file format (index)

Overview:

  • File header
  • least recently used (LRU) data (or eviction control data)
  • index table

File header

  • TODO*

Data block file format (data_#)

Overview:

  • File header
  • array of blocks

File header

  • TODO*

Data stream

See: gzip

See Also

External Links