Difference between pages "Upcoming events" and "Bulk extractor"

From ForensicsWiki
(Difference between pages)
Jump to: navigation, search
(Conferences)
 
m (current version is 1.4.4)
 
Line 1: Line 1:
<b>PLEASE READ BEFORE YOU EDIT THE LISTS BELOW</b><br>
+
== Overview ==
When events begin the same day, events of a longer length should be listed first. New postings of events with the same date(s) as other events should be added after events already in the list. Please use three-letter month abbreviations (i.e. Sep, NOT Sept. or September), use two digit dates (i.e. Jan 01 NOT Jan 1), and use date ranges rather than listing every date during an event(i.e. Jan 02-05, NOT Jan 02, 03, 04, 05).<br>
+
'''bulk_extractor''' is a computer forensics tool that scans a disk image, a file, or a directory of files and extracts useful information without parsing the file system or file system structures. The results can be easily inspected, parsed, or processed with automated tools. '''bulk_extractor''' also created a histograms of features that it finds, as features that are more common tend to be more important. The program can be used for law enforcement, defense, intelligence, and cyber-investigation applications.
<i>Some events may be <u>limited</u> to <b>Law Enforcement Only</b> or to a specific audience.  Such restrictions should be noted when known.</i>
+
  
This is a BY DATE listing of upcoming events relevant to [[digital forensics]]. It is not an all inclusive list, but includes most well-known activities. Some events may duplicate events on the generic [[conferences]] page, but entries in this list have specific dates and locations for the upcoming event.
+
bulk_extractor is distinguished from other forensic tools by its speed and thoroughness. Because it ignores file system structure, bulk_extractor can process different parts of the disk in parallel. In practice, the program splits the disk up into 16MiByte pages and processes one page on each available core. This means that 24-core machines process a disk roughly 24 times faster than a 1-core machine. bulk_extractor is also thorough. That’s because bulk_extractor automatically detects, decompresses, and recursively re-processes compressed data that is compressed with a variety of algorithms. Our testing has shown that there is a significant amount of compressed data in the unallocated regions of file systems that is missed by most forensic tools that are commonly in use today.
  
This listing is divided into three sections (described as follows):<br>
+
Another advantage of ignoring file systems is that bulk_extractor can be used to process any digital media. We have used the program to process hard drives, SSDs, optical media, camera cards, cell phones, network packet dumps, and other kinds of digital information.
<ol><li><b><u>[[Upcoming_events#Calls_For_Papers|Calls For Papers]]</u></b> - Calls for papers for either Journals or for Conferences, relevant to Digital Forensics (Name, Closing Date, URL)</li><br>
+
<li><b><u>[[Upcoming_events#Conferences|Conferences]]</u></b> - Conferences relevant for Digital Forensics (Name, Date, Location, URL)</li><br>
+
<li><b><u>[[Training Courses and Providers]]</u></b> - Training </li><br></ol>
+
  
== Calls For Papers ==
+
==Output Feature Files==
Please help us keep this up-to-date with deadlines for upcoming conferences that would be appropriate for forensic research.
+
  
{| border="0" cellpadding="2" cellspacing="2" align="top"
+
bulk_extractor now creates an output directory that includes:
|- style="background:#bfbfbf; font-weight: bold"
+
* '''ccn.txt''' -- Credit card numbers
! width="30%|Title
+
* '''ccn_track2.txt''' -- Credit card “track 2″ information
! width="15%"|Due Date
+
* '''domain.txt''' -- Internet domains found on the drive, including dotted-quad addresses found in text.
! width="15%"|Notification Date
+
* '''email.txt''' -- Email addresses
! width="40%"|Website
+
* '''ether.txt''' -- Ethernet MAC addresses found through IP packet carving of swap files and compressed system hibernation files and file fragments.
|-
+
* '''exif.txt''' -- EXIFs from JPEGs and video segments. This feature file contains all of the EXIF fields, expanded as XML records.
|IEEE Symposium on Security and Privacy
+
* '''find.txt''' -- The results of specific regular expression search requests.
|Nov 13, 2013
+
* '''ip.txt''' -- IP addresses found through IP packet carving.
|
+
* '''telephone.txt''' --- US and international telephone numbers.
|http://www.ieee-security.org/TC/SP2014/cfp.html
+
* '''url.txt''' --- URLs, typically found in browser caches, email messages, and pre-compiled into executables.
|-
+
* '''url_searches.txt''' --- A histogram of terms used in Internet searches from services such as Google, Bing, Yahoo, and others.
|DFRWS-Europe 2014
+
* '''wordlist.txt''' --- :A list of all “words” extracted from the disk, useful for password cracking.
|Dec 01, 2013
+
* '''wordlist_*.txt''' --- The wordlist with duplicates removed, formatted in a form that can be easily imported into a popular password-cracking program.
|Mar 01, 2014
+
* '''zip.txt''' --- A file containing information regarding every ZIP file component found on the media. This is exceptionally useful as ZIP files contain internal structure and ZIP is increasingly the compound file format of choice for a variety of products such as Microsoft Office
|http://www.dfrws.org/2014-europe/index.shtml
+
|-
+
|44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
+
|Dec 01, 2013
+
|Feb 25, 2014
+
|http://www.dsn.org/
+
|-
+
|12th International Conference on Applied Cryptography and Network Security
+
|Jan 10, 2014
+
|Mar 14, 2014
+
|http://acns2014.epfl.ch/callpapers.php
+
|-
+
|USENIX Annual Technical Conference
+
|Jan 28, 2014
+
|Apr 07, 2014
+
|https://www.usenix.org/conference/atc14/call-for-papers
+
|-
+
|Audio Engineering Society (AES) Conference on Audio Forensics
+
|Jan 31, 2014
+
|Mar 15, 2014
+
|http://www.aes.org/conferences/54/downloads/54thCallForContributions.pdf
+
|-
+
|}
+
  
See also [http://www.wikicfp.com/cfp/servlet/tool.search?q=forensics WikiCFP 'Forensics']
+
For each of the above, two additional files may be created:
 +
* '''*_stopped.txt''' --- bulk_extractor supports a stop list, or a list of items that do not need to be brought to the user’s attention. However rather than simply suppressing this information, which might cause something critical to be hidden, stopped entries are stored in the stopped files.
 +
* '''*_histogram.txt''' --- bulk_extractor can also create histograms of features. This is important, as experience has shown that email addresses, domain names, URLs, and other information that appear more frequently on a hard drive or in a cell phone’s memory can be used to rapidly create a pattern of life report.
  
== Conferences ==
+
Bulk extractor also creates a file that captures the provenance of the run:
{| border="0" cellpadding="2" cellspacing="2" align="top"
+
;report.xml
|- style="background:#bfbfbf; font-weight: bold"
+
:A Digital Forensics XML report that includes information about the source media, how the bulk_extractor program was compiled and run, the time to process the digital evidence, and a meta report of the information that was found.
! width="40%"|Title
+
! width="20%"|Date/Location
+
! width="40%"|Website
+
|-
+
|VB2013 - the 23rd Virus Bulletin International Conference
+
|Oct 02-04<br>Berlin, Germany
+
|http://www.virusbtn.com/conference/vb2013/index
+
|-
+
|8th International Conference on Malicious and Unwanted Software
+
|Oct 22-24<br>Fajardo, Puerto Rico, USA
+
|http://www.malwareconference.org/index.php?option=com_frontpage&Itemid=1
+
|-
+
|16th International Symposium on Research in Attacks, Intrusions and Defenses (RAID)
+
|Oct 23-25<br>St. Lucia
+
|http://www.raid2013.org/
+
|-
+
|5th International Workshop on Managing Insider Security Threats
+
|Oct 24-25<br>Busan, South Korea
+
|http://isyou.info/conf/mist13/index.htm
+
|-
+
|20th ACM Conference on Computer and Communications Security
+
|Nov 04-08<br>Berlin, Germany
+
|http://www.sigsac.org/ccs/CCS2013/
+
|-
+
|4th Annual Open Source Digital Forensics Conference (OSDF)
+
|Nov 04-05<br>Chantilly, VA
+
|http://www.basistech.com/about-us/events/open-source-forensics-conference/
+
|-
+
|Paraben Forensic Innovations Conference
+
|Nov 13-15<br>Salt Lake City, UT
+
|http://www.pfic-conference.com/
+
|-
+
|2013 International Conference on Information and Communications Security
+
|Nov 20-22<br>Beijing, Chine
+
|http://icsd.i2r.a-star.edu.sg/icics2013/index.php
+
|-
+
|8th International Workshop on Systematic Approaches to Digital Forensic Engineering (SADFE)
+
|Nov 21-22<br>Hong Kong, China
+
|http://conf.ncku.edu.tw/sadfe/sadfe13/
+
|-
+
|Black Hat-Regional Summit
+
|Nov 26-27<br>Sao Paulo, Brazil
+
|https://www.blackhat.com/sp-13
+
|-
+
|29th Annual Computer Security Applications Conference (ACSAC)
+
|Dec 09-13<br>New Orleans, LA
+
|http://www.acsac.org
+
|-
+
|IFIP WG 11.9 International Conference on Digital Forensics
+
|Jan 08-10<br>Vienna, Austria
+
|http://www.ifip119.org/Conferences/
+
|-
+
|AAFS 66th Annual Scientific Meeting
+
|Feb 17-22<br>Seattle, WA
+
|http://www.aafs.org/aafs-66th-annual-scientific-meeting
+
|-
+
|21st Network & Distributed System Security Symposium
+
|Feb 23-26<br>San Diego, CA
+
|http://www.internetsociety.org/events/ndss-symposium
+
|-
+
|Fourth ACM Conference on Data and Application Security and Privacy 2014
+
|Mar 03-05<br>San Antonio, TX
+
|http://www1.it.utsa.edu/codaspy/
+
|-
+
|9th International Conference on Cyber Warfare and Security (ICCWS-2014)
+
|Mar 24-25<br>West Lafayette, IN
+
|http://academic-conferences.org/iciw/iciw2014/iciw14-home.htm
+
|-
+
|DFRWS-Europe 2014
+
|May 07-09<br>Amsterdam, Netherlands
+
|http://dfrws.org/2014eu/index.shtml
+
|-
+
|2014 IEEE Symposium on Security and Privacy
+
|May 16-23<br>Berkley, CA
+
|http://www.ieee.org/conferences_events/conferences/conferencedetails/index.html?Conf_ID=16517
+
|-
+
|Techno-Security and Forensics Conference
+
|Jun 01-04<br>Myrtle Beach, SC
+
|http://www.techsec.com/html/Security%20Conference%202014.html
+
|-
+
|Mobile Forensics World
+
|Jun 01-04<br>Myrtle Beach, SC
+
|http://www.techsec.com/html/MFC-2014-Spring.html
+
|-
+
|12th International Conference on Applied Cryptography and Network Security
+
|Jun 10-13<br>Lausanne, Switzerland
+
|http://acns2014.epfl.ch/
+
|-
+
|54th Conference on Audio Forensics
+
|Jun 12-14<br>London, England
+
|http://www.aes.org/conferences/54/
+
|-
+
|2014 USENIX Annual Technical Conference
+
|Jun 19-20<br>Philadelphia, PA
+
|https://www.usenix.org/conference/atc14
+
|-
+
|44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
+
|Jun 23-26<br>Atlanta, GA
+
|http://www.dsn.org/
+
|-
+
|Symposium On Usable Privacy and Security (SOUPS) 2014
+
|Jul 09-11<br>Menlo Park, CA
+
|http://cups.cs.cmu.edu/soups/2013/
+
|-
+
|DFRWS 2014
+
|Aug 03-06<br>Denver, CO
+
|http://dfrws.org/2014/index.shtml
+
|-
+
|23rd USENIX Security Symposium
+
|Aug 20-22<br>San Diego, CA
+
|https://www.usenix.org/conferences
+
|-
+
|25th Annual Conference & Digital Multimedia Evidence Training Symposium
+
|Oct 06-10<br>Coeur d’Alene, ID
+
|http://www.leva.org/annual-training-conference/
+
|-
+
|}
+
  
==See Also==
+
==Post-Processing==
* [[Training Courses and Providers]]
+
 
==References==
+
We have developed four programs for post-processing the bulk_extractor output:
* [http://faculty.cs.tamu.edu/guofei/sec_conf_stat.htm Computer Security Conference Ranking and Statistic]
+
;bulk_diff.py
* [http://www.kdnuggets.com/meetings/ Meetings and Conferences in Data Mining and Discovery]
+
:This program reports the differences between two bulk_extractor runs. The intent is to image a computer, run bulk_extractor on a disk image, let the computer run for a period of time, re-image the computer, run bulk_extractor on the second image, and then report the differences. This can be used to infer the user’s activities within a time period.
* http://www.conferencealerts.com/data.htm Data Mining Conferences World-Wide]
+
;cda_tool.py
 +
:This tool, currently under development, reads multiple bulk_extractor reports from multiple runs against multiple drives and performs a multi-drive correlation using Garfinkel’s Cross Drive Analysis technique. This can be used to automatically identify new social networks or to identify new members of existing networks.
 +
;identify_filenames.py
 +
:In the bulk_extractor feature file, each feature is annotated with the byte offset from the beginning of the image in which it was found. The program takes as input a bulk_extractor feature file and a DFXML file containing the locations of each file on the drive (produced with Garfinkel’s fiwalk program) and produces an annotated feature file that contains the offset, feature, and the file in which the feature was found.
 +
;make_context_stop_list.py
 +
:Although forensic analysts frequently make “stop lists”—for example, a lsit of email addresses that appear in the operating system and should therefore be ignored—such lists have a significant problem. Because it is relatively easy to get an email address into the binary of an open source application, ignoring all of these email addresses may make it possible to cloak email addresses from forensic analysis. Our solution is to create context-sensitive stop lists, in which the feature to be stopped is presented with the context in which it occures. The make_context_stop_list.py program takes the results of multiple bulk_extractor runs and creates a single context-sensitive stop list that can then be used to suppress features when found in a specific context. One such stop list constructed from Windows and Linux operating systems is available on the bulk extractor website.
 +
 
 +
== Download ==
 +
The current version of '''bulk_extractor''' is 1.4.4.
 +
 
 +
* Downloads are available at: http://digitalcorpora.org/downloads/bulk_extractor/
 +
* A WIndows installer with the GUI can be downloaded from: http://www.digitalcorpora.org/downloads/bulk_extractor/bulk_extractor-1.4.1-windowsinstaller.exe
 +
 
 +
== Bibliography ==
 +
=== Academic Publications ===
 +
# Garfinkel, Simson, [http://simson.net/clips/academic/2013.COSE.bulk_extractor.pdf Digital media triage with bulk data analysis and bulk_extractor]. Computers and Security 32: 56-72 (2013)
 +
# Beverly, Robert, Simson Garfinkel and Greg Cardwell, [http://simson.net/clips/academic/2011.DFRWS.ipcarving.pdf "Forensic Carving of Network Packets and Associated Data Structures"], DFRWS 2011, Aug. 1-3, 2011, New Orleans, LA. BEST PAPER AWARD (Acceptance rate: 23%, 14/62)
 +
#Garfinkel, S., [http://simson.net/clips/academic/2006.DFRWS.pdf Forensic Feature Extraction and Cross-Drive Analysis,]The 6th Annual Digital Forensic Research Workshop Lafayette, Indiana, August 14-16, 2006. (Acceptance rate: 43%, 16/37)
 +
 
 +
===YouTube===
 +
'''[http://www.youtube.com/results?search_query=bulk_extractor search YouTube] for bulk_extractor videos'''
 +
* [http://www.youtube.com/watch?v=odvDTGA7rYI Simson Garfinkel speaking at CERIAS about bulk_extractor]
 +
* [http://www.youtube.com/watch?v=wTBHM9DeLq4 BackTrack 5 with bulk_extractor]
 +
* [http://www.youtube.com/watch?v=QVfYOvhrugg Ubuntu 12.04 forensics with bulk_extractor]
 +
* [http://www.youtube.com/watch?v=57RWdYhNvq8 Social Network forensics with bulk_extractor]
 +
 
 +
===Tutorials===
 +
# [http://simson.net/ref/2012/2012-08-08%20bulk_extractor%20Tutorial.pdf Using bulk_extractor for digital forensics triage and cross-drive analysis], DFRWS 2012

Latest revision as of 15:24, 19 May 2014

Overview

bulk_extractor is a computer forensics tool that scans a disk image, a file, or a directory of files and extracts useful information without parsing the file system or file system structures. The results can be easily inspected, parsed, or processed with automated tools. bulk_extractor also created a histograms of features that it finds, as features that are more common tend to be more important. The program can be used for law enforcement, defense, intelligence, and cyber-investigation applications.

bulk_extractor is distinguished from other forensic tools by its speed and thoroughness. Because it ignores file system structure, bulk_extractor can process different parts of the disk in parallel. In practice, the program splits the disk up into 16MiByte pages and processes one page on each available core. This means that 24-core machines process a disk roughly 24 times faster than a 1-core machine. bulk_extractor is also thorough. That’s because bulk_extractor automatically detects, decompresses, and recursively re-processes compressed data that is compressed with a variety of algorithms. Our testing has shown that there is a significant amount of compressed data in the unallocated regions of file systems that is missed by most forensic tools that are commonly in use today.

Another advantage of ignoring file systems is that bulk_extractor can be used to process any digital media. We have used the program to process hard drives, SSDs, optical media, camera cards, cell phones, network packet dumps, and other kinds of digital information.

Output Feature Files

bulk_extractor now creates an output directory that includes:

  • ccn.txt -- Credit card numbers
  • ccn_track2.txt -- Credit card “track 2″ information
  • domain.txt -- Internet domains found on the drive, including dotted-quad addresses found in text.
  • email.txt -- Email addresses
  • ether.txt -- Ethernet MAC addresses found through IP packet carving of swap files and compressed system hibernation files and file fragments.
  • exif.txt -- EXIFs from JPEGs and video segments. This feature file contains all of the EXIF fields, expanded as XML records.
  • find.txt -- The results of specific regular expression search requests.
  • ip.txt -- IP addresses found through IP packet carving.
  • telephone.txt --- US and international telephone numbers.
  • url.txt --- URLs, typically found in browser caches, email messages, and pre-compiled into executables.
  • url_searches.txt --- A histogram of terms used in Internet searches from services such as Google, Bing, Yahoo, and others.
  • wordlist.txt --- :A list of all “words” extracted from the disk, useful for password cracking.
  • wordlist_*.txt --- The wordlist with duplicates removed, formatted in a form that can be easily imported into a popular password-cracking program.
  • zip.txt --- A file containing information regarding every ZIP file component found on the media. This is exceptionally useful as ZIP files contain internal structure and ZIP is increasingly the compound file format of choice for a variety of products such as Microsoft Office

For each of the above, two additional files may be created:

  • *_stopped.txt --- bulk_extractor supports a stop list, or a list of items that do not need to be brought to the user’s attention. However rather than simply suppressing this information, which might cause something critical to be hidden, stopped entries are stored in the stopped files.
  • *_histogram.txt --- bulk_extractor can also create histograms of features. This is important, as experience has shown that email addresses, domain names, URLs, and other information that appear more frequently on a hard drive or in a cell phone’s memory can be used to rapidly create a pattern of life report.

Bulk extractor also creates a file that captures the provenance of the run:

report.xml
A Digital Forensics XML report that includes information about the source media, how the bulk_extractor program was compiled and run, the time to process the digital evidence, and a meta report of the information that was found.

Post-Processing

We have developed four programs for post-processing the bulk_extractor output:

bulk_diff.py
This program reports the differences between two bulk_extractor runs. The intent is to image a computer, run bulk_extractor on a disk image, let the computer run for a period of time, re-image the computer, run bulk_extractor on the second image, and then report the differences. This can be used to infer the user’s activities within a time period.
cda_tool.py
This tool, currently under development, reads multiple bulk_extractor reports from multiple runs against multiple drives and performs a multi-drive correlation using Garfinkel’s Cross Drive Analysis technique. This can be used to automatically identify new social networks or to identify new members of existing networks.
identify_filenames.py
In the bulk_extractor feature file, each feature is annotated with the byte offset from the beginning of the image in which it was found. The program takes as input a bulk_extractor feature file and a DFXML file containing the locations of each file on the drive (produced with Garfinkel’s fiwalk program) and produces an annotated feature file that contains the offset, feature, and the file in which the feature was found.
make_context_stop_list.py
Although forensic analysts frequently make “stop lists”—for example, a lsit of email addresses that appear in the operating system and should therefore be ignored—such lists have a significant problem. Because it is relatively easy to get an email address into the binary of an open source application, ignoring all of these email addresses may make it possible to cloak email addresses from forensic analysis. Our solution is to create context-sensitive stop lists, in which the feature to be stopped is presented with the context in which it occures. The make_context_stop_list.py program takes the results of multiple bulk_extractor runs and creates a single context-sensitive stop list that can then be used to suppress features when found in a specific context. One such stop list constructed from Windows and Linux operating systems is available on the bulk extractor website.

Download

The current version of bulk_extractor is 1.4.4.

Bibliography

Academic Publications

  1. Garfinkel, Simson, Digital media triage with bulk data analysis and bulk_extractor. Computers and Security 32: 56-72 (2013)
  2. Beverly, Robert, Simson Garfinkel and Greg Cardwell, "Forensic Carving of Network Packets and Associated Data Structures", DFRWS 2011, Aug. 1-3, 2011, New Orleans, LA. BEST PAPER AWARD (Acceptance rate: 23%, 14/62)
  3. Garfinkel, S., Forensic Feature Extraction and Cross-Drive Analysis,The 6th Annual Digital Forensic Research Workshop Lafayette, Indiana, August 14-16, 2006. (Acceptance rate: 43%, 16/37)

YouTube

search YouTube for bulk_extractor videos

Tutorials

  1. Using bulk_extractor for digital forensics triage and cross-drive analysis, DFRWS 2012