NSF DUE-0919593 was a project funded by the National Science Foundation to Create Realistic Forensic Corpora for Undergraduate Education and Research. Key performers were Simson Garfinkel at the Naval Postgraduate School, Dave Dittrich at University of Washington, and Cal Lee at University of North Carolina. The lasting computer forensics results were the creation of realistic data sets that are listed below.
EDUCATIONAL DATA SETS
2009-M57 "Patents" scenario
This scenario involves a small company called M57 which was engaged in prior art searches for patents. The fictional company is contacted by the local police in November 2009 after a person purchases a computer from Craigslist and discovers "kitty porn" on the computer. The police trace the computer back to the M57 company.
The scenario actually involves three separate criminal activities:
- Exfiltration of proprietary information by an M57 employee.
- Stealing of M57's property and selling it on Craigslist.
- The possession of "kitty porn" photos by an M57 employee.
This is an involved scenario which has the following information available to students trying to "solve" the case:
- Disk image of the computer that was sold on Craigs List
- Disk images of the firm's five computers when the police show up.
- Disk images of the four USB drives that were found on-site belonging to M57 employees
- The RAM image of each computer just before the disk was imaged.
There are approximately 2-4 weeks of use on each computer.
Nitroba University Harassment Scenario
This scenario involves a harassment case at the fictional Nitroba University.
Nitroba's IT department has received an email from Lily Tuckrige, a teacher in the Chemistry Department. Tuckrige has been receiving harassing emails and she suspects that they are being sent by a student in her class Chemistry 109, which she is teaching this summer. The email was received at Tuckridge's personal email account, email@example.com. She took a screenshot of the web browser and sent it in.
The system administrator who received the complaint wrote back to Tuckridge that Nitroba needed the full headers of the email message. Tuckridge responded by clicking the "Full message headers" button in Yahoo Mail and sent in another screen shot, this one with mail headers.
The mail header shows that the mail message originated from the IP address 22.214.171.124, which is a Nitroba student dorm room. Three women share the dorm room. Nitroba provides an Ethernet connection in every dorm room but not Wi-Fi access, so one of the women's friends installed a Wi-Fi router in the room. There is no password on the Wi-Fi.
Because several email messages appear to come from the IP address, Nitroba decides to place a network sniffer on the ethernet port. All of the packets are logged. On Monday 7/21 Tuckridge received another harassing email. But this time instead of receiving it directly, the perpetrator sent it through a web-based service called "willselfdestruct.com." The website briefly shows the message to Tuckridge, and then the website reports that the "Message Has Been Destroyed."
Students are provided with the screen shots, the packets that were collected from the Ethernet tap, and the Chem 109 roster. Their job is to determine if one of the students in the class was responsible for the harassing email and to provide clear, conclusive evidence to support your conclusion.
The M57-Jean scenario is a single disk image scenario involving the exfiltration of corporate documents from the laptop of a senior executive. The scenario involves a small start-up company, M57.Biz. A few weeks into inception a confidential spreadsheet that contains the names and salaries of the company’s key employees was found posted to the “comments” section of one of the firm’s competitors. The spreadsheet only existed on one of M57′s officers—Jean. Jean says that she has no idea how the data left her laptop and that she must have been hacked.
Students are given a disk image of Jean’s laptop. Their job is to figure out how the data was stolen—or if Jean isn’t as innocent as she claims.
RESEARCH DATA SETS
We are also making available an enlarged "research data set" which contains a wealth of information that can be used by students interested in RAM, Network, or Disk Forensics.
The research data set was created at the same time as the 2009-M57 Patents dataset but contains substantially more information:
- All of the IP packets in and out of the M57 test network.
- Daily disk images and RAM captures of each computer on the network.
This data is not needed to "solve" the scenario, but it might be interesting for students that are:
- Interested in learning about RAM analysis and needs a source of RAM images.
- Interested in network forensics and wants packets.
- Interested in writing software that does "disk differencing" or can detect the installation of malware.
- Wants examples of how a Windows registry is modified over time with use.
OBTAINING THE DATA
You can obtain our data at the following addresses:
The M57 Corpus:
- http://torrent.ibiblio.org/doc/187/torrents (bit torrent form)
- http://domex.nps.edu/corp/scenarios/2009-m57/ (individual files)
Please download from the iBiblio bittorrent server if possible. There are a number of torrents available for your convenience. If you examine the manifests, you will notice that the files overlap (some disk images appearing in more than one torrent).
Each torrent will place files into:
Please seed if possible! The "police materials" torrent references only those materials that would be captured in a raid (e.g. the final day of the scenario).
The 2008-Nitroba corpus:
The M57 Jean corpus: