Lab 1.3: Metadata Treasure Hunt
Objectives
- To use ExifTool to analyze .xls, .doc, and .pdf files for information that will be useful in a penetration test
- To gather recon information about usernames, email addresses, file system paths, and other sensitive data associated with a target organization
Table of Contents
Lab Setup
The files you will examine in this lab are located in /home/sec560/coursefiles/metadata/. The files are:
Please use the copy in your Linux VM. The links are provided here so they can be accessed in Windows.
The goal of this lab is to run exiftool and strings on each of these files, trying to answer the specific questions posed below.
A copy of each of these files is also included on the course USB drive in the coursefiles\metadata directory. You can open them in Windows and look at them if you’d like, but the lab should be performed in Linux, which has exiftool and strings installed.
ExifTool can be invoked on the VMware Linux image to analyze a file by running:
$ exiftool filename
To run strings against a file, you could simply use:
$ strings filename
Try this for each of the files, and enter the data you discover that answers the questions on the next page.
Also, remember that you can peek ahead at the answers and the approach used to determine them.
Questions:
What is the full name of user Bob? What is Bob’s nickname?
What is Bob’s email address?
What Personally Identifiable Information is located in the spreadsheet (.xls) file?
What information is associated with the organization’s firewall ruleset? Hint: The command below shows lines of output with the word "firewall" in a case-insensitive fashion.
$ strings filename | grep -i firewall
- If you have some extra time, also look through the files to find all file system paths and URLs.
Hint 1: You should consider looking for forward slashes by piping your output through grep to search for a / character using the command below.
$ strings filename | grep /
Hint 2: To find lines with a single backslash in them, you could pipe your data through grep '\\'. That syntax will make your shell send a single \ into the grep command.
$ strings filename | grep '\\'
Also, remember that you can peek ahead to the answers.
Walkthrough - Step-by-Step Instructions and Answers
Bob’s Full Name, Nickname, and Email Address
Bob created the .doc and .xls files in Microsoft Word and Microsoft Excel, respectively, so we can analyze the metadata of either file to determine Bob’s full name and nickname. Microsoft Office inserts usernames and author information in specific fields of the files it generates, so we can look for this structured metadata with ExifTool. You can run ExifTool against either the .doc or the .xls file.
sec560@slingshot:~/coursefiles/metadata$ exiftool WidgetStatisticalWhitepaper.doc ExifTool Version Number : 10.10 File Name : WidgetStatisticalWhitepaper.doc Directory : . File Size : 35 kB File Modification Date/Time : 2018:08:29 18:06:06+00:00 File Access Date/Time : 2019:06:29 17:34:33+00:00 File Inode Change Date/Time : 2019:06:29 17:28:35+00:00 File Permissions : rwxr-xr-x File Type : DOC File Type Extension : doc MIME Type : application/msword Title : Statistical Analysis Whitepaper Subject : Author : Bob the Awesome Keywords : Template : Normal Last Modified By : Bob Boberson Revision Number : 23 Software : Microsoft Word 9.0 Total Edit Time : 22.0 minutes Last Printed : 2009:12:30 16:22:00 Create Date : 2009:12:30 15:30:00 Modify Date : 2009:12:30 16:23:00 Pages : 1 Words : 219 Characters : 1253 Security : None Company : 560 Global Conglomerate Lines : 10 Paragraphs : 2 Char Count With Spaces : 1538 App Version : 9.8968 Scale Crop : No Links Up To Date : No Shared Doc : No Hyperlinks Changed : No Title Of Parts : Statistical Analysis Whitepaper Heading Pairs : Title, 1 Code Page : Windows Latin 1 (Western European) Hyperlinks : \\webserver\wwwroot\images\560gc_logo.jpg, ..\My Pictures\chart.png E-Mail : bob.boberson@560gc.tgt Comp Obj User Type Len : 24 Comp Obj User Type : Microsoft Word Document
The interesting data from the above command is:
Author : Bob the Awesome Last Modified By : Bob Boberson Hyperlinks : \\webserver\wwwroot\images\560gc_logo.jpg, ..\My Pictures\chart.png
We can see the Author and his name as well as file paths.
Next, examine WidgetStatisticalAnalysis.xls using ExifTool.
sec560@slingshot:~/coursefiles/metadata$ exiftool WidgetStatisticalAnalysis.xls ExifTool Version Number : 10.10 File Name : WidgetStatisticalAnalysis.xls Directory : . File Size : 32 kB File Modification Date/Time : 2018:08:29 18:06:06+00:00 File Access Date/Time : 2019:06:29 17:34:33+00:00 File Inode Change Date/Time : 2019:06:29 17:28:35+00:00 File Permissions : rwxr-xr-x File Type : XLS File Type Extension : xls MIME Type : application/vnd.ms-excel Title : Intense Statistical Analysis of Color Preferences in 560 Global Conglomerate Customers Author : Bob the Awesome Last Modified By : Bob Boberson Software : Microsoft Excel Create Date : 2009:12:30 14:37:51 Modify Date : 2009:12:30 15:55:14 Security : None Company : 560 Global Conglomerate App Version : 9.8968 Scale Crop : No Links Up To Date : No Shared Doc : No Hyperlinks Changed : No Title Of Parts : Trends Heading Pairs : Worksheets, 1 Code Page : Windows Latin 1 (Western European) E-Mail : bob.boberson@560gc.tgt Comp Obj User Type Len : 26 Comp Obj User Type : Microsoft Excel Worksheet
The interesting data from the above command is:
Author : Bob the Awesome Last Modified By : Bob Boberson E-Mail : bob.boberson@560gc.tgt
Bob’s full name is Bob Boberson (from the Last Modified By field).
Bob’s nickname appears to be Bob the Awesome as indicated in the Author field.
Bob’s email address appears to be bob.boberson@560gc.tgt as indicated in the E-mail field.
Personally Identifiable Information (PII)
To find PII in the .xls file, we can look for strings of consecutive characters. However, many files are littered with meaningless small strings, so we’ll focus our search on longer strings, such as eight characters or more in length. When we do this using the strings command with the -n 8 option, we find some interesting strings in the .xls file, as shown below (output truncated for brevity).
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 WidgetStatisticalAnalysis.xls
Daniel Pendelino
ThisWorkbook
"$"#,##0_);\("$"#,##0\)
"$"#,##0_);[Red]\("$"#,##0\)
"$"#,##0.00_);\("$"#,##0.00\)
"$"#,##0.00_);[Red]\("$"#,##0.00\)
_("$"* #,##0_);_("$"* \(#,##0\);_("$"* "-"_);_(@_)
_(* #,##0_);_(* \(#,##0\);_(* "-"_);)(@_)
_("$"* #,##0.00_);_("$"* \(#,##0.00\);_("$"* "-"??_);_(@_)
_(* #,##0.00_);_(* \#,##0.00\);_(* "-"??_);_(@_)
$.' ",#
(7),01444
'9=82<.342
!22222222222222222222222222222222222222222222222222
%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz
&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz
.xV{y!rv
Customer Color Preferences
Number of Customers
Customer #
Color Preference
Mrs. Boberson
111-11-1111
Sally Southers
222-22-2222
In the output, you’ll see strings with the full names of various people (Mrs. Boberson, Sally Southers, and more) along with data that appears to be Social Security numbers or some government-related identification numbers. This is likely PII that has leaked out of the target organization.
Firewall Information
Next, we’ll look for information about the firewall of the target organization by running strings and grepping its output for the string "firewall" in a case-insensitive fashion.
sec560@slingshot:~/coursefiles/metadata$ strings WidgetStatisticalAnalysis.xls | grep -i firewall
There’s no output, which implies that there are no such ASCII strings in this document.
sec560@slingshot:~/coursefiles/metadata$ strings WidgetStatisticalWhitepaper.pdf | grep -i firewall
Again, we see no output.
sec560@slingshot:~/coursefiles/metadata$ strings WidgetStatisticalWhitepaper.doc | grep -i firewall Note to self. Sandra asked to open port 8000 on the Windows Web server Firewall for something called IceCast. Do this before lunch. Widget Color Analysis White Pbelow
Here we see output that mentions opening up port 8000 on the Windows Web Server Firewall for Icecast, which is a streaming audio service. Bob apparently made this comment to remind himself to take this action before lunch.
Path and URL Information
If you have extra time, you can look for additional information—specifically, URLs and file system paths—in the files. These might be useful to a penetration tester who is looking to target specific valuable information assets in a target organization.
File system paths may be structured or unstructured metadata, so we’ll look for them using both ExifTool and strings.
We’ll start with ExifTool. To make our analysis more efficient, we’ll rely on a feature of ExifTool that lets us specify multiple files on the command line, one after another, and the tool will retrieve metadata from all files we specify.
First, let’s run ExifTool to look through each of our three files, grepping our output to find slashes (/):
sec560@slingshot:~/coursefiles/metadata$ exiftool WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep /
Next, let’s look for backslashes. (Sending grep '\' makes grep look for a single backslash only.)
sec560@slingshot:~/coursefiles/metadata$ exiftool WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep '\\'
Here we see file system paths of \webserver\wwwroot\images\560gc_logo.jpg and ..\My Pictures\chart.png.
sec560@slingshot:~/coursefiles/metadata$ exiftool WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep / File Modification Date/Time : 2018:08:29 18:06:06+00:00 File Access Date/Time : 2019:06:29 18:25:30+00:00 File Inode Change Date/Time : 2019:06:29 18:17:04+00:00 MIME Type : application/msword File Modification Date/Time : 2019:06:29 18:03:35+00:00 File Access Date/Time : 2019:06:29 18:17:10+00:00 File Inode Change Date/Time : 2019:06:29 18:17:04+00:00 MIME Type : application/vnd.ms-excel File Modification Date/Time : 2018:08:29 18:06:06+00:00 File Access Date/Time : 2019:06:29 18:27:29+00:00 File Inode Change Date/Time : 2019:06:29 18:17:04+00:00 MIME Type : application/pdf Producer : \376\377\000B\000u\000l\000l\000z\000i\000p\000 \000P\000D\000F\000 \000P\000r\000i\000n\000t\000e\000r\000 \000/\000 \000w\000w \000w\000.\000b\000u\000l\000l\000z\000i\000p\000.\000c\000o\000m\000 \000/\000 \000F\000r\000e\000e\000w\000a\000r\000e\000 \000E\000d\000i\000t\000i\000o\000n Format : application/pdf sec560@slingshot:~/coursefiles/metadata$ exiftool WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep '\\' Hyperlinks : \\webserver\wwwroot\images\560gc_logo.jpg, ..\My Pictures\chart.png Producer : \376\377\000B\000u\000l\000l\000z\000i\000p\000 \000P\000D\000F\000 \000P\000r\000i\000n\000t\000e\000r\000 \000/\000 \000w\000w \000w\000.\000b\000u\000l\000l\000z\000i\000p\000.\000c\000o\000m\000 \000/\000 \000F\000r\000e\000e\000w\000a\000r\000e\000 \000E\000d\000i\000t\000i\000o\000n Title : \376\377\000W\000i\000d\000g\000e\000t\000 \000S \000t\000a\000t\000i\000s\000t\000i\000c\000a\000l\000 \000W\000h\000i\000t\000e \000p\000a\000p\000e\000r Creator : \376\377\000B\000o\000b\000\000B\000o\000b\000e \000r\000s\000o\000n
The grep / command didn't reveal any interesting information. However, the grep \\ command did reveal file paths.
Hyperlinks : \\webserver\wwwroot\images\560gc_logo.jpg, ..\My Pictures\chart.png
Next, we’ll look for ASCII strings in our files using the strings command, also taking advantage of the fact that strings supports multiple files on the command line. We’ll start by searching for strings greater than eight characters (-n 8), looking through the output for the / character:
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep /
Here, we see a lot of strings in the output, which includes several URLs: http://www.w3.org/199/02/22-rdf-syntax-ns#, http://purl.org/dc/elements/1.1/, and http://ns.adobe.com/xap/1.0/mm/. These URLs are likely just part of the PDF file and point to items outside of our target scope.
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 WidgetStatisticalWhitepaper.doc
WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep '\\'
"$"#,##0_);\("$"#,##0\)
"$"#,##0_);[Red]\("$"#,##0\)
"$"#,##0.00_);\("$"#,##0.00\)
"$"#,##0.00_);[Red]\("$"#,##0.00\)
_("$"* #,##0_);_("$"* \(#,##0\);_("$"* "-"_);_(@_)
_(* #,##0+);_(* \(#,##0\);_(* "-");_(@_)
_("$"* #,##0.00_);_("$"* \(#,##0.00\);_("$"* "-"??_);_(@_)
_(* #,##0.00_);_(* \(#,##0.00\);_(* "-"??_);_(@_)
#C:\WIND
OWS\syst
em32\STD
Files\Mi
.G8Z!sU=/\
<rdf:Description rdf:about='27fc05ce-f7bb-11de-0000-4235e672b786' xmlns:pdf='htt
p://ns.adobe.com/pdf/1.3/' pdf:Producer='\\376\377\000B\000u\000l\000l\000z\000i\
000p\000 \000P\000D\000F\000 \000P\000r\000i\000n\000t\000e\000r\000 \000/\000 \
000w\000w\000w\000.\000b\000u\000l\000l\000z\000i\000p\000.\000c\000o\000m\000 \
000/\000 \000F\000r\000e\000e\000w\000a\000r\000e\000 \000E\000d\000i\000t\000i\
000o\000n' />
<rdf;Description rdf:about='27fc05cd-f7bb-11de-0000-4235e672b786' xmlns:dc='http
://purl.org/dc/elements/1.1/' dc:format="application.pdf'><dc:title><rdf:Alt><rd
f:li xml:lang='x-default'>\376\377\000W\000i\000d\000g\000e\000t\000 \000S\000t\
000a\000t\000i\000s\000t\000i\000c\000a\000l\000 \000W\000h\000i\000t\000e\000p\
000a\000p\000e\000r</rdf:li></rdf:Alt></dc:title><dc:creator><rdf:Seq><rdf:li>\3
76\377\000B\000o\000b\000 \000B\000o\000b\000e\000r\000s\000o\000n</rdf:li></rdf
:Seq></dc:creator></rdf:Description>
#
So our analysis looking for standard ASCII strings didn’t prove too useful. Let’s look for big-endian and little-endian Unicode strings to see if we get any more useful information that way.
We’ll start by looking for big-endian strings eight characters or more in length that include a slash (/):
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e b WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep /
Our output is empty. Let’s look for little-endian Unicode strings with forward slashes:
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e l WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep /
Note: The character after
-eis a lowercase L, not a one.
Again, nothing. Let’s look for big-endian Unicode strings with backslashes:
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e b WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep '\\'
This gives us some useful information. Here, we found a potentially interesting piece of information—a file system path to the original file on Bob’s machine:
C:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc.
Finally, let’s look for strings with little-endian Unicode backslashes:
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e l WidgetStatisticalWhitepaper.doc WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep '\\‘
Note: Again, the above command uses a lowercase L, not a number one.
With this one, we’ve found numerous file system paths, including paths to a file on a web server, a Visual Basic for Applications DLL, the file system path to Office on the machine, and much more.
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e b WidgetStatisticalWhitepaper.doc
WidgetStatisticalAnalysis.xls | grep /
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e l WidgetstatisticalWhitepaper.doc
WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.doc.pdf | grep /
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e b WidgetStatisticalWhitepaper.doc
WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep '\\'
..\My Pictures\Chart.png
\\webserver\wwwroot\images\560gc_logo.jpg
..\My Pictures\chart.png
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
Bob BobersonBC:\Users\Bob Boberson\My Documents\WidgetStatisticalWhitepaper.doc
C:\Users\Bob Boberson\My Pictures\560gc_logo.jpg
sec560@slingshot:~/coursefiles/metadata$ strings -n 8 -e l WidgetStatisticalWhitepaper.doc
WidgetStatisticalAnalysis.xls WidgetStatisticalWhitepaper.pdf | grep '\\'
\\webserver\wwwroot\images\560gc_logo.jpg
*\G{000204EF-0000-0000-C000-000000000046}#4.0#9#C:\PROGRA~1\COMMON-1\MICROS~1\VB
A\VBA6\VBE6.DLL#Visual Basic For Applications
*\G{00020813-0000-0000-C000-000000000046}#1.3#0#C:\Program Files\Microsoft Offic
e\Office\EXCEL9.OLB#Microsoft Excel 9.0 Object Library
*\G{00020430-0000-0000-C000-000000000046}#2.0#0#C:\WINDOWS\system32\STDOLE2.TLB#
OLE Automation
Conclusion
In this lab, we’ve seen how we can use ExifTool and the strings command to pull data from files that may be useful to us in our penetration test. We’ve seen the advantages of structured data and ExifTool in pinpointing useful information.
We’ve also seen the advantages of looking for unstructured data with the strings command to find something that ExifTool isn’t designed to show: obscured fields and comments.
We’ve also seen how to transcend the default limitation of ASCII strings on Linux with the -e option to look for Unicode strings, both big endian and little endian.