It is sometimes useful to look for malware samples containing a specific string. For example, you might look for samples sharing similar code to analyze a malware campaign with different targets. Another use case is discovering the original version of a modified file, as described in my article "Unmasking Malfunctioning Malicious Documents".
Strings of printable characters are extracted from each sample by malware analysis websites such as malwr.com and hybrid-analysis.com:
Malwr.com can search strings within samples using the “string:...” syntax on its search page.
For example, we can use that feature to find all MS Office documents containing VBA macros, because they all contain the string "VB_Nam". This is because the VBA language requires every module to start with a line VB_Name = "name" (see MS-VBAL 4.2). And since it is usually inserted automatically by MS Office as the very first line of code, the string "VB_Nam" is never compressed and it appears in clear text.
Let's look for VBA macro malware using the search page on malwr.com:
Since VBA macros are very trendy since 2014, we get lots of malware samples:
However, for now that search feature fails with a server error in many cases. Furthermore, it seems to be limited to a single string.
For example, in my article "Unmasking Malfunctioning Malicious Documents", when I was searching files containing the string “DownloadDB403”, malwr.com was always failing with a server error.
The current version of hybrid-analysis.com does not provide any string search feature.
@PayloadSecurity gave me another tip, which proves to be very useful: simply use well-known search engines such as Google, by limiting the search to the hybrid-analysis.com and/or malwr.com websites.
All malware analysis reports are already indexed by search engines, including the list of strings extracted from the analyzed files.
So let's search our string “DownloadDB403” on Google, using this syntax:
Important: to get all the relevant results, it is necessary to click on “repeat the search with the omitted results included”.
It is even possible to search both malwr.com and hybrid-analysis.com at the same time, using this syntax:
This method is therefore quite handy when looking for malware samples containing a specific string.
Of course, all the features of the search engines can be used. The most important one is to search several strings in the same files.
For example, we may look for malicious macros using the string "VB_Nam", and more specifically the ones which use anti-analysis tricks such as VirtualBox detection. Some of those macros used in a campaign in March 2015 were containing the keyword "VBOX":
The results show that all malware samples using the same trick were uploaded to malwr.com in March 2015. It is therefore likely that this specific trick was not reused afterwards, or that malware writers took care of obfuscating the strings.
Another example: let's look for RTF documents containing other files such as executable files. This technique is sometimes used by malware writers to deliver malicious payload without being too easily detected.
In that case, embedded files are usually stored into OLE objects inside RTF files, and more specifically OLE objects of type "Package".
Looking at strings extracted from existing samples, we can always find the string "Package" followed by a null chararcter, encoded into hexadecimal, i.e. "5061636b61676500". This appears in the OLE object header:
To be more specific, we will also look for the string "rtf1" which much be present in every RTF file, and the keyword "objdata" indicating an OLE object embedded into RTF.
The search syntax is then: rtf1 objdata 5061636b61676500 (site:hybrid-analysis.com OR site:malwr.com)
As expected, the search returns several dozen malware samples using this technique:
Thanks to websites such as malwr.com and hybrid-anallysis.com which give access to many recent malware samples including all the strings extracted from them, it is possible to leverage search engines to look for interesting strings.
In this article I presented a few examples, but there are surely many other use cases for this technique. If you find other useful ideas for malware analysis, please contact me or publish it on Twitter.
It is even possible to create a Google custom search engine to make it easier. You just need to type the strings, and it will search into malwr.com and hybrid-analysis.com directly.