Last Updated: 2022-07-21 19:09:29 UTC
by Didier Stevens (Version: 1)
I found a malicious Office document with VBA code where most of the identifiers (variables, function names, ...) consist solely out of characters that are not ASCII (.e.g, these characters have values between 128 and 255).
When you take a look at the VBA code with oledump.py, you get this:
This is not invalid VBA code, this is code where many of the variable names and function names consist of identifiers made up solely of ANSI characters with the high-bit set, e.g., non-ASCII characters.
It's hard to read, but my oledump plugin plugin_vba_dco can help here. It's a plugin that scans VBA source code for statements like declare, createobject, getobject, ..., and greps for lines of code with the identifiers of these statements.
It's still hard to read, but one can see the CreateObject statements now, and understand that this is a downloader (microsoft.xmlhttp, ...).
To make this more readable, I made an update to plugin_vba_dco. When you use new plugin option "-g", all identifiers whill be "generalized": they will be replaced by strings Identifier0001, Identifier0002, Identifier0003, ...
Which makes a significant difference when trying to read this VBA code:
And now an obfusctated URL stand out. Identifier0015 is the string deobfuscation routine. It does not appear in this overview, but one can request all the generalized code (and not only statements associated with Declare, CreateObject, ...) by using plugin option -a:
Identifier0017 and Identifier0018 stand out: these look like a character translation table.
I copied them to a small, ad-hoc Python decoding script, together with the encoded URL, and used these strings to translate the characters of the URL. This is the script and the translation result: