Last Updated: 2023-10-22 07:57:47 UTC
by Didier Stevens (Version: 1)
My tool base64dump.py takes any input and searches for encoded data. By default, it searches for base64 encoding, but I implemented several encodings (like vaious hexadecimal formats):
For example, to search for classic hexadecimal, use option "-e hex" (--encoding).
If you don't know what the encoding is (or how it is defined in base64dump), you can just use "-e all" to try out all encodings. The output will be sorted by length of decoded data, so that the longest data appears at the end.
Here is an example with Jesse sample from his diary entry "Hiding in Hex".
It produces a lot of output:
That's because very short sequences of letters and digits are identified as base64 or base85 encoding (but the longest encoding, the ELF file, still appears at the end of the list).
If these false positives bother you, specify a minimum length (option -n) for the decoded data:
This output tells us:
- that a long (5008 characters) hexadecimal string (bx: \x..) was found
- that it is probably a ELF file (notice the start of the decoded data)
- that the MD5 is 42df9ae083003540e5a8d698c0e4329e
With the hash, you can lookup this file in various repositories, like VirusTotal.
If you don't like MD5, you can change the hash algorithm with environment variable DSS_DEFAULT_HASH_ALGORITHMS (several of my tools look for this environment variable, not only base64dump).
Here is the Windows cmd command for sha256:
If VirusTotal doesn't have the sample (or you need to do more analysis), you can extract the ELF file with option -s. This requires you to specify the encoding (-e bx), because you can not select decoded data when using "-e all".
Here I use file-magic to identify the file type: