LLMs have become very popular recently. I've been running them on my home PC for the past few months in basic scenarios to help out. I like the idea of using them to help with forensics and Incident response, but I also want to avoid sending the data to the public LLMs, so running them locally or in a private cloud is a good option.
I use a 3080 GPU with 10GB of VRAM, which seems best for running the 13 Billion model (1). The three models I'm using for this test are Llama-2-13B-chat-GPTQ , vicuna-13b-v1.3.0-GPTQ, and Starcoderplus-Guanaco-GPT4-15B-V1.0-GPTQ. I've downloaded this model from huggingface.co/ if you want to play along at home.
Llama2 is the latest Facebook general model. Vicuna is a "Fine Tuned" Llama one model that is supposed to be more efficient and use less RAM. StarCoder is trained on 80+ coding languages and might do better on more technical explanations.
There are a bunch of tutorials to get these up and running, but I'm using oobabooga_windows to get all of this quickly. The best solution if you are going to play with many of these is running docker w/ Nvidia pass-through support.
When thinking about how to use this, the first thing that comes to mind is supplementing knowledge for responders. The second is speeding up technical tasks, and the third is speeding up report writing. These are the three use cases we are going to test.
The most significant limitation is the age of the data these models are trained on. Most models are around a year old, so the latest attacks will not be searchable directly. LLMS can be outright wrong often, too. And the smaller the model it's trained on, the more facts will be wrong.
Is this a type of attack?
126.96.36.199- - [14/Apr/2016:08:22:13 0100] "GET /wordpress/wp-content/plugins/custom_plugin/check_user.php?userid=1 AND (SELECT 6810 FROM(SELECT COUNT(*),CONCAT(0x7171787671,(SELECT (ELT(6810=6810,1))),0x71707a7871,FLOOR(RAND(0)*2))x FROM INFORMATION_SCHEMA.CHARACTER_SETS GROUP BY x)a) HTTP/1.1" 200 166 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:188.8.131.52) Gecko/20100401 Firefox/4.0 (.NET CLR 3.5.30729)"
2. All the Apache logs from this blog (2)
Is there anything unusual in this Apache log for a wordpress server?
3.Windows malware sanario
A. If doing computer forensics, where would you look for artifacts for a malicious web browser extension on a Windows system?
B. What is CVE-2021-44228. How do you detect attacks? How do you defend against it?
C. What should be the parent process of firefox.exe on Windows?
4. Write a report.
Write an incident response report about an infected computer with clipper malware. Include a summary of the malware capabilities and the included data below.
Malware hash: dabc19aba47fb36756dde3263a69f730c01c2cd3ac149649ae0440d48d7ee4cf
Timeline of events:
2023-07-02 22:23- PC initial infection
2023-07-02 22:45- PC downloaded backdoor from ohno.zip
2023-07-03 04:02- Pc started scanning for AD users
2023-07-03 06:02- PC started brute-forcing accounts
2023-07-04 09:00- PC was isolated from the network
1. SQL Injection:
LLAMA2- Said SQL injection and gave reasons. (B)
Vicuna- SQL injection and gave reasons (B)
Star- SQL injection and explanation (B)
LLAMA 2 Answer.
LLAMA2- Was wrong but broke down the logs (C-)
Vicuna-Broke down logs but wrong (C-)
Star-Correctly identified what was accessed. (B-)
3A.Malicious browser extension:
LLama2-Gave registry keys, but not all correct or useful (C)
Vicuna-Very Bad (F)
LLAMA2: Completely wrong on all levels. Said it was SSL vul. (F)
Vicuna Completely wrong (F)
Star:Correct except detection (B)
LLama2-Was incorrect said csrss.exe (F)
Vicuna- Completely wrong (F)
Star-very wrong (F)
LLAMA2: The report was ok, but it would need a lot of changes. (C-)
Vicuna- The report was ok, but would need a lot of changes. (C-)
Star-Report made up many facts (D-)
LLama 2 response.
Overall, these small models did poorly on this test. They do a good job on everyday language tasks, like giving text from an article and summarizing it or helping with proofreading. A specific version of Star is just for Python, which also works well. As expected for small models, the more specific they are trained, the better the results. I want to work on training one for incident response and forensics in the coming months.
Anyone else doing testing with local or private LLMS? Leave a comment.
ZIP archives store compressed files including their metadata (filesize, date/time, ...). When a contained file is password protected, the compressed data is encrypted, but the metadata is not.
As an example, take this ZIP file that I created. It contains a single file (mimikatz.exe), and that file is protected with a password (infected):
Although the file is password protected, it's the compressed file content that is encrypted (see screenshot: Encrypted +) but the filename, the filsize, filedate, ..., all that metadata is not encrypted. That can be read without knowing the password.
I was involved in a forum discussion, where the OP shared a password protected ZIP archive of a file that the OP considered suspicious. For whatever reason, the OP wanted us to express our opinion about the file without having the opportunity to take a look at the file (the OP would share the password later with us). I could make an educated guess about the filecontent with the crc32 checksum.
Let me explain.
My tool zipdump.py can be used to analyze ZIP files using Python modules zipfile and pyzipper. But it can also parse the binary structure of a ZIP file, and extract all the relevant metadata in its raw form. I do this with option -f l (find list):
First we see a PKZIP file record (named PK0304 by zipdump), then a PKZIP directory entry record (PK0102) and finally, a PKZIP end-of-directory record (PK0506).
All the metadata is in cleartext.
With the filename and the CRC32 checksum, I can make an educated guess about the file content. I download mimikatz.exe from github, and I calculate its crc32 checksum with hash.py:
The crc32 checksum of the file inside the archive and the file that I downloaded, are the same. This is a weak indication that the files are the same.
crc32 is an error detection checksum, it is not a cryptographic hash. It's only 32 bits long, and it is easy to craft a file that produces a desired crc32 checksum. It is certainly not strong evidence.
The OP was surprised that metadata was not encrypted, so I was pretty sure that the crc32 had not been tampered with.
My trick worked because I had a good idea of what file was inside the archive. Wihout that information, it would have been impossible, because there are countless files with that crc32 checksum.
I think that this crc32 code is also used by Gmail to detect malicious files inside password protected ZIP files.
If you need to create archive files where metadata is also encrypted, you need to use other formats, like 7zip for example. Or double-ZIP your files.
In my blog post "Quickpost: Analysis of PDF/ActiveMime Polyglot Maldocs" I explain how to search through MIME files with my tool emldump.py to find suspicious/malicious content:
I have now released a new version of emldump.py, that can output the content of all parts in JSON format.
This is done with option --jsonoutput:
This JSON output can then be consumed by different tools I develop. One of them is file-magic.py, a tool to identify files using the libmagic library.
Here file-magic.py identifies all parts of the MIME file:
And it becomes clear that the JPEG parts is not actually an image, but an MSO/ActiveMime file that can contain VBA code.