Monitoring VMWare logs
Last Updated: 2012-05-02 08:54:06 UTC
by Bojan Zdrnja (Version: 1)
Virtualization is so popular today that there is almost no company that does not use a virtualization platform. VMWare is definitely the most popular one (at least the most popular one I seem to be running into).
It is also not uncommon to see VMWare farms growing exponentially as people tend to throw more hardware and just create new VMs. In such cases, controlling what your administrators do is a must yet I also see that organizations auditing their VMWare farms (and especially administrator’s activities) are pretty rare.
One of the problems is that reviewing VMWare logs can be complex so it is not easy to setup the whole log collection and analysis system correctly; this is something a lot of SIEM’s and similar log collection and analysis tools fail at. So let’s see what we have to work with here and how we can improve things.
For the sake of this diary, I’ll write mainly about the “typical” setups today that consist of ESXi (or ESX, for older setups) host servers and one or more vCenter management servers.
ESXi is VMWare’s host operating system that actually runs the virtual machines. It is highly optimized and has a footprint of only 150 MB. This is what is usually installed on those big servers that today run 20+ virtual machines.
Of course, when you have more than one ESXi server, you want to manage it centrally, not only to make management easier but to also allow some more sophisticated processes such as vMotion and similar. This management is done through a vCenter server.
vCenter basically just runs on a normal Windows operating system machine that itself can be a VM as well. Administrators normally use the VMWare vSphere client application to connect to vCenter and to manage virtual machines (of course, depending on their role and permission).
The same client (vSphere client) can be used by an administrator to connect directly to an ESXi server and to manage VMs that are hosted on that server. As you can probably guess, this creates problems for activity auditing since, in this case, any changes are performed directly on the ESXi host server so vCenter will not see those activities directly.
Finally, if you are trying to troubleshoot some problems, you can allow SSH access directly to the ESXi hosts – this access is disabled by default, but I found it quite often that organizations enable it and leave it enabled.
We can see that there are multiple system components that generate logs that we should be collecting. While vCenter keeps its own logs and allows reviews from the console, ESXi hosts will also independently keep their logs that should be audited. Actually, when an administrator modifies something in vCenter, a task will be created that will cause vCenter to connect to the target ESXi host and issue the change.
At the moment I’m usually recommending clients to collect logs from the following components:
- Since vCenter is the “brain” we should collect all logs we can from the server it is running on. VMWare creates text log files (see http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1021806 for information about VMWare logs) that are, unfortunately, not easy to read and often lack information.
What I’ve found is that the VMWare SDK API allows much easier retrieval of logs that will be nice and structured but, if your SIEM does not support it directly, you will have to code a script to retrieve such logs yourself.
Of course, do not forget about the OS logs as well as the database logs – since this server is the most important one, make sure that you’ve protected it accordingly and that you collect all other log files that might be important.
- ESXi host logs are also very important since an administrator can connect directly to them (unless this has been prevented). With ESXi there aren’t many options and probably the best one is to configure a local Syslog to send logs to the central Syslog server, as shown in the picture below.
Keep in mind, though, that VMWare creates many multi-line logs which will eventually be broken due to size limits of Syslog so correlating them on the server side might be quite a bit difficult, if not impossible.
By using Syslog we will also take care of SSH logins, since these will be logged by the console and sent through Syslog to the central server.
Now that we have all the logs at one place, we can correlate them and setup alerts on suspicious activities.
Regular log reviews are very important. One of the things you should particularly take a look at is console access. For example, if the administrator that accessed a server’s console through vCenter forgot to logout, any other vCenter administrator can access that server’s console (if he has vCenter permissions to access it, of course).
Good log collection and correlation (remember to collect both vCenter logs as well as logs from all your guest servers) can tell you which server’s consoles were accessed as well as if the administrator had to log in or not.
So check your VMWare environments today and see if you can answer these questions: who, from where and when logged in to my vCenter console, which VMs were migrated and which consoles have been accessed by which administrator in last 30 days?
Let us know what your experiences with collecting and analyzing VMWare logs are and if you did something you’d like to share with our readers so everyone can benefit from your work.
May 3rd 2012
1 decade ago