SAP HANA VM get HUNG state rarely

Few days back,Client reported that the VM in host getting hung state often and no error message found in the SuSe Guest OS.VM reside on standlone host with only VM running on top of it.

Interesting part,We do see the logs represent the CPU Soft reset and Hard reset by the user.There is lot more difference between the logs

A Guest OS initiated reset generates this event in the virtual machine logs:
vcpu-0| CPU reset: soft

A user or API initiated request to reset a virtual machine generates this event in the virtual machine log files:
vcpu-0| CPU reset: hard

When I went thourgh the logs few hours back,I came to see the error message that

“YYYY-MM-DDT<Time>Z| vcpu-1| I120: VERIFY bora/lib/misc/strutil.c:1079”

Additional log message says that you need to file VM-Support file to support.I have went though the KB and found the SAP Host Agent 7.21 PL5, which make use of the latest ESXi Extended Guest Statistics within a virtual machine on an ESXi 5.5 Update 3 or ESXi 6.0 host, you may experience these symptoms like virtual machine becomes hung irregularly.

I filled a VM Support case with VMware and they confirm by analysis the zdump file.

VMware states:

The vm got rebooted after the vm gets into hung state since the vmon is been enabled and the Zdump was been created for the vmx crash.

when we decrypt the zdump file we found that the issue is due memory leakage issue Due to below cause. Where you have already mentioned in the call that you have saphost agent version 721 running on this guest VM.

Recent enhancements in the Extended Guest Statistics collection allows the vendors to fetch metrics from within a guest operating system of a virtual machine. However, an issue has been found in the statistics collection in ESXi which leads to a memory leak and, in consequence, the virtual machine fails due to memory exhaustion.

These collected statistics are, for example, heavily used in the context of SAP’s Host Agent (SAP Host Agent 7.21 PL5 and higher) and VMware vSphere (ESXi 5.5 Update 3 or ESXi 6.0).

Upgrade the host to resolve the issue.

Reference KB:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2137310

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s