Monday, June 12, 2017

That damn event ID 129

I was getting this event id 129 on several server 2012 R2 systems

Log Name:      System
Source:        LSI_SAS
Date:          
Event ID:      129
Task Category: None
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      
Description:
Reset to device, \Device\RaidPort0, was issued.

and I found this

Changed LSI_SAS to PVSCSI and got:

Log Name:      System
Source:        pvscsi
Date:          
Event ID:      129
Task Category: None
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      
Description:
Reset to device, \Device\RaidPort0, was issued.

Contacted VMware and they blamed on M$, and told me to upgrade the SAS driver.
Went to broadcom.com site and look for newer driver for "Lsi SAS 3801E", that is what the real name of the "LSI Adaptoer, SAS 3000 series, 8-port with 1068"


and failed to find any newer drivers for windows 2012 R2, ended up downloading driver for 2008 R2 and replaced it, but still got the event id 129.


but still getting the 129.

Called M$ and they blamed VMware, ended up doing 3way call with VMware and M$, it was fun to hear them blaming each others, and finally they agreed(!) that it was disk system's problem.

Called SAN manufacture and they said there is no error recorded.

Finally I found that event id 129 recorded only when vsphere replication is active.
Called vmware, still was blaming SAN system, and two days later I got this reply:

"I want you to try one more troubleshooting step, which helps to identify and regulate if vSphere Replication is replicating huge data though there are no changes happening on the VM.

This issue is could also be caused by a GuestOS sent unmap command. 
To disable Unmap in the Guest OS:

Using a Windows CMD window on the Host, run the command:

fsutil behavior set DisableDeleteNotify 1

To re-enable the feature, use the following command:

fsutil behavior set DisableDeleteNotify 0

To verify the current setting, use the following command:

fsutil behavior query DisableDeleteNotify

DisableDeleteNotify=0 - indicates the 'Trim and Unmap' feature is on (enabled)
DisableDeleteNotify=1 - indicates the 'Trim and Unmap' feature is off (disabled)

Kindly try the above steps and update the status to us, Awaiting your response."

Tried the solution and the event id 129 is gone, however, if you disable the unmap, you are effectively disabling reclaim function from vSphere, read more about this from: 


I asked if VMware can make changes so when replication happens the replication process ignore the unmap command. Their answer was "We will surely consider  the inputs suggested by your side which helps us to enhance the product"

Hope this help someone and save their time.