Cluster service wont start with rgmanager fixed

Cluster service wont start because it’s in a failed state caused from the service attempting to start but failing too many times in too short of a time period (based on your cluster.conf retries and remember policy) or, more likely it’s failing to shutdown cleanly.

Manually enabling the service fails, right?

# clusvcadm -e host1service

Checking syslog, you will see the service refusing to start and the reason why. In this case it’s complaining it failed to stop cleanly, but the road block is really that it’s in a failed state.

# tail /var/log/messages
rgmanager: Service:host1service has failed; can not start.
rgmanager: Service:host1service failed to stop cleanly

four people with a thumbs up
Assuming cluster.conf is valid because you ran ccs_config_validate and it told you so, the problem is probably pretty simple. If a service is in a failed state, you can’t enable it unless you disable it first!

Perhaps this functionality is to stop you from starting a failed service because it’s already tried to start up a number of times and couldn’t already, so what’s the point of another try? In my case, what’s the harm? It’s already failed again and again. Why not just retry it when I step in and manually ask for the service to start?!

Oh well, it’s just one extra step.

Disable the service, then enable it. If you really have fixed the problem that caused the failure in the first place, the service should start right up.

# clusvcadm -d host1service
# clusvcadm -e host1service

Checking syslog once again, you should see the service disable, renable, and fire off the startup script.

Stopping service service:host1service
rgmanager: [script] Executing /etc/init.d/ClusterService stop
rgmanager: [script] Executing /etc/init.d/ClusterService start
rgmanager: Service service:host1service started

Check the cluster status to make sure it started.

# clustat
Cluster Status for hacluster @ Fri Nov 9 02:15:00 2012
Member Status: Quorate
 
Member Name ID Status
------ ---- -- ------
host1 1 Online, Local, rgmanager
host2 2 Online, rgmanager
host3 3 Online, rgmanager
 
Service Name Owner State
------- ---- ----- -----
service:host1service host1 started

If the service is relocatable, now is a good time to test relocation to make sure it works properly and doesn’t fail on the other available nodes.

# clusvcadm -r dnsmasq -m host2
Trying to relocate service:dnsmasq to host2…Success
service:dnsmasq is now running on host2

If relocation fails on any particular node, fix the problem or take that node out of the failoverdomain in cluster.conf.

KVM live migration with sanlock using virsh

Sanlock helps you avoid screwing things up by starting the same virtual machine on multiple hosts, which will quickly mangle the guest file systems. But you may quickly find KVM live migration no longer works from the virt-manager GUI.

I was in shock when I first saw this migration error from virt-manager!

virt manager error dialog

Internal error cannot use migrate v2 protocol with lock manager sanlock? But alas, all is not lost.

Migration from the virsh command shell works just fine for me. And it’s quicker and more efficient than doing it pointy-clicky style from a gui anyway. If i’m migrating a guest from one host to another, it’s usually because I need to shut down the host hardware. I’m not migrating just one guest, i’m migrating one at a time.

With the shell method, I repeat the migration command in a loop and move them all, one after another automatically. Monday migrate, tuesday migrate, everybody. It’ll get in your bones!

root@host1~# for x in 1 2 3 4 5 6 7 8; do
virsh migrate –live guest$x qemu+ssh://host2/system
done

When I first set up sanlock for qemu locking, I saw errors during migration of several hosts that were already up and running with local locking before sanlock locking was implemented.

error: Failed to inquire lock: No such process

There is no impact to the guest or for users connected to the guest. The end result is the migration doesn’t complete and I found the guest still running on the original host. Other guests migrated without error.

A scheduled reboot of the guests allowed migration to proceed. No reboot or disruption to the existing host was required at all.

Mount SMB on Windows 7 Home Premium

microsoft technet screenshot
Most guides telling you how to mount smb CIFS/Samba shares on Linux to mount on Windows 7 will point you to adjust settings in Administrative Tools -> Local Policies.

Windows 7 Home Premium does not have the Local Policies MMC snap-in. Therefore you cannot use that tool to change the NTLMv2 security settings.

Instead of messing with a snap-in at all, just open the registry editor and set the LmCompatibilityLevel explicitly.

It’s described here, buried in the Lsa control set on technet.microsoft.com.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Lsa

Add a new DWORD (32-bit) Value, “LmCompatibilityLevel”

Set the value to 1 for the widest compatibility, or a higher value if you want to restrict session security to some combination of only NTLMv2, NTLM, LM.

See the technet page for the chart with descriptions of all the levels:

http://technet.microsoft.com/en-us/library/cc960646.aspx