The Fujitsu-Lotus Sametime Unified Telephony PrimeCluster software allows the Lotus Sametime Unified Telephony application software on each node to know the state of the other node. This information is important because in clusters, resources are usually controlled by one of the cluster nodes. All other nodes keep a backup of the resources in case they need to take over control if the controlling node fails. The cluster software has as a task to prevent the so-called “split brain” situation where two nodes of a cluster think that they are controlling the same resource.
A new Stand Alone Service (SAS) feature is offered in the legacy system software release. Instead of shutting down one node, the node that takes over goes into a "Stand Alone Primary" state while the node which is supposed to shutdown and reboot goes into a "Stand Alone Secondary" state. This allows phones local to each node to continue making calls to each other or even calls to the PSTN via the local gateway available on that network.
To keep track of the state of the other node, the PrimeCluster software maintains a heartbeat between the two nodes through the cluster-interconnect (two interfaces used for redundancy and load balancing of the information that is shared between the two nodes of the cluster). The frequency of the heartbeat is once per 200 milliseconds and once 50 consecutive heartbeats are missed, the PrimeCluster software activates its Split Brain defense mechanism. In this situation, the PrimeCluster software must be ensured that only one of the nodes controls the common resources. This is done by stopping one of the nodes unconditionally. In PrimeCluster terminology, the Split Brain Defense mechanism is called the Shutdown Facility (SF) and it starts any number of Shutdown Agents consecutively until one of the nodes is shutdown. In Lotus Sametime Unified Telephony.0 PrimeCluster provides two shutdown agents in the following order:
After the shutdown agents have run, the Lotus Sametime Unified Telephony cluster should have one node shut down and one node active.
In server configurations where the nodes share the same subnet (for the Management, Signaling and Billing interfaces) the active node has all virtual IP addresses activated that were running on the shut down node. When a virtual IP address is activated on the Lotus Sametime Unified Telephony node that takes over, it sends out a so-called gratuitous address resolution protocol (ARP) to inform the LAN switches and routers of the network about the new MAC address for the virtual IP address. The routers and the LAN switches then reconfigure to adapt to the new situation. A network scheme in which server nodes share the same subnet for the Management, Signaling and Billing interfaces is common for co-located voice server clusters. For server configurations with network separation (each node has different subnets for the Management, Signaling and Billing interfaces) endpoints have to switchover to the partner node IP (on the active node). This type of networking scheme is common for geo-separated clusters.
Parent topic: Cluster Redundancy
DOWN Shutdown Agent