Clustering the Resource Reservation database in IBM Lotus Domino
Susan Bulloch
IBM Software Group
SWAT Engineer, Lotus Brand
Charlotte, NC USA
August 2009
Summary: IBM® Lotus® Domino® releases 7 and 8 support the clustering of the Resource Reservation database (RRDB). This article explains the clustering process, the operation of the RRDB in a cluster, and the most current best practices for clustering your RRDB.
1 Requirements for clustering
The minimum requirements for clustering the RRDB system are:
· For fully functioning clustered RRDB, servers must be Lotus Domino 7 or later.
· The design of all clustered copies of the database must be Domino 7 or later.
· The Rooms and Resource Manager task (RnRMgr) must be running on the servers containing the databases.
· One server must be identified as the primary (Administration) server in the Access Control List (ACL).
NOTE: Only two servers are supported for failover processing. There can be other cluster replicas, but they will not process requests and will only serve as backup copies.
With all clustered servers, the system databases typically have Server Connection documents that work throughout the day as a backup to cluster replication. Normally Names.nsf, Admin4.nsf, and other system databases are on this schedule. Your RRDB should be added to this existing Server Connection document.
2 How failover works in the RRDB system
A short time after the primary server stops processing reservation requests, the clustermate realizes that the primary server is down, and the failover server then starts processing the requests.
The clustermate will continue to process all requests until it goes down itself, and then failover occurs again, with processing failing back to the primary server.
The length of time between when the primary server goes down and the secondary server begins processing can be up to 30 minutes. This is not a configurable parameter.
Note that the RRDB on the secondary server does not need to fail back to the primary server in order for the system to work correctly.
This failover behavior is different than failover with other Lotus Domino tasks. The RnRMgr task does not care which clustermate is doing the work, and it is designed to work from either database. An Administration Server must be defined in the ACL of the RRDB; failure to do this may result in the failover and fail back not working properly.
The Administration Server should also be the server listed in the Resource Mail-In database document in the Domino Directory. If the Administration Server for the RRDB is changed, these documents should be updated as well.
The failover server will be the alphabetically first server in the cluster, unless the alphabetically first server is the primary (home) server for the RRDB. If the home server is the first server alphabetically, then the second server will be the failover server. This also is not configurable.
Example scenarios
Suppose your servers NotesA, NotesB, and NotesC are in a cluster. NotesB is the home server for the RRDB. If NotesB goes down, then server NotesA will assume the processing for RnRMgr until it goes down, at which time NotesB will begin the processing duties. NotesC will never process RRDB requests, even if both NotesA and NotesB are down.
In the same scenario as above, if NotesA is the home server for the RRDB, then NotesB will be the backup server for processing. Again, NotesC will never process requests.
You can determine which server is currently processing requests by issuing the server command
tell RnRMgr whoowns xxxxx.nsf
where xxxx.nsf is the name of your Resource Reservation database.
Duplicate reminder notices
In its normal operation process, RnRMgr scans for any “missed” documents at 2:00 AM, picking up and processing any documents that may have been missed during the day.
If there are any unprocessed requests for room approvals in the system at this time, the room owners will receive a duplicate reminder notice. This also occurs whenever RnRMgr is re-started and when cluster failover occurs.
Note that an enhancement request has been submitted to Lotus Quality Engineering to better control the behavior of these duplicate reminder notices.
3 Troubleshooting clustered RRDBs
When the primary server for the RRDB goes down, there will be a period during which no reservation requests are processed. This is normal, as the RnRMgr task on the clustermate polls to confirm that the primary server is truly down. This is the expected operation.
During this time, the RnRMgr task will queue the requests to make sure they are processed in the proper order and that none get “lost”, in case the primary server comes back up quickly, or if the RnRMgr task has simply been stopped.
If the failover server does not start processing the requests after 30 minutes, perform the following steps:
1. Check the Cluster Directory to make sure the RRDB has not been prohibited from cluster replication.
2. Issue the Server Console command
Tell RnRMgr whoowns xxxxx.nsf (substituting the pathname and name of your RRDB for xxxxx).
This returns the name of the server that is currently responsible for processing the requests; check the status of the RnRMgr task on that server.
3. If necessary, restart the RnRMgr task on the failover server, using the command
restart task RnRMgr
Resources
· Participate in the
Notes and Domino 8.5 discussion forum.
· Contribute to the IBM Lotus Domino wiki.
· Refer to the developerWorks IBM Lotus Notes and Domino product page.