|This article describes how to troubleshoot some common problems encountered during IBM® Lotus® Quickr® 8.5 for WebSphere® Portal clustering, offering solutions and guiding you through the general process of diagnosing problems.
ShowTable of Contents
First, it's important to review the Product documentation topic, “Installing ND on the deployment manager machine on Windows: qp85
,” to understand the Lotus Quickr cluster installation process, especially the steps to prepare the Deployment Manager (DM).
To locate the problem, you first need to understand what is done during the clustering process. Basically the following three ConfigEngine targets are used to perform the clustering:
- Cluster-node-config-pre-federation. Uses the addNode command to add the node into the cell of the DM.
- Cluster-node-config-post-federation. Revises the security settings, since the node has taken the security settings from the DM, and recycles the DM.
- Cluster-node-config-cluster-setup. On the primary node this step creates a cluster based on the server of the primary node. On the secondary node it adds the server in the existing cluster, after which you need to do the necessary configurations.
If you are using the installer to do deploy clustering, and it fails during the clustering step (see figure 1), you can search the above three targets in the wpinstalllog.txt file (under /PortalServer/logs) and the ConfigTrace.log (under /ConfigEngine/log). These are the main logs we examine in this article.
Note that most failures occur in the “cluster-node-config-pre-federation” target.
Figure 1. Cluster config error
Common issues and their resolution
Cannot execute clustering scripts
Search “cluster-node-config-pre-federation” in wpinstalllog.txt, during the execution of this target, and you will see the error, “The configuration data type CellCompRegistryCollection is not valid” (see figure 2).
Figure 2. Error under “cluster-node-config-pre-federation” in wpinstalllog.txt
The error is caused by an incorrect setting in the DM. As stated above, before installing a cluster you need to prepare the DM, paying special attention to copying the files under the filesForDmgr directory on the Setup CD into the corresponding directories on the DM (see figure 3). (Note that there is one hidden directory in filesForDmgr.) Failure to copy these files will cause the above error.
Figure 3. Hidden file in filesForDmgr
NOTE: Merely doing the copying is not enough; you must also be sure to restart the DM server. This is a common oversight, and for users working on Linux machines, you must also make sure you have the correct execution access for the plug-in file.
Cannot connect to DM
Search “cluster-node-config-pre-federation” in wpinstalllog.txt; if you cannot see any errors after it, then turn to ConfigTrace.log. Searching for “action-cluster-node-federation”, and you'll see error, “The system cannot create a SOAP connector to host” (see figure 4).
Figure 4. Cannot connect to DM
The error could be due to the following reasons:
Perform these steps to resolve the issue:
- Wrong host name for the DM
- Wrong Simple Object Access Protocol (SOAP) port number for the DM
- The DM and node cannot see each other
On a secondary node, you also must make sure DM, primary node, and secondary node can see each other. Otherwise, it will lead to a Null Pointer Exception during node synching.
- Open the wkplc.properties file under /ConfigEngine/properties directory.
- Verify that the property WasRemoteHostName is correct; it should be the hostname of the DM.
- Verify that the property WasSoapPort is correct. The default port number is 8879. If your DM profile is not the first one created on the server, check the serverIndex.xml file under /config/cells//nodes// and check the value of SOAP_CONNECTOR_ADDRESS.
- To make sure the DM and the node can “see” each other, put their IP addresses into each other's hosts file.
Cannot federate the node
Search “action-cluster-node-federation” in ConfigTrace.log, and you should see “An error occurred during federation; rolling back to original configuration...” (see figure 5).
Figure 5. Cannot federate the node
The cause of this error may be hard to recognize. First, open addnode.log under /logs/, search for “exception”, and then look at the last “Caused by”.
You may find that the failure was caused by the JVM memory of the DM server is not being large enough; that is, you see “OutofMemoryError” (see figure 6).
Figure 6. OutOfMemoryError
To increase the JVM memory, do the following:
- Open the Admin Console of the DM.
- Navigate to System administration > Deployment manager > JavaTM and Process Management > Process Definition > Java Virtual Machine.
- Increase the value in Maximum Heap Size. Usually we choose 1408M, but this value will depend on your environment.
Sometimes, to find the root cause of the federation, you may even need to look into SystemOut.log under /logs/. The time at which this failure occurred is a clue as to what exception has happened at that time in SystemOut.log.
After fixing the above problems, you can try to cluster the node by using the manual steps. But before doing that, you need to change the value of the RegistrySynchronized property in the wkplc.propeties file to “true”; otherwise, ConfigEngine scripts cannot be executed.
It is always a good idea to execute the three clustering targets manually, to test if you have fixed the problem. Using these three commands (be sure not to wait for more than 3 or 4 hours to verify it):
- ConfigEngine.bat cluster-node-config-pre-federation -DWasPassword=
- ConfigEngine.bat cluster-node-config-post-federation -DWasPassword=
- ConfigEngine.bat cluster-node-config-cluster-setup -DWasPassword=
You should now have a good idea of how to locate, diagnose, and resolve some of the common problems that can occur during the clustering of Lotus Quickr 8.5 for WebSphere Portal, and about the different types of errors.
developerWorks® Lotus Quickr product page:
developerWorks Lotus Quickr documentation page:
Lotus Quickr Forum:
About the author
Yao Jing is a Software Engineer working with the Lotus Quickr Install team at IBM's Shanghai, China facility. She has two years experience of install technology and extensive knowledge of WebSphere Application Server deployment and configuration. You can reach Jing at email@example.com.