Abstract: This article will help you troubleshoot "server not responding" error messages and help you gather the right information to send in to IBM Technical Support if this does not solve your issue.
This document contains the following sections:
There may be times when you encounter messages in your log or on your Server Console that say either “Server not responding”, “Remote server not responding”, or “Remote server no longer responding” when one server in your domain is trying to communicate with another; either replicating or routing mail. Also, users may see these errors in their Notes client when simply trying to access a database on your server. The purpose of this document is to provide you with all of the information about these errors in one place. This document also provides the data that IBM Technical Support will need to receive from you in order to properly troubleshoot this error message.
"Server not responding" troubleshooting tools
When you receive this error message, you will want to take a look at some of the basic Domino areas that could be problematic. The Server Document is the best place to start. While in the Server Document, navigate to the Ports... tab -> Notes Network Ports. Ensure the Net Address field for the source or destination server has an IP address or Fully Qualified Host Name (FQHN) that matches what is in the Basics tab -> Fully Qualified Internet Host Name field:
Next, there are several command prompt utilities that can be useful in determining where the problem is. The commands can be issued from machines inside or outside of the firewall:
|Provides a list of NICs, IPaddresses, DNS used, etc.|
Note: If using Windows Server 2000, the command to enter is:
|Provides the local address, foreign address and associated task|
|Resolves an IP address to a Host Name|
Ping -f -l 1470 IPAddress
|More information available in How to ping by packet size to establish MTU (Technote #1086718)|
NSLookup set type=
|The type of record you are looking up. |
MX is the most common of records. If one does not exist, there must be an A record available. PTR is a pointer record and is only used for reverse DNS lookups.
|Traceroute from DOS|
In addition to the command prompt tools, we also have a GUI-based tool called NotesConnect (NPing). NPing is available through NotesConnect utility for use when troubleshooting IP connectivity issues
(Technote #4004434) and needs to be placed in the Domino program directory.
A final tool available directly from the Domino console is the simple Trace command:
This is an example of a successful trace route to a Domino Server
02/09/2010 11:10:46 AM Network: Determining path to server VEC
02/09/2010 11:10:46 AM Network: Available Ports: TCPIP
02/09/2010 11:10:46 AM Network: Checking normal priority connection documents only...
02/09/2010 11:10:46 AM Network: Allowing wild card connection documents...
02/09/2010 11:10:46 AM Network: Enabling name service requests and probes...
02/09/2010 11:10:46 AM Network: Address found in local TCPIP names table for VEC
02/09/2010 11:10:46 AM Network: Connecting to VEC over TCPIP
02/09/2010 11:10:46 AM Network: Using address 'vec' for VEC on TCPIP
02/09/2010 11:10:46 AM Network: Requesting IP Address for vec from DNS
02/09/2010 11:10:46 AM Network: DNS returned address 192.0.2.24 for vec
02/09/2010 11:10:46 AM Network: DNS returned address 192.0.2.12 for vec
02/09/2010 11:10:46 AM Network: DNS returned address 192.0.2.02 for vec
02/09/2010 11:10:46 AM Network: Connected to server VEC
When using these troubleshooting tools keep in mind the OSI Layers:
- Application – Where Domino resides
- Network – Network Interface Cards, Firewalls, etc.
- Data Link
The above tools work at the Network Layer. If one of them fails, Domino can almost be ruled out at this point.
Other areas that can be checked:
- If drives are on a SAN, you will want to review your Application and System Event Viewer logs and the SAN log. These files may contain errors that have been missed in the Domino Console.log file that show a break in connectivity between Domino and its data that is stored in the SAN.
- There is a free Network Sniffer available called Wireshark (formerly Ethereal) and is available on their website http://www.wireshark.org/. This utility can be placed on the user workstation or the server experiencing the connectivity issues. It is best to place this on an Aggregating (Full-duplex) Tap. Once a Trace file is captured, we can review to see if and where the conversation was broken or interrupted. All of this information, as well as, a helpful Tutorial is available on Wireshark's website.
Back to top
Steps to gather data for inital troubleshooting
If your users are receiving the error “Server not responding”, you may find that the information outlined above has not resolved your issue and need to engage IBM Technical Support for troubleshooting assistance. Prior to opening a Problem Management Record (PMR), there are several things that can be done so that once the PMR is opened, troubleshooting can begin immediately. Following these steps will result in a faster resolution as you will not have to open a PMR and then wait for response back with instructions on gathering data; you will already have much of the needed initial data for the Software Engineer that will be assisting you.
Notes.ini debug parameters
If you have worked with server-side IBM Technical Support before, you may have found that it is often recommended that you add a few notes.ini parameters to your Domino Server. The parameters to enable on your server for initial debugging are outlined below:
|Console_log_enabled=1||Enables console logging and creates the console.log file, which is located in the Domino\Data\IBM_TECHNICAL_SUPPORT folder. This file will log all the information shown in the console to an organized text file that IBM Technical Support can review.|
|Console_log_max_kbytes=102400||Used to set the maximum size of the console.log file. This value can be set to any value. |
Note: The value of the parameter is set in kilobytes and is commonly set between 50 and 100 MBs. Here, we have it set to a Maximum Size of 100 MBs. Once the maximum size limit is reached, the console log simply wraps the log information within the same file; it does not create a new console log file until the server is restarted.
|Debug_threadid=1||Outputs ThreadID information to the console.log file and often allows you to correlate information between the NSD, console log, and semdebug files. Once enabled, you will see a hexadecimal value placed before each line on the Console. Example output:|
[0F30:0002-13B8] 01/27/2010 09:31:40 AM Database Replicator started
[1E84:0002-0D18] 01/27/2010 09:31:42 AM Index update process started
[1A78:0002-1F14] 01/27/2010 09:31:44 AM Agent Manager started
|Debug_capture_timeout=1||Debug_capture_timeout=1 and Debug_show_timeout=1 are two parameters that are set to enable the capturing of semaphore information, which is output to the semdebug.txt file, which is also located in the Domino\Data\IBM_TECHNICAL_SUPPORT folder. Output in a semdebug file can be annotated by IBM Technical Support and will look similar to the following:|
ti="006B22ED-8625768C" sq="0008BDAB" THREAD [1434:028E-15C0] WAITING FOR WRITE LOCK ON RWSEM 0x0924 DbServerQueue semaphore (@012830FE) (R=0,W=1,WRITER=1434:13B0,OWNER=1434:13B0) FOR 30000 ms
ti="006B22F2-8625768C" sq="0008BDAC" THREAD [1434:0360-0324] WAITING FOR WRITE LOCK ON RWSEM 0x0924 DbServerQueue semaphore (@012830FE) (R=0,W=1,WRITER=1434:13B0,OWNER=1434:13B0) FOR 30000 ms
Notice the Thread IDs, which can often be correlated with the console log and NSD files. For more information, see Semaphores and Semaphore Timeouts (Technote #1094630)
All of these parameters can be enabled on the Server Console while the server is running by using the "set config" command (for example, set config debug_threadid=1). However, the two timeout parameters for semaphore debugging will not begin logging until the Domino Server is restarted, but the console log and Thread ID parameters are dynamic and do not require a Domino restart.
When you shut down Domino to make sure the timeout parameters begin logging, it is also recommended that you update the version of NSD running on the server. The download link to the latest version of NSD is found in Updated NSD for Domino releases (Technote #4013182). Once the latest NSD version is downloaded, extract the executable from the zip file. Shut down your Domino Server and double-click the executable and follow the prompts to direct it to the correct Domino install path. After the NSD update completes, Domino may be started again and does not require an OS restart.
Note: If you cannot find your release of Domino in the list within the technote, then your NSD does not need to be updated. Additionally, updating your NSD version will not affect your server version or hot fixes installed, as the NSD update only affects the NSD files, NSD.exe and NSD.sym.
Back to top
Information to gather for Technical Support
Now that the server has been configured to gather some of the important debug information, data can be gathered so that IBM Technical Support may begin troubleshooting this issue. If your users begin encountering the “Server not responding” error, the following tests should be completed prior to bringing the server down.
*IMPORTANT NOTE* Output from your results and responses to the tests and questions below should be captured in a Text Document (Symphony Word Document, Microsoft Word, etc.) so that it can be uploaded to Technical Support for review. Do NOT send this information in e-mail. If this information is not uploaded in advance of opening a PMR, you will be directed here to perform these steps.
1) Database open to server IP Address
Perform these steps on a Notes Client that is receiving the "Server not responding" error message when attempting to connect to a server:
a) Go to File -> Database -> Open and type in the IP Address of the "problem" Domino Server. Can you connect using the IP Address?
b) Go to File -> Database -> Open and type in the Fully Qualified Host Name (FQHN) of the "problem" Domino Server, for example, Boston/Mail. Can you connect using the FQHN?
2) Ping the server with 100 packets from a client PC
Open a command prompt on a problem user's workstation and type in the following command:
ping -n 100 -l 8192 ServerNameorIP <-- For example, ping -n 100 -l 8192 8.67.53.09
This will send 100 packets of data (8192bytes) to the destination server and report back on how many packets were lost and received. What is the resulting percentage received, lost, etc?
3) Telnet to the server over various ports
When the Domino Server exhibits this issue, try to connect to that server via telnet following the steps outlined below. For each step, note if the telnet connected successfully or not.
a) Try to connect to the server via telnet over Port 25 (SMTP) using the IP Address, for example, telnet 8.67.53.09 25
- If that is succeeds, try to connect to the server via telnet over Port 25 (SMTP) using the Internet Address, for example, telnet MailServer.example.com 25
b) Try to connect to the server via telnet over Port 1352 (Replication) using the IP Address, for example, telnet 8.67.53.09 1352
- If that is succeeds, try to connect to the server via telnet over Port 1352 (Replication) using the Internet Address, for example, telnet MailServer.example.com 1352
c) Try to connect to the server via telnet over Port 80 (HTTP) using the IP Address, for example, telnet 8.67.53.09 80
- If that is succeeds, try to connect to the server via telnet over Port 80 (HTTP) using the Internet Address, for example, telnet MailServer.example.com 80
4) Run a Trace from a Client PC
a) Using a Notes Client receiving the error message, go to File -> Preferences -> User Preferences -> Ports -> Trace
b) In the Trace Connections screen, type in the server's IP Address, hierarchical name, and Fully Qualified Host Name into the "Destination" field and select Trace. Provide all 3 traces for review. If you cannot connect, select "Copy" and paste the data into a text file. Save this file so that you can send this data to IBM Technical Support for review.
5) Use the NPing/NotesConnect Utility
Follow the steps outlined in NotesConnect utility for use when troubleshooting IP connectivity issues (Technote #4004434). As the technote states:
"NotesCONNECT (NPing) is a TCP/IP diagnostics tool designed to verify that a service on a given machine is available. This is accomplished establishing an end-to-end TCP/IP connection with the target host without using the Notes address book or address resolution logic. This tool requires that Notes be installed on the local machine.
This tool is ideal for determining if an TCP/IP connectivity problem is Notes related or an IP infrastructure problem. This connection that is established is a TCP/IP connection, as opposed to ping which uses an IP connection over ICMP. No application or service specific protocol/data is exchanged during the "ping" connection."
6) General Questions
Capture the answers the following questions:
a) If Domino Web Access is in use, can users open their mail files over the web?
b) Does the server seem to be running slowly? Or is it running fine but users just cannot connect?
c) If you remote into the server at the time of the issue, can you see the Server Console moving, or is it hung?
d) Issue a "show server" command on the server. Did it respond with output on the console?
e) Are you able to enter commands on the Domino console at all?
f) Do you see any error message(s) displayed?
g) Are you experiencing any network issues at the time of the problem?
h) Are any agent, backup, or maintenance utilities running at the time of the issue?
i) Are there any additional details regarding this issue that you feel are relevant?
7) Issue the following console commands
*VERY IMPORTANT PIECES OF DATA*
At the time of the issue, if the Server Console is responsive, IBM Technical Support will need you to run the following commands on the Server Console (this will provide some additional information, especially if the NSD does not complete successfully):
show tasks and/or show tasks time
8) Run a manual NSD with -detach switch
Multiple manual NSDs using the -detach switch will need to be run while the issue is occurring. Follow the procedure outlined in How to run a manual NSD for Notes/Domino on Windows (Technote #1204263) to do this. To summarize the technote:
Use a command prompt on the server to path out to the Domino\Data Directory. Type the path back to Domino\nsd -detach. Allow it to complete, and gather the data. See example below:
Do not issue an NSD -kill unless absolutely necessary to bring down the server. Always be sure an NSD -detach has completed running first.
9) Files to collect
After following the tests outlined above and running nsd -detach, browse to the IBM_TECHNICAL_SUPPORT directory inside the Domino Data directory and collect the following three files:
Note: If you restart Domino before obtaining these files, the console.log and Semdebug.txt will have been renamed with the same naming format as the nsd, except the date and time will be the time the server was last brought up prior to the problem. For example, if you started your server on 1/1/2010 at 12:00:30 PM and the server crashes on 1/15/2010 at 10:15:45 AM, the files will be named:
- console_servername _2010_01_01@12_00_30.log
- semdebug_servername _2010_01_01@12_00_30.txt
In addition to the Domino side files that are collected above, a few OS level files are also required.
- Capture a WINMSD Report and save it as a System Information file, .nfo. To do this, go to Start -> Run and type in winmsd. Once this opens, go to File -> Save and save the file as WinMSD.NFO (not .txt).
- Collect your Application and System Event Viewer logs and save them as .evt files. To do this, go to Start -> Control Panel -> Administration Tools -> Event Viewer. Once there, click Application, then Action -> Save Log File As... and save it as App.evt. Once you save the Application, click System, then Action -> Save Log File As... and save it as Sys.evt.
Back to top
Sending your data to Technical Support
To recap, the files that will be initially required by Technical Support are:
1) Text document with your results and responses to the above tests and questions
5) WINMSD Report
Place these seven files into a zip file and open a PMR with IBM Technical Support. If using the ESR Tool, then attach the zip file to the PMR upon opening it. If you open the PMR by calling 1-800-IBM-SERV, then take note of the Exact PMR Number provided. Once you have your PMR number, go to the following website http://www.ecurep.ibm.com/app/upload and fill in the fields as seen in the screen shot below with your information and click "Continue". On the next screen, browse to the zip file containing the seven files that you just created and submit it. The Software Engineer that will be troubleshooting your issue will be notified of the files uploaded to your PMR and will begin reviewing this information.
Back to top
Notes/Domino 8.x Information Center
"Server not responding"
"Server not responding" connecting to partitioned servers
Error: 'Server not responding' opening a Notes mail file (Technote #1377647)
Failover attempts fail with 'Server not responding' even though servers are available (Technote #1089786)
Error: '...The server is not responding...' appears on console every minute (Technote #1315508)
"Remote System No Longer Responding" Dialog Box when Sending a Mail Message (Technote #1198517)
Back to top