Skip to main content link. Accesskey S
  • Anonymous
  • Log on
  • Help
  • IBM logo
  • Lotus Notes and Domino wiki
  • All Wikis
  • Home
  • Community Articles
  • Product Documentation
  • Learning Center


Search

Advanced Search

Categories

Tag Cloud

  • 6.0
  • 6.5
  • 6.5.4
  • 6.x
  • 7.0
  • 7.0.2
  • 7.5
  • 7.x
  • 8.0
  • 8.0.1
  • 8.0.2
  • 8.5
  • 8.5.1
  • 8.5.2
  • 8.5.3
  • 8.5.x
  • 8.x
  • address
  • admin
  • administering
  • administration
  • administrator
  • attachment
  • best practice
  • Blackberry
  • cache
  • calendar
  • Client deployment
  • contacts
  • DAOS
  • database
  • database properties
  • db2
  • DCT
  • demo
  • deployment
  • deployment Notes
  • directory
  • document
  • documents
  • Domino
  • Domino Server
  • Domino Web Access
  • dwa
  • email
  • getting started
  • http
  • IMAP
  • inotes
  • install
  • iPhone
  • LDAP
  • logging
  • Lotus iNotes
  • Lotus Notes
  • Lotus Notes Traveler
  • Lotus Traveler
  • mail
  • mail file
  • max
  • media_notes
  • memory
  • message
  • messaging
  • MIME
  • moving_advanced
  • moving_cal
  • moving_mail
  • ND6
  • notes
  • Notes ID Vault
  • notes.ini
  • NotesBench
  • performance
  • plug-ins
  • Policies
  • preferences
  • R5
  • reference card
  • replication
  • router
  • Sametime
  • search
  • Security
  • server
  • smtp
  • table
  • text
  • tips
  • to do
  • Traveler
  • troubleshooting
  • upgrade
  • user
  • using
  • video
  • videofest
  • web
  • Widgets and Live Text
  • Windows
InformationInformation
You are currently viewing machine translated content. IBM translation might be available. Click IBM Translated Product Documentation to see what is available.X


Home > IBM Redbooks: Optimizing Lotus Domino Administration > 3.1 Monitoring
Rate this article 1 starRate this article 2 starsRate this article 3 starsRate this article 4 starsRate this article 5 stars

3.1 Monitoring 

expanded Abstract
collapsed Abstract
Monitoring a Domino environment means a repeating systematic collection and supervision of the environment, its process and individual tasks within. This article defines a monitoring strategy that covers the most common components of a Domino environment. This monitoring strategy should be treated as a base line that requires further customization to accommodate your specific Domino environment.
ShowTable of Contents
HideTable of Contents
  • 1 Monitoring Options
  • 2 What should be monitored?
  • 3 Monitoring Profiles for Domino
    • 3.1 Action
    • 3.2 Hint: Cell Phone Alert
    • 3.3 Mapping Response Level to Severity Level
    • 3.4 Profile: Generic
    • 3.5 Profile: Mail Server
    • 3.6 Profile: Web Server
    • 3.7 Profile: Domino Cluster
    • 3.8 Example of documenting Monitoring Profiles
  • 4 Domino Event Monitoring
  • 5 Further Reading
Table of Contents

Monitoring a Domino environment means a repeating systematic collection and supervision of the environment and its process and individual tasks within. The main functionality of monitoring is to identify if certain parameters of a system or environment exceed their defined boundaries and react in a defined way, for example, by alerting.

Due to the highly configurable nature of Domino and a variety of tasks it can perform, the aim of this article is to define a monitoring strategy that covers the most common components of a Domino environment. This monitoring strategy should be treated as a base line that requires further customization to accommodate your specific Domino environment.


Monitoring Options

We do not recommend a single solution or a single tool for all customers. Certainly large implementations of Domino or those with high usage have different needs than smaller Domino environments.

IBM offers the following monitoring options:

  • Server Monitor
    This is a basic monitoring option built into the Lotus Domino Administrator client. It is great for small environments or as an additional monitoring for one of the following solutions.
  • Domino Domain Monitoring (DDM)
    This is a server feature built into Lotus Domino and enabled by default. DDM is great for detecting, understanding and acting on run time issues.
    DDM probes log events. The administrators need to check the events that have been logged. Event generators and event handlers together with statistics collection can be used to monitor a Domino environment. For information on how they are handled, see:
    http://www.ibm.com/developerworks/lotus/tutorials/lsdom6stats/index.html
  • IBM Tivoli ITCAM for Applications
    This is part of an enterprise class monitoring solution, which is extremely scalable and capable to monitor much more than Lotus Domino alone.
    Its functionality can be deployed agent-based or agentless. It leverages best-practice models that focus on performance monitoring of key Lotus Domino components including servers, mail routing, replication, calendar, database, and clusters. Deeper Domino administrative capabilities are available with IntelliWatch
    http://www-01.ibm.com/software/tivoli/products/composite-application-mgr-applications/index.html
  • IBM Tivoli Intelliwatch Pinnacle for Distributed Systems
    This is an automated problem detection and correction, system-wide product configuration options, custom reporting capabilities, fault recovery and more. For more information, see:
    http://www-01.ibm.com/software/tivoli/products/intelliwatch/

Note, there are also 3rd party solutions on the market you can use which are not listed here.

What should be monitored?

A common mistake is to limit monitoring to the application Lotus Domino itself. A number of other components should also be monitored to ensure their work and to avoid spending time on analyzing issues which are caused by a completely different area.

This list provides a brief overview of which elements should be monitored:

  • Network (LAN / WAN)
  • Platform (Hardware, Operating system)
  • Storage & Backup environment
  • Application with its components

Within this article, we focus on the last part “Application” which in our case is Lotus Domino.

Monitoring Profiles for Domino

For ease of reference, different monitoring profiles should be defined within an environment. By grouping monitors in this way, it is possible to create profiles which are applicable to specific server configurations or functions in the Domino environment.

The following server roles shall be understood as an example:

  • Generic Domino Servers – Applied for all domino servers.
  • Mail Servers – Servers hosting end user mailboxes.
  • Web Servers – Servers hosting Web sites (HTTP services).
  • Cluster Servers – Servers providing cluster service.
  • Special Application Servers - Servers providing additional services.

Additional profiles can be defined based on your environment needs. Make sure to document additional server profiles and include a definition when to use which monitoring profile.

Action

Monitoring by itself is useless unless you take actions in case of an event or problem. These actions can be defined for each response level and also for each event in detail. Which action is the most important or convenient depends on your corporate environment.

In small implementations of Lotus Domino, it might be enough to mail the administrator to take action some time later. In large environments, there might need to have a solution which supports 24x7 monitoring and alerting. In this scenario, it is often required to integrate Lotus Domino monitoring results into an enterprise-wide monitoring system or help desk system.

Actions depend to different factors like the size of the environment and the availability of systems for alerting or ticket management.

Lotus Domino supports a number of notification actions which can be used further on to build custom integrations to 3rd party systems, for example, to automatically open a help desk ticket in your custom help desk application.

Figure 1 shows event handler methods.

If a Tivoli Enterprise Console is already available, then forwarding events to this console is recommended. This is most likely the case for medium and large Lotus Domino installations.


Hint: Cell Phone Alert

In order to receive cell phone alerts, there is a special cell phone configuration which some providers support. Providers can be requested to forward email messages to a phone as text messages (SMS). This allows notifying administrators via SMS when a critical event occurs. If properly configured, a cell phone can receive email messages in SMS form. The email must be addressed to a specific email address and domain name which is defined by your provider.

In most cases, it is your cell phone number followed by the provider’s gateway domain name. for example, 0123456789@.. Consult your cell phone carrier for details about how to enable this configuration. Be aware that this may add extra cost to your phone bill.

To notify multiple people about the same event, create a group in your Domino directory (e.g. “AdminAlert-HTTPServers”) which contains a list of these special email addresses.


Mapping Response Level to Severity Level

For further understanding the configuration details later on in this article, we map the response level to severity levels which are widely used in help desk systems.

    Response Level
    Severity Level
    Description
    N/A
    Sev 1
    Highest level of attention required, serious impact to business expected.
    Fatal
    Sev 2
    High attention required, system is functioning but may lead to service disruption if no action is taken
    Critical
    Sev 3
    Requires attention of a Domino administrator, if not handled in a timely manner this may lead to further problems
    Warning
    Sev 4
    Should be brought to administrators attention, but doesn’t require immediate attention
    Reset
    N/A
    Previous severity now stabilized

Profile: Generic

A default monitoring profile should be applied to every Domino server, regardless of it is designated role.

In general, where a monitor is considered important and critical enough that it will impact server function, the monitor interval can be set to 5 or 10 minutes. Otherwise an interval of hourly is predominant.

The Generic Domino Server Profile should include the following monitors:

    Monitor Name
    Response Level
    Trigger
    Details and Interval
    Mail Probe
    Warning (high)
    On time out
    Mail Delivery Monitoring probe

    Send Interval: 10 minutes

    Time out threshold: 10 minutes

    Server availability
    Fatal (non-clustered servers)

    Critical (clustered servers)

    Reset

    is unavailable

    is available

    TCP Event Monitor

    Every 5 min

    Task ‘adminp’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Hourly

    Task ‘event’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘amgr’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘stats’
    Failure

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘update’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘router’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘replica’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘DAOSMgr’
    Fatal
    Becomes Unavailable
    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘MTC’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Domino Statistic ‘Replica.Failed’
    Warning

    Failure

    Increase of 10

    Increase of 10

    Statistic Event Generator

    Alternative : Hourly

    Domino Statistic ‘Server.Sessions.Dropped’
    Warning

    Failure

    Increase of 50

    Increase of 100

    Statistic Event Generator

    Alternative : Hourly

    Domino Statistic ‘Server.Users’
    Warning

    Failure

    Increases above X

    Increases above Y

    Statistic Event Generator
    (X and Y depend on size of server)

    Alternative : Hourly

    Domino Statistic ‘Agent.Hourly.UnsuccessfulRuns’
    Warning
    Increase
    Above 0
    Statistic Event Generator

    Alternative: Hourly

    For details, see

    IBM Technote 1232603

    Domino Statistic ‘Agent.Daily.UnsuccessfulRuns’
    Warning
    Increase
    Above 0
    Statistic Event Generator

    Alternative: Daily

    For details, see

    IBM Technote 1232603

    ACL Change
    (names.nsf)
    Warning (high)
    On ACL change.
    Database Event Generator
    Monitor ACL Change

    File name: names.nsf

    Servers: all Domino servers in scope

    ACL Change
    (Admin4.nsf)
    Warning (high)
    On ACL change.
    Database Event Generator
    Monitor ACL Change

    File name: admin4.nsf

    Servers: all Domino servers in scope


Profile: Mail Server

The following monitors will be applied to all Domino servers designated as mail servers:

  • Generic Monitoring Profile.

In addition, the following monitors are recommended:

    Monitor Name
    Response Level
    Trigger
    Details & Interval
    Task ‘calconn’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Task ‘sched’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative : Every 10 minutes

    Domino Statistic ‘Mail.Dead’
    Warning

    Failure

    Reset

    Increases above X

    Increases above Y

    Decreases below X

    Statistic Event Generator

    Alternative : Hourly

    X and Y depend on size of environment

    Domino Statistic ‘Mail.Waiting’
    Warning

    Fatal

    Increases above X

    Increases above Y

    Statistic Event Generator

    Alternative : Every 10 minutes

    X and Y depend on size of environment

    Domino Statistic ‘Mail.Trans..Failures’
    Warning

    Fatal

    Reset

    Increases above 100

    Increases above 500

    Decreases below X

    Statistic Event Generator

    Alternative : Hourly

    X and Y depend on size of environment


Profile: Web Server

The following monitors will be applied to all Domino servers designated as Web servers:

  • Generic Monitoring Profile.
  • Domino Mail Server Monitors Profile (if needed).

In addition, the following monitor profile should be added:
    Monitor Name
    Response Level
    Trigger
    Details & Interval
    Task ‘http’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative: Every 10 minutes

    Domino Statistic ‘HTTP.PeakConnections‘
    Warning

    Failure

    Increases above X

    Increases above Y

    Statistic Event Generator
    (X and Y depend on size of server)

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603

    Domino Statistic ‘Domino.Threads.Active.Peak’
    Warning

    Failure

    Increases above X

    Increases above Y

    Statistic Event Generator
    (X and Y depend on size of server)

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603


Profile: Domino Cluster

Any Domino Servers configured as a Domino cluster should have the following Domino Cluster Server monitoring profile applied in addition to the basic profiles:

  • Generic Monitoring Profile.
  • Domino Mail Server Monitors Profile (if needed)

In addition, the following monitor profile should be added:

    Monitor Name
    Response Level
    Trigger
    Details & Interval
    Task ‘clrepl’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative: Every 10 minutes

    Task ‘cldbdir’
    Fatal

    Reset

    Becomes Unavailable

    Becomes Available

    Task Status Monitor

    Alternative: Every 10 minutes

    Domino Statistic ‘Replica.Cluster.Failed’
    Critical

    Reset

     
    1

    <1

    Statistic Event Generator

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603

    Domino Statistic ‘Server.Cluster.OpenRedirects.LoadBalanceByPath.Unsuccessful’
    Warning

    Critical

    Fatal

    Reset

     
    1

     
    5

    10

    <1

    Statistic Event Generator

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603

    Domino Statistic ‘Server.Cluster.OpenRedirects.LoadBalance.Unsuccessful’
    Warning

    Critical

    Fatal

    Reset

     
    1

     
    5

     
    10

    <1

    Statistic Event Generator
    (X and Y depend on size of server)

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603

    Domino Statistic ‘Server.Cluster.OpenRedirects.FailoverByPath.Unsuccessful’
    Warning

    Critical

    Fatal

    Reset

     
    1

     
    5

     
    10

    <1

    Statistic Event Generator

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603

    Domino Statistic ‘Server.Cluster.OpenRedirects.Failover.Unsuccessful’
    Warning

    Critical

    Fatal

    Reset

     
    1

     
    5

     
    10

    <1

    Statistic Event Generator

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603

    Replica.Cluster.WorkQueueDepth.Avg
    Warning
     
    500
    Statistic Event Generator

    Alternative: Every 60 minutes

    For details see IBM Technote 1232603

Example of documenting Monitoring Profiles

Server NameGenericMailHubWebClusterCustom App.Add-onMail Delivery
ServerA/ITSOYesYes

Yes

Yes
ServerB/ITSOYes


Yes
AntivirusYes
SametimeA/ITSOYes






Domino Event Monitoring


Although the profiles above can be implemented in different monitoring systems, it is possible to monitor Lotus Domino event from the Domino Monitoring Configuration (events4.nsf) database.

To prevent too much information from being shown, administrators should monitor all Domino events defined as Fatal, Failure or Warning (high), as defined in the table below. Each event type sub classifies each message with a severity level. These severity levels are defined, in the Lotus Domino server, as:

    Severity Level
    Response Level
    Meaning
    Fatal
    Fatal
    Imminent system crash
    Failure
    Critical
    Severe failure that does not cause a system crash.
    Warning (high)
    Warning
    Loss of function requiring intervention.
    Warning (low)
    N/A
    Performance degradation.
    Normal
    N/A
    Status messages.
    All Severities
    N/A
    All of the above messages.

For best results you may wish to change the following default settings:
Remember to document changed defaults, so you can reapply them after an upgrade of Lotus Domino to a higher version.
    Value
    Text
    Old event severity
    New event severity
    Reason
    0x02CC
    Database is being Compacted; Compact must finish before use.
    Warning (Low)
    Normal
    Compact task runs against (e.g.) a system database which is in use.
    0x0EA2
    Recovery Manager: Assigning new DBIID for (need new backup for media recovery).
    Warning (Low)
    Normal
    Backup software is requested to take a new full backup of this application.
    0x0EA8
    Recovery Manager: Restart Recovery complete. (/ databases needed full/partial recovery)
    Warning (Low)
    Normal
    This only indicates that the server has been restarted completely.
    0x0F13
    Database is currently being indexed by another process
    Warning (Low)
    Normal
    This is only informational.
    0x0F3B
    Full Text Error (FTG): Exceeded max configured index size while indexing document NT in database index
    Warning (High)
    Normal
    We do not want to FT large attachments - so this error is normal.
    0x1104
    Recipient user name not unique. Several matches found in Domino Directory.
    Failure
    Normal
    We cannot do anything about, because the recipient is chosen by the sender, and when sent offline or to email address not validated by Client.
    0x1105
    User not listed in Domino Directory
    Warning (Low)
    Normal
    Failure occurs every time a user writes wrong name in SendTo field.
    0x1149
    Error registering mail rule for database
    Warning (High)
    Normal
    Rules is controlled by users - we can not fix this every time - and it has no consequence for the server.
    0x131B
    Database created by
    Warning (Low)
    Normal
    This is only informational.
    0x131C
    Database deleted by
    Warning (Low)
    Normal
    This is only informational.
    0x1321
    ATTEMPT TO ACCESS SERVER by was denied
    Warning (Low)
    Normal
    Many users may try to access Admin server or servers with limited access, e.g. because they have had access before.
    0x1323
    ATTEMPT TO ACCESS DATABASE by was denied
    Warning (High)
    Normal
    Normal (ex. Users try to see calendar details and does not have any public access or higher).
    0x1357
    Failing over from for replica id , directing open to
    Warning (Low)
    Normal
    Information about an user has been redirected to cluster-server.
    0x135C
    Failing over from , directing open to
    Warning (Low)
    Normal
    Information about an user has been redirected to cluster-server.
    0x135E
    Unable to redirect failover from
    Warning (Low)
    Normal
    Information that a database was not able to failover to cluster-server
    0x138C
    Operation cannot be performed at the current time - database compaction in progress.
    Warning (Low)
    Normal
    Normal under compact
    0x1519
    A DDM report document (NoteID 0x) could not be opened.
    Warning (High)
    Normal
    If a DDM report has been manually deleted, and then another instance of the error is logged, then this error is coming.
    0x1614
    Replicator was unable to initialize (from ):
    Failure
    Normal
    Failure occurs every time a replica stub is made.
    0x19FC
    Your account is locked out; see your system administrator to reset it
    Warning (Low)
    Normal
    Many users forget to change their password in time; we consider this to be fixed by the user himself.
    0x330A
    documents ( bytes) indexed in
    Warning (Low)
    Normal
    Indexing is normal.
    0x9AC0
    LDAP Server: Warning: Invalid credentials specified on Bind request, DN is
    Warning (Low)
    Normal
    Normal behavior, see IBM Technote 1219847.
    0x331A
    Database was marked for delete and has been deleted
    Warning (Low)
    Normal
    This is only informational.
    0x3320
    Admin Process: does not appear in the ACLs of any databases designating as their Administration Server
    Warning (Low)
    Normal
    AdminP process is normal.
    0x3327
    does not appear in the Readers or Authors fields of any databases designating as their Administration Server
    Warning (Low)
    Normal
    AdminP process is normal.
    0x3346
    The database is transactionally logged. A full backup of it needs to be performed on for media recovery.
    Failure
    Normal
    Backup software is requested to take a new full backup of this application.
    0x336D
    Router: Message contains no recipients
    Warning (High
    Normal
    Information on missing recipients in a message.
    0x33C4
    does not appear in the unread lists of the databases on .
    Warning (Low)
    Normal
    AdminP process is normal.
    0x33E3
    Admin Process: does not appear in design elements of any databases designating as their Administration Server
    Warning (High)
    Normal
    AdminP process is normal.
    0x3032
    Not all specified languages were found in design template
    Normal
    Warning (High)
    This error has to be handled, otherwise refresh design of the database fails.

Further Reading

For more information about how to use and configure DDM, refer to the following IBM Technote and the IBM Redpaper:

  • http://www-01.ibm.com/support/docview.wss?rs=463&uid=swg27009312
  • http://www.redbooks.ibm.com/abstracts/redp4089.html

expanded Article information
collapsed Article information
Category:
IBM Redbooks: Optimizing Lotus Domino Administration
Tags:

This Version: Version 2 January 31, 2011 4:19:41 PM by Amanda J Bauman  IBMer

expanded Attachments (0)
collapsed Attachments (0)

 


expanded Versions (1)
collapsed Versions (1)
Version Comparison     
Version Date Changed by               Summary of changes
This version (2) Jan 31, 2011 4:19:41 PM Amanda J Bauman  
expanded Comments (0)
collapsed Comments (0)
Copy and paste this wiki markup to link to this article from another article in this wiki.
Go ElsewhereStay ConnectedSubscribe to RSSHelpAbout
  • All Lotus and WebSphere Portal wikis
  • IBM developerWorks
  • IBM Software support
  • IBM Social Business User Experience Blog
  • IBMSocialBizUX on Twitter
  • IBMSocialBizUX on Facebook
  • Lotus product forums
  • IBMSocialBizUX blog
  • IBM Collaboration Solutions
  • Recently added feedRecently added
  • Recently edited feedRecently edited
  • Recently added comments feedRecently Added Comments
  • Wiki Help
  • Forgot user name/password
  • Wiki design feedback
  • Content feedback
  • About the wiki
  • About IBM
  • Privacy
  • Contact IBM
  • IBM Terms of use
  • Wiki terms of use