Abstract: This article explains hard disk fragmentation and how to troubleshoot resultant IBM® Lotus® Domino® server performance issues. We focus on Domino versions 6.5x, 7.x, and 8.x running on the Microsoft® Windows® 2000 and 2003 platforms.
In general, the administrators who take care of Lotus Domino are not responsible for maintaining the operating system, and they often only discover the OS needs maintenance when advised by someone external to the company.
Also, administrators may observe only the total percentage of the hard disk fragmentation; if the percentage is low (say, 1-10%), administrators don’t think the OS needs maintenance. In fact, however, just one fragmented file can cause Domino server performance issues.
The hard disk is one of the slowest components of a computer system, so it is often the cause of performance issues. An optimally operating disk sub-system is key to performance.
NOTE: Hard disk fragmentation also affects Lotus Notes® clients, so you should verify the fragmentation in your workstations as well.
What is fragmentation?
Fragmentation occurs when the OS cannot or will not allocate enough contiguous space to store a complete file as a unit, and instead puts parts of it in gaps between other files. (Usually those gaps exist because they formerly held a file that the OS has subsequently deleted, or because the OS allocated excess space for the file in the first place).
Larger files and greater numbers of files also contribute to fragmentation and consequent performance loss. Likewise, modifications and writing to files/databases will also cause fragmentation. Lotus Domino is doing this all the time; as new mail arrives and mail is sent, the databases are in a constant state of flux.
Common problems caused by file fragmentation
The most common problems resulting from file fragmentation are as follows:
Crashes and system hangs/freezes
Slow boot up and computers that won’t boot
Slow backup times and aborted backup
File corruption and data loss
Errors in programs
RAM use and cache problems
Hard drive failures
Overall sluggish Domino server performance
Long-held locks in the console log and semaphore timeouts
You should verify the hard disk fragmentation, including all parti tions, even if a partition does not have any Domino files. Look for all files with more than 600 fragments; if you find any file with more 600 fragments you should monitor the hard disk performance.
If you find any file with around 2,000 fragments, you should run maintenance. For example, if the antivirus log has more the 2,000 fragments, this can be enough to cause Domino performance issues—and it will be worse if Domino’s files are fragmented.
Example of Microsoft Windows Defragmenter report
To get the fragmentation report in Windows 2003:
1. Go to All Programs > Accessories > System Tools > Disk Defragmenter
2. Select the partition for which you want the report.
3. Click the Analyze button; the Analysis Report displays (see figure 1). The data highlighted in red is the crucial information.
Figure 1. Analysis Report
Is there any debug to detect disk fragmentation?
No, not directly in Lotus Domino, but you can look for symptoms that indicate disk fragmentation. Specifically, you can enable semaphore debugging, to look for sl ow processing and databases being locked for extended periods of time.
To do this, add the parameters below in the Domino server’s Notes.ini file:
Also, when a performance issue occurs, you can manually run the NSD twice; if you see semaphore 0x0244 in the semdebug.txt file, it could indicate a fragmentation issue.
If you have any questions, you can contact IBM Support for further investigation (be sure to provide the two NSDs, console.log, and semdebug.txt.) Also, look for long-held locks in the console log. When transaction logging is enabled, long-held locks replace the 0x0244 database semaphore timeout.
Examples of semdebug:
06/02/2008 03:07:19 AM ZW3 sq="0000590B" THREAD [0FF4:0002-10D4] WAITING FOR READ LOCK ON FRWSEM 0x0244 database semaphore (@104982E3) (f:\lotus\domino\dat a\ade\2006\xxx6.nsf) (R=0,W=2,WRITER=104C:106C,1STREADER=0000:0000) FOR 30000 ms
06/02/2008 03:07:42 AM ZW3 sq="00005938" THREAD [0C08:0002-0C04] WAITING FOR READ LOCK ON FRWSEM 0x0244 database semaphore (@120982E3) (f:\lotus\domino\data\ade\2006\ xxx6.nsf.nsf) (R=0,W=2,WRITER=104C:106C,1STREADER=0000:0000) FOR 30000 ms
06/02/2008 03:07:49 AM ZW3 sq="00005939" THREAD [0FF4:0002-10D4] WAITING FOR READ LOCK ON FRWSEM 0x0244 database semaphore (@104982E3) (f:\lotus\domino\data\ade\2006\ xxx6.nsf.nsf) (R=0,W=2,WRITER=104C:106C,1STREADER=0000:0000) FOR 30000 ms
Disk fragmentation is caused by the OS; it is not a Lotus Domino/Notes issue. Data is written to the hard disk as files and, depending on the available free disk space, the data/file may be split up and located in fragmented pieces in different areas of the disk. This can cause it to take longer to read/get this data, as the disk heads must jump around to retrieve the data.
On the Windows OS level, you can troubleshoot for fragmentation using the Windows Performance Monitoring Tool (PerfMon), which lets you set up data collection for a certain period, and then go back and analyze it. PerfMon can be used to collect information and statistics about many aspects of the running server, including CPU use, process level, thread level, disk, memory, and network.
Of course, you would want to look at disk data to identify fragmentation and symptoms of fragmentation. The main statistic for disk fragmentation is Split IO/Sec, which reports the rate at which I/Os to the disk are split into more then one I/O, or fragmented. If split I/Os are greater than 10% of the total I/Os, there's probably a performance hit.
The Disk Transfers/sec is the rate of the read and write operations on the disk, so you can check the ratio of the Split IO/Sec to the Disk Transfers/sec.
Another easy way to determine whether your disk I/O is slow is to look at the disk statistics in PerfMon that show the times to read, write, and transfer. We recommend time to read to be <15 msec; greater than that indicates poor disk performance. In the example shown in figure 2, the disk read time is averaging 76msec, which is not good.
Figure 2. Disk read times
Figure 3 shows an example of disk fragmentation statistics, in which the Disk Transfers/sec is averaging 59, and the Split IO/Sec is averaging 12. Thus the ratio sho wing fragmentation is about 20%, meaning that one out of every five reads to disk must do extra work to retrieve the data.
Figure 3. Disk fragmentation stats
Here are some other tips to help avoid file fragmentation:
You should have a minimum of 20–30% free space on the hard disk:
Normal Domino maintenance (the compact tool) attempts to reorganize data within databases for improved access. When you do a copy compact, the database is copied into a new file. If there is contiguous free space available, the newly compacted database is written out contiguously with little or no fragmentation.
The virtual memory should be at a fixed value (see figure 4).
Figure 4. Fixed Memory example
Administrators should check the hard disk regularly (at least once a month).
· Run the Microsoft Disk Defragmenter tool on a regular basis:
We suggest running the tool once a week, but you should monitor the files on the server; if any files reach about 2,000 fragments we suggest running the Microsoft Disk Defragmenter tool as soon is possible.
· If you cannot run the Microsoft Disk Defragmenter tool, move the files to another partition and then move them back to the initial partition. This procedure will reduce the file’s fragmentation.
· There are several third-party tools available to defragment disks, even while the application is running:
Note, however, that IBM does not support nor recommend any specific third-party software; the links above are just examples of disk defragmenter software.
How a disk becomes fragmented
Consider a scenario, excerpted from this Wikipedia page, in which an otherwise blank disk has five files, A, B, C, D and E, each using 10 blocks of space (here, a block is an allocation unit of that system; it could be 1K, 100K, or 1 MB and is not any specific size). On a blank disk, all these files will be allocated one after the other (see example 1 in figure 5.)
Figure 5. Example fragmentation scenario
If file B is deleted, there are two options: Leave the space for B empty and use it again later, or compress all the files after B so that the empty space follows it. However, this could be time consuming if there are hundreds or thousands of files that must be moved, so in general the empty space is simply left there, marked in a table as available for later use, and then used again as needed (example 2 on figure 5).
Now, if a new file, F, is allocated seven blocks of space, it can be placed into the first seven blocks of the space formerly holding the file B, and the three blocks following it will remain available (example 3 in the figure)
If another new file, G, is added and needs only three blocks, it could then occupy the space after F and before C (example 4 in the figure). Now, if F subsequently must be expanded, since the space immediately following it is no longer available, there are two options:
Add a new block somewhere else and indicate that F has a second extent, or
Move the file F to someplace else where it can be created as one contiguous file of the new, larger size.
The latter option may not be possible, however, as the file may be larger than any one contiguous space available, or the file could be so large the operation would take an undesirably long time. Thus the usual practice is simply to create an extent somewhere else and chain the new extent onto the old one (example 5 in the figure).
Repeat this practice hundreds or thousands of times, and eventually the file system has many free segments in many places, and many files may be spread over many extents. If, as a result of free space fragmentation, a newly created file (or a file that has been extended) must be placed in a large number of extents, access time for that file (or for all files) may become excessively long.
The process of creating new files, and of deleting and expanding existing files, may sometimes be colloquially referred to as “churn,” and can occur at both the level of the general root file system and in subdirectories. Fragmentation not only occurs at the level of individual files, but also when different files in a directory (and maybe its subdirectories) that are often read in a sequence start to "drift apart" as a result of churn.
A defragmentation program must move files around within the free space available, to un-do fragmentation. This is a memory-intensive operation and cannot be performed on a file system with no free space. The reorganization involved in defragmentation does not change logical location of the files (defined as their location within the directory structure).
Tips for Lotus Notes client workstations
To determine whether the Microsoft Windows Vista OS has fragmentation issues, follow these steps:
1. Open a Command Prompt as Administrator and select Start > All Programs > Accessories.
2. Right-click the Command Prompt and choose “Run as administrator”. Obviously you need admin privileges to do this.
3. Type the following to see how much your hard drive is fragmented (in this example, your C drive): defrag c: -a -v
4. Vista will display a “Percent file fragmentation” statistic and, at the bottom, as message indicating whether you need to defragment the drive.
5. To fully defragment your C drive, type the following: defrag c: -w
· a performs fragmentation analysis only
· v specifies verbose mode, in which the defragmentation and analysis output is more detailed
· w performs full defragmentation, attempting to consolidate all file
fragments, regardless of their size.
An example is shown in figure 6 below.
Figure 6. Example hard drive defrag
Before you start the troubleshooting process in Lotus Domino, you should determine whether the operating system needs any maintenance, because you cannot have good Domino server performance if the OS is not regularly maintained. In general, you should run the maintenance if you find any file with more than about 2,000 fragments on any partition.
“Computer Stops Responding with Event ID 2022”:
“ACC: Defragment and Compact Database to Improve Performance”:
SQL Server Magazine UPDATE TIP OF THE WEEK:
“Identifying Common Reliability/Stability Problems Caused by File Fragmentation”:
IBM Redbooks publication, “Tuning Windows Server 2003 on IBM System X Servers”:
“Measuring The Impact Of Fragmentation On NT System Performance - Microsoft's Windows NT operating system - Product Support”:
About the author
Leonardo Caldas has been a Level 2 Support Engineer with IBM Lotus Support since November 2005. He works on the LATAM team in North America and has the following Domino/Notes 8 certifications: IBM Certified Advanced System Administrato r - Lotus Domino 6.5.x/7.x/8.x and IBM Certified Application Developer - Lotus Domino 8.x.
Before beginning his career at IBM, he worked for an IBM partner in Brazil from 1997 to 2005, starting in Domino/Notes Support L1 and L2. In 2000 he became a manager for the Support team, rising to a technical director in 2004.