IBM®
Skip to main content
    Country/region select      Terms of use
 
 
   
     Home      Products      Services & solutions      Support & downloads      My account     

developerWorks  >  Lotus  >  Forums & community  >  Best Practice Makes Perfect

Best Practice Makes Perfect

A collaboration with Domino developers about how to do it and how to get it right in Domino

Man looking for documentsGetting back to the subject of performance, let's consider again an agent to process lots of documents in LotusScript (or Java).
The key to getting a well-performing script is to access as few documents as possible -- for read or write -- and if you have to access them, at least do so in the fast C (and C++) code of the Notes client or Domino server, rather than in your script.
There are three reasons to push document access into the C layer:

  • Because it's compiled into machine code, C code runs faster than LotusScript..
  • The view indexer makes use of a "summary" version of the document -- one where any large fields, such as rich text, are not loaded into memory. This causes less disk access and uses less memory.
  • Taking advantage of compilations of document information (view indices and full-text index) let you locate relevant documents without having to touch non-matching documents at all.
Just as you have the most slack if you take advantage of work previously done by yourself and others, so too will your script have a more relaxing run if you use as much pre-calculated information as you can. Once you create a NotesDocument object, you lose, because it pulls all the document information into memory. The rule of thumb is, your script will run fastest the fewer NotesDocument objects you create.
Let's start out, then, with an example of the worst possible way to do it:
     
' don't do this! This is how not to do it!

1.        
Dim session As New NotesSession
2.        
Dim db As NotesDatabase
3.        
Dim coll As NotesDocumentCollection
4.        
Dim lngPos As Long
5.        
Dim doc As NotesDocument
6.        
Set db = session.CurrentDatabase
7.        
Set coll = db.AllDocuments
8.        
For lngPos = 1 To coll.Count
9.        
Set doc = coll.GetNthDocument(lngPos)
10.        
doc.SomeField = "Nevermind"
11.        
Call doc.Save(True, False)
12.        
Next
Before getting into the details of how, I like to think about whether the task is really worth doing. You can save much work by just not doing it. But this is just an example, so let's ignore for a moment that this script is a waste of time, and figure out how to waste time most efficiently.
The first problem is that this script touches every document in the database. On line 7, it uses AllDocuments, then scans that collection. So it loads into memory all the data in the whole database, when really you maybe just needed to update three documents (the other 200,000 already had the correct field value). Even worse, the script saves every document in the database, even those that didn't need changing. This not only makes the agent slower, but also makes everything else about the application slower. Every document you change causes extra work of view indexing, replication, and full-text indexing, as well as increasing the chance of save conflicts with users who were working on the documents at the time.




(Extra credit for you if you spotted the use of GetNthDocument, which is an easy way to get really poor performance. Use GetNextDocument to iterate thru a collection.)

There are a few different ways to speed up the search, and target just the documents you need to change. I won't describe them all in detail now, but I plan to cover them in future entries.
  • Use the timestamp of the document. You can think of what's happening like this: the agent has an "unread list". UnprocessedDocuments is a collection of only the agent's unread documents. UpdateProcessedDoc marks a document as read. And of course, if it's edited by someone else, it becomes unread again. This is a very quick way to find documents of interest, provided you're only interested in modified ones.
  • Use a full-text index. If the database is full-text indexed, you can limit the selection of documents by a full-text search; for instance in this example the search would be not ([Somefield] = "Nevermind"). This is done either in the Document Selection section of an agent, or using FTSearch method. The drawbacks here are that (1) you don't always have a full-text index, and (2) not every search is possible; for instance, there is no test for exactly equal, so this example search would not return documents where Somefield contains "Nevermind" but also other information, e.g. "Nevermind, Sam".
  • Use a macro search. NotesDatabase.Search in LotusScript, or a SELECT formula in a macro agent, lets you identify documents of interest very precisely, albeit not very efficiently, since the entire document has to be loaded into memory to evaluate the selection formula on it. About the best that can be said for it, is that because the iteration through the documents happens in C code rather than a LotusScript loop, it's not as slow as it might be.
  • Use a view. You find or create a view that contains just the documents you want (or that contains a column that's sorted so you can locate the documents by searching for a key). The view selection formula is also macro code, of course, but views have two big advantages over a macro search: first, the view index remembers which documents are in there from before, so it only has to consider documents modified since the view was last used. Second, the view indexer only uses the summary fields of the document, so the rich text of the document doesn't have to be loaded into memory.
  • Combined approaches. Often you can use a quick method (such as timestamp) to limit the number of documents you must consider, then another method (full-text) to narrow it down further, then finally iterate through the remaining documents to identify the ones of interest.
Choosing the best-performing approach generally involves considering your data and thinking about how much work must be done, total.

Andre Guirard | 14 March 2007 10:00:06 AM ET | Caribou Coffee, Plymouth, MN, USA | Comments (4)


 Comments

1) Finding the Documents You Need
Chris Blatnick | 3/17/2007 3:22:45 PM

Andre...it's really great to see you blogging! I've learned many invaluable tips from your postings over the years and I expect that I will learn many more from this site. Thanks for everything!

Cheers,

Chris

{ Link }

2) Finding the Documents You Need
Charles Robinson | 3/17/2007 3:29:23 PM

Andre, I seem to recall that the UnprocessedDocuments collection gets reset every time you save an agent. Is that the way it really works, or did I mess something else up? :)

Since you're talking about performance, I wanted to ask about your use of coll.Count on Line 9. In some languages this would cause coll.Count to be looked up on every iteration. Does LotusScript do this also, or is it cached?

3) re: Finding the Documents You Need
Andre Guirard | 3/18/2007 9:03:22 AM

The limit of the For loop is calculated only once at the beginning, as you can see for yourself by running the following code:

Dim ind%, limit%

limit = 6

For ind = 1 To limit Step 2

limit = limit + 1

Next

Print ind

The output is 7; if the limit expression were calculated once at the end of the loop, it would be 9; if it were recalculated every time through, it would be 11.

4) Finding the Documents You Need
Johannes Madsen | 4/9/2007 2:36:01 AM

The problem this agent is trying to solve is not unusual (although a bit simplified). The agent could e.g. be needed in a database which is already in operation and where a change of design has been made.

I have written houndreds of such agents, but not in the way you propose. In a case like this, I would always use a formula agent and specify that it should run on all documents in the database.

The formula would look:

SELECT SomeField != "Nevermind";

FIELD SomeField := "Nevermind";

Or better:

SELECT Form = "SomeForm" & SomeField != "Nevermind";

FIELD SomeField := "Nevermind";

This is a much simpler solution to your problem and is also much more efficient than anything you can code using LotusScript. It doesn't require fulltext indexing og views.

I will always solve problems using formula, and only use LotusScript if the problem is very complicated.

 Add a Comment
Subject:
   
Name:
Comment:  (No HTML - Links will be converted if prefixed http://)
 
Remember Me?     Cancel

Search this blog 

Disclaimer 

    About IBM Privacy Contact