Skip to main content
    Country/region select      Terms of use
     Home      Products      Services & solutions      Support & downloads      My account     

developerWorks  >  Lotus  >  Forums & community  >  Best Practice Makes Perfect

Best Practice Makes Perfect

A collaboration with Domino developers about how to do it and how to get it right in Domino

In response to one of my earlier entries, Johannes Madsen comments (edited for length):

The problem this agent is trying to solve is not unusual (although a bit simplified). The agent could e.g. be needed in a database which is already in operation and where a change of design has been made.
.... In a case like this, I would always use a formula agent and specify that it should run on all documents in the database.
The formula:
SELECT Form = "SomeForm" & SomeField != "Nevermind";
FIELD SomeField := "Nevermind";

This is a much simpler solution to your problem and is also much more efficient than anything you can code using LotusScript. It doesn't require fulltext indexing or views. I will always solve problems using formula, and only use LotusScript if the problem is very complicated.

For a one-time change of design, where you just need a throwaway agent to run once, I agree that simpler is better. Computer time is far less expensive than developer time, and for such an agent, generally performance is not a real concern. And as I've said, I'm a big fan of macro language; it lets you do a lot with a few lines.

However, while the above agent is more efficient in terms of time to develop, I can't agree that it's the fastest to execute. For an agent that needs to run regularly, and if performance is important, it's better to use full-text index or a view to find your documents.

CPU cycles are cheap, so performance becomes a problem only when there's a user twiddling her thumbs until the agent finishes, or when there's a limited time during which the task must be completed (especially if it's late at night and there are many agents running then -- you don't want to hog the processor). Let's assume we have such a situation.

The agent selection SELECT Form = "SomeForm" & SomeField != "Nevermind" iterates through every document in the database, loads the entire document into memory, and evaluates the expression to decide whether to continue executing statements of the macro for that document. This work must be done while the agent is running, and once the agent finishes, the information about which documents met the selection criteria is discarded -- next time the agent runs, it all has to be done over.

Anthropomorphic animal cartoonLet's consider the full-text index. It's simple in concept and the benefits are obvious. End users find it convenient to quickly locate the documents they want, so you generally want to have a one anyway for their sake. If it exists, it's an extremely efficient way to locate a set of documents that contain specific values in specific fields. The work of arranging all this information has already been done. Why, why, why would you not take advantage of it? Make the same work (of indexing the documents) serve multiple purposes. This is very much in keeping with the philosophy of slack. And it's really easy to do. Even for Johannes' example, you greatly enhance performance by taking out the SELECT statement and putting the following into the Document Selection pane:

([Form] = "Someform") and ([SomeField] = "Nevermind")

That was not so hard! It's still just two lines of code, and rather than cracking open every note in the database to look inside, you only have to open the notes that match your selection criteria.

Of course, full text indexing has its limitations. The size of the result set is limited, and the query language is less precise (can't search by time of day, no distinction between "field contains value" and "field exactly equals value"). How can a view help?

Ideally, you would use a view that either has the identical selection formula, or uses part of the selection formula and is sorted by the other fields of interest (for instance, the view selection formula is SELECT Form = "SomeForm" and the first sorted column is Somefield). There are many reasons a view is nice for finding documents for your agent to process:

  • The index is maintained on disk, so only those documents that were created or modified since the last time the view was used, must be examined at the time the agent runs. Rather than discarding all the work you did to figure out what documents are in the view and in what order, results are stored and can be used by multiple executions (or multiple agents). How important a factor this is, depends on how often documents are modified -- but as I have written before, if you modify documents too often, that's going to cause all sorts of performance problems and you should think about whether all those modifications are necessary.
  • Like the duck in the cartoon, view indexing isn't interested in the entire content of the document. It uses just the summary fields (in most cases that means everything except rich text). In applications that use rich text at all, the rich text tends to be more than half of the document, so you're saving a considerable amount of time from not having to read it in. So, depending on how much rich text is in the documents and how many columns in the view, the view selection formula may be faster than the same SELECT statement used in an agent, even if it is creating the view index from scratch (which is unusual).
  • Suppose the view selection formula is not exactly the same as the agent selection criteria. This might happen if the view is not dedicated to one agent, or when the selection criteria are not the same for every execution (e.g. if you let the user select the value of SomeField that they wanted to look for). You can still narrow down the selection to the exact documents you need by searching sorted columns within the view, which is very fast. Again, you can find the documents you need without having to open any documents that you don't need.
  • Unless the view is dedicated to one agent, it's probably used for other purposes, so chances are the number of documents that need to be reindexed is less than the number of documents modified since the last run of the agent.

Andre Guirard | 9 April 2007 06:05:00 PM ET | Plymouth, MN, USA | Comments (4)

Search this blog 


    About IBM Privacy Contact