IBM®
Skip to main content
    Country/region select      Terms of use
 
 
   
     Home      Products      Services & solutions      Support & downloads      My account     

developerWorks  >  Lotus  >  Forums & community  >  Best Practice Makes Perfect

Best Practice Makes Perfect

A collaboration with Domino developers about how to do it and how to get it right in Domino

I got a question in email recently that I thought should've been addressed in my performance whitepaper -- but it wasn't, so I figured it would be worth mentioning here.

The question: why does NotesView.FTSearch take so much longer than NotesDatabase.FTSearch to find the same set of documents?

What's going on: This excellent article describes the performance characteristics of different search techniques based on empirical testing.

The full-text index is based on documents and their contents only; it contains no information about views. Therefore, to do a full-text search within a view is a two-step process. The NotesView.FTSearch method must first do the search, which returns all the matching documents within the database. Then, it has to check each matching document to see whether the document occurs within the view. The benchmark article shows that the amount of time this takes, depends on how far down in the view the document appears, so it's apparently just iterating through the view entries looking for one that matches the noteID of each document in the result set.

Obviously, NotesDatabase.FTSearch doesn't do the second step of testing whether the documents appear in a particular view. So, you get your results faster, even if they might contain some documents you don't want.

If you decide to search the database instead of the view, you might need a more complicated FTSearch expression to limit the results to those you would have gotten from searching the view, and this might take longer (or might not even be possible -- the search expression can't do everything you can do with a formula). So performance-wise, you might still be better off searching the view, but it's worth considering.

If the view you would've liked to search contains nearly all the documents in the database, and if you were going to iterate through the results anyway, you might search the database with your original search expression, and just add a test in your code to skip over any documents that you would've wanted excluded. ( If doc.Form(0) = "FormIWanted" Then (process the document) End If )

A fourth alternative may be useful in some situations: because we have set operations on document collections, if the view you wanted to search contains all the documents except those in some other view (vwKeywords, say), you could first search the database, then get the collection of documents from vwKeywords, and use the Subtract method to remove them from the search results. The nice thing about it is that the iteration takes place in the fast C code, and you don't have to "crack open" a note to find out that you didn't want it. It seems to me this might be faster than a view.FTSearch in cases where the result set and the set of documents to be excluded are relatively much smaller than the documents in the view you wanted to search. Let's do the math for a sample case to see how this might work:

  • The view you wanted to search contains a million documents.
  • On average, you think 300 documents in the database might match your search.
  • 295 of those are likely to actually be in the view.
  • Your "inverse" view, containing all the documents that are not in the view you wanted to search, contains 4000 documents.
  • NotesView.FTSearch must compare 295 results against an average of 500,000 view entries each to determine that they are in the view, and 5 results against 1,000,000 documents each to determine that they are not in the view, a total of (295*500000)+(5*1000000) = 152,500,000 comparisons.
  • If you use NotesView.AllEntries to get the collection of 5000 entries in the inverse view and Subtract those from your NotesDatabase.FTSearch results, Subtract must compare 295 results against 4000 view entries each to determine they don't need to be removed, and 5 results against an average of 2000 entries to determine that they do need to be removed (this is assuming a very simple-minded algorithm, which is usually the safest assumption :-) ). That's 295*4000 + 5*2000 = 1,190,000 comparisons, a much more civilized number.
So to summarize, there are four alternatives, each of which may be the best performer in different situations.
  • Use NotesView.FTSearch("your search expression")
  • Use NotesDatabase.FTSearch("(your search expression) and (additional expression that duplicates view selection formula)")
  • Use NotesDatabase.FTSearch("your search expression") and then use a test in your code to skip over unwanted results.
  • Use NotesDatabase.FTSearch("your search expression") and then if there's a view that contains all the documents you don't want in your results, get all the entries in that view and subtract them from your collection.
If the performance of your search is an issue, you might try each of these to see which gives the best results. Also bear in mind that the size of the FTSearch result set is limited (generally to 5000 documents) and this might make some of these techniques not work in your case.

And of course, don't forget to consider alternatives to FTSearch. If a view search (NotesView.GetAllDocumentsByKey) can get you the documents you want, that's generally the fastest way.

Andre Guirard | 8 December 2009 10:45:14 AM ET | Caribou Coffee, Minnetonka, MN, US | Comments (5)


 Comments

1) Thank you - Andre Guirard
Palmi | 12/8/2009 1:13:16 PM

Very well explained - thanks

2) Thanks!
Stein | 12/9/2009 6:28:14 AM

Thank you!

3) *Sigh*...
Erik Brooks | 3/2/2010 12:09:15 AM

Reading this makes me cry and long for a ?SearchViewEntries command. Again. :-(

4) FYI: Readers fields
Sjef Bosman | 6/21/2011 10:07:42 AM

If you have documents with Readers fields, NotesDatabase.FTSearch doesn't respect the rights and will find all documents, whereas NotesView.FTSearch correctly filters out unreadable documents.

 Add a Comment
Subject:
   
Name:
Comment:  (No HTML - Links will be converted if prefixed http://)
 
Remember Me?     Cancel

Search this blog 

Disclaimer 

    About IBM Privacy Contact