This article explains how various searches work in Profiles.
Related information, and the the topics cited in this article, can be found in the Connections 4 Product Documentation
section of this wiki.
I. Search-able contents
All Profiles content is searchable, regardless of whether the data content is entered by the end user or obtained from external data sources such as LDAP. This includes the following:
1). All data for user profiles, such as name, phone numbers, title, departments, city, and so on. Refer to "Profiles attributes" help for a full list of available attributes
2). All extension attributes, including simple, rich text and XML extension attribute typpes
For XML attributes, the search-able contents can be defined and configured in the profiles-config.xml file. Refer to "Configure XML attributes" help for more details.
The names of the links added in the Profiles Linkroll widget are searchable by default. But the URLs of the links are not searchable.
3). Tags for users
All profile search queries are case in-sensitive. The user queries are actually converted to lower case before querying the Profiles content.
A search containing only wild card characters is not supported. Entering just '*' or '%' wildcard characters would return empty search results.
See help topics "Customizing Profiles search
" and "Managing the Profiles search operation
" to learn about which Profiles fields can be searched.
II. Two different search implementations internally
1. Database search:
The most common search in Profiles is to search users by their names. To ensure fast and dynamic searches for names, we use a special database schema and internal logic to optimize search for names. We refer to this type of search as database search
The following searches from the UI use the database search
a). Search by Name from the search drop-down menu
b). Name type-ahead search from the Directory page without the Full search options
c). Clicking on a tag from a user's Profile page
d). Tag type-ahead, such as the hints of tags when user is creating a new tag
Some special notes for database searches:
a). Because the database search is performed on the Profiles database directly, there is no delay for any new/updated/deleted data. For example, if a user's name is updated by the TDI scripts, then searching against the new name should find the user immediately. In the meantime, searching against the old name would not find the user.
b). It is important to note that for database searches, all user queries are appended with a wild card at the end of each word. So if a user types in a search term like: "Amy Jones", we would be searching for names like: "amy% jones%".
c). Inactive users are not included in database search results. To find inactive users, one has to use the index searches and specifically select to include inactive users. See the following "Index search" information for details.
2. Index search:
We build indexes for all profile contents for advanced searches. The profiles indexes are also used for social data analytic purposes. We refer to searches using indexes as index search
The following searches from the UI use the index search
a). Search by Keyword from the search drop-down menu
b). Directory search with full options (Advanced search)
c). Global/Common search for Profiles when selecting All Connections search drop-down menu
d). Organization tag cloud on the Profiles Directory page
e). Clicking a tag from the Organization tag cloud
Special notes on index searches:
a). The indexes are rebuilt on a scheduled basis (every 15 minutes by default), a delay to find new contents/changes should be expected when performing searches using index search.
b). It is important to note that other than those special fields related to names mentioned below, all user queries are expected to be an exact match, for example we do not automatically append wild cards to the search queries as is done in the database searches.
c). Inactive users are not included in the search results for index searches. From the UI with Full search options, one has to check the Include inactive users checkbox to include inactive users in the search results. There is no support to search only inactive users.
3. Search APIs:
All profile contents can be searched by profile search APIs. Refer to the Profiles Search APIs
section for more details.
When the name parameter is used in the search APIs, the database search is used; otherwise, the index search is used for the search APIs.
4. Organization tag cloud
The organization tag cloud on the Profiles Directory page consists of the 50 most popular tags in the entire organization, based on the frequencies of tags. The number of tags in the organization tag cloud is not configurable. The organization tag cloud is built from the indexes, so there will be delays in displaying new popular tags. The organization tag cloud data is cached internally in the application. A delay of up to 30 minutes is not unusual for updating the display of new popular tags in the organization tag cloud.
III. Logic behind name search
Profiles data for names only holds information for users' last name and first names. There is no specific data fields for middle name, therefore, there is no way to search users with their middle names. Due to the complication to parse the first name and last name from a user input, especially in consideration with international names, we parse user queries in various forms, as follows:
1. Search query without a comma
The search queries are tokenized into words using a space. Depending on the words, we try to account for all variations of first and last names to find the matches, some examples are as follows:
a). For user input of a single word, such as 'david', we would search for users whose:
i). first name starts with david or
ii). last name is david
b). For user input with two words, such as 'david jones', we would search for users whose:
i). first name is: david and last name is: jones or
ii). first name is: david jones or
iii). last name is: david jones
c). User input with more words, such as david alex jones, we would search for users whose:
i). first name is: david and last name is alex jones or
ii). first name is: david alex and last name is jones or
iii). first name is: david alex jones or
iv). last name is: david alex jones
Search queries with more words will follow the similar logic to go through the variations of the first names and last names.
2. Search query with a comma
The query before the comma is treated as a user's last name. The query after the comma is treated as the user's first name. There is no further parsing for additional variations of first and last names.
3. Data on the names table
Internally, database searches for names are performed against against two tables -- GIVEN_NAME and SURNAME. These tables hold names and their alias. The data in these tables is used for search purposes only; they are not exposed in any part of the UI. So it is important to note that database searches will not find matches resident in the EMPLOYEE table. For example, the Display Name from the EMPLOYEE table is not searched during a database search, even though Display Name is what you see in the UI for users.
Data in these two name tables should be expressed in lower case. During data population using TDI or Admin APIs, the values are converted to lower cases for these tables. It is not recommended to directly insert or update content in these tables. If using the tables for testing purposes, make sure that contents entered are in lower cases. User queries are converted to lower cases when matching up with the values in the names table.
Name alias are created through an internal program for most of the common English first names. These name alias are provided in the product in a binary file; there is no support for viewing or editing that file.
IV. Logic behind index searches
1. Special logic is associated with some special fields in the search form with Full options.
a). Display name field
Using the same logic as in the database search for names, the user input in the Display name field is broken down into all variations of first and last names. The big difference in the index search is that matches are also found with the following additional data fields on the EMPLOYEE table: displayName, surname, preferredFirstName, preferredLastName, alternativeLastName, nativeFistName and nativeLastName.
Thus, using the same user input in the Display Name field, the index search returns more results than the database search.
A sample query string constructed for the index search with the user input as Amy Jones is as follows:
((((FIELD_PREFERRED_FIRST_NAME:"amy jones*" OR FIELD_NATIVE_FIRST_NAME:"amy jones*" OR FIELD_GIVEN_NAME:"amy jones*") OR (FIELD_PREFERRED_FIRST_NAME:amy* OR FIELD_NATIVE_FIRST_NAME:amy* OR FIELD_GIVEN_NAME:amy*)) AND ((FIELD_PREFERRED_LAST_NAME:"amy jones*" OR FIELD_ALTERNATE_LAST_NAME:"amy jones*" OR FIELD_NATIVE_LAST_NAME:"amy jones*" OR FIELD_SURNAME:"amy jones*") OR (FIELD_PREFERRED_LAST_NAME:jones* OR FIELD_ALTERNATE_LAST_NAME:jones* OR FIELD_NATIVE_LAST_NAME:jones* OR FIELD_SURNAME:jones*))) OR FIELD_DISPLAY_NAME:"amy jones*")
b). First name field
The query string is used to match the preferredFirstName, nativeFirstName and givenName data fields.
A sample query string constructed for the index search with the user input as Amy is as follows:
(FIELD_PREFERRED_FIRST_NAME:amy OR FIELD_NATIVE_FIRST_NAME:amy OR FIELD_GIVEN_NAME:amy)
c). Last name field
The query string is used to match the preferredLastName, alternativeLastName, nativeLastName and surname data fields.
A sample query string constructed for the index search with the user input as Jones is as follows:
(FIELD_PREFERRED_LAST_NAME:Jones OR FIELD_ALTERNATE_LAST_NAME:Jones OR FIELD_NATIVE_LAST_NAME:Jones OR FIELD_SURNAME:Jones)
d). Phone number field
Several format variations are available for what users can enter for phone number searching:
i). A phone number query can be entered with or without '-' or '.'. For example, user input such as 1234567890 or 123.456.7890 find users with the phone number record 123-456-7890.
ii). A phone number query can be entered with a mixture of letters. For example, user input such as 1-800-IBM-HELP finds users with the phone number record 1-800-426-4357.
iii). A phone number query can omit the leading 1. For example, user input such as 123-456-7890 finds users with the phone number record 1-123-456-7890. Note that the leading international phone prefix 011 must be specified to match users with phone number records that contain the leading 011 characters.
iv). A phone number entered in the search UI form with the Full search options would search all available phone number fields, such as mobileNumber, ipTelephoneNumber, and faxNumber.
2. Search by keywords
Users can enter keywords from the search drop-down menu, the "Search by keyword" or "Keyword" field in the Directory search page with the Full search options. The keyword search field is the only field for which search operators can be used for composing complex search queries. That means that special terms and characters are reserved for search operators. For example, terms like AND, OR, NOT, and characters like: +, -, ~, :, etc. are reserved. Double quotes are required to include those special terms and characters if they are not intended to be search operators. Refer to external Lucene syntax sites and documentation for details about using search operators.
Note that the "Search by keyword" function is performed against all indexed contents.
3. The logical operation among the fields entered in the Directory search with full options form is AND.
V. Searchable fields from the search UI and API
The following fields are indexed and made searchable using either the UI or API. The values on the right are the actual field values in the indexes.
For extension attributes, use the expression defined in the profiles-config.xml file. For example, if you defined some extension attributes in profiles-config.xml as shown in the following sample:
Then you can search using the field names as defined in the profiles-config.xml and using the following format:
VI. Consideration for Accent Characters
Names in various international locales may have accent characters. It can be desirable to allow the users to search for names with the accent characters without having to type in the accents. One common support to consider is to support searches with the exact accents and searches completely without accents. To support such searches, a version of the names without the accent characters must be available in the name tables. Special TDI scripts can be used to strip off the accent characters from the names during initial population or subsequent updates of the names with the data sources.
There is no solution to support inputs for a mixture of accents and without accents. So users must either enter a search query exactly how the names are expresses (with the correct accent characters) or enter a search query completely without accents.
VII. Available search configurations
1. Search results configuration
Search results are displayed using a template. See information about configuring search results in the Managing the Profiles search operation documentation topic.
2. Search UI form configuration
The form for Directory search with full options is configurable. See information about configuring search in the UI in the "Customizing Profiles search" documentation topic.
3. General search configuration
i). Search on first name
ii). Default for search results display order
VIII. Trouble Shooting
1. Typical questions and problems. What to check
a). Why is search returning users whose names don't seem to match the name search query?
When performing searches by name, we are searching against the GIVEN_NAME and SURNAME table, as described above.
You can either run some simple SQL to check whether the searched names are in these tables or you can run the following elaborate SQL to see whether there are any hits when searching a name such as Amy Jones:
with key_list as ( ( (select PROF_KEY from (SELECT PROF_KEY FROM EMPINST.SURNAME WHERE (EMPINST.SURNAME.PROF_SURNAME LIKE 'jones%' escape '*') AND PROF_USRSTATE = 0 ) SN where PROF_KEY in (SELECT PROF_KEY FROM EMPINST.GIVEN_NAME WHERE ( EMPINST.GIVEN_NAME.PROF_GIVENNAME LIKE 'amy%' escape '*' ) AND PROF_USRSTATE = 0 ) ) UNION ALL (select PROF_KEY from (SELECT PROF_KEY FROM EMPINST.SURNAME WHERE ( EMPINST.SURNAME.PROF_SURNAME LIKE 'amy%' escape '*' ) AND PROF_USRSTATE = 0 ) SN where PROF_KEY in (SELECT PROF_KEY FROM EMPINST.GIVEN_NAME WHERE ( EMPINST.GIVEN_NAME.PROF_GIVENNAME LIKE 'jones%' escape '*' ) AND PROF_USRSTATE = 0 )) UNION ALL (SELECT PROF_KEY FROM EMPINST.SURNAME WHERE EMPINST.SURNAME.PROF_SURNAME LIKE 'amy jones%' escape '*' AND PROF_USRSTATE = 0 ) UNION ALL (SELECT PROF_KEY FROM EMPINST.GIVEN_NAME WHERE EMPINST.GIVEN_NAME.PROF_GIVENNAME LIKE 'amy jones%' escape '*' AND PROF_USRSTATE = 0 ) )) select distinct key_list.PROF_KEY from key_list fetch first 250 rows only optimize for 250 rows;
b). Why does the page UI indicate that there are more results than are displayed?
This could happen for both database search and index search.
i). In the case of database search, such issue happens typically because there are orphans in the GIVEN_NAME or SURNAME tables. A record in those table is considered an orphan if the PROF_KEY value in the record cannot be found in the EMPLOYEE table. The following SQLs can be used to find out whether there are such orphans:
SELECT * FROM EMPINST.GIVEN_NAME where PROF_KEY not in (select PROF_KEY from EMPINST.EMPLOYEE);
SELECT * FROM EMPINST.SURNAME where PROF_KEY not in (select PROF_KEY from EMPINST.EMPLOYEE);
ii). In the case of index search, this happens when the search indexes are out of sync with the Profiles database. Search indexes holds user records that are no longer available in Profiles database. If this happens, rebuild the Profiles indexes.
c). The Search indexing is not building.
This could happen for many different reasons. The general steps to take are:
i). Make sure that Profiles seedlist servlet is functioning. From the browser, open the seedlist URL using the following format as a guide:
Ensure that you see the correct seedlist feed, with a default 100 records;
ii). Check whether there are any errors in both the Profiles server logs and Search server logs;
iii). Enable detailed tracing for the Profiles seedlist on Profiles server using the following setting:
d). The organization tag cloud is not displaying tags.
As explained above, the organization tag cloud display can experience delays after initial indexing. Typical things to check are as follows:
i). Index search works in Profiles, in particular you can search tags from the Directory search with full options.
ii). Look for any errors whether the organization tag cloud cache is built.
e). Clicking a tag in the organization tag cloud returns different results then clicking a tag from a user's profile.
As explained above, clicking on a tag from the organization tag cloud uses index search, and there is a delay for the new tags are indexed. However, clicking a tag from a user's profile performs a database search, and there are no delays. It's expected to not see newly added tags from the organization tag cloud, or users who are newly tagged with a tag in the search results.
f). Typing into the Display name with the "Full search options" on the Directory page returns different results than using the same query with "Search profiles by name."
This is expected because the search from the Directory page with "Full search options" searches more fields than the database search for names.
2. Trace setting to apply for more details: com.ibm.lconn.profiles.internal.service.SearchServiceImpl=all
Look for outputs like:
[2/16/12 11:16:47:125 EST] 00000087 SearchService 2 com.ibm.lconn.profiles.internal.service.SearchServiceImpl trace Entering getTagListForSearchResultsOnKeyword method, userQuery = (FIELD_PREFERRED_FIRST_NAME:amy OR FIELD_NATIVE_FIRST_NAME:amy OR FIELD_GIVEN_NAME:amy), pageNum = 1