EMC Documentum Search Development Guide
Transcription
EMC Documentum Search Development Guide
EMC® Documentum® Version 7.2 Search Development Guide EMC Corporation Corporate Headquarters: Hopkinton, MA 01748–9103 1–508–435–1000 www.EMC.com Copyright ©1999-2015 EMC Corporation. All rights reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Adobe and Adobe PDF Library are trademarks or registered trademarks of Adobe Systems Inc. in the U.S. and other countries. All other trademarks used herein are the property of their respective owners. Documentation Feedback Your opinion matters. We want to hear from you regarding our product documentation. If you have feedback about how we can make our documentation better or easier to use, please send us your feedback directly at IIGDocumentationFeedback@emc.com. Table of Contents Chapter 1 Indexing and Querying Full-text Indexes..........................................................7 Introduction to Indexing.......................................................................................7 Controlling what is indexed .................................................................................8 How queries are processed.................................................................................8 DQL hints ........................................................................................................10 Extended object search ....................................................................................14 Chapter 2 Configuring and Customizing DFC Search.....................................................19 Configuring DFC search ...................................................................................19 DFC query builder ............................................................................................24 Transforming a query with a filter.......................................................................25 DFC database queries......................................................................................28 Hello World DFC search ...................................................................................28 DFC customization examples............................................................................31 Chapter 3 Customizing Search with DFS ........................................................................39 DFS Search Services .......................................................................................39 Full-text and database searches........................................................................39 Constructing a search.......................................................................................40 Search service objects......................................................................................42 Search service operations.................................................................................48 Chapter 4 Configuring and Customizing Webtop Search ...............................................63 About WDK search...........................................................................................63 Wildcards, lemmatization, and word fragments...................................................66 Configuring search controls...............................................................................67 Configuring the basic search component............................................................68 Configuring the advanced search component .....................................................69 Configuring search results ................................................................................72 Configuring Webtop Federated Search clustering ...............................................74 Modifying search component JSP pages............................................................75 Modifying a search component query.................................................................79 Chapter 5 Configuring CenterStage Search....................................................................85 Set Federated Search Services options .............................................................85 Improving search performance ..........................................................................86 EMC Documentum Version 7.2 Search Development Guide 3 Table of Contents Chapter 6 Troubleshooting .............................................................................................87 Troubleshooting Search ....................................................................................87 Problem queries ...............................................................................................89 Debugging .......................................................................................................91 Appendix A 4 DFC schemas .................................................................................................93 EMC Documentum Version 7.2 Search Development Guide Preface This document summarizes information for developers who customize search in their Content Server client applications. When you customize search, you may need information about several different products: Content Server, xPlore index server, DQL, DFC, DFS, and WDK. The information in this document is drawn from the following sources: • EMC Documentum Content Server Administration and Configuration Guide • EMC Documentum Content Server DQL Reference • DFC Javadocs • EMC Documentum Foundation Services Development Guide • EMC Documentum WDK Development Guide • EMC Documentum WDK Reference Guide Some information appears in this guide that is not available in other product guides. When you are familiar with the Content Server data model and indexing, you can design queries and search customizations and troubleshoot query performance. Web Development Kit (WDK) provides you with tools to display query-generating pages and results pages in web-accessible applications. DFC and DFS allow you to build a query within a client application. This document does not describe how to set up and configure an xPlore server or a Federated Search Services (FS2) server. (FS2 server and FS2 adapters are required for federated search, that is, searches against external sources, not Documentum repositories.) For information on installing and configuring an xPlore index server and index agent, see EMC Documentum xPlore Installation Guide and EMC Documentum xPlore Administration and Development Guide. For information on developing an FS2 adapter, see EMC Documentum Federated Search Services Development Guide. If you need assistance in implementing your customizations, contact EMC Professional Services or EMC Developer support. Intended Audience This guide is directed to administrators and Java developers who are developing customized DFC, DFS, or WDK-based clients of the Content Server. The customization tasks described in this guide use Java, JSP, XML, XQuery and XPath, JavaScript, and DQL. Conventions This manual uses the following conventions in the syntax descriptions and examples. Table 1 Syntax conventions Convention Identifies italics A variable for which you must provide a value EMC Documentum Version 7.2 Search Development Guide 5 Preface Convention Identifies [ ] square brackets An optional argument that is included only once xplore_home Installation directory for xPlore DM_HOME Installation directory for Content Server Revision history The following changes have been made to this document. Table 2 6 Revision history Revision Date Description February 2015 Initial publication. EMC Documentum Version 7.2 Search Development Guide Chapter 1 Indexing and Querying Full-text Indexes This chapter contains the following topics: • Introduction to Indexing • Controlling what is indexed • How queries are processed • DQL hints • Extended object search Introduction to Indexing This chapter provides a brief overview of the indexing process, the indexes, and the software components that perform indexing and searching. For information on Documentum xPlore (xPlore) installation, administration, configuration and customization, refer to EMC Documentum xPlore Administration and Development Guide. The Content Server Installation Guide contains information on installing Content Server. The EMC Documentum xPlore Installation Guide contains information on installing the index agent and index server. See the EMC Community Network Documentum search and analytics forum to post your questions and see solutions offered by other customers and EMC employees. Content Server Full-text indexing is enabled in the repository by default when the repository is created or upgraded to the latest Content Server version. However, Content Server itself does not create or maintain the full-text index. Install xPlore to create and maintain the index. The Content Server manages documents in a repository, generates full-text indexing events, queries the index, and returns query results to client applications. Index agent The xPlore index agent is a multithreaded Java application running in the Content Server application server. Run the xPlore installer to install an index agent on a Content Server host or a separate host. The index agent processes index queue items generated by Content Server and prepares objects for indexing. The index agent creates a representation of the indexable SysObjects using the DFTXML schema. xPlore processes the DFTXML for indexing in the internal xDB database. EMC Documentum Version 7.2 Search Development Guide 7 Indexing and Querying Full-text Indexes xPlore The xPlore indexing server creates full-text indexes and responds to full-text queries from Content Server client applications. The index itself is a Lucene index managed by an XML database (xDB). xPlore can be installed on the Content Server host that meets the xPlore environment requirements. For better performance, install xPlore on a separate host. For complete information on installing and running xPlore, refer to EMC Documentum xPlore Installation Guide. Controlling what is indexed A full-text index is an index on the properties and content of files associated with objects of SysObjects and SysObject subtypes. When you search for values in a full-text index, you can retrieve objects with properties or content associated with your search terms. All characters are stored as lowercase in the index. Case sensitivity is not configurable. Content files and properties in all supported languages are indexed by default. All standard Unicode character sets are supported. No special configuration is necessary. For tested languages in xPlore, refer to EMC Documentum xPlore Administration and Development Guide. To control what is indexed, set the properties on individual objects, object types, or formats in Documentum Administrator. Configure stop words or special characters in xPlore. You can also limit indexing by file size or text content size. For complete information on these controls, see EMC Documentum xPlore Administration and Development Guide. Lemmatization is applied to indexed documents and to queries. Lemmatization analyzes a word for its context (part of speech), and the canonical form of a word (lemma) is indexed. The extracted lemmas are actual words. Lemmatization saves both the indexed term and its canonical form in the index, effectively doubling the size of the index. You can turn off lemmatization in xPlore or configure lemmatization for specific elements. Refer to EMC Documentum xPlore Administration and Development Guide. How queries are processed FTDQL is a subset of Document Query Language (DQL) and is used for querying full-text indexes. DQL and FTDQL are fully documented in Content Server DQL Reference. DFC- and DFS-based client applications like Webtop or TaskSpace translate queries into an XQuery statement. Your application can also issue DQL queries, which can be configured to run against the database or against the full-text index. The Content Server query plugin for xPlore translates a DQL into an XQuery expression unless XQuery generation is turned off. (For instructions on turning off XQuery generation, see EMC Documentum xPlore Administration and Development Guide. Note: It is not recommended to turn off XQuery processing. If you do, you cannot use facets, native xPlore security, and other performance enhancements. For detailed information on query processing, including wildcards and fuzzy search, see EMC Documentum xPlore Administration and Development Guide. 8 EMC Documentum Version 7.2 Search Development Guide Indexing and Querying Full-text Indexes Security of query results Content Server user, group, and object permissions are applied to query results either in the xPlore server (default) or in Content Server. Performance is faster with native xPlore security, because results are not sent back to the Content Server and discarded for users who do not have appropriate permissions. Security is configurable in xPlore. Refer to EMC Documentum xPlore Administration and Development Guide: Clients like WDK, DFC, and DFS do not apply permissions to search results Changes to permissions are replicated to xPlore as they happen, with some small latency. You can decrease the latency by setting up a separate index agent dedicated to ACLs and groups. Faceted results Faceted search, also called guided navigation, allows users to explore large data sets to locate items of interest. You can define facets for the attributes that are used most commonly for search. After facets are computed and the results of the initial query are presented in facets, the user can drill down to areas of interest. Multiple attributes can be used to compute a facet, for example, r_modifier or keywords. Faceted navigation has several advantages over a keyword search or explicit query: • The user can explore an unknown data set by restricting values suggested by the search service. • The data set is presented in a visual interface, so that the user can drill down rather than constructing a query in a complicated UI. • Faceted navigation prevents dead-end queries by limiting the restriction values to non-empty results. The query is reissued for the selected facets. Facets are computed on discrete values, for example, authors, categories, tags, and date or numeric ranges. Facets are not computed on text fields such as content or object name. Facet results are not localized; the client application must provide localization. For information on creating facets, refer to EMC Documentum xPlore Administration and Development Guide. When to use a database query Full-text queries have more capability for natural language and free-text searching than database queries. These queries generally perform better than database queries because the index is optimized and security is performed in the xPlore server. If security is performed in the Content Server, non-permitted results are returned to the Content Server and then discarded. In DFC clients, all search component queries are full-text queries unless a DQL hints file is in place and you have turned off automatic XQuery generation in dfc.properties. The hints file allows you to specify certain conditions under which a database is done in place of a full-text query. For information on the hints file, see DQL hints, page 10. A selection in the Webtop UI labeled Include recently modified properties searches for attribute values in the database instead of the full-text index: A NOFTDQL search on attributes.) This option is not enabled out of the box and requires configuration. Note: For attributes that are queried against the database, create an index in the database. EMC Documentum Version 7.2 Search Development Guide 9 Indexing and Querying Full-text Indexes DQL hints DQL hints can be added to a query to change query behavior. For information on all DQL hints, refer to EMC Documentum Content Server DQL Reference. For tips on migrating DQL hints to xPlore, see EMC Documentum xPlore Administration and Development Guide. The ENABLE(FTDQL) hint causes the Content Server to attempt to execute the query as an FTDQL query. If the remaining syntax in the query conforms to the required syntax for an FTDQL query, the query is executed as an FTDQL query. If the syntax does not conform to FTDQL query rules, an error is returned. The TRY_FTDQL_FIRST hint is added to all queries that are built with the DFC query builder package. This hint handles timeouts and resource exceptions returned from xPlore by querying the attributes portion of a query against the repository database. You can turn off FTDQL for the attribute portion of a query with the hint ENABLE(NOFTDQL), like the following query: Select r_object_id from dm_document SEARCH DOCUMENT CONTAINS ’foo’ WHERE object_name = ’bar’ ENABLE(NOFTDQL) You cannot use a DQL hints file with xPlore unless you turn off automatic XQuery generation. The portion of the query covered by hints file criteria is run against the database, and the remainder of the query is run against the full-text index. However, when XQuery generation is turned off, search performance is worse. Some search features do not work without XQuery such as : facets, paging, and parallel queries. Using a DQL hints file If a DQL hints file is present on the application server, and XQuery generation is turned off, DFC reads it. DFC applies the hints to queries based on conditions defined in the file. The remainder of the query is run against the full-text index. You can define conditions under which the hints are applied, for example, for certain object types, attributes, or repositories. DQL hints, page 10 describes the behavior governed by the hints file. The DQL hints file location is specified in the DFC configuration file dfc.properties on the application server host. The file must be named dfc.dqlhints.xml. If the file has been modified, it is reloaded every two minutes. The following line could be added to dfc.properties to specify a Windows location for the hints file: dfc.dqlhints.file=C:/Documentum/config/dfc-dqlhints.xml Alternatively, you can place a DQL hints file in the application server host system classpath or as a system environment variable, for example: -Ddfc.dqlhints.file=path_to_hints_file Use forward slashes for paths in Java properties file (back slash is used for escape). Alternatively, the file can be loaded from classpath or the DFC data home directory on the application server host. See DQL hints file DTD, page 93 for the hints file DTD. 10 EMC Documentum Version 7.2 Search Development Guide Indexing and Querying Full-text Indexes Hints file elements The following elements are contained within a root <RuleSet> element to define the hints passed to IDfQueryManager. Table 3 DQL hints file elements Element Description <Rule> Can have zero to many <Condition> elements <DisableFullText/> Disables full-text search on basic search or attributes for the conditions in the rule <DisableFTDQL/> Disables search for metadata in the FT index. <Condition> Child elements are ANDed <Select>, <Where> Child <Attribute> elements can be ANDed (condition="all") or ORed (condition="any") <SelectOption> Adds a permission, for example, FOR READ or FOR BROWSE. For example, FOR DELETE would limit the results of a query that meets the condition to those documents on which the user has delete permission. The following example applies to all Webtop queries: <RuleSet> <Rule> <Condition> <Where> <Attribute operator="like">object_name</Attribute> </Where> </Condition> <SelectOption>FOR DELETE</SelectOption> <DisableFTDQL/> </Rule> </RuleSet> <From> Child <Type> elements can be ANDed (condition="all") or ORed (condition="any") <Docbase> The value of this element corresponds to a repository to which the hint applies. The descend attribute is optiona. Default=false. To apply the DQL hint to a folder and all its subfolders, set descend=true. <Attribute>, <Type>, Support Java regular expression (java.util.regex.Pattern). For example, <Docbase> <type>custom.*</type> matches all type names beginning with "custom". EMC Documentum Version 7.2 Search Development Guide 11 Indexing and Querying Full-text Indexes Element Description <Attribute> Operator "like" represents DQL predicates CONTAINS and LIKE. The value "is_null" represents DQL predicates NULL, NULLINT, NULLSTRING, and NULLDATE. <FulltextExpression> Child of <condition>. Set the mandatory exists attribute to false to add ENABLE(NOFTDQL) to the query when there is no full-text expression in the search. <DQLHint> Contains any valid DQL hint. For the full list of DQL hints, refer to Content Server DQL Reference. Hints file examples To send all queries on attributes to the database, define the following hint. The query must not contain a full-text search expression. <RuleSet> <Rule> <Condition> <FulltextExpression exists="false"/> </Condition></Rule></RuleSet> If you disable FTDQL for specific conditions defined within the <rule> element, the attributes portion of the query that meets those conditions is issued against the database. A temp table is populated with the full-text result. If the full-text query is unselective, then the temp table is large, negatively impacting response time. In the following example, FTDQL is turned off for queries on the object_name attribute that use the "like" operator. (In the Webtop UI, the like operator is "contains", "begins with", or "ends with".) Multiple attributes can be added to the rule. <RuleSet> <Rule> <DQLHint>ENABLE(FT_CONTAIN_FRAGMENT)</DQLHint></Rule></RuleSet> In the following example, attributes for the specified object type are queried in the database, not the full-text index: <RuleSet> <Rule> <Condition> <From condition="any"> <Type>km_message</Type> </From> </Condition> <DisableFTDQL/> </Rule> </RuleSet> The following example adds two hints to wildcard queries on either of two attributes: 12 EMC Documentum Version 7.2 Search Development Guide Indexing and Querying Full-text Indexes <RuleSet> <Rule> <Condition> <Where condition="any"> <Attribute operator="like">subject</Attribute> <Attribute operator="like">object_name</Attribute> </Where> </Condition> <DQLHint>ENABLE(SQL_DEF_RESULT_SET 100, NOFTDQL)</DQLHint> <DisableFTDQL/> </Rule> </RuleSet> In the following hints file, one rule applies to queries for one attribute, the second rule applies to a different attribute: <RuleSet> <Rule> <Condition> <Where condition="any"> <Attribute operator="like">subject</Attribute> </Where> </Condition> <DQLHint> ENABLE(SQL_DEF_RESULT_SET 100, NOFTDQL) </DQLHint> <DisableFTDQL/> </Rule> <Rule> <Condition> <Where condition="any"> <Attribute operator="like">object_name</Attribute> </Where> </Condition> <DQLHint> ENABLE(SQL_DEF_RESULT_SET 10) </DQLHint> <DisableFTDQL/> </Rule> </RuleSet> Make sure that your multiple rules are mutually exclusive when applied to a single query. If not, the query generates a DQL syntax error. If the Webtop user adds both attributes to the query (subject and object_name), this hints file example throws an error. You can turn off FTDQL for attribute queries in a repository, adding conditions as needed, as shown in the following example: <Rule> <Condition> <Docbase> <Name>support</Name> </Docbase> </Condition> <DisableFTDQL/> </Rule> You can turn off FTDQL for FOLDER(DESCEND) queries. In Webtop, this hint turns off FTDQL for searches from current location or some other specific location instead of from the repository root. If there are many subfolders, FOLDER(DESCEND) queries can time out. The following EMC Documentum Version 7.2 Search Development Guide 13 Indexing and Querying Full-text Indexes example sends the attribute portion of the query to the database instead of the full-text index for the specific repository. The descend attribute specifies whether to apply the condition and hint to FOLDER(DESCEND) queries: <Rule> <Condition> <Docbase> <Name descend="true">dm_notes</Name></Docbase> </Condition> <DisableFTDQL/> </Rule> DQL hints and Webtop search components The Webtop search components use the DFC query builder package to construct a query. If XQuery generation is turned off, the DFC query builder adds the DQL hint TRY_FTDQL_FIRST. This hint prevents timeouts and resource exceptions by querying the attributes portion of a query against the repository database. The query builder also bypasses lemmatization by using a DQL hint for wildcard and phrase searches. If wildcard attribute searches ("contains", "begins with", "ends with") have many results, they can time out. These searches have been optimized in xPlore, but the optimization is not applied when XQuery generation is turned off. You can configure xPlore to support wildcard searches without using DQL and without turning off XQuery generation. Extended object search Extended object search (EOS) allows you searching in the content or attributes of more than one object when the objects are related in some way. For example, you can search both an email and its attachments for content. EOS also allows you searching on augmented content. For example, you can inject data from external repositories to enrich the content indexed by xPlore. To support an extended object, you define a mapping that is independent from the storage format. For example, an extended object definition represents emails. The definition combines attributes for more than one object type. You create a mapping file for the main interface. Your search application uses the DFC query builder API to query the join of objects or tables as though it were a single object. In the addResultAttribute() and addSimpleAttrExpression() methods, you add aliases that are defined in your mapping file. These procedures are described in detail in the following topics. You can also use the aliases in facets. Note: Starting in version 7.0, the DQL mapping and the mapping deployment mechanism using an SBO are deprecated. They are only supported for backward compatibility. The following diagram illustrates the steps necessary to implement EOS: This section focuses on the last two steps: defining (and deploying) the EOS mapping and defining a custom query. 14 EMC Documentum Version 7.2 Search Development Guide Indexing and Querying Full-text Indexes Creating a mapping file A mapping applies to all types. Multiple mappings can apply at the same time. The mapping loader merges all the mappings. If several mappings apply to the same attribute, they are incompatible and the system throws an error at query time. For the mapping files schema, see Extended object search schema, page 93. In the mapping file, you define interfaces that the DFC query builder can instantiate. The following example defines the main interface of the mapping as IDmDoc: <interface name=’IDmDoc’> You add aliases to the interface that can be used in your queries. The alias can map to other interfaces or to qualified Documentum attributes. Use the map-to attribute of the alias element for this mapping. The map-to value is a path within the DFTXML representation of the input document, for example, map-to="dmftcustom>mediaAnnotations>annotation>author". The DFTXML schema is documented in the appendix of EMC Documentum xPlore Administration and Development Guide. Add interface elements that map to attributes. Add subinterfaces and reference them recursively from an alias in the main interface. The following example shows the main interface, IDmDoc, and an alias the subinterface IMgAnn. The aliases in the subinterface map to a path in the dmftcustom element of the DFTXML representation of the main document. (A TBO injected this data.) <!-- Main interface--> <interface name=’IDmDoc’> <!-- Alias points to sub interface defined in this file --> <alias name=’annotation’ map-to=’IMgAnn’ cardinality="MANY"/> </interface> <!-- Subinterface with aliases--> <interface name=’IMgAnn’> <alias name="author" map-to=" dmftcustom>mediaAnnotations>annotation>author" cardinality="MANY"/> <alias name="content" map-to="dmftcustom>mediaAnnotations>annotation" cardinality="MANY"/> </interface> Sample extended object mapping file xPlore mapping (xploreMapping.xml) <?xml version="1.0"?> <doc:mapping xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.documentum.com" xsi:schemaLocation=" http://www.documentum.com ../../ressources/complex_objects_mapping.xsd"> <!-- Main interface of the EOS mapping --> <interface name=’IDmDoc’> <!-- Aliases point to sub interfaces defined in this file --> <alias name=’annotation’ map-to=’IMgAnn’ cardinality="MANY"/> </interface> EMC Documentum Version 7.2 Search Development Guide 15 Indexing and Querying Full-text Indexes <!-- Subinterface referenced (recursively) from the main interface. Aliases point to other subinterfaces or to qualified documentum attributes --> <interface name=’IMgAnn’> <alias name="author" map-to=" dmftcustom>mediaAnnotations>annotation>author" cardinality="MANY"/> <alias name="content" map-to=" dmftcustom>mediaAnnotations>annotation" cardinality="MANY"/> </interface> </doc:mapping> Note: The map-to value is a path within the DFTXML representation of the input document. Deploying EOS mappings in the repository Deploy mappings in the repository to the folder: /System/Search/EOS/. The DFC Search Service scans the folder and loads all the files in this folder as xPlore mappings. If you modify a mapping file, the DFC Search Service dynamically reloads it. By default, the system scans the mapping folder every minute and when a query is run. To modify this interval, set the property : dfc.search.eos.mappingcache.refresh_interval in the dfc.properties file. 1. Create an XML file to define the mapping. 2. Name the file. While the filename is ignored by the DFC Search Service, we recommend to prefix it by the namespace of the application that deploys it. 3. Import the file as a dm_document to /System/Search/EOS/. Files in sub-folders are ignored. 4. Make sure that the ACL for this file allows read access to anyone. We recommend to make it read-only. Deploying EOS mappings in classpath Use classpath deployment if you have different mappings on each Content Server repository. Instead of deploying the mapping files in the repository, a registration file defines an alternate location in the classpath. The following procedure does not describe the creation of the XML mapping file. 1. Create a property file named sco.properties. 2. Add it to your DFC classpath, for example, in the folder that contains dfc.properties. 3. Edit the file sco.properties to add the properties such as : complextype.xploremapping[0]=<filename> complextype.xploremapping[1]=<filename2> where <filename> can be either: an absolute filename, a relative filename (relative to the application current folder), or a file in the classpath. The DFC Search Service first looks in the file system then in absence of a matching file, it looks in the classpath. For example, with the following property: complextype.xploremapping[0]=com/documentum/test/fc/client/search/ TFileMappingLoader_sco.mapping.properties The DFC Search Service looks in the classpath for a file named TFileMappingLoader_sco.mapping.properties in the package com.documentum.test.fc.client.search. 16 EMC Documentum Version 7.2 Search Development Guide Indexing and Querying Full-text Indexes Mappings deployed in the classpath are not reloaded dynamically. You must restart the application to refresh the cache. Adding metadata from other tables or objects to the main document The metadata that is referenced in an alias must be denormalized into the index for the main document by a TBO or aspect. In this context, denormalization is the process of rendering normalized relational data into a single XML structure within the DFTXML representation of the main document. For the customization of injected metadata or joins, refer to EMC Documentum xPlore Administration and Development Guide . The developer must subclass DfPersistentObject and override customExportForIndexing to add custom nodes in the DFTXML. Using extended object aliases in a DFC query The aliases that you define in a mapping file can be used like any regular attribute in the DFC search service. They can be used in constraints or as results attributes. In a DFS query, aliases for attributes can be used in a PropertyExpression. In the following example, the alias annotation/author is added as a result attribute and as a simple attribute expression. The aliases are shown in bold in the mapping example. IDfClient client = DfClient.getLocalClient(); m_searchService = client.newSearchService(m_sessionManager, docbase); IDfQueryManager queryManager = m_searchService.newQueryMgr(); m_queryBuilder = queryManager.newQueryBuilder("dm_document"); m_queryBuilder.addSelectedSource(docbase); m_queryBuilder.addResultAttribute("annotation/author"); //annotation author is our alias m_queryBuilder.addResultAttribute("r_object_id"); m_queryBuilder.addResultAttribute("object_name"); // annotation/author alias is used again exprSet.addSimpleAttrExpression(" annotation/author", IDfAttr.DM_STRING, IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, false, true, "value1"); m_processor = m_searchService.newQueryProcessor(m_queryBuilder, false); m_processor.blockingSearch(600000); The XQuery rendering of this query is the following: let $libs := (’/MSSQL66ECI1/dsearch/Data’) let $results := for $dm_doc score $s in collection($libs)/dmftdoc[ (dmftmetadata//a_is_hidden = "false") and (dmftversions/iscurrent = " true") and (dmftinternal/i_all_types = "03110a1b80000129") and ( dmftcustom/mediaAnnotations/annotation/author ftcontains "value1" with stemming)] order by $s descending return $dm_doc return (for $dm_doc in subsequence($results,1,351) return <r> { for $attr in $dm_doc/dmftcustom/mediaAnnotations/annotation/author return <alias name=’f0_f1’ type=’dmstring’>{string($attr)}</alias>} EMC Documentum Version 7.2 Search Development Guide 17 Indexing and Querying Full-text Indexes {for $attr in $dm_doc/dmftmetadata//*[local-name()=(’r_object_id’)] return <attr name=’{local-name($attr)}’ type=’{$attr/@dmfttype}’> {string($attr)}</attr>}{xhive:highlight(( $dm_doc/dmftcontents/dmftcontent/dmftcontentref,$dm_doc/dmftcustom))} <attr name=’score’ type=’dmdouble’>{string(dsearch:get-score($dm_doc))} </attr></r>) 18 EMC Documentum Version 7.2 Search Development Guide Chapter 2 Configuring and Customizing DFC Search This chapter contains the following topics: • Configuring DFC search • DFC query builder • Transforming a query with a filter • DFC database queries • Hello World DFC search • DFC customization examples Configuring DFC search The following options in dfc.properties configure search behavior in DFC and DFC clients such as WDK and Webtop. This file is located in the Documentum home config directory as specified by the dfc_data environment variable, for example, C:\Documentum\config or /tmp/Documentum/config. This file includes settings to enable and configure FS2 for searching external (non-Documentum) sources. Optimizing query batch size You can optimize query performance by setting a smaller batch size. The batch size is the number of results returned at a time by xPlore. Set the batch size for an individual query, if you are constructing the query in DFC. Set it for multiple queries in dfc.properties as the value of dfc.search.batch_hint_size. Any value can be used for dfc.search.batch_hint_size, but larger values probably do not optimize. Configuring search in dfc.properties Table 4 Search options in dfc.properties Parameter Default value Description dfc.search.docbase.broker_count 20 Number of broker threads supporting execution of the Documentum repository part of a query. One broker supports execution of the query for each repository selected for this query. min value: 0, max value: 1000 EMC Documentum Version 7.2 Search Development Guide 19 Configuring and Customizing DFC Search Parameter Default value Description dfc.search.external_sources.broker_count 30 Number of broker threads supporting execution of the FS2 part of a query. One broker supports the execution of the query for all external sources selected for this query. min value: 0, max value: 1000 dfc.search.external_sources.enable false Set to true tells DFC to use FS2 in addition to Content Server’s basic search facilities. For CenterStage Pro deployments: true dfc.search.external_sources.host localhost RMI registry host to connect to FS2 Server. For information on the RMI registry, refer to EMC Documentum Federated Search Services Development Guide chapter on the application SDK. dfc.search.external_sources.port 3005 RMI registry port to connect to FS2 Server. For information on the RMI registry, refer to EMC Documentum Federated Search Services Development Guide chapter on the application SDK. min value: 0, max value: 65535 dfc.search.external_sources.username guest Default credentials to connect to FS2 server as guest. dfc.search.external_sources.password askonce Default credentials to connect to FS2 server as guest. dfc.search.external_sources.backup.host localhost RMI registry host to connect to the backup FS2 Server. The EMC Documentum Federated Search Services Development Guide chapter on the application SDK explains the RMI registry. dfc.search.external_sources.backup.port 3005 RMI registry port to connect to the backup FS2 Server. The EMC Documentum Federated Search Services Development Guide chapter on the application SDK explains the RMI registry. min value: 0, max value: 65535 dfc.search.external_sources.retry.period 300000 Time in milliseconds before retrying to connect to the main FS2 server (after having switch to the backup FS2 server). min value: 0, max value: 2147483647 dfc.search.external_sources.adapter.domain JSP Subdomain containing the source available to DFC. By default, DFC uses the default domain of the standalone FS2 WEB client. For CenterStage Pro deployments: CenterStage dfc.search.external_sources.request_timeout 180000 Time in milliseconds to wait for answer from FS2 server. min value: 0, max value: 10000000 dfc.search.external_sources.rmi_name xtrim.RmiApi RMI registry symbolic name associated with FS2 API. dfc.search.external_sources.ssl.enable false Enable encryption of results and content sent from the FS2 server to the DFC client. dfc.search.external_sources.ssl.keystore (none) Define a keystore where to find DFC client certificate and keys and FS2 Server trusted certificate.This keystore is a file available locally on the machine where the DFC resides. 20 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search Parameter Default value Description dfc.search.external_sources.ssl.keystore_password (none) Define the password for the keystore file used for communication with the FS2 server. dfc.search.fulltext.enable true Use the Content Server full-text engine (for example, xPlore). If you set this to false, DFC replaces DQL full-text clauses by LIKE clauses on the following attributes: object_name, title, subject. dfc.search.matching_terms_computing.enable true If this property is enabled, the matching terms will not be computed by the indexer but will be computed locally by the DFC search service. This setting can enhance performance, but variants will not be included. If the source is not indexed, this property is ignored because the matching terms are already computed by DFC. dfc.search.max_results 1000 Maximum number of results to retrieve by a query search. min value: 1, max value: 10000000 dfc.search.max_results_per_source 350 Maximum number of results to retrieve per source by a query search. min value: 1, max value: 10000000 dfc.search.sourcecache.refresh_interval 1200000 Time in milliseconds between refreshes of the search source map cache. min value: 0, max value: 10000000 dfc.search.typecache.refresh_interval 1200000 Time in milliseconds between refreshes of the cache of type information. min value: 0, max value: 10000000 dfc.search.formatcache.refresh_interval 1200000 Time in milliseconds between refreshes of the cache of formats. min value: 0, max value: 10000000 dfc.search.eos.mappingcache.refresh_interval 60000 Time in milliseconds between refreshes of the cache of Extended Object Search (EOS) mapping information. min value: 0, max value: 1000000000 dfc.search.batch_hint_size 0 This controls both the client to server and server to database batching of query data for the search services only. If set, this property overrides the DFC_BATCH_HINT_SIZE property value for all queries generated by Search services. It can be used to affect the performance based on the performance of the network links. It is a hint in the sense that there is no guarantee that the value will be honored; for example if the number is too large it will be rounded down. For client to server traffic, it controls the number of rows transported each time a new batch of rows is needed in while processing a query collection. For server to database traffic, this affects the number of rows returned each time a database table is accessed. The default value is usually adequate. Sometimes EMC Documentum Version 7.2 Search Development Guide 21 Configuring and Customizing DFC Search Parameter Default value Description a larger value can improve performance in a high latency environment. min value: 0, max value: 1000 Configuring federated search ranking xPlore returns a ranking of search results. xPlore uses the relevancy scoring of the underlying Lucene index. If DFC relevancy configuration has been customized, it can combine with or override the xPlore score. If you search over more than one source, ranking is recalculated based on the custom ranking algorithm. If you search only one source, like xPlore or an external source, the score returned by the source is used. You can configure the weighting of criteria used for ranking the relevancy of search results from xPlore and other sources. (For xPlore, source=<repository_name>.) A weight is a numerical value that increases or decreases the importance of a search source or set of sources. DFC combines scores for sources to produce a relevancy ranking that displays the most relevant results first. Weights for relevancy ranking are configured in a file named dfc-searchranking.xml, located in the Documentum home /config directory, for example, C:\Documentum\config. In WDK-based applications, the Documentum home directory is under the application server executable directory. Add this file to the Documentum/config subdirectory of the binary directory, for example, CATALINA_HOME/bin/Documentum/config. You can specify an alternate location as the value of a Java system property named dfc.searchranking.file. The following table describes the elements that configure relevancy ranking. All elements are contained within the root element <SearchRanking>. Table 5 Relevancy ranking configuration elements Element Description <SourceBonus> Specifies a specific source or set of sources for which to provide bonus ranking. Contains <AttributeQuery>, <FullTextQuery>, or both. The source attribute value is a source name or a regular expression that defines the source. The type attribute value can be used to restrict the source type to either repository or external. <AttributeQuery> Specifies a separate bonus for attributes. The source bonus is within [-1,1] <FullTextQuery> Specifies a separate bonus for full-text. The source bonus is within [-1,1] <RankConfidence > Decreases confidence ranking for specific source or set of sources. The value is within [0,1]. The source attribute value is a source name or a regular expression that defines the source. The type attribute value can be used to restrict the source type to either repository or external. <FullText> Specifies a set of attributes to be added to the computation of the full-text factor. By default, as a partial representation of the full-text score for a specific document, the computation uses the concatenation of Dublin Core Metatdata Elements. You can set one or more attributes to be used for the computation. Contains one or more <Attribute> elements. <Attribute> Specifies an attribute to be weighted with the full-text score. The value is an attribute or a regular expression that resolves one or more attributes. 22 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search Element Description <AttributeWeight> Specifies the weight for a specific attribute value or values that match a regular expression. The weight of an attribute is a positive number, relative to the other attributes weight. By default, the title attribute weight is 2, all other attributes have a neutral weight of 1. A weight of 0 negates the effect of the attribute. The attribute attribute specifies the attribute or a regular expression that resolves one or more attributes. The value attribute is optional and specifies a value or a regular expression that resolves one or more values. The value is within [0+]. <RatingWeight> Specifies the relative weight of the score from specific source types compared to the relevancy ranking score (this last one is assigned a neutral weight of 1). With a weight of 0 the score from the specific source is not taken into account; with a weight of 100 or greater the relevancy ranking score is ignored (not computed). The source attribute value is a source name or a regular expression that defines the source. The type attribute value can be used to restrict the source type to either repository or external. The rating weight is within [0+]. The following example removes xPlore ranking: <RatingWeight source="my_repository" >0</RatingWeight> Note: Regular expression substitution is supported. For example, attribute=".*format.*" resolves any attribute with the substring format in the name. The declaration <Attribute>abstract.*|summary</Attribute> resolves any attribute starting with abstract, or the summary attribute. The DTD for this file is in DFC, so you do not need to provide it in your environment: <!ELEMENT SearchRanking (SourceBonus*, RankConfidence*, FullText?, AttributeWeight*, RatingWeight*)> <!ELEMENT SourceBonus (AttributeQuery?, FullTextQuery?)> <!ATTLIST SourceBonus source CDATA #IMPLIED> <!ATTLIST SourceBonus type (any | docbase | external) "any"> <!ELEMENT AttributeQuery (#PCDATA)> <!ELEMENT FullTextQuery (#PCDATA)> <!ELEMENT RankConfidence (#PCDATA)> <!ATTLIST RankConfidence source CDATA #IMPLIED> <!ATTLIST RankConfidence type (any | docbase | external) "any"> <!ELEMENT FullText (Attribute*)> <!ELEMENT Attribute (#PCDATA)> <!ELEMENT AttributeWeight (#PCDATA)> <!ATTLIST AttributeWeight attribute CDATA #REQUIRED> <!ATTLIST AttributeWeight value CDATA #IMPLIED> <!ELEMENT RatingWeight (#PCDATA)> <!ATTLIST RatingWeight source CDATA #IMPLIED> <!ATTLIST RatingWeight type (any | docbase | external) "any"> Adding a bonus for a specific source The unified ranking score takes only into account the results metadata. You can give a bonus for a specific source when you know that the source returns relevant results. In the following sample, a 0.3 bonus is added to the score of all results returned by the source named "good_source". <SourceBonus source="good_source"> <AttributeQuery>0.3</AttributeQuery> <FullTextQuery>0.3</FullTextQuery> EMC Documentum Version 7.2 Search Development Guide 23 Configuring and Customizing DFC Search </SourceBonus> Emphasizing a specific attribute You can modify the relative weight of an attribute in the score. By default, the title attribute weight is 2, while other attributes have a weight of 1, which is a neutral value. If the title attribute is not very relevant, you can assign other attributes a higher weight in the global score. You can also decrease the weight of the title attribute. The following example demonstrates how to accentuate the effect of the subject attribute in the global score. <SearchRanking> <AttributeWeight attribute="subject">4</AttributeWeight> </SearchRanking> DFC query builder For information on DFC interfaces for use with the xPlore server, refer to EMC Documentum xPlore Administration and Development Guide. There are two ways to execute a query in DFC: • Simple query using IDfQuery. See DFC database queries, page 28 • Complex query using the DFC search service (query builder) With IDfQueryBuilder, you can use DQL syntax to query one or more indexed or non-indexed Content Servers. With Federated Search Services (FS2) product, you can query external sources and the client desktop as well. IDfQueryBuilder provides a programmatic interface to change the query structure, support external sources, support asynchronous operations, change display attributes, and perform concurrent query execution in a federation. IDfQueryBuilder allows you to build queries with the following information: • Data to build the query • Source list (required) • Set max result count • Get hit count (setHitCountRetrieved) • Set the locale of the query (setLocale) • Container of source names • Transient search metadata manager bound to the query • Transient query validation flag • Attributes to order the results by: addOrderByAttribute() • Add a facet definition IDfQueryManager is an object-oriented interface to build a query. This interface does not manipulate a String representation. It is internally responsible for translating the query to different language and language levels: DQL, FTDQL, FS2 Query Language. In DFC or WDK-based search components, use IDfQueryBuilder to access and manipulate queries. Pools of query brokers queue and execute synchronous and asynchronous queries. There is one queue for repositories and one queue for external sources. Each broker is a thread running in DFC that 24 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search executes a query on a single source. For example, a broker can execute an IDfQuery on a repository. Brokers for external sources connect to FS2 brokers for repositories. In the following example, 30 brokers are configured in dfc.properties: dfc.search.external_sources.brokers=30 dfc.search.docbase.brokers=20 Results and events such as progress or errors are returned as soon as they arrive. The following illustration diagrams this asynchronous process: IDfSourceMap maps the available repositories and external sources and their capabilities. Before sending a query to a source, you can check the source capabilities. For example, you can verify whether facets are supported, if FTDQL is supported, or if wildcards are supported. Refer to the javadocs of the interface IDfSearchSource for DFC or RepositoryProperty for DFS for more details about source capabilities. Querying external sources requires Federated Search Services. IDfSearchMetadataMgr determines for the query builder what metadata is available from the selected sources, such as available object types and data dictionary information about the types. The FS2 server store and administration tools manage external sources. The Search Metadata Manager communicates with the FS2 server to assemble a list of available sources. The search metadata manager has methods to get types and their attributes from each source With FS2, if the FS2 configuration file defines external custom types, they can be searched. An external type is defined as a value of client.dfc.types. Additionally, dm_sysobject and dm_document types are queried in external sources, but not all attributes of these types are available in the external sources. For multi-repository searches, the first repository in a client search list is used as the metadata model server. This model server is used to retrieve all data dictionary information. Transforming a query with a filter A search filter is a Java class or SBO that transforms a query before it is submitted or transforms the results. For example, you can: • Transform a query before it is sent for processing (DQL, XQuery, or FS2). – Add new attributes that can be transformed to internal attributes. – Direct which xPlore collection to query, for more efficient queries. – Remove attributes that the target does not support. – Add logging information for each query. • Transform the query results before they are returned to the user. – Add computed attributes to the results. EMC Documentum Version 7.2 Search Development Guide 25 Configuring and Customizing DFC Search – Filter out results. Implementing a filter A search service filter implements one or more of the following interfaces in the com.documentum.fc.client.search.filter package: • IDfQueryFilter • IDfFacetFilter • IDfResultFilter • IDfCompletionFilter A filter can modify the data structure (query, results, or facets) and context parameters. It can send an event that is retrievable by an IDfQueryStatus object. The filter accesses the execution context through the IContext interface. This interface contains runtime information: Session, application-specific properties, and backend information such as whether the target is a repository or which index server is supported. Deploying a filter Choose one of the following to deploy a search service filter: • Create a searchfilter.properties file in the application classpath. The class must also be in the classpath. The file has the following form: filterclass[0]=com.emc.documentum.filters.MyFilter filterclass[1]=com.emc.documentum.filters.MyOtherFilter • Package the filter class as an SBO. At runtime, DFC loads the filter class. This method is recommended for a multi-repository environment. Multiple filters are supported, but the order in which they are loaded is not configurable. You have some control over filter order by implementing the interface IFilterOrderDependency. Sample filter class This example shows how to set the collection based on the object type set in the query. This filter does static caching in the filter static fields. This field is lazily populated the first time a query is executed. package com.documentum.test.fc.client.search.utils; import import import import import import import import com.documentum.fc.client.search.filter.IDfQueryFilter; com.documentum.fc.client.search.filter.IDfContext; com.documentum.fc.client.search.IDfQueryDefinition; com.documentum.fc.client.search.IDfQueryBuilder; com.documentum.fc.client.search.IDfSearchSourceMap; com.documentum.fc.client.search.IDfSearchSource; com.documentum.fc.client.*; com.documentum.fc.common.DfException; import java.util.Map; import java.util.Collections; 26 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search import java.util.HashMap; public class CollectionFilter implements IDsQueryFilter { public IDfQueryDefinition filterQuery ( IDfContext context, IDfQueryDefinition query) throws DfException { if (query.isQueryBuilder()) { IDfQueryBuilder builder = (IDfQueryBuilder) query; IDfSearchSourceMap sourcesMap = query.getMetadataMgr().getSourceMap(); Iterable<String> sources = context.getSources(); for (String source : sources) { IDfSearchSource sourceDef = sourcesMap.getSource(source); if (sourceDef.getType() == IDfSearchSource.SRC_TYPE_DOCBASE) { String collection = getCollection(context, source, builder); if ((collection != null) && (collection.length()>0)) { builder.addPartitionScope(source, collection); } } } } return query;} private String getCollection ( IDfContext context, String source, IDfQueryBuilder builder) throws DfException { String typeName = builder.getObjectType(); Map<String, String> collectionMapping = getCollectionMapping(context); String collection = collectionMapping.get(typeName); if (collection == null) { IDfSessionManager sessionManager = context.getSessionManager(); IDfSession session = sessionManager.getSession(source); try { while ((collection == null) && ((typeName != null) && ( typeName.length()>0))) { IDfType dfType = session.getType(typeName); typeName = dfType.getSuperName(); if ((typeName != null) && (typeName.length()>0)) { collection = collectionMapping.get(typeName); } } } finally { sessionManager.release(session); } if (collection != null) { collectionMapping.put(typeName, collection); } EMC Documentum Version 7.2 Search Development Guide 27 Configuring and Customizing DFC Search else { collectionMapping.put(typeName, ""); } } return collection;} private static synchronized Map<String, String> getCollectionMapping ( IDfContext context) { if (m_collectionToTypeMapping == null) { m_collectionToTypeMapping = Collections.synchronizedMap( new HashMap<String , String >()); // TODO: load the collection mapping from the classpath or a // file in the repository. Here we hardcode the mapping m_collectionToTypeMapping.put("dm_folder", "collection1"); m_collectionToTypeMapping.put("dm_document", "collection2"); } return m_collectionToTypeMapping; } private static Map<String, String> m_collectionToTypeMapping = null;} DFC database queries You can use the IDfQuery interface, which is not part of the DFC search service, for database queries. Refer to the Javadocs for the com.documentum.fc.client.search package for a description of how to use this capability. The following example from the WDK GroupAttributes class executes a simple query and gets the results as an IDfCollection: StringBuffer query = new StringBuffer(512); query.append( "SELECT group_name FROM dm_group where ANY i_all_users_names = ’"); query.append(loginUserName); query.append("’"); IDfQuery queryObject = DfcUtils.getClientX().getQuery(); queryObject.setDQL(query.toString()); IDfCollection collection = queryObject.execute( getDfSession(), IDfQuery.DF_READ_QUERY); Hello World DFC search You can create DFC search applications based on servlets and JSP pages and the DFC Search Service. For information on the DFC query builder service, see DFC query builder, page 24 and the Javadocs for the package com.documentum.fc.client.search. The following example takes a search input string and searches all available sources known to the search service: 28 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search /** * Search the web based on the search string and stores it in the Hashmap */ private void saveECISearchResults() { System.out.println("ECISearch Method :SaveECISearchResults: Start"); IDfSearchSourceMap srcMap = null; IDfClient localClient = null; IDfQueryManager queryMgr = null; IDfQueryBuilder queryBldr = null; IDfQueryProcessor idfQueryProcessor = null; IDfResultsSet resultsSet = null; IDfResultObjectManager idfResultObjMgr = null; ArrayList arrExternalSources = new ArrayList(20); mMap = new HashMap(); int c = 0; try { IDfClient client = m_clientX.getLocalClient(); /* * sessionManager - A session manager to be used for authentication * against search sources * defaultMetadataDocbase - The default repository from which to pick * type metadata. Can be safely set to null if the search service is * configured to search only repositories and not on external sources. * Must not be null if external sources are configured in the search * service. The session manager must have login info for the repository */ IDfSearchService searchService = client.newSearchService( m_sessionManager, m_docbaseName); srcMap = searchService.getSourceMap(); queryMgr = searchService.newQueryMgr(); IDfQueryBuilder queryBuilder = queryMgr.newQueryBuilder("dm_sysobject"); IDfSearchMetadataManager IDfSearchMetadataManager = queryBuilder. getMetadataMgr(); //Getting the source map IDfSearchSourceMap searchSourceMap = searchService.getSourceMap(); //Getting list of available external sources IDfEnumeration enumSearchSource = searchSourceMap.getAvailableSources( IDfSearchSource.SRC_TYPE_EXTERNAL); while (enumSearchSource.hasMoreElements()) { IDfSearchSource idfsource = ( IDfSearchSource) enumSearchSource.nextElement(); String[] strExternalSource = new String[2]; strExternalSource[0] = idfsource.getName(); System.out.println("External Sources(0):" + strExternalSource[0]); arrExternalSources.add(strExternalSource); //add source to SearchMetadatamanager IDfSearchMetadataManager.addSelectedSource(strExternalSource[0]); //add the source to the query builder queryBuilder.addSelectedSource(strExternalSource[0]);} EMC Documentum Version 7.2 Search Development Guide 29 Configuring and Customizing DFC Search IDfExpressionSet rootExp = queryBuilder.getRootExpressionSet(); //Creating the search query rootExp.addSimpleAttrExpression("object_name", IDfValue.DF_STRING, IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, false, false, m_searchString); queryBuilder.addResultAttribute("object_name"); idfQueryProcessor = searchService.newQueryProcessor(queryBuilder, true); idfResultObjMgr = searchService.newResultObjectManager(queryBuilder); idfQueryProcessor.addListener(this); idfQueryProcessor.search(); System.out.println("ECISearch Method: Query Failed : " + idfQueryProcessor.getQueryStatus().getNbrFailed()); Thread.sleep(m_sleepTime); System.out.println("ECISearch Method: Query Status : " + idfQueryProcessor.getQueryStatus().getStatus()); IDfResultsSet rs = idfQueryProcessor.getResults(); System.out.println(rs.size() + " result(s)\n"); while (rs.next()) { IDfResultEntry result = rs.getResult(); // Filter the results based on the score attribute if (result.getString("score").equalsIgnoreCase("1.0") || result. getString("score") == "1.0") { String objectName = result.getString("object_name"); mMap.put(objectName, result); System.out.println(result); } } addExternalFilesToFolder(mMap, idfResultObjMgr); } catch (Exception e) { e.printStackTrace();}} Displaying the FS2 targets at design time: //Getting the source map IDfSearchSourceMap searchSourceMap = searchService.getSourceMap(); //Getting list of available external sources IDfEnumeration enumSearchSource = searchSourceMap.getAvailableSources( IDfSearchSource.SRC_TYPE_EXTERNAL); while (enumSearchSource.hasMoreElements()) { IDfSearchSource idfsource = (IDfSearchSource) enumSearchSource.nextElement(); String[] strExternalSource = new String[2]; strExternalSource[0] = idfsource.getName(); } Setting the FS2 target at query execution time: //Getting the source map IDfSearchSourceMap searchSourceMap = searchService.getSourceMap(); 30 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search //Getting list of available external sources IDfEnumeration enumSearchSource = searchSourceMap.getAvailableSources( IDfSearchSource.SRC_TYPE_EXTERNAL); while (enumSearchSource.hasMoreElements()) { IDfSearchSource idfsource = (IDfSearchSource) enumSearchSource.nextElement(); String[] strExternalSource = new String[2]; strExternalSource[0] = idfsource.getName(); //custom test to check if source belongs to the selection of the user //(design time) if (strExternalSource-does-not-belong-to-selection-at-design-time) continue; //add source to SearchMetadatamanager IDfSearchMetadataManager.addSelectedSource(strExternalSource[0]); //add the source to the query builder queryBuilder.addSelectedSource(strExternalSource[0]);} DFC customization examples The following examples illustrate the most common scenarios when using the DFC search service. The first scenario is a simple search on one repository. The next example searches an external source (relying on the Federated Search Server) that requires authentication. The third example creates an asynchronous search. The source files for these examples can be found on the EMC Developer Network web site. Go to Content Management > Sample code > DFC > DFC Search API Samples and download the corresponding file: DFCSearchAPISamples.zip. Simple search of one repository In the following example, a login servlet (LoginServlet class) and login.jsp page handle user login. (The login class servlet is not shown in the following code.) The SearchServlet class handles query building and execution. The JSP pages search.jsp and results.jsp display a search form and results. The following illustration shows the UI that is displayed in search.jsp. EMC Documentum Version 7.2 Search Development Guide 31 Configuring and Customizing DFC Search The following illustration shows the directory structure for this simple application. The SearchServlet class gets the query builder instance to create a search. The variables from the search JSP page are saved for the QueryBuilder ("ft" for full-text, and "object_name"): String fulltextValue = httpServletRequest.getParameter("ft"); String objectNameValue = httpServletRequest.getParameter("object_name"); String docbase= httpServletRequest.getParameter("docbase"); IDfSearchService searchService = client.newSearchService(sMgr, docbase); IDfQueryManager queryManager = searchService.newQueryMgr(); 32 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search IDfQueryBuilder queryBuilder = queryManager.newQueryBuilder("dm_document"); IDfSearchService (com.documentum.fc.client.search) is the entry point to search related services: query building, query execution, results manipulation, available sources, and query metadata. The following lines in the search servlet set the result attributes to be displayed. The servlet then adds the source repository, which can either be added to the UI or set in the servlet class. Next, the servlet builds an expression set. The method addFullTextExpression adds the string from the search form. The method addSimpleAttrExpression adds the object name and operator from the form: queryBuilder.addResultAttribute("object_name"); queryBuilder.addResultAttribute("summary"); queryBuilder.addResultAttribute("score"); queryBuilder.addSelectedSource(docbase); IDfExpressionSet rootExpressionSet = queryBuilder.getRootExpressionSet(); if (fulltextValue!=null) rootExpressionSet.addFullTextExpression(fulltextValue); if (objectNameValue!=null) rootExpressionSet.addSimpleAttrExpression( "object_name", IDfValue.DF_STRING, IDfSimpleAttrExpression. SEARCH_OP_CONTAINS, false, false, objectNameValue); The following lines execute the query synchronously by using the synchronous call blockingSearch with a timeout of 60 seconds. The query processor handles the query execution. When the query has finished, the control is forwarded to the JSP page to build the results page. IDfQueryProcessor queryProcessor = searchService.newQueryProcessor( queryBuilder, true); queryProcessor.blockingSearch(60000); The following code generates the results JSP page. The interface IDfResultEntry is like IDfTypedObject but is not modifiable. <%IDfResultsSet results = queryProcessor.getResults(); for (int index = 0; index < results.size(); index++) { IDfResultEntry result = results.getResultAt(index); %> <table border="0" cellpadding="0" cellspacing="0" width="100%" style=" margin-bottom: 8px"> <tr><td width="5"/><td width="1"> </td> <td> <div class="result-title"><%=result.getString("object_name")%> </div> <div class="result-score"> <%= result.getSource() %> - <i> <%=(int)(result.getScore() * 100)%>%</i> </div> <br/><font size="-1"><%=result.getString("summary")%></font> </td> </tr><tr height="1"></tr> </table> <% } %> EMC Documentum Version 7.2 Search Development Guide 33 Configuring and Customizing DFC Search Search an external source that requires authentication This example extends the first one and illustrates how to create a search on an external source. The Federated Search Server handles communication with the external source. The following configuration in the dfc.properties file is required: dfc.search.external_sources.enable = true dfc.search.external_sources.host = <host_name> The external source can be another repository, an eRoom, or a web site. Refer to Federated Search Services (FS2) documentation for details about out-of-the-box adapters and adapter development. The query building and query execution are similar for one or for several sources. When you query external sources, you must do three tasks: • Get the list of available sources. • Add the sources to the query. • Register the authentication information (the credentials) with the SessionManager. The following example illustrates these tasks. IDfSearchSourceMap sourceMap = searchService.getSourceMap(); // Get the list of available sources IDfEnumeration sources = sourceMap.getAvailableSources(); while (sources.hasMoreElements()) { IDfSearchSource source = (IDfSearchSource) sources.nextElement(); String sourceName = source.getName(); // Add source in query builder queryBuilder.addSelectedSource(sourceName); // That would come from the custom application String loginName = getLoginName(sourceName); String loginPassword = getLoginPassword(sourceName); // If need be, check login capability // source.hasCapability(IDfSearchSource.CAP_LOGIN) // Set the credentials for the user IDfLoginInfo loginInfoObj = clientx.getLoginInfo(); loginInfoObj.setUser(loginName); loginInfoObj.setPassword(loginPassword); // Add credentials for the source in Session manager sessionManager.setIdentity(sourceName, loginInfoObj);} The instance of IDfSearchSourceMap is a map of all available search sources, including external sources from FS2. It is like IDfDocbaseMap which provides information about the repositories known to a connection broker. The same interface, IDFSessionManager, is used to contain the credentials for the current repository, or any Documentum repository as well as external sources. 34 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search Asynchronous search example Aynchronous search is also called a "non-blocking" search as it allows you to display results as they come in. You do not have to wait for the complete result set. You can also display and update the status of the query in real time (such as "done", "in progress", or "failed"). Several calls are made to populate the results, each time retrieving the next results. It is useful when retrieving large result sets or when querying sources with different response times. This example differs from the first example on the execution part. Instead of calling blockingSearch() and indicating a timeout, we call the search() method and provide a notification interface that extends DfGenericQueryListener. The query is run in the background and new results and execution events are notified to the query listener. The notification methods are the following: • onQueryCompleted(): Query execution finished (successfully or with errors). • onResultChange(): New results have been received from the sources. • onStatusChange(): An event has occurred. It can be related to the query execution status or to possible errors. IDfQueryProcessor queryProcessor = searchService.newQueryProcessor( queryBuilder, true); // Add the notification interface QueryListener queryListener = new QueryListener(queryProcessor); queryProcessor.addListener(queryListener); // Call the asyncronous search method queryProcessor.search(); After you launch the search, use IDFQueryStatus to obtain information about the status of the query and the sources. Use IDfSourceStatus to obtain status information for a specific source. Using the visitor API You can use the visitor API in DFC to visit nodes in the expression tree. The following example creates a QueryDumper class that visits the expressions in the query. import com.documentum.fc.client.search.DfExpressionVisitor; class QueryDumper extends DfExpressionVisitor { private StringBuffer m_expressionDump = new StringBuffer(); public String dump() { return m_expressionDump.toString();} public final void visit(IDfExpressionSet expr) throws DfException { switch (expr.getLogicalOperator()) { case IDfExpressionSet.LOGICAL_OP_AND:m_expressionDump.append("(and "); break; case IDfExpressionSet.LOGICAL_OP_OR:m_expressionDump.append("(or "); } super.visit(expr); m_expressionDump.append(")");} EMC Documentum Version 7.2 Search Development Guide 35 Configuring and Customizing DFC Search public void visit(IDfValueListAttrExpression expr) throws DfException { super.visit(expr); dumpAttrAndOperator(expr); IDfEnumeration values = expr.getValues(); while (values.hasMoreElements()) { String value = (String) values.nextElement(); m_expressionDump.append(" ").append(value); } m_expressionDump.append("]");} public void visit(IDfFullTextExpression expr) throws DfException { super.visit(expr); m_expressionDump.append("[ft ").append(expr.getValue()).append("]");} public void visit(IDfSimpleAttrExpression expr) throws DfException { super.visit(expr); dumpAttrAndOperator(expr); if ((expr.getSearchOperationCode() != IDfSimpleAttrExpression. SEARCH_OP_IS_NULL) && (expr.getSearchOperationCode() != IDfSimpleAttrExpression.SEARCH_OP_IS_NOT_NULL)) { m_expressionDump.append(" ").append(expr.getValue()); } m_expressionDump.append("]");} public void visit(IDfRelativeDateExpression expr) throws DfException { super.visit(expr); dumpAttrAndOperator(expr); String timeUnitAsAString = ReflectionUtil.getConstantName( Calendar.class, expr.getTimeUnit()); m_expressionDump.append(" ").append(expr.getRelativeTime()).append( " ").append(timeUnitAsAString).append("]");} public void visit(IDfValueRangeAttrExpression expr) throws DfException { super.visit(expr); dumpAttrAndOperator(expr); m_expressionDump.append(" ").append(expr.getFromValue()).append(" ").append( expr.getToValue()).append("]");} private void dumpAttrAndOperator(IDfAttrExpression expr) { m_expressionDump.append("[").append(expr.getAttrName()).append(" "); String searchOpAsAString = s_operationMap.get(expr.getSearchOperationCode()); String valueDataTypeAsAString = ReflectionUtil.getConstantName( IDfValue.class, expr.getValueDataType()); m_expressionDump.append(searchOpAsAString).append("(").append( valueDataTypeAsAString).append(")");}} 36 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing DFC Search You can use the expression visitor in a class that accesses the query builder, such as a customized Webtop search class. The following example gets the query expression set: QueryDumper queryDumper = new QueryDumper(); rootExpr = queryBuilder.getRootExpression(); rootExpr.acceptVisitor(queryDumper); System.out.println("query =" + queryDumper.dump()); EMC Documentum Version 7.2 Search Development Guide 37 Chapter 3 Customizing Search with DFS This chapter contains the following topics: • DFS Search Services • Full-text and database searches • Constructing a search • Search service objects • Search service operations DFS Search Services Search Services provides search capabilities against EMC Documentum repositories, as well as against external sources, using Documentum Federated Search Services (FS2) server. The Search service provides full-text and structured search capabilities against multiple EMC Documentum repositories (termed managed repositories in DFS). You must install and configure full-text indexing on Documentum repositories. All DFC customizations can be used in DFS client applications. For DFC filters, see Transforming a query with a filter, page 25. See the EMC Community Network Documentum search and analytics forum to post your questions and see solutions offered by other customers and EMC employees. External sources (termed external repositories) can also be searched. , You must install FS2 adapters on external repositories (registered with an FS2 server) and deploy the Clustering SBO if Content Server is lower than 6.7. To use the Search service it is also helpful to understand FTDQL queries, dfc.properties settings, and DQL hint file settings. Full-text and database searches Search service queries can be run as full-text queries, database queries against a managed or external repository, or mixed queries (both full-text index and database). The search query is a full-text or database search depending on the following factors: • The availability to the service of indexed repositories. • Settings in the DQL hints file, if present. • The presence or absence of full-text expressions (a SEARCH DOCUMENT CONTAINS clause) in a DQL query. EMC Documentum Version 7.2 Search Development Guide 39 Customizing Search with DFS • Explicit setting of setDatabaseSearch in a StructuredQuery. Searches against a full-text index are case insensitive. Database searches are by default case sensitive. If a database query includes a SEARCH DOCUMENT CONTAINS clause in PassthroughQuery or a FullTextExpression object in a StructuredQuery, the full-text expression is evaluated against the title, subject, and object_name of dm_sysobjects. If the repository does not support full-text queries, the query is not processed. Constructing a search Non-blocking (asynchronous) searches Searches can either be blocking or non-blocking, depending on the Search Profile setting. By default, searches are blocking. Non-blocking searches display results dynamically. The client application does not have to wait for all results before displaying the first results. The Search service supports non-blocking searches because: • DFS relies on DFC, which supports asynchronous search execution; • Query calls are non-blocking: multiple successive calls can be made to get new results and the query status. The query status contains the status for each source repository: Successful, more results expected, or failed with errors. Caching mechanism The Search service relies on a caching mechanism. The cache contains the search results populated in background for every search. The cache key is built with the queryId, the query definition, and the number of results requested, which we call the search context. To leverage the cache, subsequent calls have to use the same search context. If one of the search context elements is different, the search is re-executed. The cache is used to make successive calls. This way, the first results can be displayed while subsequent calls retrieve more results. If one source fails or takes too long to return results, the search is not blocked and the first available results are returned. When a query is not found in the cache (cache miss), the operation, which contains the query execution parameters, re-executes the query. The cache clean-up mechanism is both time-based and size-based. You can modify the cache clean-up properties by editing the dfs-runtime.properties file. To modify the cache period, set the dfs.search_query_cache_house_keeper.period parameter. The default value is set to 10 (minutes) which lets enough time to compute clustering operations for the result set. If you have a large number of search operations, reduce the cache period to avoid excessive memory usage. To modify the cache size, set the dfs.search_query_cache_house_keeper.max_queries parameter. The default value is set to 100 (queries). As a guideline, one cache entry for a simple query on dm_document with 350 results uses around 1 MB of memory. For such queries, with the default cache size value of 100, the cache does not use more than 100 MB of memory. 40 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS Computing clusters The search results can be displayed in clusters. Clusters group results dynamically into categories based on the values of the results attributes. The clustering information is returned as soon as enough results are gathered to compute clusters. Clusters can then be used to navigate into the search results. For each level of clusters, a strategy is used to defined which attributes are used to compute the clusters. For example, you can define a first strategy to compute the first level of clusters on the values for Author, Source and Owner. Define a second strategy display clusters on a subset of the results using the values for Author, Format and Modified Date. Clusters can be computed on search results, but they can also be computed on a subset of the results. Query results are not cached. If they are no longer available in the search context, execute the query again. The search context is the context in which the query was executed. The clustering operations of the Search service (getClusters and getSubclusters) depend on the Clustering SBO . This SBO must be installed on a global registry. Starting with Content Server 6.7, the Clustering SBO is installed with Content Server. Computing facets A facet definition is like a cluster strategy. The definition indicates on which attribute the facet is computed. However, there are some fundamental differences: • xPlore computes facets on the entire result set. Clusters are computed on a subset of results retrieved by the application. • Facets are more exhaustive and use a group-by technique. The clustering algorithm uses tokenizers (often with text analytics), relative grouping sizes, and thresholds. Consequently, clusters provide a global idea of the result set while facets are more accurate and can be used for navigation purpose for example. Facets are like clusters. They group results into categories based on common attribute values. A facet d Other differences: • The tokenizers define the cluster order. Facets are sorted using the facetSort parameter. • Clusters usually have a threshold, that is, a minimum number of documents, to optimize the number of groupings. • It is possible to set a maximum number of facets to retrieve. In contrast, the number of clusters depends on the number of results in the result set. • Facets must be defined before the query execution, clusters are computed after the query execution. For full information on facets, see EMC Documentum xPlore Administration and Development Guide. Searching external repositories To run searches against external repositories: • Install the FS2 server. The EMC Documentum Federated Search Services Installation Guide provides information about how to install the FS2 server. EMC Documentum Version 7.2 Search Development Guide 41 Customizing Search with DFS • Install and configure FS2 adapters as described in EMC Documentum Federated Search Services Adapter Installation Guide. • Set the following properties in the file dfc.properties: – dfc.search.external_sources.enable=true – dfc.search.external_sources.host=<fs2_host> – dfc.search.external_sources.port=<fs2_port> (default is 3005) Search service objects This section briefly describes objects used by this service. For field-level information, please refer to the Javadoc or Windows help. PassthroughQuery The PassthroughQuery object is a container for a DQL or FTDQL query string. It can be executed as either a full-text or database query. A PassthroughQuery can search multiple managed repositories, but does not run against external repositories. To search an external repository a client must use a StructuredQuery. StructuredQuery A structured query defines a query using an object-oriented model. An ExpressionSet object defines a set of criteria that constraing the query. An ordered list of RepositoryScope objects defines the scope of the query (sources) . The structured query can also contain a list of FacetDefinition objects that are used to retrieve the facets with the results and a list of PartitionScope objects to limit the search to specific partitions. If you specify several partition scopes, all the specified partitions are searched. The ExpressionScope object allows you to add an ExpressionSet to the query for a given repository. The expression set is added to the root expression set of the query. This mechanism can be useful when executing the same query against several sources. The following table summarizes the StructuredQuery fields. Field Data Type Description scopes List<Reposito- Specifies the list of RepositoryScope objects that define the repositories against which the query is executed. ryScope> partitionScopes 42 List<PartitionScope> (Since 6.7) Specifies the list of PartitionScope objects that define the partitions against which the query is executed for a specific source. A partition is an xPlore collection. This parameter is ignored if xPlore is not the indexing engine. EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS Field Data Type Description expressionScopes List<Expression- (Since 6.7) Specifies the list of ExpressionScope objects. An ExpressionScope object is used to specify expressions that are only added for a specific source. Scope> isDatabaseSearch boolean Specify if the query must be executed against the database and not against the indexer. Default is false. isIncludeAllVersions boolean Specify if the query must return all matching versions (true) or only the current version (false) of the objects. Default is false. isIncludeHidden boolean Specifies if the hidden objects must be filtered from the result set (false) or kept (true). Default is false. rootExpressionSet ExpressionSet Specifies the query constraints in an ExpressionSet. orderByClauses List<OrderBy- Specifies the list of OrderByClause objects. Clause> facetDefinitions List<FacetDefinition> (Since 6.6) Specifies the list of FacetDefinition objects for the query. maxResultsForFacets int (Since 6.6) Specifies the total number of unique results available from the source, after deduplication (if deduplication is available) that are used to compute facets. Default value is -1 which means that the configuration of the indexer is used. isHitcountRetrieved boolean Specifies if the hit count must be computed and retrieved even if no facets are requested. Default is false which means that the hit count is only computed when facets are requested in the query. maxHitcount int Specifies the maximum number of results to be returned as the hit count. A smaller number lowers the performance impact of the hit count computation. Default value is -1 which means that the DFC property dfc.search.max_results_ per_source is used (10000). Scope objects PartitionScope allows you to specify a partition (xPlore collection) when querying a repository. It is only used with xPlore indexer and ignored in all other cases. An xPlore partition is a storage area (or "file store") in the Content Server mapped to an xPlore collection. RepositoryScope enables a search to be constrained to a specific folder of a repository. It can also exclude folders. An expression set and repository name define an ExpressionScope. The expression scope allows you to add an expression set only for the specified repository. This mechanism isuseful when you execute the same query against several sources. EMC Documentum Version 7.2 Search Development Guide 43 Customizing Search with DFS Expression objects An ExpressionSet is a collection of Expression objects, each of which defines either a full-text expression, or a search constraint on a single property. The Expression instances comprising the ExpressionSet are related to one another by a single logical operator (either AND or OR). The ExpressionSet as a whole defines the complete set of search criteria that is applied during a search. The top-level Expression passed contained in a StructuredQuery is referred to as the root expression of the expression tree. Three concrete classes extend the Expression class: FullTextExpression, PropertyExpression, and ExpressionSet. • FullTextExpression FullTextExpression encapsulates a search string accessed using the getValue and setValue methods. This string supports the operators "AND" "OR", and "NOT", as well as parentheses. • PropertyExpression PropertyExpression provides a search constraint based on a single property. • ExpressionSet Extends Expression and contains a set of Expression instances. An ExpressionSet can nest ExpressionSet instances. Nesting allows construction of arbitrarily complex expression trees. The following table describes the concrete subtypes of the ExpressionValue class. Table 7 ExpressionValue subtypes Subtype Description SimpleValue Contains a single String value. RangeValue Contains two String values representing the start and end of a range. The values can represent dates (using the DateFormat specified in the StructuredQuery) or integers. Contains an ordered List of String values. ValueList RelativeDateValue Contains a TimeUnit setting and an integer value representing the number of time units. TimeUnit values are MILLISECOND, SECOND, MINUTE, HOUR, DAY, ERA, WEEK, MONTH, YEAR. The integer value can be negative or positive to represent a past or future time. Condition is an enumerated type that expresses the logical condition to use when comparing a repository value to a value in an Expression. A specific Condition is included in a PropertyExpression to determine precisely how to constrain the search on the property value. QueryResult Both the Search and Query services use the QueryResult class as a container for the set of results returned by the execute operation. The QueryResult class also contains the queryId generated for this query. To uniquely identifie the query, use the queryId. The queryId is a key in the cache that identifies the query for a given user. 44 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS Status objects QueryStatus contains status information returned by a search operation. The status information can be known for each search source repository. Table 8 QueryStatus fields Field Data Type repositoryStatusInfos List<RepositoryStatusInfo> hasMoreResults Description Specifies the list of RepositoryStatusInfo where the query has been executed. boolean Specifies if the repository can return more results. isCompleted boolean Specifies if the query execution is completed. globalResultsCount int Specifies the total number of unique results available from the source, after deduplication (if deduplication is available). RepositoryStatusInfo contains data related to a query or search result regarding the status of the search in a specific repository. RepositoryStatusInfo instances are returned in a List<RepositoryStatusInfo> within a QueryResult, which is returned by a search or query operation. Starting with DFS version 6.7, RepositoryStatusInfo also contains a list of repositoryEvent objects. Use these objects to access information available at the DFC level in the IDFQueryEvent objects, such as the native query or the type of error. RepositoryStatus provides detail information about the status of a query that was executed, as pertains to a specific repository. Cluster objects The QueryCluster object is a container for ClusterTree objects for a given query. Another parameter is the queryId, which is used to uniquely identify the query. The queryId can be used to access any part of the result set. For example, you can retrieve the next set of results or clusters on all or some of the results. A ClusterTree is a container for Cluster objects that are calculated according to a ClusteringStrategy. The field isRefreshable indicates that all clusters have been computed and the search is complete or that more results can be returned by the source. The Cluster class represents a cluster or group of objects that have something in common. These objects are grouped into categories comparing the values of their attributes. EMC Documentum Version 7.2 Search Development Guide 45 Customizing Search with DFS Table 9 Cluster fields Field Data Type clusterValues List<String> Description Specifies the list of values that are used to generate the cluster name. clusterSize int Specifies the number of objects in the cluster. clusterObjectsIdentities ObjectIdentitySet Specifies a list of ObjectIdentity instances for the objects belonging to this cluster. A ClusterTree object uses the ClusteringStrategy class to set the strategy for calculating clusters. The clustering strategy can use tokenizers to group the clusters (for example, dates can be grouped into quarters). In this case, you define which tokenizer to apply for a given attribute. The ClusteringStrategy class also controls the amount of data returned by the operation. Table 10 ClusteringStrategy field Field Data Type Description strategyName String Specifies the strategy name. attributes List<String> Specifies the list of attributes used in this strategy. clusteringRange ClusteringRange Specifies the number of clusters computed by the clustering service. Possible values are : LOW, MEDIUM, HIGH. clusteringThreshold int Specifies the minimum number of results required to create a cluster. returnIdentitySet boolean Specifies whether the object identities is returned. PropertySet tokenizers Table 11 Specifies the tokenizer to apply. The ProperySet is a set of StringProperty where the name is the attribute name and the value is the tokenizer name to apply to this attribute.Available tokenizers are listed in ClusteringStrategy. List of Tokenizers available for the clustering Tokenizer name Description dm_object_name Tokenizes an object name attribute. Strings are cleaned before being used: underscore characters are replaced by spaces and the extensions are removed. dm_percentage Tokenizes a score attribute or a numeric value between 0 and 1. The suffix "%" is added to the percentage. dm_date_by_quarter Tokenizes a date attribute to create cluster by Quarter (2006 Q1, 2006 Q2, 2006 Q3 ...) 46 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS Tokenizer name Description dm_dynamic_size Tokenizes a string size attribute and groups dynamically the input sizes. dm_size_by_range Tokenizes a string size attribute and creates predefined ranges. The ranges are 0KB-100KB, 100KB-1MB, 1MB-10MB, 10MB-100MB, >100MB dm_date_by_day Tokenizes a string date attribute according to the "dd/MM/yyyy" pattern. dm_exact_match Tokenizes any string and groups the ones that are exactly the same. dm_text Parses, lemmatizes and dynamically groups any string attribute. dm_number Tokenizes strings to obtain numbers and groups dynamically the input numbers. dm_author Tokenizes strings to obtain lists of authors. Groups dynamically the authors. By default, the author names are expected to start with the first name. dm_collection Tokenizes strings of the form "category1:category2:category3" and groups dynamically according to the most significant categories or sub-categories. dm_source Tokenizes a r_source attribute, it generates a suitable source name for the external source. Facet objects The QueryFacet object is a container for Facet objects for a given query. It is computed on query results. The queryId field identifies the query. The QueryFacet also contains the QueryStatus. It is like the QueryResult object. A Facet is a container for FacetValue objects and a FacetDefinition object. xPlore computes the facet values according to the facet definition. The FacetValue class represents a group of results having attribute values in common. A FacetValue has a value and a count indicating the number of results contained in this group. It can also have a list of subfacet values and a set of properties. Table 12 FacetValue fields Field Data Type value string Description The display value or label for this FacetValue. count int Specifies the number of results for the facet value. properties PropertySet Specifies a list of Property instances used to define custom properties. For example, facets grouped by day are defined by a starting and an ending date and time. subFacetValues List<FacetValue> Specifies the list of FacetValue objects. A Facet object uses the FacetDefinition class to define how to build a Facet. EMC Documentum Version 7.2 Search Development Guide 47 Customizing Search with DFS Table 13 FacetDefinition fields Field Data Type Description name String Specifies the definition name. attributes List<String> Specifies the list of attributes used in this definition. If not specified, the definition name is used as an attribute. groupBy String Specifies the "group by" strategy. Possible values are: string (default value), range (for numeric values), location (for CIS entities). The range grouping requires a range property that defines the subvalues to use. For dates, the possible values are: day, week, month, year, and relativeDate. The relativeDate subvalues are: today, yesterday, this week, this month, this year, last year, and older. An optional property timezone allows you to specify the client timezone, such as GMT+1. maxFacetValues int Specifies the maximum number of FacetValue objects to build a Facet. If not set, it returns ten values. If set to -1, it returns all values. facetSort FacetSort Specifies the sort order to apply. Possible values are: FREQUENCY (descending order based on count values), VALUE_ASCENDING (ascending order based on alphanumeric values), VALUE_DESCENDING (descending order based on alphanumeric values), NONE. properties PropertySet Specifies a list of Property instances used to define custom properties. subFacetDefinition FacetDefinition Specifies a FacetDefinition for subfacet values, if any. Search service operations The following operations are available in the search service. getRepositoryList operation The getRepositoryList operation provides list of managed and external repositories that are available to the service for searching. Java syntax List<Repository> getRepositoryList(OperationOptions options) throws SearchServiceException C# syntax 48 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS List<Repository> GetRepositoryList(OperationOptions options) Parameter Data type Description options OperationOptions Contains profiles and properties that specify operation behaviors. Not used. Returns a List of Repository instances. The following example demonstrates the getRepositoryList operation. Java: Getting a repository list public List<Repository> repositoryList() { try { ServiceFactory serviceFactory = ServiceFactory.getInstance(); ISearchService searchService = serviceFactory.getService(ISearchService.class, serviceContext); List<Repository> repositoryList = searchService.getRepositoryList (new OperationOptions()); for (Repository r : repositoryList) { System.out.println(r.getName()); } return repositoryList; } catch (Exception e) { e.printStackTrace(); throw new RuntimeException(e);} C#: Getting a repository list public List<Repository> RepositoryList() { try { List<Repository> repositoryList = searchService.GetRepositoryList (new OperationOptions()); foreach (Repository r in repositoryList) { Console.WriteLine(r.Name); } return repositoryList; } catch (Exception e) { Console.WriteLine(e.StackTrace); throw new Exception(e.Message);}} execute operation The execute operation searches a repository or set of repositories and returns search results. Java syntax EMC Documentum Version 7.2 Search Development Guide 49 Customizing Search with DFS QueryResult execute(Query query, QueryExecution execution, OperationOptions options) throws SearchServiceException C# syntax QueryResult Execute(Query query, QueryExecution execution, OperationOptions options) Parameter Data type Description query Query Either a PassthroughQuery or a StructuredQuery execution QueryExecution Object describing execution parameters. Query execution parameters are described in . options OperationOptions Contains profiles and properties that specify operation behaviors. For the execute operation, the profiles primarily provide filters that modify the contents of the DataPackage returned in QueryResult. An applicable profile is the SearchProfile. In a PropertyProfile only the property filter mode SPECIFIED_BY_INCLUDE is supported for this operation. Other property filter modes are not supported. The SearchProfile sets the parameters for the search execution. Set the isAsyncCall parameter to indicate whether the search is blocking. Returns a QueryResult instance. Java: Simple PassthroughQuery public QueryResult simplePassthroughQuery() { QueryResult queryResult; try { ServiceFactory serviceFactory = ServiceFactory.getInstance(); ISearchService searchService = serviceFactory.getService(ISearchService.class, serviceContext); String queryString = "select distinct r_object_id from dm_document order by r_object_id "; int startingIndex = 0; int maxResults = 20; int maxResultsPerSource = 60; PassthroughQuery q = new PassthroughQuery(); q.setQueryString(queryString); q.addRepository(defaultRepositoryName); QueryExecution queryExec = new QueryExecution(startingIndex, maxResults,maxResultsPerSource); queryExec.setCacheStrategyType(CacheStrategyType. NO_CACHE_STRATEGY); 50 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS queryResult = searchService.execute(q, queryExec, null); QueryStatus queryStatus = queryResult.getQueryStatus(); RepositoryStatusInfo repStatusInfo = queryStatus. getRepositoryStatusInfos().get(0); if (repStatusInfo.getStatus() == Status.FAILURE) { System.out.println(repStatusInfo.getErrorTrace()); throw new RuntimeException("Query failed to return result."); } System.out.println("Query returned result successfully."); DataPackage dp = queryResult.getDataPackage(); System.out.println("DataPackage contains " + dp.getDataObjects().size() + " objects."); for (DataObject dataObject : dp.getDataObjects()) { System.out.println(dataObject.getIdentity()); } } catch (Exception e) { e.printStackTrace(); throw new RuntimeException(e); } return queryResult;} C#: Simple PassthroughQuery public QueryResult SimplePassthroughQuery() { QueryResult queryResult; try { string queryString = "select distinct r_object_id from dm_document order by r_object_id "; int startingIndex = 0; int maxResults = 20; int maxResultsPerSource = 60; PassthroughQuery q = new PassthroughQuery(); q.QueryString = queryString; q.AddRepository(DefaultRepository); QueryExecution queryExec = new QueryExecution(startingIndex, maxResults, maxResultsPerSource); queryExec.CacheStrategyType = CacheStrategyType.NO_CACHE_STRATEGY; queryResult = searchService.Execute(q, queryExec, null); QueryStatus queryStatus = queryResult.QueryStatus; RepositoryStatusInfo repStatusInfo = queryStatus. RepositoryStatusInfos[0]; if (repStatusInfo.Status == Status.FAILURE) EMC Documentum Version 7.2 Search Development Guide 51 Customizing Search with DFS { Console.WriteLine(repStatusInfo.ErrorTrace); throw new Exception("Query failed to return result."); } Console.WriteLine("Query returned result successfully."); DataPackage dp = queryResult.DataPackage; Console.WriteLine("DataPackage contains " + dp.DataObjects.Count + " objects."); foreach (DataObject dataObject in dp.DataObjects) { Console.WriteLine(dataObject.Identity); } } catch (Exception e) { Console.WriteLine(e.StackTrace); throw new Exception(e.Message); } return queryResult;} Java: Structured query public void simpleStructuredQuery() { try { ServiceFactory serviceFactory = ServiceFactory.getInstance(); ISearchService searchService = serviceFactory.getService(ISearchService.class, serviceContext); String repoName = defaultRepositoryName; // Create query StructuredQuery q = new StructuredQuery(); q.addRepository(repoName); q.setObjectType("dm_document"); q.setIncludeHidden(true); q.setDatabaseSearch(true); ExpressionSet expressionSet = new ExpressionSet(); expressionSet.addExpression(new PropertyExpression("owner_name", Condition.CONTAINS, "admin")); q.setRootExpressionSet(expressionSet); // Execute Query int startingIndex = 0; int maxResults = 20; int maxResultsPerSource = 60; QueryExecution queryExec = new QueryExecution(startingIndex, maxResults, maxResultsPerSource); QueryResult queryResult = searchService.execute(q, queryExec, null); QueryStatus queryStatus = queryResult.getQueryStatus(); RepositoryStatusInfo repStatusInfo = queryStatus. 52 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS getRepositoryStatusInfos().get(0); if (repStatusInfo.getStatus() == Status.FAILURE) { System.out.println(repStatusInfo.getErrorTrace()); throw new RuntimeException("Query failed to return result."); } // print results for (DataObject dataObject : queryResult.getDataObjects()) {System.out.println(dataObject.getIdentity());} catch (Exception e) { e.printStackTrace(); throw new RuntimeException(e); } System.out.println("test completed - OK");} C#: Structured query public void SimpleStructuredQuery() { try { String repoName = DefaultRepository; // Create query StructuredQuery q = new StructuredQuery(); q.AddRepository(repoName); q.ObjectType = "dm_document"; q.IsIncludeHidden = true; q.IsDatabaseSearch = true; ExpressionSet expressionSet = new ExpressionSet(); expressionSet.AddExpression(new PropertyExpression("owner_name", Condition.CONTAINS, "admin")); q.RootExpressionSet = expressionSet; // Execute Query int startingIndex = 0; int maxResults = 20; int maxResultsPerSource = 60; QueryExecution queryExec = new QueryExecution(startingIndex, maxResults, maxResultsPerSource); QueryResult queryResult = searchService.Execute(q, queryExec, null); QueryStatus queryStatus = queryResult.QueryStatus; RepositoryStatusInfo repStatusInfo = queryStatus.RepositoryStatusInfos[0]; if (repStatusInfo.Status == Status.FAILURE) { Console.WriteLine(repStatusInfo.ErrorTrace); throw new Exception("Query failed to return result."); } EMC Documentum Version 7.2 Search Development Guide 53 Customizing Search with DFS // print results foreach (DataObject dataObject in queryResult.DataObjects) {Console.WriteLine(dataObject.Identity);}} catch (Exception e) { Console.WriteLine(e.Message); Console.WriteLine(e.StackTrace); throw new Exception(e.Message); } } stopSearch operation The stopSearch operation stops the execution of the query passed in as parameter. The execute operation must be called first to launch the query. Once the query is stopped, results retrieved so far are available. It is then possible to call the operations getClusters, getSubclusters and getResultProperties passing in the Query and QueryExecution parameters of the stopped query. Restart the stopped search by calling the execute operation with the same query and query execution objects, without the queryId. Java syntax QueryStatus stopSearch(Query query, QueryExecution execution) throws SearchServiceException C# syntax QueryStatus StopSearch(Query query, QueryExecution execution) Parameter Data type Description query Query Either a PassthroughQuery or a StructuredQuery execution QueryExecution Object describing execution parameters. Query execution parameters are described in Documentum Enterprise Content Services Reference. Returns a QueryStatus instance of the stopped query. Java: stopping a search public QueryStatus stopSearch () throws ServiceException { // Specify query: can be either a PassthroughQuery or a StructuredQuery PassthroughQuery query = new PassthroughQuery(); query.setQueryString("select * from dm_document"); query.addRepository(getEnv().getDefaultDocbaseName()); // Specify query execution QueryExecution queryExecution = new QueryExecution(); queryExecution.setMaxResultCount(100); queryExecution.setMaxResultPerSource(350); // Set operations options OperationOptions operationOptions = new OperationOptions(); 54 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS SearchProfile searchProfile = new SearchProfile(); searchProfile.setAsyncCall(true); operationOptions.setSearchProfile(searchProfile); PropertyProfile propertyProfile = new PropertyProfile(); propertyProfile.setFilterMode(PropertyFilterMode.SPECIFIED_BY_INCLUDE); operationOptions.setPropertyProfile(propertyProfile); // Start the search QueryResult results = m_searchService.execute(query, queryExecution, operationOptions); // Set query id queryExecution.setQueryId(results.getQueryId()); // Optional: check the status is RUNNING before stopping the search // Stop the search QueryStatus status = m_searchService.stopSearch(query, queryExecution); // Optional: check the status is STOPPED return status;} getClusters operation The getClusters operation computes clusters on query results. To run the query and get results, call the execute operation first. The getClusters operation uses the same Query and QueryExecution parameters. If the query has not run or if results are no longer available in the search context, you must supply these parameters to reexecute the query. Set blocking in the Search profile to compute clusters on the first available results. Set non-blocking to compute clusters only when all results are returned. By default, the execution is synchronous and clusters are computed when all results are returned. Java syntax QueryCluster getClusters (Query query, QueryExecution execution, OperationOptions options) throws SearchServiceException; C# syntax QueryCluster GetClusters (Query query, QueryExecution execution, OperationOptions options) Parameter Data type Description query Query Contains the query definition and the repositories against which the query is run. EMC Documentum Version 7.2 Search Development Guide 55 Customizing Search with DFS Parameter Data type Description execution QueryExecution Object describing execution parameters. Query execution parameters are described in Documentum Enterprise Content Services Reference. options OperationOptions Contains profiles and properties that specify operation behaviors. Only the ClusteringProfile and the SearchProfile are applicable. If this object is null or if there is no ClusteringStrategy, no clusters are returned. The ClusteringProfile contains a list of ClusteringStrategy instances. The ClusteringStrategy is used to compute the ClusterTrees and controls the amount of data returned by the operation. Returns a QueryCluster object containing a list of ClusterTree objects and the id of the query. The SearchServiceException exception is thrown in particular when the Clustering SBO is not installed. The following example demonstrates the getClusters operation. public QueryCluster getClusters () throws ServiceException { OperationOptions options = new OperationOptions(); // Can be either a PassthroughQuery or StructuredQuery PassthroughQuery query = new PassthroughQuery(); query.setQueryString("select * from dm_document"); query.addRepository(YOUR_REPOSITORY); // Get 50 results QueryExecution queryExec = new QueryExecution(0, 50, 50); QueryResult results = searchService.execute(query, queryExec, options); // Get generated queryId and set it for subsequent calls String queryId = results.getQueryId(); queryExec.setQueryId(queryId); // Get query clusters // Set ClusteringStrategy ClusteringStrategy strategy = new ClusteringStrategy(); strategy.setStrategyName("Name"); List<String> attrs = new ArrayList<String>(2); attrs.add("object_name"); strategy.setAttributes(attrs); strategy.setReturnIdentitySet(true); strategy.setClusteringRange(ClusteringRange.HIGH); // Set ClusteringProfile ClusteringProfile profile = new ClusteringProfile(strategy); options.setClusteringProfile(profile); QueryCluster queryCluster = searchService.getClusters(query, queryExec, options); return queryCluster;} 56 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS getSubclusters operation The getSubclusters operation enables to compute clusters on a subset of the result set. The subset is specified in the ObjectIdentitySet. To run the query and get results, call the execute operation first. IThe getSubclusters operation uses the same Query and QueryExecution parameters. If the query has not run, or if results are no longer available in the search context, the query is executed according to the Query, QueryExecution and OperationOptions parameters. Set blocking in the Search profile to compute clusters on the first available results. Set non-blocking to compute clusters only when all results are returned. By default, the execution is synchronous and clusters are computed when all results are returned. Java syntax QueryCluster getSubclusters (ObjectIdentitySet objectsToClusterize, Query query, QueryExecution execution, OperationOptions options) throws SearchServiceException; C# syntax QueryCluster GetSubclusters (ObjectIdentitySet objectsToClusterize, Query query, QueryExecution execution, OperationOptions options) Parameter Data type Description objectsToClusterize ObjectIdentitySet Contains a list of ObjectIdentity instances specifying the objects on which the clusters are computed. query Query Contains the query definition and the repositories against which the query is run. execution QueryExecution Object describing execution parameters. Query execution parameters are described in Documentum Enterprise Content Services Reference. options OperationOptions Contains profiles and properties that specify operation behaviors. Only the ClusteringProfile and the SearchProfile are applicable. If this object is null or if there is no ClusteringStrategy, no clusters are returned. The ClusteringProfile contains a list of ClusteringStrategy instances. The ClusteringStrategy is used to compute the ClusterTrees and controls the amount of data returned by the operation. Returns a QueryCluster object containing a list of ClusterTree objects and the id of the query. The SearchServiceException exception is thrown in particular when the Clustering SBO is not installed. The following example demonstrates the getSubclusters operation. public DataPackage getClusterObjects () throws ServiceException { OperationOptions options = new OperationOptions(); // Can be either a PassthroughQuery or StructuredQuery EMC Documentum Version 7.2 Search Development Guide 57 Customizing Search with DFS PassthroughQuery query = new PassthroughQuery(); query.setQueryString("select * from dm_document"); query.addRepository(YOUR_REPOSITORY); // Get 50 results QueryExecution queryExec = new QueryExecution(0, 50, 50); QueryResult results = searchService.execute(query, queryExec, options); // Get generated queryId and set it for subsequent calls String queryId = results.getQueryId(); queryExec.setQueryId(queryId); // Get query clusters // Set ClusteringStrategy ClusteringStrategy strategy = new ClusteringStrategy(); strategy.setStrategyName("Name"); List<String> attrs = new ArrayList<String>(2); attrs.add("object_name"); strategy.setAttributes(attrs); strategy.setReturnIdentitySet(true); strategy.setClusteringRange(ClusteringRange.HIGH); // Set ClusteringProfile ClusteringProfile profile = new ClusteringProfile(strategy); options.setClusteringProfile(profile); QueryCluster queryCluster = searchService.getClusters(query, queryExec, options); // Get objects belonging to the first cluster DataPackage clusterObjects = new DataPackage(); if (null != queryCluster.getClusterTrees() && !queryCluster. getClusterTrees().isEmpty()) { ClusterTree finalTree = queryCluster.getClusterTrees().get(0); if (null != finalTree.getClusters() && !finalTree. getClusters().isEmpty()) { Cluster cluster = finalTree.getClusters().get(0); clusterObjects = searchService. getResultsProperties(cluster.getClusterObjectsIdentities(), query, queryExec, options);}} return clusterObjects;} getResultsProperties operation To display results, use the getResultsProperties operation. Call this operation after a call to the getClusters or getSubclusters operations. It can also be called after a search. If the search context is no longer available, the query is executed according to the Query, QueryExecution and OperationOptions parameters. The search context is necessary to retrieve the results for the selected cluster. 58 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS Java syntax DataPackage getResultsProperties (ObjectIdentitySet forClustersObjects, Query query, QueryExecution execution, OperationOptions options) throws SearchServiceException; C# syntax DataPackage GetResultsProperties (ObjectIdentitySet forClustersObjects, Query query, QueryExecution execution, OperationOptions options) Parameter Data type Description forClustersObjects ObjectIdentitySet Contains a list of ObjectIdentity instances specifying the results to retrieve. query Query Contains the query definition and the repositories against which the query is run. execution QueryExecution Object describing execution parameters. Query execution parameters are described in Documentum Enterprise Content Services Reference. options OperationOptions Contains profiles and properties that specify operation behaviors. If this object is null, default operation behaviors apply. Returns a DataPackage containing the query results, that is, the objects specified in the ObjectIdentitySet. The SearchServiceException exception is thrown in particular when the Clustering docapp is not installed. The following example demonstrates the getResultsProperties operation. public QueryCluster getSubClusters () throws ServiceException { OperationOptions options = new OperationOptions(); // Can be either a PassthroughQuery or StructuredQuery PassthroughQuery query = new PassthroughQuery(); query.setQueryString("select * from dm_document"); query.addRepository(YOUR_REPOSITORY); // Ask for 100 results QueryExecution queryExec = new QueryExecution(0, 100, 100); QueryResult results = searchService.execute(query, queryExec, options); // Get generated queryId and set it for subsequent calls String queryId = results.getQueryId(); queryExec.setQueryId(queryId); // Now get query clusters // Set ClusteringStrategy EMC Documentum Version 7.2 Search Development Guide 59 Customizing Search with DFS ClusteringStrategy strategy = new ClusteringStrategy(); strategy.setStrategyName("Name"); List<String> attrs = new ArrayList<String>(); attrs.add("object_name"); strategy.setAttributes(attrs); strategy.setReturnIdentitySet(true); strategy.setClusteringRange(ClusteringRange.HIGH); // Set ClusteringProfile with strategy ClusteringProfile profile = new ClusteringProfile(strategy); options.setClusteringProfile(profile); // Get clusters on results retrieved so far QueryCluster queryCluster = searchService.getClusters(query, queryExec, options); // Get the objects belonging to the first cluster // and calculate new clusters on this subset List<ClusterTree> clusterTrees = queryCluster.getClusterTrees(); QueryCluster subClusters = new QueryCluster(); if (null != clusterTrees && !clusterTrees.isEmpty()) { // Get first ClusterTree ClusterTree firstTree = clusterTrees.get(0); List<Cluster> clusters = firstTree.getClusters(); if (null != clusters && !clusters.isEmpty()) { // Get first cluster Cluster cluster = clusters.get(0); // Get identities of objects belonging to this cluster ObjectIdentitySet ids = cluster.getClusterObjectsIdentities(); // Create a new strategy to get clusters based on format ClusteringStrategy authorStrategy = new ClusteringStrategy(); authorStrategy.setStrategyName("Format"); List<String> authorAttrs = new ArrayList<String> authorAttrs.add("a_content_type"); authorStrategy.setAttributes(authorAttrs); authorStrategy.setReturnIdentitySet(true); authorStrategy.setClusteringRange(ClusteringRange.HIGH); // Create new profile to take into account the new strategy ClusteringProfile newProfile = new ClusteringProfile(authorStrategy); options.setClusteringProfile(newProfile); // Get new clusters calculated on the given subset of results subClusters = searchService.getSubclusters(ids, query, queryExec, options);}} return subClusters;} 60 EMC Documentum Version 7.2 Search Development Guide Customizing Search with DFS getFacets operation The getFacets operation computes facets on query results. To run the query and benefit from the search cache, call the execute operation first. If the search context is no longer available, or if the query has not already been executed, the query is executed according to the Query and OperationOptions parameters. By default, the execution is synchronous and facets are computed when all results are returned. To retrieve the facets asynchronously, for example, if the query is run against several repositories, specify a SearchProfile. Java syntax QueryFacet getFacets (Query query, QueryExecution execution, OperationOptions options) throws SearchServiceException; C# syntax QueryFacet GetFacets (Query query, QueryExecution execution, OperationOptions options) Parameter Data type Description query Query Contains the query definition, the repositories against which the query is run, and the facet definitions. execution QueryExecution Object describing execution parameters. Query execution parameters are described in Documentum Enterprise Content Services Reference. Only the QueryId is used to identify the query. options OperationOptions Contains profiles and properties that specify operation behaviors. Only the SearchProfile is applicable. Returns a QueryFacet containing the facets, the query id, and query status. The following example demonstrates the getFacets operation. // Create the query StructuredQuery query = new StructuredQuery(); query.addRepository("your_docbase"); query.setObjectType("dm_sysobject"); ExpressionSet set = new ExpressionSet(); set.addExpression(new FullTextExpression("your_query_term")); query.setRootExpressionSet(set); // Add a facet definition to the query: we want a facet on r_modify_date // attribute. FacetDefinition facetDefinition = new FacetDefinition("date"); facetDefinition.addAttribute("r_modify_date"); // Request all facets facetDefinition.setMaxFacetValues(-1); // Set sort order facetDefinition.setFacetSort(FacetSort.VALUE_ASCENDING); query.addFacetDefinition(facetDefinition); EMC Documentum Version 7.2 Search Development Guide 61 Customizing Search with DFS // Execution options: we don’t want to retrieve results, we just want // facets. QueryExecution queryExecution = new QueryExecution(0, 0); // Call getFacets method. QueryFacet queryFacet = service.getFacets(query, queryExecution, new OperationOptions()); // Check the query status: it should be SUCCESS QueryStatus status = queryFacet.getQueryStatus(); System.out.println(status.getRepositoryStatusInfos().get(0). getStatus()); // Display facet values List<Facet> facets = queryFacet.getFacets(); for (Facet facet : facets) { for (FacetValue facetValue : facet.getValues()) { System.out.println(facetValue.getValue() + "/" + facetValue.getCount());}} 62 EMC Documentum Version 7.2 Search Development Guide Chapter 4 Configuring and Customizing Webtop Search This chapter contains the following topics: • About WDK search • Wildcards, lemmatization, and word fragments • Configuring search controls • Configuring the basic search component • Configuring the advanced search component • Configuring search results • Configuring Webtop Federated Search clustering • Modifying search component JSP pages • Modifying a search component query About WDK search Following is a brief general description of the WDK customization model. Information on individual search controls and components is contained in the comprehensive reference guide, EMC Documentum Web Development Kit and Webtop Reference Guide. General information on configuring and customizing features in WDK applications is described in EMC Documentum xPlore Administration and Development Guide The following illustration shows points at which you can configure or customize search component presentation and behavior in Webtop applications. EMC Documentum Version 7.2 Search Development Guide 63 Configuring and Customizing Webtop Search Key: 1. See Configuring search controls, page 67, Configuring the advanced search component, page 69, and Configuring search results, page 72. 2. See Modifying a search component query, page 79. 3. See Constructing a search, page 40. 4. See DQL hints, page 10. 5. See Debugging, page 91. 64 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search Search sources Multiple repositories can be added to the user search preferences. With Federated Search Services, the user can select external sources for search and import results into the current repository. Included files within HTML or XML documents are not imported. Simple and advanced search Simple and advanced searches query the full-text index by default. You can run a full-text query in advanced search using the Contains field. The Contains field or the simple search text box can contain a string within quotations marks to search for the string, for example, "this string". The box also supports the operators AND and OR operators. The following rules apply: • Either operator can be appended with NOT. • The operators are not case sensitive. • Punctuation, accents, and other special characters are ignored (replaced with a space). • The AND operator has priority over the OR operator. For example, you type knowledge AND management OR discovery. The results contain both knowledge and management, or the results contain discovery. • Parentheses override the priority of operators. For example, if you type knowledge AND (management OR discovery), the results must contain knowledge and must also contain either management or discovery. The NOT operator cannot be used to qualify an expression within parentheses, for example, NOT (a and b). It can be used within parentheses, for example a OR (b and NOT c). • If no operators are used between words, multiple words are treated with the AND operator. Searching attribute values All attributes are indexed, so a query for attribute criteria is run against the full-text index by default. The attributes for search criteria are supplied by the data dictionary of the selected repository. If value assistance is defined in the data dictionary, the values are supplied for "is" and "is not" search criteria. Verity operators such as "not" or "between" are not supported. The default search is for a string query type in a full-text search. If the Content Server is indexed, the query is performed against the full-text index including all searchable properties. For attributes-only search, or mixed DQL and full-text, disable XQuery generation. Turn off XQuery generation by adding the following setting to dfc.properties on the DFC client application: dfc.search.xquery.generation.enable=false The following procedures support attributes-only search: • (Advanced search only) Add a checkbox for Include recently modified properties on the advanced search page. Attributes are queried against the database and not the index. To add the checkbox, uncomment the following lines your custom advanced search JSP page (a copy of the webcomponent advanced search JSP page): <!-<tr class="leftAlignment" valign=top> EMC Documentum Version 7.2 Search Development Guide 65 Configuring and Customizing Webtop Search <td class="leftAlignment" valign=top nowrap> <dmfxs:searchscopecheckbox name=’<%=AdvSearchEx.DATABASE_SEARCH_SCOPECHECKBOX_CONTROL%>’ scopename=’<%=RepositorySearch.DATABASE_SEARCH_PROPERTY%>’ checkedvalue=’true’ uncheckedvalue=’false’ nlsid=’MSG_DATABASE_SEARCH’ tooltipnlsid="MSG_DATABASE_SEARCH_TIP"/> </td> </tr>--> • Use the DQL query type for a custom search component and pass the query string in the query parameter. (See Modifying search component JSP pages, page 75.) • Turn off FTDQL (queries against index) using a DQL hints file. You can disable index queries for attributes without affecting the full-text string portion of a query. For more information, see DQL hints, page 10. • Set dfc.search.fulltext.enable to false in dfc.properties, which is located in WEB-INF/classes. Value assistance and presets If value assistance is defined in the data dictionary, the values are supplied for "is" and "is not" search criteria. Value assistance as defined within a DAR is supported. The assistance within the DAR provides a union of values for a type across lifecycles. For information on supporting conditional value assistance in JSP pages, see Configuring the advanced search component, page 69. Limitations: • Not all values in value assistance are available across repositories in a logical OR operation. (This limitation does not apply to the AND operation.) • Locale-based assistance must be present in the data dictionary for each locale. In the Webtop presets editor, you can create a preset that limits the searchable object types. This preset overrides the <includetypes> setting in the advanced search component definition. Clustering, templates, and monitoring Content Server provides search results clustering, search templates, and search monitoring. Before version 6.7 of Content Server, the clustering and search monitoring requires a DAR file deployed to a global registry repository. The search templates DAR file must be deployed to each repository in which you wish to store search templates. Use Documentum Composer to deploy these DAR files to the repositories. Instructions for deploying the Webtop Federated Search DocApps are in the EMC Documentum Web Development Kit and Webtop Deployment Guide. Wildcards, lemmatization, and word fragments When the user enters an explicit wildcard (asterisk in one-box search, for example, Docum*), the wildcard is not applied in the full-text index. It is applied only to find metadata in the index. Most 66 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search queries that users make are for whole words, not parts of words. This behavior can be changed (see “Enabling the wildcard CONTAINS operator” below). Lemmatization finds terms that are based on the root or lemma. For example, if no wildcard is present, a search for car finds auto. Lemmatization is not performed on terms that contain wildcards: a search for car* finds cars but not autos. Enabling the wildcard CONTAINS operator for string property searches To enable the checkbox, remove the JSP comment tags around tine following tag in the advancedsearchex.jsp page: <!-<tr class="leftAlignment" valign=top> <td class="leftAlignment" valign=top nowrap> <dmfxs:searchscopecheckbox name=’<%=AdvSearchEx.DATABASE_SEARCH_SCOPECHECKBOX_CONTROL%>’ scopename=’<%=RepositorySearch.DATABASE_SEARCH_PROPERTY%>’ checkedvalue=’true’ uncheckedvalue=’false’ nlsid=’MSG_DATABASE_SEARCH’ tooltipnlsid="MSG_DATABASE_SEARCH_TIP"/> </td> </tr>--> Enabling fragment or database search You can change the behavior of the CONTAINS operator behavior by enabling the searchscope checkbox in the advanced search JSP page. This checkbox serves the following purposes: • Retrieve objects with recently modified properties that have not yet been indexed. • Perform case-sensitive queries against the database: – DFC (and WDK/Webtop) queries Set dfc.search.fulltext.enabled to false. – DQL queries Add the DQL hint ft_contain_fragment. Lemmatization is not applied when this hint is used. Configuring search controls Seeo EMC Documentum Web Development Kit and Webtop Reference Guide for details on the configuration of each control. You can globally configure all instances of certain advanced search controls by modifying the control configuration definitions on wdk/config/advsearchex.xml. The following controls can be configured: • searchattribute controls, match case attribute (does not apply to searches of the index) • searchsizeattribute control EMC Documentum Version 7.2 Search Development Guide 67 Configuring and Customizing Webtop Search • searchdateattributecontrol • search clusters The following example changes the size range dropdown selections. It modifies advsearchex.xml in a modification file located in custom/config with the following content: <config version=’1.0’> <scope type=’dm_sysobject’> <searchsizeattributerange modifies="searchsizeattributerange: wdk/config/advsearchex.xml"> <insert> <option> <label>Any old size</label> <operator>LT</operator> <value>-1</value> <unit>KB</unit> </option> </insert> </searchsizeattributerange> </scope> </config> The resulting UI (search size custom dropdown list) shows the new values for size attribute range: Search on full-text strings or attributes against a repository is not case sensitive. If the repository is not indexed, queries are case sensitive by default. Case sensitivity for non-indexed repositories can be turned on or off in wdk/config/advsearchex.xml, as the value of the <defaultmatchcase> element. If you turn off case sensitivity, create functional indexes on the attributes that are queried. You can set NOFTDQL queries to be case sensitive. Set the value of <defaultmatchcase> to true. For better performance, set case sensitivity to true, or set it to false and create a functional index on the queried attribute columns. Configuring the basic search component Basic search searches all sysobjects in the current repository for the user-supplied string in the full-text index of content and attributes. The default base type for the search can be configured in the search component definition. The default preferred sources can also be specified in the component definition. If Federated Search Services is installed, its sources can include external sources . • The list of object types and their attributes comes from the reference repository. The reference repository is the first repository selected by the user. If external sources only are selected, then the list of object types in the current repository is used. 68 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search • The search components are versioned. If a request is made for a search component, the new component is returned by default. If you customized a supported previous version of a Webtop search component and extended it, your customization is used in place of the new search components. • To configure basic search to perform a DQL query, create a modified JSP page. For information on this configuration, see Modifying search component JSP pages, page 75. Configuring the advanced search component The data dictionary provides the following data to the search UI: • The default and other searchable attributes for a given object type. • The list of searchable types. The presets or configuration file filters the list. • The default and other search operators for a given type and attribute. • Value assistance values for "=" and "< >" search operations, if defined in the data dictionary. The WDK search UI contains search controls. To control attribute values, extend a search component and modify your custom search JSP page. Setting the search type drop-down list The includetypes element in the advsearch component definition configures the available search types list. The includetypes list is comma-delimited. The descend attribute specifies whether subtypes or included or not. Create your modification definition in custom/config. The following example displays dm_folder and all of its subtypes including custom types that subtype dm_folder: <component modifies="advsearch:webcomponent/config/library/search/searchex/ advsearch_component.xml"> <replace path="includetypes"> <includetypes descend="true">dm_folder</includetypes>... </replace></component> The following illustration shows the type selection list set by includetypes with descend set to true. EMC Documentum Version 7.2 Search Development Guide 69 Configuring and Customizing Webtop Search The following example displays only two selections, because the descend parameter is set to false: <includetypes descend="false"> dm_folder, my_type </includetypes> The following illustration shows the type selection list set by includetypes with descend set to false. Providing conditional value assistance Use individual searchattribute control tags to provide conditional value assistance. The default value assistance must have no dependency on another attribute. Conditional value assistance depends on the display order of the constraints in the JSP page, so you must display the controls in the dependency order. The searchattributegroup tag provides only simple attribute assistance unless the constraints are entered in the correct order. The lists of conditional values are set in Documentum Composer. Query value assistance can use a reference ($value(attribute)), for example: SELECT "MyDocbase"."MyTable"."MyColumn1" FROM "MyDocbase"."MyTable" WHERE "MyDocbase"."MyTable"."MyColumn2" = ’$value(MyAttribute)’ 70 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search The following example lists four attributes, three of which have conditional value assistance lists that were set up in Documentum Composer. The drop-down list for Make determines the list available for Model. The drop-down lists Fuel and Year both depend on Model. This UI was generated from the following set of controls in the JSP page: <tr> <td>Make:</td> <td><dmfxs:searchattribute </tr> <tr> <td>Model:</td> <td><dmfxs:searchattribute </tr> <tr> <td>Year:</td> <td><dmfxs:searchattribute </tr> <tr> <td>Fuel:</td> <td><dmfxs:searchattribute </tr> name=’make’ attribute="make"/></td> name=’model’ attribute="model"/></td> name=’year’ attribute="year"/><td> name=’fuel’ attribute="fuel"/></td> Configuring the savesearch component Searches are saved as smartlist objects. Saved searches save the display configuration as well as the query, and the user has the option of saving query results with the query. Users can revise a saved search using the advanced search component. Smartlists created with Documentum Desktop can be executed or edited in the advanced search UI. After editing, they can no longer be used Desktop. Smartlists that are created in WDK applications cannot be used or edited in Desktop. The savesearch component displays checkboxes that allow the user to save search results with a search and to make the saved search public. These two features can be removed by setting the value of the configuration element enablesavingsearchresults to false. The following example in a modification file removes these two checkboxes: <component modifies="savesearch:webcomponent/config/library/savesearch/ savesearchex/savesearch_component.xml"> <replace path="enablesavingsearchresults"> EMC Documentum Version 7.2 Search Development Guide 71 Configuring and Customizing Webtop Search <enablesavingsearchresults>false</enablesavingsearchresults> </replace></component>... The configuration element <includeresults> specifies whether to save results with a search. Configuring search results You can configure the maximum number of search results and turn off term hit highlighting. After you have made custom types and their attributes available for search, you can configure the display of custom attributes in the search results. You can configure the display_preferences component to allow users to configure their preferences for displaying custom attributes. The maximum number of search results, globally and per source, is configured in dfc.properties. The maximum number of search results is specified as the value of dfc.search.max_results (was maxresults_per_source in 5.3.x). The maximum number of results per source is specified as the value of dfc.search.max_results_per_source. For example, you have specified a maximum of 1000 results and a maximum per source of 500. Results are accumulated from each source until the source maximum of 500 is reached or until the global maximum of 1000 is reached. Note: These settings can affect performance. Setting the value too high can overload xPlore, and setting it too low can frustrate users. Evaluate the best settings for your environment. Term hit highlighting (highlighting of the search term in the results) can be set as a user preference. The default value is set as the value of the element highlight_matching_terms in the search component definition, which is located in webcomponent/config/library/search/searchex. If you are customizing Webtop or an application that extends Webtop, add a highlight_matching_terms element to the top-level search component definition. Configuring the display of attributes in search results Default search result columns are configured as column elements in the basic search configuration file search60_component.xml in webcomponent/config/library/search/searchex. Only attributes marked as searchable in the data dictionary can be specified as columns. Users can set a preference for search results columns in the display_preferences component, which then overrides the default settings in the configuration file. To define default visible columns for custom attributes, your custom search component definition must specify a scope for the custom type. For example, the user selects a custom type for the advanced search. The columns specified in your scoped basic search component are displayed in the results. Details of the columns configuration can be found in EMC Documentum Web Development Kit Reference Guide In the following simple configuration, the definition extends the WDK search component definition and adds some custom attribute columns: <config version=’1.0’> <scope type=’technical_publications_web’> <component modifies=" search:webcomponent/config/library/search/searchex/search60_component.xml"> <insert path=’columns_list’> <column> <attribute>tp_edition</attribute> 72 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search <label>Edition</label> <visible>true</visible> </column> <column> <attribute>tp_web_viewable</attribute> <label>OK to display</label> <visible>true</visible> </column> </insert> </component> </scope> </config> The user can select attributes for display in search results, which overrides the default display. The preferences UI allows users to specify the attributes that are displayed for specific object types. If the user configures different display columns, the query is not reissued. The new column data is not displayed until the search is performed again. For example, calculated columns such as score or summary do not display any values unless they are selected before the query is run. Modify the definition for the display_preferences component to make columns of your custom type available to users for display. To make a custom type available in preferences: 1. Modify the display_preferences component in your custom/config directory: <component modifies=" display_preferences:webtop/config/display_preferences_ex_component.xml"> 2. Add your custom type to the <display_docbase_types> element. For example: <insert path=’preferences.display_docbase_types’> <docbase_type> <value>my_custom_type</value> <label>My type</label> </docbase_type> </insert> 3. Save this file and refresh the configuration files on the application server by navigating to wdk/refresh.jsp. To make a calculated attribute available in search results: 1. Extend the Search60 class in the package com.documentum.webtop.webcomponent.search. 2. Override the initAttributes method and add your computed attribute. The following example adds "myComputed" attribute: protected void initAttributes() { List<String> mandatoryAttrs = getAttributesManager().getMandatory(); mandatoryAttrs.add("myComputed"); getAttributesManager().setMandatory(mandatoryAttrs); super.initAttributes(); } 3. Extend the search component definition to use your custom class, and scope it to your custom type. Set the class to use the custom class. EMC Documentum Version 7.2 Search Development Guide 73 Configuring and Customizing Webtop Search Tuning results performance To enhance query performance, turn off the display of the results folder path. The value of displayresultspath in webcomponent/config/library/search/searchex/search60_component.xml is set to false. The summary column is calculated, which can add to query overhead. Turn off the summary column by extending the Webtop search component searchex_component.xml, which is located in webtop/config. Copy the columns_XXX elements (columns_drilldown, columns_list, and columns_saved_search) from the parent configuration file search60_component.xml in webcomponent/config/libarary/search/searchex. In each of the columns elements, set the value of column.attribute.visible for the summary attribute to false. Set the value of columns_XXX.loadinvisibleattribute to false to ensure that the column is not calculated. Configuring Webtop Federated Search clustering Install the Webtop Federated search clustering DAR file in the global registry to support clustering of search results in groups based on their attribute values. Define the strategies including default strategies in clusterstrategies_config.xml, which is located in the wdk/config of the WDK-based application. The clusterStrategy element defines each cluster strategy. This element contains one or more attributes specified as the value of the criterion child element. The clusterTree element governs the display. Its child elements primary and secondary have values that correspond to the IDs of strategies. Tokenizers split attribute strings into chunks that are then used as clusters. Only one tokenizer is associated with an attribute. The default tokenizer is text, and other tokenizers are defined to tokenize on number, author and date. Tokenizers are part of the clustering SBO. You can add, remove, or change a strategy definition or add, remove, or change the strategies that are displayed in the default cluster tree. Users can change these defaults in their search preferences. To add a strategy definition: 1. Create a file clusterstrategies_modifications.xml in custom/config. 2. Add the opening and closing declarations: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <config> <scope> </scope> </config> 3. Within the scope element, add the following element that specifies the primary element you are modifying and the file in which it exists: <clusterStrategies modifies="clusterStrategies:wdk/config/ clusterstrategies_config.xml"> </clusterStrategies> 4. Within the clusterStrategies element, insert the new strategy that will cluster results for a certain attribute. 74 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search This example creates a cluster for the keywords attribute: <insert> <clusterStrategy id="keywords" nlsid="MSG_KEYWORDS" icon="cluster/ranking.gif" threshold="5"> <criterion>keywords</criterion> </clusterStrategy> </insert> Note: If you provide an nlsid value, you must have a corresponding string in clusterstrategiesNlsProp.properties. The icon path is relative to the theme folder icons/browsertree directory in the application. The threshold specifies the minimum number of documents for which to display the cluster. 5. Refresh the configurations in memory by navigating to wdk/refresh.jsp or restart the application server. To display a new strategy in the default cluster tree: 1. In the modifications file you created that contains the new strategy, add the following child element to scope (sibling to clusterStrategy): <clusterTreeGroup modifies="clusterTreeGroup:wdk/config/ clusterstrategies_config.xml"> <insert> <clusterTree> <primary>keywords</primary> <secondary>topic</secondary> </clusterTree> </insert> Modifying search component JSP pages Changes to JSP pages are considered to be customizations. The following examples extend Webtop search component definitions and specify a custom JSP page in which to make customizations. Performing a DQL query The basic search component can perform a DQL query. Basic search is launched from the titlebar component. This example replaces basic search. You can add a button in the titlebar that launches a DQL query, leaving basic search intact. If you add a new button, as shown in the example, add a JavaScript event handler to launch your DQL query. 1. Create an XML modification file in /custom/config with the following contents: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <config version=’1.0’> <scope> <component modifies="titlebar:webtop/config/titlebar_component.xml"> <replace path="pages.start"> <start>custom/titlebar/titlebar.jsp</start> </replace> </component></scope></config> EMC Documentum Version 7.2 Search Development Guide 75 Configuring and Customizing Webtop Search 2. Copy titlebar.jsp from webtop/titlebar to custom/titlebar. (Create this target directory if it does not yet exist.) 3. Open titlebar.jsp in custom/titlebar and find the JavaScript function onClickSearch. Within the function, find the following line: postComponentJumpEvent(null, "search", "content", "query", strValue); In this call to the basic search component, you change the query type to "dql" and the value to the DQL string. 4. Add a query and change the query type in the onClickSearch JavaScript function, like the following. (This example does a wildcard search with the input string.) function onClickSearch () { var contentPage = eval(getAbsoluteFramePath("content")); if (contentPage != null) { var text = document.getElementById("txtSearch"); callBlur(text); var strValue = text.value; if (strValue != "" && strValue != "<%=strSearch %>") { var strDQL = "select * from dm_document where upper(object_name) like ’%" + strValue.toUpperCase() + "%’"; postComponentJumpEvent(null, "search", "content", "queryType", "dql", "query", strDQL); if (typeof text.autoComplete != "undefined" && text.autoComplete != null) { // add the search string to client-side’s auto-complete suggestions text.autoComplete.addEntry(strValue); var prefs = InlineRequestEngine.getPreferences( InlineRequestType.JSON); prefs.setCallback("onUpdateACCallBack"); postInlineServerEvent(null, prefs, null, null, " onUpdateAutoCompleteData", null, null);}}}} Setting the default search type To set the default search type, supply your preferred type in the JavaScript function that calls the advanced search container. In Webtop, titlebar.jsp calls advanced search. Extend the titlebar component and provide the following postComponentNestEvent calls in the onClickAdvancedSearch JavaScript function. Substitute your custom type (in quotation marks) for custom_type: postComponentNestEvent(null, "advsearchcontainer", "content", "component", " advsearch", "type", custom_type, "usepreviousinput", "false", "query", strValue); ... postComponentNestEvent(null, "advsearchcontainer","content","component"," advsearch", "type", custom_type, "usepreviousinput", "true"); 76 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search This example uses simple DQL. You can take content from the user for a DQL search and construct the DQL on the fly as shown in the following example. Displaying specific attributes for search You can specify attributes for your search rather passive generation by the searchattributegroup control. In the following example of a custom advsearch component, specific attribute controls have replaced the searchattributegroup control in the JSP page: ... <dmfxs:searchobjecttypedropdownlist name=’objecttypectrl’.../></td></tr> <tr><td colspan=’2’ class=’spacer’ height=’10’> </td></tr> <tr> <td align=right valign=top nowrap><dmf:label label=’Name’ cssclass=" fieldlabel"/></td> <td align=left valign=top nowrap> <dmfxs:searchattribute name=’searchname’ attribute=’object_name’ andorvisible="false" removable="false"> </dmfxs:searchattribute> </td> </tr> <tr> <td align=right valign=top nowrap><dmf:label label=’Type’ cssclass=" fieldlabel"/></td> <td align=left valign=top nowrap> <dmfxs:searchattribute name=’searchtype’ attribute=’r_object_type’ andorvisible="false" removable="false"> </dmfxs:searchattribute> </td> </tr>... Note: Set the andorvisible and removable attributes to false on the searchattribute control. Before this customization, the user must select properties from a dropdown: EMC Documentum Version 7.2 Search Development Guide 77 Configuring and Customizing Webtop Search To display specific custom attributes as individual search criteria, extend the advanced search component. Scope the definition to your custom type and provide a custom JSP page. In that page, add attribute controls for your attributes. When the user selects the custom type, the configuration service reads the scoped definition. The custom JSP page with custom attributes is displayed, like the following: After customization, the UI shows the individual attributes "Name" and "Type" as search criteria: Specific attributes as search criteria 78 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search Enabling fragment search (wildcard support) Starting with DFC 7.0 and xPlore 1.3, the support of fragment search using wildcards has changed. The default behavior in xPlore matches that of commonly used search engines. Wildcard (fragment) search is not performed in a full-text search unless the user adds an explicit wildcard. This provides fast, more precise search results than a fragment search. The EMC Documentum xPlore Administration and Development Guide provides information on the default support and wildcard configuration. Modifying a search component query You can access a query before it is submitted and modify it in various ways. The query is accessible by overriding the initSearch() method of the Search60 class. Your custom class must extend the Webtop version of either the Search60 or AdvSearchEx component class. The following methods in the basic search component class Search60 provide customization points: • initSearch(arg): Override to modify queries before execution • initControls(arg): Override to update custom controls • initAttributes(): Override to perform specific treatment for columns. Use getAttributesManager() to manipulate columns and query attributes • initResultsSet(): Override to manipulate the results that are fed to the datagrid • initSearchExecution(): Start the actual query execution Adding a WHERE clause to simple search To add a WHERE clause to the query in simple search, extend Search60 in the package com.documentum.webtop.webcomponent.search. You can add criteria other than keywords to the initSearch method. If you override buildQuery, you can break smartlist usage. The following example adds an AND clause to a query. The query searches for a specific string in the name of the object, in addition to criteria in the simple search text box. First, create your search component definition in custom/config as follows: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <config version=’1.0’> <scope> <component modifies="search:webtop/config/search60_component.xml"> <replace path=’class’> <class>com.mycompany.SearchEx</class> </replace> </component> </scope> </config> Next, create your custom class that extends Search60 and overrides initSearch(): package com.mycompany; import com.documentum.fc.client.search.IDfExpressionSet; import com.documentum.fc.client.search.IDfQueryBuilder; import com.documentum.fc.client.search.IDfSimpleAttrExpression; import com.documentum.fc.common.IDfValue; EMC Documentum Version 7.2 Search Development Guide 79 Configuring and Customizing Webtop Search import com.documentum.web.common.ArgumentList; import com.documentum.webcomponent.library.search.SearchInfo; public class SearchEx extends com.documentum.webtop.webcomponent.search.Search60 { protected void initSearch (ArgumentList args) { super.initSearch(args); String queryType = args.get(ARG_QUERY_TYPE); if ((queryType == null) || (queryType.length() == 0) || (queryType.equals("string"))) { SearchInfo info = getSearchInfo(); IDfQueryBuilder qb = info.getQueryBuilder(); IDfExpressionSet rootSet = qb.getRootExpressionSet(); IDfExpressionSet setAnd = rootSet.addExpressionSet (IDfExpressionSet.LOGICAL_OP_AND); setAnd.addSimpleAttrExpression("r_modifier", IDfValue.DF_STRING, IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, true, false, "tuser"); } } } This example adds an AND criterion in which the modifier attribute must contain the user name "tuser". Before the customization, a search on the string "Target" in the simple search box returns three results as shown here: After customization, only a single result in which the object name contains "Target" and the user name contains "tuser" returned. (User name is displayed in the second column, as "Modifier.") With IDfExpressionSet, you can add the following operators: LOGICAL_OP_AND, LOGICAL_OP_DEFAULT (default operator in data dictionary), and LOGICAL_OP_OR. The following expressions, also called predicates, are available for IDfSimpleAttrExpression (names are self-explanatory): 80 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search SEARCH_OP_BEGINS_WITH SEARCH_OP_CONTAINS SEARCH_OP_DOES_NOT_CONTAIN SEARCH_OP_ENDS_WITH SEARCH_OP_EQUAL SEARCH_OP_GREATER_EQUAL SEARCH_OP_GREATER_THAN SEARCH_OP_IS_NOT_NULL SEARCH_OP_IS_NULL SEARCH_OP_LESS_EQUAL SEARCH_OP_LESS_THAN SEARCH_OP_NOT_EQUAL The following expression is available for IDfValueRangeAttrExpression: SEARCH_OP_BETWEEN The following expressions can be used with IDfValueListAttrExpression: SEARCH_OP_IN SEARCH_OP_NOT_IN Setting exact match When you use IDfQueryBuilder to build the query, you can call the IDfSimpleAttrExpression method setExactMatchEnabled(boolean) to turn off lemmatization, stop words, thesaurus, fuzzy search, and wildcards. Adding a WHERE clause to advanced search In advanced search, you override buildQuery to access the user query. The search class is as follows: package com.mycompany; import com.documentum.fc.common.IDfValue; import com.documentum.fc.client.search.IDfSimpleAttrExpression; import com.documentum.fc.client.search.IDfExpressionSet; import com.documentum.fc.client.search.IDfQueryBuilder; public class AdvSearchEx extends com.documentum.webtop.webcomponent.advsearch.AdvSearchEx { protected IDfQueryBuilder buildQuery() throws Exception { IDfQueryBuilder qb = super.buildQuery(); IDfExpressionSet rootSet = qb.getRootExpressionSet(); IDfExpressionSet setAnd = rootSet.addExpressionSet (IDfExpressionSet.LOGICAL_OP_AND); setAnd.addSimpleAttrExpression("object_name", IDfValue.DF_STRING, IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, true, false, "xpath"); return qb; } } EMC Documentum Version 7.2 Search Development Guide 81 Configuring and Customizing Webtop Search Changing the query source You can change the location, including the source and folder path in the repository with query builder APIs. The following example adds a source repository to IDfQueryBuilder instance and sets a path within the repository for the query. The examples for basic and advanced search show you how to get the query builder instance (variable qb in this example): qb.clearSelectedSources(); qb.addSelectedSource("dm_notes"); // set source, path, descend flag qb.addLocationScope("dm_notes", "/Temp", false); The resulting query is like the following: SELECT r_object_id,text,object_name,FROM dm_document SEARCH DOCUMENT CONTAINS testing WHERE (object_name LIKE %testing% ESCAPE \) AND FOLDER(/Temp) AND (a_is_hidden = FALSE) Hiding the customization from query editing If you have intercepted and modified a query after form submit, the hidden query processing will be displayed when the user tries to modify the query. To hide the custom modification, add the usepreviousinput parameter in the call to the advanced search component. Modify the titlebar component definition to use your own titlebar.jsp page as follows: <component modifies="titlebar:webtop/config/titlebar_component.xml"> <replace path="pages.start"> <start>/custom/titlebar/titlebar.jsp</start> </replace></component> In your custom titlebar JSP page, change the call to the advanced search component to set usepreviousinput to false: postComponentNestEvent(null, "advsearchcontainer","content","advsearch", "type", "dm_sysobject", "usepreviousinput", "false")’ Programmatic search value assistance Data dictionary value assistance is available in advanced search. If you have not defined value assistance for an attribute in the repository data dictionary, you can add value assistance programmatically. Define a custom tag handler to render the value assistance values. The tag handler is specified in the search configuration file advsearchex.xml as follows: <searchvalueassistance> <attribute_type_name> fully_qualified_class_name </attribute_type_name> </searchvalueassistance> When the user selects an attribute for search, the values in the criteria dropdownlist control are filled by the custom tag class. To add your own custom tag class, copy the file wdk/advsearchex.xml to custom/config and add your handlers to the <searchvalueassistance> element. Your tag handler must implement ISearchAttributeValueTag. 82 EMC Documentum Version 7.2 Search Development Guide Configuring and Customizing Webtop Search Note: Do not delete the Documentum value assistance handlers. The entire contents of the <searchvalueassistance> overrides the contents of the element in the WDK version of this file. The following tag handlers render values for certain attributes. The handler classes are in com.documentum.web.formext.control.docbase.search. • BooleanVATag Provides values for any Boolean attribute • ContentTypeVATag Provides valid a_content_type (dm_format) names and descriptions • ExistingValueVATag Uncomment this tag and specify an attribute for which to populate the drop-down list with all existing values for the selected object type • ObjectTypeVATag Populates the search object type drop-down list with available object types • PermissionVATag Provides possible permission values (none, browse, read, relate, version, write, delete) for setting world_permit, group_permit, and owner_permit attributes • SearchMetaDataVATag Gets attribute names, default value, and description for each attribute. This handler is for internal use only. Your tag class must extend the abstract class SearchVADropDownListTag and implement ISearchAttributeValueTag. For example, the BooleanVATag class implements populateValueDropDownList to provide the two Boolean values: protected void populateValueDropDownList(SearchDropDownList ddList) { Option optionTrue = new Option(); optionTrue.setValue("1"); optionTrue.setLabel(SearchControl.getString("MSG_TRUE", ddList)); ddList.addOption(optionTrue); Option optionFalse = new Option(); optionFalse.setValue("0"); optionFalse.setLabel(SearchControl.getString("MSG_FALSE", ddList)); ddList.addOption(optionFalse); } EMC Documentum Version 7.2 Search Development Guide 83 Chapter 5 Configuring CenterStage Search This chapter contains the following topics: • Set Federated Search Services options • Improving search performance Set Federated Search Services options Federated search is available if your organization has enabled the connection with the Federated Search Services (FS2) server. Federated search allows users to search external and internal sources at the same time and display all results consistently. This section briefly describes the main steps to add and configure external sources. For more information on FS2, see the EMC Documentum Federated Search Services 6.6 Administration Guide, available within the CenterStage product on EMC Online Support (https://support.emc.com). You manage external sources using the Admin Center FS2 administration tool. Each external source in CenterStage is an information source in Admin Center. An information source relies on an adapter bundle (available as a *.jar file) and a specific configuration. Some information sources can be available with a default configuration because they correspond to public information sources. For example, the information sources Google, Wikipedia, OpenDirectory, and YahooDirectory are already configured and available in CenterStage. Other information sources require configuration before being available to users. The following adapter bundles are available out-of-the-box with FS2: • EMC Documentum ECM (Enterprise Content Management) • EMC Documentum eRoom • EMC Documentum ApplicationXtender • EMC Documentum EmailXtender • EMC SourceOne • JDBC/ODBC • Google Desktop Enterprise • Windows Search • OpenSearch • FS2 Indexing for shared drives The configuration of each adapter is described in the EMC Documentum Federated Search Services Adapter Installation Guide. FS2 Admin Center can be accessed using a URL such as: EMC Documentum Version 7.2 Search Development Guide 85 Configuring CenterStage Search https://:<FS2_server_host>:<Admin_Center_port_number>/AdminCenter where <FS2_server_host> is the name or the IP address of FS2 server,and <Admin_Center_port_number> is set to 3003 by default. Use FS2 Admin Center to perform the following administration tasks: • Add information sources • Upload new bundles • Configure and test the adapters • Set the authentication mode for the information sources: public access, corporate account (same account shared by all users), and user account Improving search performance Due to the high number of available formats in the repository, searches perform poorly when the user selects formats in the format filter. To improve search performance, configure the format filter to ignore the formats that are not used. You can restore the filters at any time. You ignore a format by setting the format_class attribute to kw_ignore in the formats table. Ignoring some formats also reduces the list of possible formats in the Others format filter, which can be a long list. To ignore a format: 1. In DA, open the DQL editor. 2. Run the following DQL query to get the list of available formats in the repository: SELECT name, mime_type, description FROM dm_format WHERE NOT ANY format_class=’kw_ignore’ ORDER BY name 3. Run the following DQL query where xyz is the format to ignore.: UPDATE dm_format OBJECTS APPEND format_class=’kw_ignore’ WHERE "name" = ’xyz’ 4. Restart the application server to clean the cache of the formats table. To restore a format: 1. In DA, open the DQL editor. 2. Run the following DQL query where xyz is the format to ignore. UPDATE dm_format OBJECTS REMOVE format_class[0] where "name" = ’xyz’ Index [0] is used if there was no value already set for the repeating attribute format_class. Otherwise, check for the right index. 3. Restart the application server to clean the cache of the formats table. 86 EMC Documentum Version 7.2 Search Development Guide Chapter 6 Troubleshooting This chapter contains the following topics: • Troubleshooting Search • Problem queries • Debugging Troubleshooting Search Set the xPlore search service log level to WARN to log queries. If query auditing is enabled (the default), you can view or edit reports on queries. Refer to EMC Documentum xPlore Administration and Development Guide for more information. For performance-related configuration, refer to EMC Documentum xPlore Administration and Development Guide. Inconsistent results between database and full-text queries Some queries generate different results when they are executed as a full-text query than when they are executed as a database query. Possible reasons for this problem are discussed in the following topics. Document too large to be indexed You can set a maximum size for content that is indexed by CPS. You set the actual document size, not the size of the text within the content. To set the maximum content size, edit the index agent configuration file. For more information, refer to EMC Documentum xPlore Administration and Development Guide. You can configure xPlore CPS to change the maximum text size within a document, or change the thread pool size. You can also add a separate CPS instance that is dedicated to processing. This processor does not interfere with query processing. For more information, refer to EMC Documentum xPlore Administration and Development Guide. Verifying the query plugin Check the Content Server log after your start the Content Server. The file repository_name.log is located in $DOCUMENTUM/dba/log. Look for the line like the following. It references a plugin with DSEARCH in the name, like the following. EMC Documentum Version 7.2 Search Development Guide 87 Troubleshooting Mon Jun 14 21:53:50 2010 031000 [DM_FULLTEXT_T_QUERY_PLUGIN_VERSION]info: "Loaded FT Query Plugin: ...C:\Documentum\product\6.5/bin/DSEARCHQueryPlugin.dll... The Content Server query plugin properties of the dm_ftengine_config object are set during xPlore configuration. If you have changed one of the properties, like the primary xPlore host, the plugin can fail. Verify the plugin properties, especially the qrserverhost, with the following DQL: 1> select param_name, param_value from dm_ftengine_config 2> go You see specific properties like the following: param_name param_value dsearch_qrygen_mode fast_wildcard_compatible query_plugin_mapping_file dsearch_domain DSS_LH1 dsearch_qrserver_host dsearch_qrserver_port dsearch_qrserver_target both true C:\Documentum\fulltext\dsearch\dm_AttributeMapping.xml Config8518VM0 9300 /dsearch/IndexServerServlet Indexing latency Latency is the time interval between two events. In the context of searching, latency caused by a number of situations can cause inconsistent results. For example, the following situations can generate latency periods that result in inconsistent results: • An object was deleted in the repository but that deletion is not yet reflected in the index In this case, a query against the index returns a result, whereas the same query against the repository does not. • An object was added to the repository but is not yet added to the index In this case, a query against the repository returns the result, whereas the same query against the index does not. Lemmatization differences The full-text engine uses lemmatization (grammatical normalization) when conducting a search. Database searches do not support lemmatization. Content Server only returns exact matches. This means that the same query, run against the index and run again against the database can return different numbers of results. Case sensitivity differences Searches on the full-text index are not case sensitive. Searches in the database are case sensitive by default. This difference can cause queries to return different numbers of results. For example, suppose you issue the following query: SELECT object_name,object_owner,title FROM dm_document WHERE subject = ’bread’ ENABLE(FTDQL) 88 EMC Documentum Version 7.2 Search Development Guide Troubleshooting The example query runs as a full-text query. This query returns all objects whose subject is ’bread’, ’Bread’, ’bRead’, or any other combination of upper and lowercase letters that spell bread. If the query is run with the hint ENABLE(NOFTDQL) hint, it runs against the database. In that case, the query returns only those objects whose subject is ’bread’, all lowercase. If you want to run that query against the database and in a case-insensitive manner, you could use the upper (or lower) function: SELECT object_name,object_owner,title FROM dm_document WHERE UPPER(subject) = UPPER(’bread’) Problem queries A query can have the following problems: • Foreign language not identified The first language that is identified in associated with the document for indexing. Other language content might not be properly indexed. Queries issued from Documentum clients are searched in the language of the session_locale. The search client can set session locale through DFC or iAPI. • Query is unselective A query is unselective when it searches for a property value that is common among the objects in the repository. For example, the following query is unselective if the specified property value is common: SELECT object_name, object_owner FROM dm_sysobject WHERE a_storage_type = "engrfilestore" ENABLE(FTDQL) If engrfilestore is the default file store for sysobjects, this query finds many objects but not the object the user is searching for. • Search contains a wildcard • Wildcards match separate terms, not fragments of a term. Fragment search support can be turned on in xPlore, but it causes slower performance. For details, refer to EMC Documentum xPlore Administration and Development Guide. Wildcards are supported in attribute searches. The operator * matches 0 or more characters. • Query for a specific folder Folder descend query performance can depend on folder hierarchy and data distribution across folders. The following conditions can degrade query performance: – Many folders, and a large portion of them are empty Increase folder_cache_limit in the dm_ftengine_config object. – The search predicate is unselective but the folder constraint is selective Decrease folder_cache_limit in the dm_ftengine_config object. The folder_cache_limit setting in the dm_ftengine_config object specifies the maximum number of folder IDs probed. Default is 2000. If the folder descend condition evaluates to less than the folder_cache_limit value, then folder IDs are pushed into the index probe. If the condition exceeds the folder_cache_limit value, the folder constraint is evaluated separately for each result. • Search for XML elements EMC Documentum Version 7.2 Search Development Guide 89 Troubleshooting By default, XML content of an input document is not indexed. You can change XML indexing in the xml-content element of the xPlore configuration file. indexserverconfig.xml. For more information, refer to EMC Documentum xPlore Administration and Development Guide. • Document indexed but term not found Because lemmatization is context-based, a word is tokenized differently depending on its context in a sentence, yielding variable results. For example, saw is lemmatized to the verb to see or to the noun saw depending on the context. A query sometimes does not have enough context to determine which of these bases is required. In another example, the noun swimming is not lemmatized to the related verb to swim. A search for swimming does not return documents containing swim. (Alternative lemmas solve this issue: both lemmas are saved for ambiguous contexts.) Lemmatization of queries is more prone to error because less context is available in comparison to indexing. See EMC Documentum xPlore Administration and Development Guide. • Query contains special characters A search for a string containing special characters is treated as a phrase search. For example, when a home_base is indexed, home and base are stored next to each other. A search for home_base finds the containing document but does not find other documents containing home or base but not both. Another example is a list of names containing White,Jim. This list is tokenized as "White,Jim" because the comma is treated as a context character. A search for "White" does not return this document. You can configure the special characters list to remove the comma. See EMC Documentum xPlore Administration and Development Guide. • xQuery with DfXQuery.java is not thread-safe. To execute the xQuery and other queries in one session, the xQuery must be synchronized until the result stream is closed as shown in the following example: synchronized(session.getDocbaseConnection()) { try { xq.execute(session, target); InputStream in = xq.getInputStream(session); //Change in to ByteArrayInputStream so that we can close xq byte[] buff = new byte[10000]; int bytesRead = 0; ByteArrayOutputStream bao = new ByteArrayOutputStream(); while((bytesRead = in.read(buff)) != -1) { bao.write(buff, 0, bytesRead); } is = new ByteArrayInputStream(bao.toByteArray()); } finally { xq.close(); } } 90 EMC Documentum Version 7.2 Search Development Guide Troubleshooting Debugging You can test queries in xPlore administrator. Reports on slow queries allow you to see the actual query and how it was executed. Using Documentum Administrator, you can trace full-text querying operations. Go to Job Management > Administration Methods > MODIFY_TRACE. Two tracing levels are available: • None: Tracing is turned off. • All: Content Server and full-text messages resulting from queries are logged. You can trace index agent operations. See EMC Documentum xPlore Administration and Development Guide. If the query fails to return expected results in Webtop, perform a Ctrl-click on the Edit button in the results page. The query is displayed in the events history as a select statement like the following: IDfQueryEvent(INTERNAL, DEFAULT): [dm_notes] returned [Start processing] at [2010-06-30 02:31:00:176 -0700] IDfQueryEvent(INTERNAL, NATIVEQUERY): [dm_notes] returned [SELECT text,object_name,score,summary,r_modify_date,... SEARCH DOCUMENT CONTAINS ’ctrl-click’ WHERE (...] his action also displays the list of events that occurred during the search: The DQL sent, the FS2 query sent, and the errors from search sources. If there is a processing error, the stack trace is shown. EMC Documentum Version 7.2 Search Development Guide 91 Appendix A DFC schemas This appendix covers the following topics: ∙ DQL hints file DTD ∙ Extended object search schema DQL hints file DTD Following is the hints file DTD, parsed and enforced in DFC. It does not need a doctype declaration. <!ELEMENT RuleSet (Rule*)> <!ELEMENT Rule (Condition?, DQLHint?, SelectOption?, DisableFullText?, DisableFTDQL?)> <!ELEMENT Condition (Select?, From?, Where?, Docbase?, FulltextExpression?)> <!ELEMENT DQLHint (#PCDATA)> <!ELEMENT SelectOption (#PCDATA)> <!ELEMENT DisableFullText EMPTY> <!ELEMENT DisableFTDQL EMPTY> <!ELEMENT Select (Attribute+)> <!ATTLIST Select condition (all | any) \"all\"> <!ELEMENT From (Type+)> <!ATTLIST From condition (all | any) \"all\"> <!ELEMENT Where (Attribute+)> <!ATTLIST Where condition (all | any) \"all\"> <!ELEMENT Docbase (Name+)> <!ELEMENT FulltextExpression EMPTY> <!ELEMENT FulltextExpression exists (true | false) #REQUIRED> <!ELEMENT Attribute (#PCDATA)> <!ATTLIST Attribute operator (equal|not_equal|greater_than|greater_equal|less_than|less_equal|like| not_like|is_null|is_not_null|in|not_in|between)#IMPLIED> <!ELEMENT Type (#PCDATA)> <!ELEMENT Name (#PCDATA)> <!ATTLIST Name descend (true | false) #IMPLIED> Extended object search schema <?xml version="1.0"?> <xsd:schema targetNamespace="http://www.documentum.com" xmlns:doc="http://www.documentum.com" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.documentum.com"> EMC Documentum Version 7.2 Search Development Guide 93 Extended object search schema <xsd:element name="mapping" type="doc:JAXBMappingXplore"/> <!--==================================================--> <!-- Complex Types definition --> <!--==================================================--> <xsd:complexType name="JAXBMappingXplore"> <xsd:sequence> <xsd:element name="interface" type="doc:JAXBSearchInterfaceXplore" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="JAXBSearchInterfaceXplore"> <xsd:sequence> <xsd:element name="alias" type="doc:JAXBAliasXplore" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> <xsd:attribute name="name" type="doc:Name" use="required"/> <xsd:attribute name="map-to" type="doc:Identifier" use="optional"/> <xsd:attribute name="primary" type="xsd:boolean" use="optional" default="false"/> </xsd:complexType> <xsd:complexType name="JAXBAliasXplore"> <xsd:attribute name="name" type="doc:Name" use="required"/> <xsd:attribute name="map-to" type="doc:MixIdentifier" use="required"/> <xsd:attribute name="cardinality" default="ONE" type="doc:Cardinality"/> </xsd:complexType> <!--==================================================--> <!-- Simple Types definition --> <!--==================================================--> <xsd:simpleType name="Name"> <xsd:restriction base="xsd:string"> <xsd:pattern value="[a-zA-Z][a-z_A-Z0-9]*"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="Identifier"> <xsd:restriction base="xsd:string"> <xsd:pattern value="[a-zA-Z][a-z_>A-Z0-9\.]*"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="MixIdentifier"> <xsd:restriction base="xsd:string"> <xsd:pattern value="[a-zA-Z][a-z_>A-Z]*(\.[a-zA-Z][a-z_>A-Z]*){0,2}"/> </xsd:restriction> 94 EMC Documentum Version 7.2 Search Development Guide DFC schemas </xsd:simpleType> <xsd:simpleType name="Cardinality"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="ONE"/> <xsd:enumeration value="MANY"/> </xsd:restriction> </xsd:simpleType> </xsd:schema> EMC Documentum Version 7.2 Search Development Guide 95 Index C S case sensitivity in WDK basic search, 68 of queries, 88 Content Server, 7 search number of results, 72 results display, 72 term hit highlighting, 72 slow query unselective, 89 special characters troubleshooting, 90 F Federated Search Services Admin Center, setting external sources, 85 T I IDfQueryBuilder, 24 IDfQueryManager, 24 IDfQueryProcessor, 24 IDfSearchMetadataMgr, 25 index agent described, 7 index server xPlore, 8 L term hit highlighting in WDK search, 72 W wildcard contains fragment, 79 X xPlore index server, 8 languages indexing, 8 latency inconsistent query results, and, 88 lemmatization inconsistent query results, and, 88 M multirepository search data model, 25 P performance suppress folder path display, 74 suppress summary calculation, 74 Q queries case sensitivity, 88 inconsistent results, causes, 87 lemmatization of, 88 query definition, in DFC, 24 EMC Documentum Version 7.2 Search Development Guide 97