org.opencms.search
Class CmsSearchIndex

java.lang.Object
  extended by org.opencms.search.CmsSearchIndex
All Implemented Interfaces:
I_CmsConfigurationParameterHandler

public class CmsSearchIndex
extends java.lang.Object
implements I_CmsConfigurationParameterHandler

Implements the search within an index and the management of the index configuration.

Since:
6.0.0
Version:
$Revision: 1.62 $
Author:
Carsten Weinholz, Thomas Weckert, Alexander Kandzior

Field Summary
static java.lang.String EXCERPT
          Constant for additional param to enable excerpt creation (default: true).
static java.lang.String PERMISSIONS
          Constant for additional param to enable permission checks (default: true).
static java.lang.String PRIORITY
          Constant for additional param to set the thread priority during search.
static java.lang.String REBUILD_MODE_AUTO
          Automatic ("auto") index rebuild mode.
static java.lang.String REBUILD_MODE_MANUAL
          Manual ("manual") index rebuild mode.
static java.lang.String ROOT_PATH_SUFFIX
          Special root path append token for optimized path queries.
static java.lang.String ROOT_PATH_TOKEN
          Special root path start token for optimized path queries.
 
Fields inherited from interface org.opencms.configuration.I_CmsConfigurationParameterHandler
ADD_PARAMETER_METHOD, INIT_CONFIGURATION_METHOD
 
Constructor Summary
CmsSearchIndex()
          Default constructor only intended to be used by the xml configuration.
CmsSearchIndex(java.lang.String name)
          Creates a new CmsSearchIndex with the given name.
 
Method Summary
 void addConfigurationParameter(java.lang.String key, java.lang.String value)
          Adds a parameter.
 void addSourceName(java.lang.String sourceName)
          Adds am index source to this search index.
 boolean checkConfiguration(CmsObject cms)
          Checks is this index has been configured correctly.
 boolean equals(java.lang.Object obj)
           
 java.util.Map getConfiguration()
          Returns the configuration of this parameter configurable class instance, or null if the class does not need to be configured.
 I_CmsDocumentFactory getDocumentFactory(CmsResource res)
          Returns the document type factory used for the given resource in this index, or null in case the resource is not indexed by this index.
 java.util.List getDocumenttypes(java.lang.String path)
          Deprecated. use getDocumentFactory(CmsResource) instead to find out if this index is 'interested' in a resource
 CmsSearchFieldConfiguration getFieldConfiguration()
          Returns the search field configuration of this index.
 java.lang.String getFieldConfigurationName()
          Returns the name of the field configuration used for this index.
 org.apache.lucene.index.IndexWriter getIndexWriter(boolean create)
          Returns a new index writer for this index.
 java.util.Locale getLocale()
          Gets the langauge of this index.
 java.lang.String getLocaleString()
          Returns the locale of the index as a String.
 java.lang.String getName()
          Gets the name of this index.
 java.lang.String getPath()
          Returns the path where this index stores it's data in the "real" file system.
 java.lang.String getProject()
          Gets the project of this index.
 java.lang.String getRebuildMode()
          Get the rebuild mode of this index.
 java.util.List getSourceNames()
          Returns all configured sources names of this search index.
 java.util.List getSources()
          Returns all configured index sources of this search index.
 int hashCode()
           
protected  boolean hasReadPermission(CmsObject cms, org.apache.lucene.document.Document doc)
          Checks if the OpenCms resource referenced by the result document can be read be the user of the given OpenCms context.
 void initConfiguration()
          Initializes a configuration after all parameters have been added.
 void initialize()
          Initializes the search index.
 boolean isEnabled()
          Returns true if this index is currently disabled.
 void removeSourceName(java.lang.String sourceName)
          Removes an index source from this search index.
static java.lang.String rootPathRewrite(java.lang.String path)
          Rewrites the a resource path for use in the CmsSearchField.FIELD_ROOT field.
static java.lang.String[] rootPathSplit(java.lang.String path)
          Spits the a resource path into tokens for use in the CmsSearchField.FIELD_ROOT field and with the rootPathRewrite(String) method.
 CmsSearchResultList search(CmsObject cms, CmsSearchParameters params)
          Performs a search on the index within the given fields.
 void setEnabled(boolean enabled)
          Can be used to enable / disable this index.
 void setFieldConfiguration(CmsSearchFieldConfiguration fieldConfiguration)
          Sets the field configuration used for this index.
 void setFieldConfigurationName(java.lang.String fieldConfigurationName)
          Sets the name of the field configuration used for this index.
 void setLocale(java.util.Locale locale)
          Sets the locale to index resources.
 void setLocaleString(java.lang.String locale)
          Sets the locale to index resources as a String.
 void setName(java.lang.String name)
          Sets the logical key/name of this search index.
 void setProject(java.lang.String projectName)
          Sets the name of the project used to index resources.
 void setProjectName(java.lang.String projectName)
          Sets the name of the project used to index resources.
 void setRebuildMode(java.lang.String rebuildMode)
          Sets the rebuild mode of this search index.
 java.lang.String toString()
          Returns the name (getName()) of this search index.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

EXCERPT

public static final java.lang.String EXCERPT
Constant for additional param to enable excerpt creation (default: true).


PERMISSIONS

public static final java.lang.String PERMISSIONS
Constant for additional param to enable permission checks (default: true).


PRIORITY

public static final java.lang.String PRIORITY
Constant for additional param to set the thread priority during search.


REBUILD_MODE_AUTO

public static final java.lang.String REBUILD_MODE_AUTO
Automatic ("auto") index rebuild mode.

See Also:
Constant Field Values

REBUILD_MODE_MANUAL

public static final java.lang.String REBUILD_MODE_MANUAL
Manual ("manual") index rebuild mode.

See Also:
Constant Field Values

ROOT_PATH_SUFFIX

public static final java.lang.String ROOT_PATH_SUFFIX
Special root path append token for optimized path queries.

See Also:
Constant Field Values

ROOT_PATH_TOKEN

public static final java.lang.String ROOT_PATH_TOKEN
Special root path start token for optimized path queries.

See Also:
Constant Field Values
Constructor Detail

CmsSearchIndex

public CmsSearchIndex()
Default constructor only intended to be used by the xml configuration.

It is recommended to use the constructor CmsSearchIndex(String) as it enforces the mandatory name argument.


CmsSearchIndex

public CmsSearchIndex(java.lang.String name)
               throws CmsIllegalArgumentException
Creates a new CmsSearchIndex with the given name.

Parameters:
name - the system-wide unique name for the search index
Throws:
CmsIllegalArgumentException - if the given name is null, empty or already taken by another search index.
Method Detail

rootPathRewrite

public static java.lang.String rootPathRewrite(java.lang.String path)
Rewrites the a resource path for use in the CmsSearchField.FIELD_ROOT field.

All "/" chars in the path are replaced with the ROOT_PATH_SUFFIX token. This is required in order to use a Lucene "phrase query" on the resource path. Using a phrase query is much, much better for the search performance then using a straightforward "prefix query". With a "prefix query", Lucene would interally generate a huge list of boolean sub-queries, exactly one for every document in the VFS subtree of the query. So if you query on "/sites/default/*" on a large OpenCms installation, this means thousands of sub-queries. Using the "phrase query", only one (or very few) queries are internally generated, and the result is just the same.

This implementation basically replaces the "/" of a path with "@o.c ". This is a trick so that the Lucene analyzer leaves the directory names untouched, since it treats them like literal email addresses. Otherwise the language analyzer might modify the directory names, leading to potential duplicates (e.g. members/ and member/ may both be trimmed to member), so that the prefix search returns more results then expected.

Parameters:
path - the path to rewrite
Returns:
the re-written path

rootPathSplit

public static java.lang.String[] rootPathSplit(java.lang.String path)
Spits the a resource path into tokens for use in the CmsSearchField.FIELD_ROOT field and with the rootPathRewrite(String) method.

Parameters:
path - the path to split
Returns:
the splitted path
See Also:
rootPathRewrite(String)

addConfigurationParameter

public void addConfigurationParameter(java.lang.String key,
                                      java.lang.String value)
Adds a parameter.

Specified by:
addConfigurationParameter in interface I_CmsConfigurationParameterHandler
Parameters:
key - the key/name of the parameter
value - the value of the parameter

addSourceName

public void addSourceName(java.lang.String sourceName)
Adds am index source to this search index.

Parameters:
sourceName - the index source name to add

checkConfiguration

public boolean checkConfiguration(CmsObject cms)
Checks is this index has been configured correctly.

In case the check fails, the enabled property is set to false

Parameters:
cms - a OpenCms user context to perform the checks with (should have "Administrator" permissions)
Returns:
true in case the index is correctly configured and enabled after the check
See Also:
isEnabled()

equals

public boolean equals(java.lang.Object obj)
Overrides:
equals in class java.lang.Object
See Also:
Object.equals(java.lang.Object)

getConfiguration

public java.util.Map getConfiguration()
Description copied from interface: I_CmsConfigurationParameterHandler
Returns the configuration of this parameter configurable class instance, or null if the class does not need to be configured.

All elements in the configuration are key, value String pairs, set using the I_CmsConfigurationParameterHandler.addConfigurationParameter(String, String) method during initialization of the loader.

Implementations will (should) not to return a direct reference to the internal configuration but just a copy of it, to avoid unwanted external manipulation.

Specified by:
getConfiguration in interface I_CmsConfigurationParameterHandler
Returns:
the configuration of this resource loader, or null
See Also:
I_CmsConfigurationParameterHandler.getConfiguration()

getDocumentFactory

public I_CmsDocumentFactory getDocumentFactory(CmsResource res)
Returns the document type factory used for the given resource in this index, or null in case the resource is not indexed by this index.

A resource is indexed if the following is all true:

  1. The index contains at last one index source matching the root path of the given resource.
  2. For this matching index source, the document type factory needed by the resource is also configured.

Parameters:
res - the resource to check
Returns:
he document type factory used for the given resource in this index, or null in case the resource is not indexed by this index

getDocumenttypes

public java.util.List getDocumenttypes(java.lang.String path)
Deprecated. use getDocumentFactory(CmsResource) instead to find out if this index is 'interested' in a resource

Returns a list of names (Strings) of configured document type factorys for the given resource path.

Parameters:
path - path of the folder
Returns:
a list of names (Strings) of configured document type factorys for the given resource path

getFieldConfiguration

public CmsSearchFieldConfiguration getFieldConfiguration()
Returns the search field configuration of this index.

Returns:
the search field configuration of this index

getFieldConfigurationName

public java.lang.String getFieldConfigurationName()
Returns the name of the field configuration used for this index.

Returns:
the name of the field configuration used for this index

getIndexWriter

public org.apache.lucene.index.IndexWriter getIndexWriter(boolean create)
                                                   throws CmsIndexException
Returns a new index writer for this index.

Parameters:
create - if true a whole new index is created, if false an existing index is updated
Returns:
a new instance of IndexWriter
Throws:
CmsIndexException - if the index can not be opened

getLocale

public java.util.Locale getLocale()
Gets the langauge of this index.

Returns:
the language of the index, i.e. de

getLocaleString

public java.lang.String getLocaleString()
Returns the locale of the index as a String.

Returns:
the locale of the index as a String
See Also:
getLocale()

getName

public java.lang.String getName()
Gets the name of this index.

Returns:
the name of the index

getPath

public java.lang.String getPath()
Returns the path where this index stores it's data in the "real" file system.

Returns:
the path where this index stores it's data in the "real" file system

getProject

public java.lang.String getProject()
Gets the project of this index.

Returns:
the project of the index, i.e. "online"

getRebuildMode

public java.lang.String getRebuildMode()
Get the rebuild mode of this index.

Returns:
the current rebuild mode

getSourceNames

public java.util.List getSourceNames()
Returns all configured sources names of this search index.

Returns:
a list with all configured sources names of this search index

getSources

public java.util.List getSources()
Returns all configured index sources of this search index.

Returns:
all configured index sources of this search index

hashCode

public int hashCode()
Overrides:
hashCode in class java.lang.Object
See Also:
Object.hashCode()

initConfiguration

public void initConfiguration()
Description copied from interface: I_CmsConfigurationParameterHandler
Initializes a configuration after all parameters have been added.

Specified by:
initConfiguration in interface I_CmsConfigurationParameterHandler
See Also:
I_CmsConfigurationParameterHandler.initConfiguration()

initialize

public void initialize()
                throws CmsSearchException
Initializes the search index.

Throws:
CmsSearchException - if the index source association failed

isEnabled

public boolean isEnabled()
Returns true if this index is currently disabled.

Returns:
true if this index is currently disabled

removeSourceName

public void removeSourceName(java.lang.String sourceName)
Removes an index source from this search index.

Parameters:
sourceName - the index source name to remove

search

public CmsSearchResultList search(CmsObject cms,
                                  CmsSearchParameters params)
                           throws CmsSearchException
Performs a search on the index within the given fields.

The result is returned as List with entries of type I_CmsSearchResult.

Parameters:
cms - the current user's Cms object
params - the parameters to use for the search
Returns:
the List of results found or an empty list
Throws:
CmsSearchException - if something goes wrong

setEnabled

public void setEnabled(boolean enabled)
Can be used to enable / disable this index.

Parameters:
enabled - the state of the index to set

setFieldConfiguration

public void setFieldConfiguration(CmsSearchFieldConfiguration fieldConfiguration)
Sets the field configuration used for this index.

Parameters:
fieldConfiguration - the field configuration to set

setFieldConfigurationName

public void setFieldConfigurationName(java.lang.String fieldConfigurationName)
Sets the name of the field configuration used for this index.

Parameters:
fieldConfigurationName - the name of the field configuration to set

setLocale

public void setLocale(java.util.Locale locale)
Sets the locale to index resources.

Parameters:
locale - the locale to index resources

setLocaleString

public void setLocaleString(java.lang.String locale)
Sets the locale to index resources as a String.

Parameters:
locale - the locale to index resources
See Also:
setLocale(Locale)

setName

public void setName(java.lang.String name)
             throws CmsIllegalArgumentException
Sets the logical key/name of this search index.

Parameters:
name - the logical key/name of this search index
Throws:
CmsIllegalArgumentException - if the given name is null, empty or already taken by another search index.

setProject

public void setProject(java.lang.String projectName)
Sets the name of the project used to index resources.

A duplicate method of setProjectName(String) that allows to use instances of this class as a widget object (bean convention, cp.: getProject().

Parameters:
projectName - the name of the project used to index resources

setProjectName

public void setProjectName(java.lang.String projectName)
Sets the name of the project used to index resources.

Parameters:
projectName - the name of the project used to index resources

setRebuildMode

public void setRebuildMode(java.lang.String rebuildMode)
Sets the rebuild mode of this search index.

Parameters:
rebuildMode - the rebuild mode of this search index {auto|manual}

toString

public java.lang.String toString()
Returns the name (getName()) of this search index.

Overrides:
toString in class java.lang.Object
Returns:
the name (getName()) of this search index
See Also:
Object.toString()

hasReadPermission

protected boolean hasReadPermission(CmsObject cms,
                                    org.apache.lucene.document.Document doc)
Checks if the OpenCms resource referenced by the result document can be read be the user of the given OpenCms context.

Parameters:
cms - the OpenCms user context to use for permission testing
doc - the search result document to check
Returns:
true if the user has read permissions to the resource