org.opencms.search.documents
Interface I_CmsSearchExtractor

All Known Subinterfaces:
I_CmsDocumentFactory
All Known Implementing Classes:
A_CmsVfsDocument

public interface I_CmsSearchExtractor

Defines a text extractor for the integrated search engine.

The job of a search extractor is to extract indexable plain text from a resource in the OpenCms VFS. This may be from the resource content, for example from a PDF file, or from the resource properties, for example the Title, Keywords and Description properties.

Since:
6.0.0
Version:
$Revision: 1.5 $
Author:
Carsten Weinholz

Method Summary
 I_CmsExtractionResult extractContent(CmsObject cms, A_CmsIndexResource resource, java.lang.String language)
          Extractes the content of a given resource according to the resource file type.
 

Method Detail

extractContent

public I_CmsExtractionResult extractContent(CmsObject cms,
                                            A_CmsIndexResource resource,
                                            java.lang.String language)
                                     throws CmsException
Extractes the content of a given resource according to the resource file type.

Parameters:
cms - the cms object
resource - a cms resource
language - the requested language
Returns:
the extracted content of the resource
Throws:
CmsException - if somethin goes wrong