Troubleshooting / FAQ

Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua.

Target System: Apache Solr

Schema Handling

Before each content traversal the Apache Solr connector will fetch the schema for the collection to index documents into. Only metadata defined in the schema, will be indexed for each document. The connector considers explicit and dynamic schema fields.

The following field type classes will result in a special preprocessing:

Class Preprocessing

solr.DatePointField

Metadata field values which match a schema field of this type are formatted to ISO-8601 standard with second precision.

Schemaless mode is not supported!

In case all extracted metadata should be indexed, the following dynamic field definition can be added to the schema:

<dynamicField name="*"  type="text_general"  indexed="true"  stored="true"/>

This way all not explicitly defined fields will be interpreted as text fields.

Special Metadata

Between connectors a defined set of metadata is streamlined. These standard metadata will take precedence before metadata extracted from the source system. In case a field name extracted from the source system clashes with the name of a standard metadata field, only the standard metadata field is indexed.

Standard Metadata Field Description

title

The (interpreted) title of the document.

sourceName

The name of the source system.

sourceUrl

The URL pointing to the source system.

sourceType

The source system type.

itemType

The type of the item / document.

clickUrl

The clickable URL referencing the document.

lastModifiedDate

The last modification date.

createdDate

The creation date.

previewUrl

A URL which points to a preview of the document.

mimeType

The mime type of the document.

authorId

The ID of the author.

authorName

The display name of the author.

fileExtension

The file extension.

fileType

The pretty name of the file type.

contributorIds

A list of contributor IDs.

contributorNames

A list of contributor names.

breadcrumbNames

A list of names referencing items in the hierarchy above the current document.

breadcrumbUrls

A list of URLs referencing items in the hierarchy above the current document.

languages

A list of languages.

keywords

A list of keywords.

content

The document’s content as part of the metadata.

docAllowAcl

The allow document ACL.

docDenyAcl

The deny document ACL.

Scheduling Commits

Commits to the index will not be triggered by the connector. In order to make data searchable, either commit manually or schedule automatic commits. For further information see the Official Apache Solr Documentation.

Unable to access jarfile error (installation on Windows)

This error occurs when the installation path exceeds the maxium Windows path length of 260 characters. Ensure that the full path to bin\connector.bat does not exceed 260 characters.