7.3.1Fulltext/Terminology Indexing

 

Smile CDR uses Hibernate Search to provide fast indexing capabilities for specific query patterns that benefit from this type of index. Using this feature is completely optional, and Smile CDR will function without it.

However, its use can significantly improve performance and enable additional functionality on terminology operations, so it is recommended for use in environments where validation or other terminology operations will be heavily used. For more information, see Terminology and Lucene Indexing.

7.3.1.1Indexes

When Hibernate Search Indexing is enabled, one or more of the following indexes should be enabled:

  • FullText Content Index – The FullText Content index indexes all string element values within every resource stored to the repository. This can be useful in allowing fulltext searches, but it can potentially consume a large amount of space so careful capacity planning should be used if this feature is enabled on a repository that contains a large number of resources.

  • Search Parameter Index – The Search Parameter index extends token indexing with support for the :text modifier, as well as indexing string parameters with support for the :exact, :contains, and :text search modifiers. This will increase space and memory requirements.

  • Store Resources – The Store Resources indexes all resource data values within every resource stored to the repository. This can be useful in allowing fulltext searches, but it can potentially consume a large amount of space so careful capacity planning should be used if this feature is enabled on a repository that contains a large number of resources.

Note that enabling any of these options on a populated index requires reindexing existing resources

7.3.1.2Hibernate Search Providers

Smile CDR ships with compatibility to 3 providers that can be used to provide indexing services. Each one has its own characteristics:

  • LUCENE_MEMORY – This provider keeps a copy of the index in memory. This provider is only recommended for testing and non-production scenarios. It is very fast, but will lose all stored data when the system shuts down.

  • LUCENE_DISK – This provider uses a local disk file to store indexing information. This option is a production-ready solution, but is only recommended for non-clustered environments.

  • ELASTICSEARCH – This provider uses an Elasticsearch/OpenSearch cluster to provide indexing. This option is suitable for clustered environments. Note that to use this option, you will need to create your own Elasticsearch/OpenSearch cluster.

7.3.2Lucene Disk Provider

 

When using the Lucene Disk Provider, a directory on the local filesystem should be configured using the Hibernate Search Directory option.

This directory should be created on a disk partition that has enough capacity to store all content for the selected indexes.

7.3.3Elasticsearch/OpenSearch Provider

 

When using the Elasticsearch/OpenSearch provider, the following settings should be provided:

  • URL – The Elasticsearch/OpenSearch Cluster endpoint URL stripped of protocol

  • Protocol – The Elasticsearch/OpenSearch protocol (http/https)

  • Username – The Elasticsearch/OpenSearch connection username.

  • Password – The Elasticsearch/OpenSearch connection password.

In order for Smile CDR to connect to the Elasticsearch/OpenSearch cluster, The connecting user has to have the following minimal set of permissions.

For each of the following index permissions which are prefixed with indices:, the user must have them associated to these two index patterns:

  • resourcetable-*
  • termconcept-*

7.3.3.1OpenSearch Permissions

PermissionReason
indices:admin/getListing existing indices
indices:admin/aliasesAbility to create write and read aliases for necessary indices
indices:admin/createAbility to create required indices
indices:admin/mapping/putCustom index settings for Hibernate Search managed indices.
indices:data/write/bulk*Hibernate Search will often buffer and bulk-write in high throughput situations
indices:data/write/deleteData removal or reindexing needs to be updated to the fulltext index
indices:data/write/indexStoring data to Elasticsearch/OpenSearch
indices:data/read/searchTerminology/Fulltext search
cluster:monitor/mainHibernate Search monitoring indices for ILM rollover
cluster:monitor/healthHibernate Search health checks for the cluster

7.3.3.2Elasticsearch Permissions

PermissionReason
indices:manageCreation of indices and mapping management, as well as ILM
indices:readAbility to read and search indices.
indices:writeAbility to write to required indices
cluster:monitorHibernate Search health checks for the cluster