7.2.1Performance Tuning for FHIR Storage (Relational)

 

The default FHIR Storage (Relational) module configuration is tuned to prioritize complete FHIR specification conformance and feature set over prioritizing performance.

This page outlines various settings that can be set to improve performance.

7.2.2Improving Write Performance

 

In many scenarios large amounts of data needs to be written to the CDR (e.g. during initial backloads, or for update-heavy applications). Even for routine operation, if your architecture involves lots of resource writing (create/update/patch/etc) then it is worth

The following considerations may be helpful in planning for high-volume write scenarios:

7.2.2.1Smile CDR Settings

The following settings can be used to tune your system for the fastest write performance.

  • Disable deletes: If the Delete Enabled setting is turned off, a number of deletion checks can be skipped when writing data and an additional client-assigned ID cache is automatically enabled. This reduces the number of reads required during a resource create/update, especially if your resources have lots of references to other resources. It is worth considering disabling this setting (even if only temporarily) during bulk loading exercises. If you do not intend to use FHIR Delete operations, it is always a good idea to set this setting. You can re-enable deletes at any time without issue if your needs change in the future.

  • Enabling Match URL Cache: If the Match URL Cache setting is enabled, the resolution of any conditional URLs used in your write operations (e.g. conditional create, conditional update) will be cached in an in-memory cache. This can improve overall write performance, especially in cases where conditional creates are frequently being used to resolve references to the same targets. For example, suppose you are uploading many ExplanationOfBenefit resources in a FHIR transaction bundle, and each one has a reference to a Patient and Practitioner resource, each one using a conditional create. Enabling this setting will avoid two lookups for each ExplanationOfBenefit. Note that this setting should not be used if your write patterns will change the targets of your conditional URLs (this is generally not the case, but should be considered).

  • Disable logs: The Audit Log and Transaction Log both require database processing, and add to the overall load during a write operation. Consider disabling one or both, especially during backloading activities.

  • Disable Unnecessary Features: The following features should be disabled (even if temporarily during backload) unless they are needed, as they add additional processing time for each resource being loaded.

  • Enable Mass Ingestion Mode: The Mass Ingestion Mode setting tunes the system to prioritize write operations over read operations.

  • Tune Search Parameters: The FHIR specification describes a rich set of default Search Parameters for every resource type, and these are all enabled by default. Every enabled search parameter means additional processing work when a resource is written, so disabling search parameters that are not used can have a significant impact on write performance. See Search Parameter Tuning for more information.

7.2.3Example Properties File

 

The following snippet may be uses in your configuration properties file in order to enable many of the settings described above.

# Disable Audit and Transaction Logs
module.clustermgr.config.audit_log.db.enabled=false
module.clustermgr.config.audit_log.broker.enabled=false
module.clustermgr.config.transactionlog.enabled=false

# Read performance
# Note: Setting this property to true will prevent pagination from working with the $everything operation
module.persistence.config.always_use_offset_searches               =true

# Write Performance
module.persistence.config.seed_default_search_params               =false
module.persistence.config.suppress_string_indexing_in_tokens       =true
module.persistence.config.dao_config.tag_storage_mode              =INLINE
module.persistence.config.dao_config.delete_enabled                =true
module.persistence.config.dao_config.match_url_cache.enabled       =true
module.persistence.config.dao_config.mass_ingestion_mode           =true
module.persistence.config.dao_config.enforce_reference_target_types=false

# Reduce the number of active Search Parameters. The values below are an example only,
# your specific needs may be different.
module.persistence.config.search_parameter_seeding.disable_patterns=*
module.persistence.config.search_parameter_seeding.enable_patterns =\
  *:patient\
  Practitioner:identifier\
  Location:identifier\
  Organization:identifier\
  ExplanationOfBenefit:identifier\
  ExplanationOfBenefit:type\
  ExplanationOfBenefit:service-date

7.2.3.1Environment Preparation

7.2.3.2Data Design

  • Use transactions: FHIR Transactions allow multiple operations to be batched into a single database transaction. Submitting multiple resources in a single transaction is almost always going to be faster than submitting them individually (i.e. each one in its own HTTP request), especially if those resources have references to each other. Note that you do not want to create transactions of unlimited size. The entire transaction bundle is loaded into memory during processing, so this is a practical limit to consider. Bundles containing hundreds or sometimes thousands are common.

    • Avoid Small FHIR Transactions: While FHIR transactions are a great tool for improving performance, a FHIR Transaction Bundle with a small number of entries (e.g. 1-2) will often perform slightly worse than equivalent individual operations due to the extra processing required to support placeholder ID resolution in transactions. It is often worth testing whether breaking up a transaction will yield better performance for your specific use case.
  • Avoid client assigned IDs: The FHIR "create with client assigned ID" uses an update/PUT operation to let the client control the ID of the resource inserted in the database, rather than relying on the server to assign one. This can be handy if you are replicating data from another system and want your IDs to match between systems. Client-assigned IDs come at a price however, as the system needs to first perform a read before every write to ensure that a resource doesn't already exist with the specified ID. It is often feasible to use the identifier field instead of the id field to store these source system IDs.

7.2.4Disabling Non Resource DB History

 

This setting controls whether non-resource (ex: Patient is a resource, MdmLink is not) DB history is enabled. Presently, this only affects the history for MDM links, but the functionality may be extended to other domains.

Clients may want to disable this setting for performance reasons as it populates a new set of database tables when enabled.

Setting this property explicitly to false disables the feature: Non Resource DB History

7.2.5Enabling Index Storage Optimization
Trial

 

If enabled, the server will not write data to the SP_NAME, RES_TYPE, SP_UPDATED columns for all HFJ_SPIDX_xxx tables.

This setting may be enabled on servers where HFJ_SPIDX_xxx tables are expected to have a large amount of data (millions of rows) in order to reduce overall storage size. The size of 'HFJ_SPIDX_xxx' tables can be lowered by up to 10%, saving up to 5% of persistent database space.

Setting this property explicitly to true enables the feature: Optimize index storage

7.2.5.1Limitations

  • Note that setting only applies to newly inserted and updated rows in HFJ_SPIDX_xxx tables. In order to apply this setting to existing HFJ_SPIDX_xxx index rows, Manual Search Parameter Reindexing should be executed at the instance or server level.

  • If this setting is enabled along with Index Missing Search Params setting, the following index may need to be added into the HFJ_SPIDX_xxx tables to improve the search performance: (HASH_IDENTITY, SP_MISSING, RES_ID, PARTITION_ID).

7.2.5.2Search Parameter Hash Identity in Capability Statement

If Optimize index storage setting is enabled, Capability Statement (/metadata endpoint) will return hashIdentity value for all active Search Parameters for troubleshooting purposes.