Smile CDR v2023.08.PRE
On this page:

7.1Resource Storage Mode

 

This page describes settings which can be used to optimize the way that resources are stored in the database.

7.1.1Resource Body Storage

 

By default, the FHIR Storage (Relational) module uses a database LOB column to store resource body text in a separate storage area. See the RES_TEXT column on the HFJ_RES_VER table in the HAPI FHIR Database Schema for more information.

This storage mode allows for resources of unlimited size, limited only by available memory. Data is compressed using the GZip compression algorithm in order to reduce storage requirements. This is called Compressed BLOB Storage.

Resources can also be stored in a simple text column called RES_TEXT_VC using a SQL datatype that stores arbitrarily long text strings (text on PostgreSQL, varchar(MAX) on MSSQL, long on Oracle, etc.). This is called Inline Storage.

The Inline Resource Storage Below Size (chars) is used to configure a size threshold below which resources will be stored using Inline Storage. For example, if this setting is set to 10000, any resources whose total length (measured using non-pretty-printed JSON FHIR encoding) is less than 10000 will be stored using Inline Storage and any resources above than this threshold will be stored using compressed BLOB Storage.

Optimizing for Performance

Inline Storage will always perform better than Compressed Blob Storage, at the expense of needing more disk storage space and potentially more memory. The default threshold value of 10000 is intended to provide a suitable balance between these concerns.

Any resources that need to be read or written using Compressed Blob Storage require an extra database round-trip due to the use of a LOB column. This additional overhead is generally not worth worrying about if you are performing small-scale operations (i.e. writing one or a few resources, or submitting a FHIR transaction Bundle which creates 20 resources). However, at larger scale this time can add up to have a significant impact on your system's performance.

Operations which read or write 1000 resources (e.g. large FHIR transaction Bundles, Bulk Export jobs, etc.) are often up to 10x faster if most/all resources are stored using inline storage. If you are designing a system with this type of operation in mind, you should consider increasing the threshold to a size that is larger than the size of the majority of your resources.

Adjusting the Threshold

It is always safe to modify this setting, even if data is already loaded. If the threshold is changed, any data that is stored after the change will be stored using the appropriate mechanism for the new threshold. Any data that has already been stored will remain in the column determined by the previous setting. This means that large-scale read operations may still not see any improvement right away.

If you wish to migrate existing data in place to account for your new threshold, you can perform a server-level $reindex with the optimizeStorage parameter set to true.

7.1.2Practical Limitations

 

Response: Generally speaking, the only real limitation is for fields that you want to index for searching. So for example, the length of Observation.note is effectively unbounded. The full resource body is temporarily stored in memory, so there are practical scalability limits if you had massive strings (ie multiple megabytes) and lots of concurrency, but other than that there is no limit.

However, in the case of fields that are indexed for searching the indexer may impose a maximum length on the field. For example, Patient.name.family will be indexed because it has a search parameter by default. The maximum length for string indexes is 200 characters. Note that this does not mean that you can not store a family name longer than 200 characters, only that the indexed value used for supporting searching will be truncated.