14.5.1MegaScale
Trial

 
MegaScale has limitations, and will not be suitable for every use case. See Limitations below.

MegaScale is a mechanism for storing virtually unlimited amounts of data in a single FHIR server. It uses multiple database instances to create discrete pools of data which are logically separate, but are managed under a single Smile CDR FHIR Storage (RDBMS) module.

In its simplest terms, a MegaScale-enabled server can be thought of as a partitioned FHIR repository where individual partitions or groups of partitions are stored in separate database schemas, and potentially in separate physical database instances.

Using this strategy can be helpful in cases such as:

  • The total amount of data to be stored exceeds the natural limit of a single instance of the database technology being used
  • Data needs to be kept physically separate due to privacy or legal rules.

14.5.2Architecture
Trial

 

In MegaScale mode, one or more FHIR Endpoint modules are combined with a single FHIR Storage (RDBMS) module. Incoming FHIR requests include a tenant identifier which maps to a particular partition, which then specifies the target database. This architecture is shown in the diagram below.

MegaScale Architecture

14.5.3Partitions and Shards

 

In a MegaScale architecture, the dual concepts of Partitions and Shards are used. These two terms mean related but different things.

A Partition is a single grouping of resources. Any individual resource must be assigned to a single partition, and that partition will generally contain multiple resources.

One or more partitions are assigned to a given database schema. This grouping of Partitions to a single database schema is called a Shard.

The following diagram shows a potential mapping of the 15000 partitions defined in Patient ID Partition Mode to 3 shards. This is only one potential mapping however; it is possible to have fewer or more shards depending on anticipated storage and scaling requirements.

MegaScale Shards and Partitions

14.5.4Patient ID Partition Selection Mode

 

See MegaScale Patient ID Partition Selection Modes for details on how to use these partition selection modes with MegaScale.

14.5.5FHIR Transactions Spanning Multiple Shards

 

MegaScale creates an architecture where different partitions are stored on different shards (see Partitions and Shards above). This has several implications to the semantics and operation of FHIR Transaction processing, but does not mean that FHIR transactions can not be used even if they span multiple shards.

When loading data using a FHIR Transaction Bundle, the system will automatically attempt to respect the semantics of the FHIR transaction as much as possible, but will make compromises where necessary if a transaction needs to span multiple shards.

When a transaction Bundle needs to write to multiple shards, it will be automatically split into multiple discrete FHIR Transaction Bundles and executed in sequence. The server will order these bundles according to resource dependencies within the Bundle, and will use the outcome of earlier bundles to inform the processing of later Bundles.

For example, suppose you are have configured your Partition Selection Mode to Patient ID Partition Selection Mode, with your Ancillary Resources on a separate MegaScale database from our Patient Resources. In this example, you might have Patient and Encounter resource referencing Organization and Location resources in the same FHIR Transaction Bundle.

In this scenario, the server will automatically process the Ancillary resources first. Any newly assigned resource IDs will be used in references from the subsequent Patient and Encounter resources.

As a result, it is not possible to have circular dependencies in FHIR Transaction Bundles executed on a MegaScale server where the cycle crosses shard boundaries. For example, if your Patient and Ancillary data are on separate shards, attempting to process a FHIR Transaction Bundle with a reference from a Patient to an Organization where the Organization also holds a reference to the Patient would result in an error.

14.5.6Limitations
Trial

 

This section lists the known limitations on this feature.

14.5.6.1FHIR Interactions

The following FHIR interactions have been tested:

  • Create/Update
  • Search
  • $reindex – It is possible to reindex a single partition by providing the tenant name in the URL (e.g. /P1/$reindex), or all partitions using _ALL as the tenant name (e.g. POST /_ALL/$reindex).
  • $validate
  • $expunge – Expunge everything is verified to work, and will only expunge everything for a single MegaScale database at a time.
  • $delete-expunge – Delete expunge is verified to work, and will only delete and expunge for a single MegaScale database at a time.
  • FHIR Transactions

No other features, operations, or interactions have been tested or are expected to work with MegaScale.

You must ensure that all updates within a single Bundle target a single MegaScale database. This is true for REQUEST_TENANT partitioning mode, but may not be true for other partition modes like Patient-Id partitioning or custom partitioning solutions.

14.5.6.2Cross-Partition Searching

Search requests will only include results from a single database.

14.5.7Configuration
Trial

 

To enable MegaScale mode, the following settings must be set.

On the FHIR Storage (RDBMS) module:

On the FHIR Endpoint module:

14.5.8Connection Provider Interceptor
Trial

 

MegaScale connection details are supplied using a Java Smile CDR Interceptor using the STORAGE_MEGASCALE_PROVIDE_DB_INFO pointcut.

See Example: MegaScale Connection Provider to see how this pointcut can be used. This example is also available in the Interceptor Starter Project.