14.6.1MegaScale Patient ID Partition Selection Modes

 

Although it is not the only way to use MegaScale, MegaScale is designed to work well with Patient ID Partition Mode and Bucketed Patient ID Partition Mode. These are referred to as Patient ID Partition Modes on this page.

These modes are a great choice if your use case involves needing to store large amounts of data where the majority of your queries will be patient-oriented (i.e. queries about a single patient or a list of patients). This generally means performing FHIR searches with a subject or patient parameter, such as Observation?patient=Patient/37510001&category=vital-signs and DocumentReference?patient=Patient/37510001&date=gt2025.

In these modes, all resources belonging to a single Patient compartment will be colocated in the same partition. This means that reads and writes to data belonging to this compartment can be performed efficiently even within a large MegaScale deployment since these operations need only to access a single Partition on a single Shard.

14.6.2Required and Suggested Settings

 

When using MegaScale in Patient ID Partition Modes, the following settings should be considered:

  • Cross-Partition Reference Mode should be set to ALLOWED_UNQUALIFIED if you want to allow resources in the Patient compartment to hold references to resources in the default partition (such as Ancillary resources).
  • Any Patient.identifier.system values that will be used for conditional operations on Patient resources should be declared using Configuring Pre-Assigned Patient Identifier Systems.
  • Server ID Mode does not need a specific setting in order to work. However, since MegaScale encodes the partition IDs in the server-assigned resource IDs, if this setting is set to SEQUENTIAL_NUMERIC it will be possible to perform efficient FHIR Read operations on resources with these server-assigned IDs without needing to scan multiple shards.

14.6.3Searching for Data

 

The following search patterns are supported for efficient queries in MegaScale in Patient ID Partition Modes:

Search Pattern Example URL Notes
Search for ancillary resource by identifier. http://base/Practitioner?identifier=http://pract|0
Search for Patient Resource by identifier. http://base/Patient?identifier=http://patient|1 Only a single Patient identifier may be placed in the URL.
Search for resources in the Patient Compartment by patient identifier. This search finds any resources that belong to the patient with the given identifier. http://base/Encounter?patient.identifier=http://patient|1 Only a single Patient identifier may be placed in the URL. The _include parameter may be used to include any referenced Patient compartment or ancillary resources.

14.6.4Patient Resource ID-Identifier Mapping

 

In large infrastructures involving storage of Patient data from multiple sources, data is often read and written using a Patient identifier as the search key, as opposed to the Patient ID. For example, a common pattern is to perform a conditional update operation on a Patient resource using a conditional URL such as Patient?identifier=http://example.org|123.

In Patient ID Partition Mode, the server needs to resolve this search to determine the Patient resource ID, which is then used to determine the actual shard and partition to access. This creates a circular dependency since the server needs to know which partition to search, but needs to search to determine the partition.

To avoid this circular dependency, MegaScale leverages an identifier mapping table located on the default partition. When performing a conditional operation on a patient identifier, the server performs an initial lookup on the default partition to check whether the given identifier is known. If it is, the server uses it to select a partition for the rest of the transaction. This lookup is cached to avoid unnecessary database queries if repeated operations access the same identifier.

14.6.4.1Configuring Pre-Assigned Patient Identifier Systems

Identifier systems that will be used for conditional operations on Patient resources must be declared in the FHRI Storage module configuration, using the Patient Identifier Systems for Pre-Assignment setting. Any identifier systems that have not been pre-declared in configuration will not be available for use in conditional operations, and trying to use them will result in an error.

This setting accepts multiple identifier systems, each separated by whitespace (space or newline). Values can be a fixed value, e.g. http://example.org/practitioner. Values can also be specified as a regular expression by adding a prefix of ^ and a suffix of $, e.g. ^http://example.org/practitioner/[0-9]+$.

Values should not be added to this list if they have already been used in stored data in the repository. Values may be added to the list at any time, however, as long as this is done before adding any data using the new identifier system.

Pre-assignment creates a permanent 1:1 mapping between the identifier and the resource ID assigned to this identifier. This has several important consequences:

  • Any resource with an identifier that has a Pre-Assigned Patient Identifier System can never have that identifier removed or changed. Other identifiers may be added and removed as long as they do not also have a Pre-Assigned Patient Identifier System. Any attempt to remove or change the identifier with the Pre-Assigned Patient Identifier System will result in an error.
  • No resource may have multiple identifiers with system values which are matched by the Pre-Assigned Patient Identifier Systems list.
  • All identifiers with system values which are matched by the Pre-Assigned Patient Identifier Systems list have uniqueness enforced automatically, meaning that no two resources may have the same identifier with the same Pre-Assigned Patient Identifier System and value.

14.6.5Loading Data

 

When loading data into a MegaScale repository in these modes, it is recommended to always use a FHIR Transaction and to group resources together by Patient as much as possible. In other words, if you are loading Patient resources as well as multiple Observation resources for each Patient, your data will load much faster if you put as many Observation resources referring to the same Patient (as well as other resources referring to that same Patient) in the same transaction Bundle.

MegaScale will split the transaction Bundle into multiple transactions (one for each Shard) and will load each sub-transaction in an order. This means that it is possible for the overall transaction to fail if a later sub-transaction fails after an earlier sub-transaction has succeeded. If you want to avoid this possibility entirely, ensure that any FHIR Transaction Bundles contain only resources that belong to a single Patient, or resources that are not in any Patient compartment (such as Ancillary Resources).

A good compromise is to include resources in a single Patient compartment as well as any ancillary resources referenced by these Patient resources in a single Transaction Bundle. All resources should be included in the Bundle as either a Conditional Create, a Conditional Update or a plain Update so that the transaction can be retried if it fails without creating duplicate resources.

14.6.5.1Conditionally Creating Patient by Identifier

Patient resources and other resources can be created using a Conditional Create, as shown in the example below. Any resources belonging to the same Patient compartment should be referenced using the Patient entry fullUrl, which must contain a Placeholder ID. For this example to work, the http://patient identifier system must be configured.

{
  "resourceType": "Bundle",
  "type": "transaction",
  "entry": [ {
    "fullUrl": "urn:uuid:c4592eed-14b7-4a19-9ec0-bff03965d489",
    "resource": {
      "resourceType": "Patient",
      "identifier": [ {
        "system": "http://patient",
        "value": "1"
      } ]
    },
    "request": {
      "method": "POST",
      "url": "Patient",
      "ifNoneExist": "Patient?identifier=http://patient|1"
    }
  }, {
    "fullUrl": "urn:uuid:958c64e5-83c1-4261-8174-b0cc210dddd4",
    "resource": {
      "resourceType": "Encounter",
      "identifier": [ {
        "system": "http://encounter",
        "value": "1"
      } ],
      "subject": {
        "reference": "urn:uuid:c4592eed-14b7-4a19-9ec0-bff03965d489"
      }
    },
    "request": {
      "method": "POST",
      "url": "Encounter",
      "ifNoneExist": "Encounter?identifier=http://encounter|1"
    }
  } ]
}

14.6.5.2Conditionally Updating Patient by Identifier

Patient resources and other resources can be created using a Conditional Update, as shown in the example below. Any resources belonging to the same Patient compartment should be referenced using the Patient entry fullUrl, which must contain a Placeholder ID. For this example to work, the http://patient identifier system must be configured.

This example also demonstrates a Conditional Update on an Ancillary Resource (the Practitioner), which will be stored in the default partition but may be referenced by resources in other partitions as long as Cross-Partition Reference Mode is set to ALLOWED_UNQUALIFIED.

{
	"resourceType": "Bundle",
	"type": "transaction",
	"entry": [ {
		"fullUrl": "urn:uuid:71f0cbca-7d53-4ca3-a685-f23b0f455256",
		"resource": {
			"resourceType": "Practitioner",
			"identifier": [ {
				"system": "http://practitioner",
				"value": "1"
			} ]
		},
		"request": {
			"method": "PUT",
			"url": "Practitioner?identifier=http://practitioner|1"
		}
	}, {
		"fullUrl": "urn:uuid:5cdc41d7-f1b4-408e-8719-789797c080eb",
		"resource": {
			"resourceType": "Patient",
			"identifier": [ {
				"system": "http://patient",
				"value": "1"
			} ],
			"generalPractitioner": [ {
				"reference": "urn:uuid:71f0cbca-7d53-4ca3-a685-f23b0f455256"
			} ]
		},
		"request": {
			"method": "PUT",
			"url": "Patient?identifier=http://patient|1"
		}
	}, {
		"fullUrl": "urn:uuid:16b16967-bf3a-4e58-82ed-c3c5d8a0605b",
		"resource": {
			"resourceType": "Encounter",
			"identifier": [ {
				"system": "http://encounter",
				"value": "1"
			} ],
			"subject": {
				"reference": "urn:uuid:5cdc41d7-f1b4-408e-8719-789797c080eb"
			},
			"participant": [ {
				"individual": {
					"reference": "urn:uuid:71f0cbca-7d53-4ca3-a685-f23b0f455256"
				}
			} ]
		},
		"request": {
			"method": "PUT",
			"url": "Encounter?identifier=http://encounter|1"
		}
	} ]
}