Partitioning allows every resource on the server to be placed in a partition, which is essentially just an arbitrary identifier grouping a set of resources together.
Partitioning is designed to be flexible, and can be used to achieve different outcomes. For example:
Partitioning could be used to achieve multitenancy, where there are multiple logically separate pools of resources on the server. Traditionally this kind of setup is desired when each of these pools belongs to a distinct user group / organization / customer / etc. (a "tenant"), and each of these tenants should not be able to access or modify data belonging to another tenant.
Partitioning could also be used to logically separate data coming from distinct sources within an organization. For example, patient records might be placed in one partition, lab data sourced from a lab system might be placed in a second partition and patient surveys from a survey app might be placed in another. In this situation data does not need to be completely segregated (lab Observation records may have references to Patient records in the patient partition) but these partitions might be used to support security groups, retention policies, etc.
Partitioning could be used for geographic sharding, keeping data in a partition that is geographically closest to where it is likely to be used.
Partitioning can be used for scalability, as a mechanism to take advantage of native partitioning/sharding capabilities in the underlying database platform.
These examples each have different properties in terms of security rules, and how data is organized and searched.
See the HAPI FHIR Partitioning Documentation for a general overview of the concepts related to partitioning.
Storage Module Type | Supported | Additional Documentation |
FHIR Storage (RDBMS) | ✔ | MegaScale |
FHIR Storage (MongoDB) | ✔ | Sharding / Partitioning on MongoDB |
Every partition in Smile CDR is assigned a unique ID, which is an integer in the range of MIN_INT to MAX_INT.
In some partitioning schemes, it is also helpful to think of resources in terms of Patient Resources and Ancillary Resources. Because many health system architectures are focused on patient-oriented queries, it is often helpful to partition resources based on the Patient resource that they are associated with. This means that you might choose to host all Encounters, Observations, MedicationPrescriptions, etc. that have the same subject/patient reference in the same partition as the Patient resource they all refer to. On this page, we call these resources Patient Resources.
In this kind of scheme, there are also shared resources such as Location, Practitioner, Organization, etc. that are not associated with a specific Patient resource and will generally be referred to by many different Patient resources. These resources are called Ancillary Resources.
If you are implementing a partitioning scheme purely for multitenancy, the distinction between Patient Resources and Ancillary Resources may not be important. If you are implementing a partitioning scheme for scalability, you may need to consider the impacts of having Patient resources and Ancillary resources being in different partitions.
One ID is reserved as a special partition called the Default Partition. This partition holds all resources of types that are considered non-partitionable by Smile CDR.
For example, it is not possible to partition SearchParameter resources. This means that no matter how you have chosen to implement partitioning, you will only have one collection of SearchParameter resources, and they will always be stored in the Default Partition. See Non-Partitionable Resources for a list of these resource types.
The ID associated with the Default Partition is controlled by setting the Default Partition ID property.
To enable partitioning, the Partitioning Enabled property on the FHIR Storage (RDBMS) module must be enabled.
Once partitioning is enabled, you will have two new concerns in your server:
Request Partition Selection: Every incoming FHIR request (e.g. a FHIR read, create, transaction, etc.) will now need to identify the partition ID for the given request. For a FHIR create this means identifying the partition ID that will be stored with the resource. For a FHIR read this means limiting the read to only selecting resources with the given partition ID.
Request Partition Security: When a partition ID is selected, the requesting user must also have appropriate access rights (permissions) to be able to access the given partition. The appropriate setting for your use cases will depend on the Requedst Partition Selection Mode, and on your overall security model.
If you are building a new deployment, you may also consider enabling Database Partition Mode.
When using partitioning, the system needs a way of determining which partition is being accessed during every read or write operation. Some modes rely on automatic identification of the partition ID, while others rely on explicit request properties.
Name | Partition Selection Mode | Description | Example Use Cases |
---|---|---|---|
Manual | MANUAL |
Smile CDR will not attempt to automatically determine the partition ID for requests, but will rather rely on a customer-supplied Interceptor. The interceptor should include hooks to Identify partitions for Create and Read as described in the HAPI FHIR Partition Interceptors Documentation. | Any |
Request Tenant | REQUEST_TENANT |
Smile CDR will allow the FHIR Endpoint module to determine the request partition based on the Request Tenant ID. This means that a Tenant Identification Strategy must also be configured on the FHIR Endpoint module. See Request Tenant Partition Selection Mode below for more information. | Multitenancy / Data Segmentation |
Request Header | REQUEST_HEADER |
Smile CDR will determine the request partition based on a custom HTTP request header named `X-Request-Partition-IDs`. See Request Header Partition Selection Mode below for more information. | Multitenancy / Data Segmentation |
Patient ID | PATIENT_ID |
Smile CDR will automatically partition based on the ID of the relevant Patient Compartment. See Patient ID Partition Mode below for more information. See MegaScale Patient ID Selection Mode if you are using this mode with a MegaScale repository. | Scalability |
Bucketed Patient ID | BUCKETED_PATIENT_ID |
Smile CDR will use a custom HTTP request header named `X-Request-Partition-IDs` to select a range of partition IDs, and will then automatically select a partition ID from that range based on the relevant Patient Compartment. See Patient ID Partition Mode below for more information. See MegaScale Patient ID Selection Mode if you are using this mode with a MegaScale repository. | Scalability with Data Segmentation or Multitenancy |
In Request Tenant Partition Selection Mode, the Partition ID is determined by using the Tenant ID that was supplied with the incoming HTTP request.
This mode uses several related but distinct concepts, and it is important to understand how they relate to each other.
Patient/ABC
and Partition ID 123
. The 123
portion is invisible to the REST client, and is determined based on the Partition Selection Mode.The following diagram shows the relationship between the Tenant ID, Partition ID, and Partition Name.
When the FHIR Endpoint Module Tenant Identification Strategy is set to URL_BASED
the request partition ID will be determined by an extra element in the request path.
For example, if a FHIR Endpoint module is listening on port 8000, in a non-partitioned server a request to search for all Patients named smith
would use the following URL:
http://localhost:8000/Patient
In URL-Based Tenant Selection mode, the Partition Name must be added to the base URL for the server. This means that the example above can be applied to a partition named TENANT-A
by using the following URL:
http://localhost:8000/TENANT-A/Patient
An internal mapping between Partition Names (corresponding to the Tenant ID) and Partition IDs is maintained in a table within the FHIR Storage module database. These mappings must be created and maintained manually, using the Partition Management Operations.
In URL Based Tenant selection mode, server level operations such as the Partition Management Operations should be performed against the DEFAULT
partition, as shown below. For server-level operations, the partition name may also be omitted entirely from the request URL.
http://localhost:8000/DEFAULT/$partition-management-create-partition
For cases where a server level request needs to be performed against all partitions, that can be made explicit using _ALL
as the tenant name.
At the moment we only support this for the $reindex
operation.http://localhost:8000/_ALL/$reindex
Upon startup, the system will always create a single partition with the name DEFAULT
and an ID corresponding to the configured Default Partition ID.
To create additional partitions, you can either use the Partition Management Operations or you can create a seed file.
The Partition Seed File property of the FHIR Storage (Relational) can be used to automatically create one or more additional partitions when the system starts up.
The value for this file is a Resource Path that should point to a file containing JSON PartitionDefinitions contents.
A sample partition file is found in the Smile CDR distribution at classes/config_seeding/fhir-partitions.json
. This file can be activated by setting the Partition Seed File configuration to a value of classpath:/config_seeding/fhir-partitions.json
.
In this Patient ID Partition mode, the partition ID is determined by the resource ID of the Patient resource associated with the request. A hash function is used to create a partition ID that will be consistently used for all resources belonging to a given patient.
Importantly, this does not mean that each Patient has a unique partition ID belonging only to that Patient. Instead, it means that for any given Patient and all of its associated data, the same partition ID will be used. Other Patients and their associated data might be on the same partition, and might be on other partitions.
This is helpful if you have a system that will be used to satisfy Patient-oriented queries and needs to scale to very large amounts of data, since it means that any query for an individual Patient and its associated data will only need to query one partition.
For example, suppose the id Patient/ABC
results in a hash value of 111
. This means that:
When creating data:
When retrieving data:
This mode imposes several important limitations. These technical limitations are caused by current constraints of the HAPI FHIR partitioning system and should be relaxed in a future release. Please get in touch if you have specific needs that are impacted by these limitations.
Observation?identifier=http://foo|123
will not be permitted, but the search Observation?subject=Patient/ABC&identifier=http://foo|123
will be.In Patient ID Partition Mode, non-partitionable resources (e.g. StructureDefinition) and ancillary resources (e.g. Location) are always placed on the default partition.
On non-MegaScale repositories, Patient compartment resources (e.g. Patient, Encounter) are distributed evenly across the range of 0-14999, using a stable hash of the Patient resource ID. On MegaScale repositories this scheme is modified slightly, see Partition Distribution for details.
To use Patient ID Partition Mode:
UNNAMED
.false
.Because Partition IDs for standard partitions will always fall between 0 and 14999, it is recommended to use a Default Partition ID of 0
.
The following diagram shows an example of how Patient ID Mode partitions data. Note that the resource types shown are only examples, as there are other types of resources which must be placed in the default partition, and in the patient compartment partitions.
Please read the Patient ID Partition Mode first if you have not already, as this mode builds on top of the semantics of Patient ID Partition Mode.
This mode creates multiple "buckets" of partition IDs, each containing a range of partition IDs. When a read or a write operation is being performed, the system will determine the bucket based on a custom HTTP request header named X-Request-Partition-IDs
. Within the Bucket, the specific partition ID is chosen based on the resource ID of the Patient resource associated with the request.
This mode is useful for systems that have resources sourced from multiple sources (e.g. differrent health systems each with their own clinical systems), which is often the case for Community Information Exchange (CIE) systems.
When making requests to the FHIR repository in Bucketed Patient ID Partition Mode, the X-Request-Partition-IDs
header is used to specify the range of partition IDs to be used.
To specify a specific bucket, the header must have a value beginning with _
followed by a number which is a multiple of 100. For example, the following example specifies that the request should target the bucket with a range of partition IDs from 400 to 499:
X-Request-Partition-IDs: _400
It is not currently possible to specify multiple buckets in a single request.
The following diagram shows an example of how Patient ID Mode partitions data. Note that the resource types shown are only examples, as there are other types of resources which must be placed in the default partition, and in the patient compartment partitions.
When the Partition Selection Mode is set to
REQUEST_HEADER
,
Smile CDR will determine the request partition based on a custom HTTP header named X-Request-Partition-IDs
.
The X-Request-Partition-IDs
header value can contain:
123
)1,2,3
)1,2,DEFAULT
)For read operations (searches, reads, history), all specified partition IDs will be used. For example:
GET /Patient
X-Request-Partition-IDs: 1,2,3
This request will retrieve Patient resources from partitions with IDs 1, 2, and 3.
To search across all partitions without specifying individual IDs:
GET /Patient
X-Request-Partition-IDs: _ALL
To search specific partitions plus the default partition:
GET /Patient
X-Request-Partition-IDs: 1,2,DEFAULT
For create operations, only the first partition ID from the header is used. Any subsequent IDs are ignored. For example, both of the following requests will create a Patient resource in partition 1:
POST /Patient
X-Request-Partition-IDs: 1
POST /Patient
X-Request-Partition-IDs: 1,2,3
The system will validate the header syntax and return an error if:
The X-Request-Partition-IDs
header is supported across multiple messaging platforms and integration points:
When using ResourceOperationJsonMessage
objects in messaging scenarios (such as Channel Import or Camel routes), the system applies the following precedence rules for partition selection:
partitionId
is already set on the inbound ResourceOperationJsonMessage
, it takes highest precedence 2. Header-level partition: The X-Request-Partition-IDs
header value is used only if no partitionId
is set on the messageIn Request Header Partition Selection Mode, a transaction bundle can target multiple partitions within the same database by using a custom FHIR extension. The extension can be added to the entry.request
element to specify the partition ids to be used for that transaction bundle entry.
For example,
POST /transaction
X-Request-Partition-IDs: 1
{
"resourceType": "Bundle",
"type": "transaction",
"entry": [
{
"resource": {
"resourceType": "Observation"
},
"request": {
"method": "POST",
"url": "Observation"
}
},
{
"resource": {
"resourceType": "Patient"
},
"request": {
"method": "POST",
"url": "Patient",
"extension": [
{
"url": "http://hapifhir.io/fhir/ns/StructureDefinition/request-partition-ids",
"valueString": "2"
}
]
}
}
]
}
in this FHIR transaction request, the X-Request-Partition-IDs
header specifies partition id 1. The Observation resource will be created in the partition with id 1 as it has no overriding extension. However, for the Patient resource, the entry.request.extension
overrides the header, and the Patient resource will be created in the partition with id 2.
When partitioning is enabled, by default users must explicitly be granted permissions to access specific partitions they need to access. This is helpful in cases where you want to enforce security rules on a per-partition basis, such as when partitioning is being used as a multitenancy solution. This security is controlled with the Partitioning Security Enabled setting, which is defaulted to true
.
This is done by assigning any of the following user permissions to the user or to their session.
FHIR_ACCESS_PARTITION_ALL – This permission grants the user access to all partitions.
FHIR_ACCESS_PARTITION_NAME – This permission grants the user access to the given partition name(s). The argument to this permission is a comma separated list of partition names.
In other cases however, partitioning is used purely as a way of segmenting data for better scalability. For example, if Patient ID Partition Mode is being used, the specific partition ID will likely not have any useful meaning since it represents only an arbitrary collection of patients and their data. In this case, it is recommended to disable the Partitioning Security Enabled setting.
The Cross-Partition Reference Mode setting can be used to allow references to be created between two resources that are in separate partitions.
By default this type of reference is forbidden.
In ALLOWED_UNQUALIFIED mode, all references are allowed.
Note that this functionality will likely be enhanced based on future requirements. Please get in touch if you would like to discuss more advanced partitioning and multitenancy strategies.
The list below shows all the available Smile CDR modules which may currently support Partitioning. If you need a module to support Partitioning please contact your Account Representative or Customer Success Manager.
Module | Supports Partitioning |
---|---|
Bulk Export | Yes |
Bulk Import | Yes |
CDA Export | Yes * |
CDA Import | Yes * |
Channel Import | Yes |
CQL | No |
CRUD & Delete Expunge | Yes |
Database: MongoDB | Yes |
Database: Relational | Yes |
DQM | No |
FHIR Endpoint | Yes |
ETL Import | Yes |
FHIR Gateway | Yes |
FHIR Web | Yes |
HL7 Listening | Yes |
HL7 Outbound | Yes |
JSON Admin (Bulk Import) | No |
LiveBundle | No |
MDM | Yes |
MegaScale | Yes |
Realtime Export | Yes |
SmileUtil | Yes |
Subscription | Yes |
Use/Apply CDA Template
does not enforce partitioning.