Partitioning allows every resource on the server to be placed in a partition, which is essentially just an arbitrary identifier grouping a set of resources together.
Partitioning is designed to be flexible, and can be used to achieve different outcomes. For example:
Partitioning could be used to achieve multitenancy, where there are multiple logically separate pools of resources on the server. Traditionally this kind of setup is desired when each of these pools belongs to a distinct user group / organization / customer / etc. (a "tenant"), and each of these tenants should not be able to access or modify data belonging to another tenant.
Partitioning could also be used to logically separate data coming from distinct sources within an organization. For example, patient records might be placed in one partition, lab data sourced from a lab system might be placed in a second partition and patient surveys from a survey app might be placed in another. In this situation data does not need to be completely segregated (lab Observation records may have references to Patient records in the patient partition) but these partitions might be used to support security groups, retention policies, etc.
Partitioning could be used for geographic sharding, keeping data in a partition that is geographically closest to where it is likely to be used.
Partitioning can be used for scalability, as a mechanism to take advantage of native partitioning/sharding capabilities in the underlying database platform.
These examples each have different properties in terms of security rules, and how data is organized and searched.
See the HAPI FHIR Partitioning Documentation for a general overview of the concepts related to partitioning.
Storage Module Type | Supported | Additional Documentation |
FHIR Storage (RDBMS) | ✔ | |
FHIR Storage (MongoDB) | ✔ | Sharding / Partitioning on MongoDB |
To enable partitioning, the Partitioning Enabled property on the FHIR Storage (RDBMS) module must be enabled.
Once partitioning is enabled, you will have two new concerns in your server:
Request Partition Selection: Every incoming FHIR request (e.g. a FHIR read, create, transaction, etc.) will now need to identify the partition ID for the given request. For a FHIR create this means identifying the partition ID that will be stored with the resource. For a FHIR read this means limiting the read to only selecting resources with the given partition ID.
Request Partition Security: When a partition ID is selected, the requesting user must also have appropriate access rights (permissions) to be able to access the given partition.
If you are building a new deployment, you may also consider enabling Database Partition Mode.
There are several strategies available for partition selection:
If the Partition Selection Mode is set to MANUAL
, Smile CDR will not attempt to automatically determine the partition ID for requests, but will rather rely on a customer-supplied Interceptor. The interceptor should include hooks to Identify partitions for Create and Read as described here.
If the Partition Selection Mode is set to REQUEST_TENANT
, Smile CDR will allow the FHIR Endpoint module to determine the request partition based on the Request Tenant ID.
This means that a Tenant Identification Strategy must also be configured on the FHIR Endpoint module. See Tenant Identification Strategies below for more information.
If the Partition Selection Mode is set to PATIENT_ID
, Smile CDR will partition based on the ID of the relevant Patient Compartment. See Patient ID Partition Mode below for more information.
In addition to providing a way for the server to determine which partition is being accessed, when partitioning is enabled, the requesting user session will also need to have appropriate permissions to access that partition.
This is done by assigning any of the following user permissions to the user or to their session.
FHIR_ACCESS_PARTITION_ALL – This permission grants the user access to all partitions.
FHIR_ACCESS_PARTITION_NAME – This permission grants the user access to the given partition name(s). The argument to this permission is a comma separated list of partition names.
When the partition selection is based on explicit (generally client-supplied) request properties, the partition is also called a Tenant since this is generally the configuration used for a strict multitenant solution.
When the Tenant Identification Strategy is set to URL_BASED
the request partition ID will be determined by an extra element in the request path.
For example, if a FHIR Endpoint module is listening on port 8000, in a non-partitioned server a request to search for all Patients named smith
would use the following URL:
http://localhost:8000/Patient
In URL Based Tenant Selection mode, the Partition Name must be added to the base URL for the server. This means that the example above can be applied to a partition named TENANT-A
by using the following URL:
http://localhost:8000/TENANT-A/Patient
Note that the path element refers to the Partition Name and not the Partition ID.
In URL Based Tenant selection mode, server level operations such as the Partition Management Operations should be performed against the DEFAULT
partition, e.g.
http://localhost:8000/DEFAULT/$partition-management-create-partition
If the tenant is omitted from a server level operation request, then DEFAULT
will be assumed. For example, the following URL would be considered equivalent to the preceding one when using URL_BASED
tenant selection:
http://localhost:8000/$partition-management-create-partition
Only server level requests will assume the DEFAULT
partition. All other requests must explicitly specify a partition. For example, the following requests would fail in a multi-tenant configuration:
http://localhost:8000/Patient/1234
In order to succeed, the request would need to specify a partition:
http://localhost:8000/DEFAULT/Patient/1234
For cases where a server level request needs to be performed against all partitions, that can be made explicit using _ALL
as the tenant name.
At the moment we only support this for the $reindex
operation.http://localhost:8000/_ALL/$reindex
Upon startup, the system will always create a single partition with an ID of 0
(zero) and a name of DEFAULT
.
To create additional partitions, you can either use the Partition Management Operations or you can create a seed file.
The Partition Seed File property of the FHIR Storage (Relational) can be used to automatically create one or more additional partitions when the system starts up.
The value for this file is a Resource Path that should point to a file containing JSON PartitionDefinitions contents.
A sample partition file is found in the Smile CDR distribution at classes/config_seeding/fhir-partitions.json
. This file can be activated by setting the Partition Seed File configuration to a value of classpath:/config_seeding/fhir-partitions.json
.
If the Partition Selection Mode is set to PATIENT_ID
, Patient ID Partition Mode will be activated.
In this mode, the partition ID is determined by the resource ID of the Patient resource associated with the request. A hash function is used to create a partition ID that will be consistently used for all resources belonging to a given patient. This is helpful if you have a system that will be used to satisfy Patient-oriented queries and needs to scale to very large amounts of data.
For example, suppose the id Patient/ABC
results in a hash value of 111
. This means that:
When creating data:
When retrieving data:
This mode imposes several important limitations. These technical limitations are caused by current constraints of the HAPI FHIR partitioning system and should be relaxed in a future release. Please get in touch if you have specific needs that are impacted by these limitations.
Observation?identifier=http://foo|123
will not be permitted, but the search Observation?subject=Patient/ABC&identifier=http://foo|123
will be.The hashing function used to generate partition IDs uses a hash of the ID part of the broader resource ID. For example, given the resource Patient/ABC
, the string ABC
is hashed using the following function. The hashCode function is the Java String hashCode function, which provides a stable and fast string hashing algorithm.
int partitionId = Math.abs("ABC".hashCode()) % 15000;
To use Patient ID Partition Mode, the Partition Naming Mode setting must be set to UNNAMED
.
The Cross-Partition Reference Mode setting can be used to allow references to be created between two resources that are in separate partitions.
By default this type of reference is forbidden.
In ALLOWED_UNQUALIFIED mode, all references are allowed. This setting is only currently useful if the server is in Manual Request Partition Selection Mode, as resources in other partitions will not be visible to each other in Request Tenant Selection Mode.
Note that this functionality will likely be enhanced based on future requirements. Please get in touch if you would like to discuss more advanced partitioning and multitenancy strategies.
If you are only using partitioning to achieve scalability, taking advantage of native partitioning/sharding capabilities in the underlying database platform, and you do not need to restrict access based on partitions, you can disable partition-based security by setting Partition Security Enabled to false. (By default this setting is true.)
The list below shows all the available Smile CDR modules which may currently support Partitioning. If you need a module to support Partitioning please contact your Account Representative or Customer Success Manager.
Module | Supports Partitioning |
---|---|
Bulk Export | Yes |
Bulk Import | Yes |
CDA Export | Yes * |
CDA Import | Yes * |
Channel Import | Yes |
CQL | No |
CRUD & Delete Expunge | Yes |
Database: MongoDB | Yes |
Database: Relational | Yes |
DQM | No |
FHIR Endpoint | Yes |
ETL Import | Yes |
FHIR Gateway | Yes |
FHIR Web | Yes |
HL7 Listening | Yes |
HL7 Outbound | Yes |
JSON Admin (Bulk Import) | No |
LiveBundle | No |
MDM | Yes |
MegaScale | Yes |
Realtime Export | Yes |
SmileUtil | Yes |
Subscription | Yes |
Use/Apply CDA Template
does not enforce partitioning.