8.1.1Sharding / Partitioning on MongoDB
Trial

 

MongoDB supports a concept called Sharding, which achieves horizontal scaling by dividing the database into a collection of smaller logical databases called shards. Many read/write operations can then be performed more efficiently by only accessing the individual shard server(s) instead of accessing the entire database.

Smile CDR supports sharded databases using built-in partitioning.

In this documentation the term partitioning is used over sharding, although these refer to the same concept.

8.1.2Scope and Limitations

 

Note the following points which are specific to the partitioning feature implementation for the FHIR Storage (MongoDB) module:

  • Server-level history must be disabled on a partitioned server. This is because it is not currently possible to efficiently maintain global resource history in a sharded way. This limitation may be removed in a future release. Instance-level history can be left enabled on partitioned servers.
  • Partitioning has only been tested and is only supported with Patient ID Partition Mode. This means that the only supported means of enabling sharding on a MongoDB FHIR repository is to shard on the patient identity, meaning that all data belonging to a patient will be stored in the same shard and all non-patient-specific data will live on one shard. This means that patient specific queries will see the most benefit from a sharded environment.

8.1.3Enabling Sharding

 

To enable sharding on a FHIR Storage (MongoDB) module:

8.1.3.0.1Sharded MongoDB Environment

Setting up a MongoDB database cluster with sharding enabled is a non-trivial activity, beyond the scope of this documentation. It is described in the MongoDB Documentation.

For testing purposes only, a Docker Compose file which builds an entire local cluster can be found in this GitHub Repository: https://github.com/pkdone/sharded-mongodb-docker

8.1.3.0.2Required Settings

The following snippet shows a Smile CDR configuration document with the basic settings to enable a FHIR Storage (MongoDB) module with partitioning enabled.

module.persistence.type                                             =PERSISTENCE_MONGODB

# Set these appropriately
module.persistence.config.db.url                                    =mongodb://localhost:27017/cdr
module.persistence.config.db.username                               =cdr
module.persistence.config.db.password                               =cdr

# Enable only the resource types you need to support
module.persistence.config.resource_types.supported.whitelist        =Patient, Observation, CodeSystem

# Disable Server level history  
module.persistence.config.history.server.enabled                    =false

# Enable partitioning in Patient ID mode
module.persistence.config.partitioning.enabled                      =true
module.persistence.config.partitioning.partition_selection_mode     =PATIENT_ID

8.1.4Architecture

 

When partitioning is enabled, an additional element called _PartitionName is added to all documents stored on a partitioned resource collection. This element is used as the shard key on these collections.

The following example shows a document with this element present:

{
    "_id": "63172d3a506fbb5d211ca92b",
    "resource": {
        "resourceType": "Observation",
        "id": "426c6336-047b-4611-a38a-58b18bfc3fcb",
        "meta": { "versionId": "1", "lastUpdated": "2022-09-06T07:21:30.804-04:00" },
        "identifier": [ { "system": "http://identifier", "value": "A1" } ],
        "subject": { "reference": "Patient/A" }
    },
    "_HashSha256": "d64a8d0b3e8a2f825f1f1208cf3b09295e25d6ba44bf7943249e45b627f3634f",
    "_UniqueKey": "identifier=http%3A%2F%2Fidentifier%7CA1",
    "_PartitionName": "A",
    "_lastUpdated-time": 1662463290804,
    "identifier": [ "http://identifier|A1", ",|A1" ],
    "patient": [ "Patient/A" ],
    "subject": [ "Patient/A" ]
}