Smile CDR v2024.05.PRE
On this page:

22.0.1Master Data Management (MDM)
Experimental

 
This is a new feature initially introduced in Smile CDR 2020.08 and HAPI FHIR 5.1.0 as EMPI and now updated to support MDM. Feedback is welcome. Please get in touch to discuss if you are interested in implementing MDM during the early availability of this new feature.

Smile CDR MDM works for most HAPI FHIR resources, independently for each resource type. In the MDM documentation, we use "Patient" to illustrate MDM use cases.

MDM maintains links between a HAPI FHIR resource and its real-life representation known as Golden Record. MDM keeps track of when multiple resources refer to the same Golden Record. The Golden Record id plays the role of a unique identifier for that resource across all of the different source systems. For example, if the FHIR repository consolidates lab data, medication data, claims data etc from disparate systems that all maintain their own Patient records, all of those records will be tied together as belonging to the same Patient.

Most of the Smile CDR MDM details can be found in the HAPI FHIR MDM documentation:

22.0.1.1HAPI FHIR MDM Table of Contents

22.0.2Getting Started with Smile CDR MDM
Experimental

 

If you'd like to jump right in and start trying things out, a quick walkthrough is available in our Smile CDR MDM Quickstart Guide. Otherwise, if you'd like to understand in detail how MDM is configured within Smile CDR read on.

22.0.3Enabling and Configuring MDM within Smile CDR
Experimental

 

To enable MDM on a Smile CDR FHIR repository, several modules are used together. The following diagram shows how these different modules relate to each other.

MDM Components

This diagram shows the following modules:

22.0.3.1Cluster Manager Module

The Cluster Manager Module contains the configuration used to connect to the selected message broker.

See Message Broker for information on how to select and configure a message broker. By default an embedded Apache ActiveMQ server will be used, and this is acceptable for testing purposes but an external broker should be used in a production scenario.

22.0.3.2FHIR Storage Module

MDM uses a Subscription Module and the configured message broker to process incoming resources asynchronously.
MDM uses the FHIR Storage Module specified by the Subscription Matching Module it depends on. MDM support is configured on the FHIR Storage module via theĀ mdm.enabled and subscription.message.enabled properties, requiring both to be enabled.

22.0.3.3Subscription Matching Module

A Subscription Matching module should be created with a module dependency on the chosen FHIR Storage module. When MDM starts up, it will create a subscription for each MDM type. These subscriptions have a "message" channel type and submit the incoming MDM resources on the "mdm" channel.

22.0.3.4MDM Module

The MDM Module subscribes to the "mdm" channel and processes the incoming MDM resources, creating MDM links according to the rules configured in this module. See HAPI FHIR MDM and MDM Rule Definiton for details on how the MDM rules are configured.

The Message Broker configured in the Cluster Manager Module might be used extensively by the MDM module, specially if operations like $mdm-submit are used. Many FHIR resources could potentially be pushed to the "mdm" channel of the message broker, so a reliable external broker is recommended for this reason.

22.0.4MDM Search Expansion

 

Users of MDM often want to query across all the resources associated with all linked patients in a single query. This is called MDM Search Expansion. For example, a user may want to know all the Observations of a given Patient, but also include all the observations of all matched patients. The following query shows an example of the usage of the :mdm parameter modifier to expand a search across all linked patients.

GET [base]/Observation?patient:mdm=Patient/123

The above query will return all Observations for Patient/123 and all Observations for all linked patients.

MDM Expansion is also supported on the $everything operation via the _mdm query parameter. Below is an example:

GET [base]/Patient/123/$everything?_mdm=true

In order to support expanded reference searches using the :mdm search parameter qualifier, you must enable the mdm.search_expansion.enabled property. If expanded reference searches are enabled and the user has the FHIR_AUTO_MDM permission, search parameters in the FHIR Patient Compartment will be expanded even when the :mdm qualifier is not explicitly provided. This will also automatically convert any $everything operation invocations to use mdm expansion.

22.0.5Troubleshooting
Experimental

 

The MDM Troubleshooting Log can be helpful in diagnosing issues relating to MDM processing.

22.0.6MDM User Interface (MDM UI)
Experimental

 

This is a preview feature introduced in Smile CDR 2023.05 release. The MDM UI may require additional licensing in a future release.

See MDM UI for more information about this feature.

22.0.7MDM Scenarios
Experimental

 

Smile MDM is designed to be flexible to work on different kinds of enterprise environments. Below are some example MDM scenarios.

22.0.7.1Create-only EID mode and multiple EID mode

Some enterprises have a strict intake process that identifies all patients with a uniquely assigned Enterprise Identifier (EID) and all interactions identify that Patient via their EID.

Smile MDM has EID management configuration options to support this scenario. By default, the MDM system does not allow updates to an EID. This can be disabled via property. There is a similar property which controls the ability for a resource to hold multiple EIDs simultaneously; this is disabled by default.

See Using Enterprise Identifiers in MDM Rule Definition section for more details about using EIDs and how these options change the way the incoming resources are processed by the MDM Module.

22.0.7.2Rule-based matching

Other enterprises, however, need to consolidate records from different systems where it is not known beforehand which records refer to the golden record. To support this scenario, Smile MDM provides a rich set of MDM Matching Rules to algorithmically detect when two Patient records refer to the same person.
Based on the rule configuration, some patients will be identified as exact matches and automatically linked. Others may be flagged as possible matches or may identify that two Golden Records in the system may be duplicates. Smile CDR provides an MDM User Interface to manually resolve these possible matches and duplicates. The MDM Rule Definiton section provides details on how to create MDM rules for resource matching. See Sample MDM Rules below for an example of a rules file.

22.0.7.3Blocking MDM Matching

MDM can be configured to block certain resources from MDM matching entirely using a set of json rules.

For more information on block list rules, see the hapi fhir customizations section

Additional examples can be found here.

22.0.7.4Analytics

Smile MDM can also be used to support business analytics, automatically linking batch data from different systems. In this scenario, the FHIR repository might be reset before each load, or it may link to external records maintained in a data warehouse.

22.0.7.5Sample MDM Rules

{
   "version": "1",
   "mdmTypes": ["Patient", "Practitioner"],
   "candidateSearchParams": [
      {
         "resourceType": "Patient",
         "searchParams": [
            "birthdate"
         ]
      },
      {
         "resourceType": "*",
         "searchParams": [
            "identifier"
         ]
      }
   ],
   "candidateFilterSearchParams": [
      {
         "resourceType": "*",
         "searchParam": "active",
         "fixedValue": "true"
      }
   ],
   "matchFields": [
      {
         "name": "family-name-double-metaphone",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "DOUBLE_METAPHONE"
         }
      },
      {
         "name": "given-name-double-metaphone",
         "resourceType": "*",
         "resourcePath": "name.given",
         "matcher": {
            "algorithm": "DOUBLE_METAPHONE"
         }
      },
      {
         "name": "given-name",
         "resourceType": "*",
         "resourcePath": "name.given",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "family-name",
         "resourceType": "*",
         "resourcePath": "name.family",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "birthdate",
         "resourceType": "*",
         "resourcePath": "birthDate",
         "matcher": {
            "algorithm": "DATE"
         }
      },
      {
         "name": "gender",
         "resourceType": "*",
         "resourcePath": "gender",
         "matcher": {
            "algorithm": "STRING",
            "exact": true
         }
      },
      {
         "name": "family-name-soundex",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "SOUNDEX"
         }
      },
      {
         "name": "given-name-soundex",
         "resourceType": "*",
         "resourcePath": "name.given",
         "matcher": {
            "algorithm": "SOUNDEX"
         }
      },
      {
         "name": "city",
         "resourceType": "*",
         "resourcePath": "address.city",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "address-line",
         "resourceType": "*",
         "resourcePath": "address.line",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "state",
         "resourceType": "*",
         "resourcePath": "address.state",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "postal-code",
         "resourceType": "*",
         "resourcePath": "address.postalCode",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "family-name-caverphone2",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "CAVERPHONE1"
         }
      },
      {
         "name": "family-name-caverphone1",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "CAVERPHONE2"
         }
      },
      {
         "name": "name-prefix",
         "resourceType": "*",
         "resourcePath": "name.prefix",
         "matcher": {
            "algorithm": "STRING"
         }
      },
      {
         "name": "family-name-normalize-substring",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "SUBSTRING"
         }
      },
      {
         "name": "given-name-normalize-substring",
         "resourceType": "*",
         "resourcePath": "name.given",
         "matcher": {
            "algorithm": "SUBSTRING"
         }
      }
   ],
   "matchResultMap": {
      "given-name-double-metaphone,family-name,birthdate,gender,address-line,city,state,postal-code": "MATCH",
      "given-name-double-metaphone,family-name,birthdate,gender": "POSSIBLE_MATCH"
   },
   "eidSystems": {
      "*": "http://hl7.org/fhir/sid/us-ssn"
   }
}

22.0.7.6Multithreaded Performance Considerations

By default, MDM runs on a single thread in a background. If your MDM rules are defined by an EID, and your message broker is Kafka, you are afforded the opportunity to run MDM on multiple threads. In order to enable multiple consumers for MDM, you have to perform the following steps:

  1. Manually set the partition count for the mdm kafka topic to the amount of threads you want doing MDM processing.
  2. In the MDM module config, set the consumer count to the same value as the partition count. e.g, if your topic has 3 partitions, set the consumer count to 3.
  3. (Optional) Provide a kafka partition key generator script using MDM Partition Key Script Text or MDM Partition Key Script File properties.
  4. Restart the MDM module.

On boot, each consumer will consume from a single partition. Note that this only works if you define:

  • an eidSystem or eidSystems in your MDM rules, in which case the value of the eidSystem becomes the partition key. If multiple eidSystems are defined, the value of the first eidSystem becomes the partition key or
  • a MDM Partition Key Script is defined as per option 3. above, in which case the script must provide the partition key for each resource.

22.0.7.7MDM and Partitioning

  • If both MDM and multitenancy are enabled, resources can only be configured to either matches against resources from all partitions or just against resources in the same partition using the MDM_SEARCH_ALL_PARTITION_FOR_MATCH.
  • All golden resources will be stored on the same partition as the resource that triggered the creation of the golden resource unless a partition is designated as the golden resource partition using the config MDM_GOLDEN_RESOURCE_PARTITION, in which case all golden resources will be stored on said partition.

22.0.7.8Fully Removing MDM

To fully remove MDM functionality from the system, first turn off the MDM module. You will then also need to manually set the status of all subscriptions to "Off".