Smile CDR v2023.02.PRE
On this page:

22.0Master Data Management (MDM)
Trial

 
This is a new feature initially introduced in Smile CDR 2020.08 and HAPI FHIR 5.1.0 as EMPI and now updated to support MDM. Feedback is welcome. Please get in touch to discuss if you are interested in implementing MDM during the early availability of this new feature.

Smile CDR MDM works for most HAPI FHIR resources, independently for each resource type. In the MDM documentation, we use "Patient" to illustrate MDM use cases.

MDM maintains links between a HAPI FHIR resource and its real-life representation known as Golden Record. MDM keeps track of when multiple resources refer to the same Golden Record. The Golden Record id plays the role of a unique identifier for that resource across all of the different source systems. For example, if the FHIR repository consolidates lab data, medication data, claims data etc from disparate systems that all maintain their own Patient records, all of those records will be tied together as belonging to the same Patient.

Most of the Smile CDR MDM details can be found in the HAPI FHIR MDM documentation:

HAPI FHIR MDM Table of Contents

22.0.1Getting Started with Smile CDR MDM
Trial

 

If you'd like to jump right in and start trying things out, a quick walkthrough is available in our Smile CDR MDM Quickstart Guide. Otherwise, if you'd like to understand in detail how MDM is configured within Smile CDR read on.

22.0.2Enabling and Configuring MDM within Smile CDR
Trial

 

To enable MDM on a Smile CDR FHIR repository, several modules are used together. The following diagram shows how these different modules relate to each other.

MDM Components

This diagram shows the following modules:

Cluster Manager Module

The Cluster Manager Module contains the configuration used to connect to the selected message broker.

See Message Broker for information on how to select and configure a message broker. By default an embedded Apache ActiveMQ server will be used, and this is acceptable for testing purposes but an external broker should be used in a production scenario.

FHIR Storage Module

The FHIR Storage Module must be a relational database (MDM on Mongo is not currently supported). MDM uses a Subscription Module and the configured message broker to process incoming resources asynchronously.
MDM uses the FHIR Storage Module specified by the Subscription Matching Module it depends on. MDM support is configured on the FHIR Storage module via theĀ mdm.enabled and subscription.message.enabled properties, requiring both to be enabled.

MDM Search Expansion

Users of MDM often want to query across all the resources associated with all linked patients in a single query. This is called MDM Search Expansion. For example, a user may want to know all the Observations of a given Patient, but also include all the observations of all matched patients. The following query shows an example of the usage of the :mdm parameter modifier to expand a search across all linked patients.

GET [base]/Observation?patient:mdm=Patient/123

The above query will return all Observations for Patient/123 and all Observations for all linked patients.

MDM Expansion is also supported on the $everything operation via the _mdm query parameter. Below is an example:

GET [base]/Patient/123/$everything?_mdm=true

In order to support expanded reference searches using the :mdm search parameter qualifier, you must enable the mdm.search_expansion.enabled property. If expanded reference searches are enabled and the user has the FHIR_AUTO_MDM permission, search parameters in the FHIR Patient Compartment will be expanded even when the :mdm qualifier is not explicitly provided. This will also automatically convert any $everything operation invocations to use mdm expansion.

Subscription Matching Module

A Subscription Matching module should be created, with a module dependency on the chosen FHIR Storage module. When MDM starts up, it will create a subscription for each MDM type. These subscriptions have a "message" channel type and submit the incoming MDM resources on the "mdm" channel.

MDM Module

The MDM Module subscribes to the "mdm" channel and processes the incoming MDM resources, creating MDM links according to the rules configured in this module. See HAPI FHIR MDM and MDM Rule Definiton for details on how the MDM rules are configured.

It means the Message Broker configured in the Cluster Manager Module might be used extensively by the MDM module, specially if operations like $mdm-submit are used. Many FHIR resources could potentially be pushed to the "mdm" channel of the message broker, so a strong and capable external broker is recommended in those cases.

22.0.3Troubleshooting
Trial

 

The MDM Troubleshooting Log can be helpful in diagnosing issues relating to MDM processing.

22.0.4MDM User Interface
Trial

 

The MDM User Interface is currently under development. See MDM Operations for a description of the back-end operations that support this user interface. Smile CDR also provides simplified REST versions of these operations through the JSON Admin API. The HAPI FHIR version of these operations are FHIR-Compliant Operations and use FHIR Parameters resources for input and output, so the payloads can be quite heavy. The Smile CDR JSON payloads available through the JSON Admin API are lighter weight and may be easier for front-ends to work with.

The $mdm-clear operation is a batch job that can be managed on the Batch Jobs page of the Web Admin Console. The $mdm-submit operation can also be run as a batch job using the appropriate header.

22.0.5MDM Scenarios
Trial

 

Smile MDM is designed to be flexible to work on different kinds of enterprise environments. Below are some example MDM scenarios.

Create-only EID mode and multiple EID mode

Some enterprises have a strict intake process that identifies all patients with a uniquely assigned Enterprise Identifier (EID) and all interactions identify that Patient via their EID.

Smile MDM has EID management configuration options to support this scenario. By default, the MDM system does not allow updates to an EID. This can be disabled via property. There is a similar property which controls the ability for a resource to hold multiple EIDs simultaneously; this is disabled by default.

See Using Enterprise Identifiers in MDM Rule Definition section for more details about using EIDs and how these options change the way the incoming resources are processed by the MDM Module.

Rule-based matching

Other enterprises, however, need to consolidate records from different systems where it is not known beforehand which records refer to the golden record. To support this scenario, Smile MDM provides a rich set of MDM Matching Rules to algorithmically detect when two Patient records refer to the same person.
Based on the rule configuration, some patients will be identified as exact matches and automatically linked. Others may be flagged as possible matches or may identify that two Golden Records in the system may be duplicates. Smile CDR provides an MDM User Interface to manually resolve these possible matches and duplicates. The MDM Rule Definiton section provides details on how to create MDM rules for resource matching. See Sample MDM Rules below for an example of a rules file.

Analytics

Smile MDM can also be used to support business analytics, automatically linking batch data from different systems. In this scenario, the FHIR repository might be reset before each load, or it may link to external records maintained in a data warehouse.

Sample MDM Rules

{
   "version": "1",
   "mdmTypes": ["Patient", "Practitioner"],
   "candidateSearchParams": [
      {
         "resourceType": "Patient",
         "searchParams": [
            "birthdate"
         ]
      },
      {
         "resourceType": "*",
         "searchParams": [
            "identifier"
         ]
      }
   ],
   "candidateFilterSearchParams": [
      {
         "resourceType": "*",
         "searchParam": "active",
         "fixedValue": "true"
      }
   ],
   "matchFields": [
      {
         "name": "family-name-double-metaphone",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "DOUBLE_METAPHONE"
         }
      },
      {
         "name": "given-name-double-metaphone",
         "resourceType": "*",
         "resourcePath": "name.given",
         "matcher": {
            "algorithm": "DOUBLE_METAPHONE"
         }
      },
      {
         "name": "given-name",
         "resourceType": "*",
         "resourcePath": "name.given",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "family-name",
         "resourceType": "*",
         "resourcePath": "name.family",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "birthdate",
         "resourceType": "*",
         "resourcePath": "birthDate",
         "matcher": {
            "algorithm": "DATE"
         }
      },
      {
         "name": "gender",
         "resourceType": "*",
         "resourcePath": "gender",
         "matcher": {
            "algorithm": "STRING",
            "exact": true
         }
      },
      {
         "name": "family-name-soundex",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "SOUNDEX"
         }
      },
      {
         "name": "given-name-soundex",
         "resourceType": "*",
         "resourcePath": "name.given",
         "matcher": {
            "algorithm": "SOUNDEX"
         }
      },
      {
         "name": "city",
         "resourceType": "*",
         "resourcePath": "address.city",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "address-line",
         "resourceType": "*",
         "resourcePath": "address.line",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "state",
         "resourceType": "*",
         "resourcePath": "address.state",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "postal-code",
         "resourceType": "*",
         "resourcePath": "address.postalCode",
         "similarity": {
            "algorithm": "LEVENSCHTEIN",
            "matchThreshold": 0.8
         }
      },
      {
         "name": "family-name-caverphone2",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "CAVERPHONE1"
         }
      },
      {
         "name": "family-name-caverphone1",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "CAVERPHONE2"
         }
      },
      {
         "name": "name-prefix",
         "resourceType": "*",
         "resourcePath": "name.prefix",
         "matcher": {
            "algorithm": "STRING"
         }
      },
      {
         "name": "family-name-normalize-substring",
         "resourceType": "*",
         "resourcePath": "name.family",
         "matcher": {
            "algorithm": "SUBSTRING"
         }
      },
      {
         "name": "given-name-normalize-substring",
         "resourceType": "*",
         "resourcePath": "name.given",
         "matcher": {
            "algorithm": "SUBSTRING"
         }
      }
   ],
   "matchResultMap": {
      "given-name-double-metaphone,family-name,birthdate,gender,address-line,city,state,postal-code": "MATCH",
      "given-name-double-metaphone,family-name,birthdate,gender": "POSSIBLE_MATCH"
   },
   "eidSystems": {
      "*": "http://hl7.org/fhir/sid/us-ssn"
   }
}

Multithreaded Performance Considerations

By default, MDM runs on a single thread in a background. If your MDM rules are defined by an EID, and your message broker is Kafka, you are afforded the opportunity to run MDM on multiple threads. In order to enable multiple consumers for MDM, you have to perform the following steps:

  1. Manually set the partition count for the mdm kafka topic to the amount of threads you want doing MDM processing.
  2. In the MDM module config, set the consumer count to the same value as the partition count. e.g, if your topic has 3 partitions, set the consumer count to 3.
  3. (Optional) Provide a kafka partition key generator script using MDM Partition Key Script Text or MDM Partition Key Script File properties.
  4. Restart the MDM module.

On boot, each consumer will consume from a single partition. Note that this only works if you define:

  • an eidSystem or eidSystems in your MDM rules, in which case the value of the eidSystem becomes the partition key. If multiple eidSystems are defined, the value of the first eidSystem becomes the partition key or
  • a MDM Partition Key Script is defined as per option 3. above, in which case the script must provide the partition key for each resource.

MDM and Partitioning

If both MDM and multitenancy are enabled, resources can only be matched against resources in the same partition.