31.1.1MDM Rule Definition

In the MDM User Interface, you need to provide the MDM Rule Definition Script, a JSON document describing exactly how and why two FHIR Resources should be linked together.

Here are the minimal fields that must be included in this JSON document:

{
  "version": "1",
  "mdmTypes": [],
  "candidateSearchParams": [],
  "candidateFilterSearchParams": [],
  "matchFields": [],
  "matchResultMap": {}
}

The HAPI FHIR MDM Rules section introduces them. Below, you will find more detailed information about each field. Additionally, if you'd like to test out how different MDM algorithms work then see $mdm-evaluate operation.

MDM processing is divided into two main phases: the first phase is about finding candidate FHIR Resources for the second phase, which consist of matching more precisely the newly created or updated input FHIR resource with the candidate FHIR Resources found in the first phase to finally create links between them.

So the first phase uses candidateSearchParams and candidateFilterSearchParams on the FHIR Resource types listed in mdmTypes to find the candidates, the second phase uses matchFields and matchResultMap to create the MDM links.

Also, the optional eidSystems field can be used to change how the MDM module creates links and Golden Record resources. See Using Enterprise Identifiers (EIDs) in MDM Rule Definition section for more details.

31.1.2Finding Candidates

Field Name	Brief Description	Notes and Comments
`mdmTypes`	List the different FHIR Resource types that will be analyzed by MDM module.	Any FHIR Resource having an active `identifier` SearchParameter can be configured in MDM. Most common FHIR Resources used in MDM: Patient, Practitioner, Organization, Location, Person Others: Account, CareTeam, RelatedPerson, Group, Device, InsurancePlan, etc. Even if multiple types are listed, the MDM module will only link resources of the same type together. For example, if `mdmTypes` lists Patient and Practitioner, no Patient will be linked to Practitioner, and vice versa.
`candidateSearchParams`	List the SearchParameter used that must have at least one exact match before two resources are considered for matching.	It is used to search candidate FHIR Resources based on the listed SearchParameter and the field values of the newly created or updated input FHIR resource. The candidates found by these searches will be used to find matches with the input FHIR resource more precisely later in the second phase. One search will be done in the database for each entry in the `candidateSearchParams` array, all executed in parallel. The FHIR Resources must already be indexed with each SearchParameter used. Custom SearchParameter can be created/indexed beforehand and used. Obviously, the performance of the first phase is totally dependent on the speed of the searches done with the SearchParameter. It is important to note that if too many candidates are found by the search queries, it will slow down the matching process a lot. For that reason, any field that divides the data into numerous small groups could be considered. Example: most identifiers, phone number, date of birth. But fields that divide the data into few large groups wouldn't make sense to be used performance wise. Example: gender, given name, some identifiers that group many resources, province or state. Also, data associated with locations, or data that change regularly in one resource might prevent finding appropriate candidate resources that could have been used during the matching phase. Example: street address.
`candidateFilterSearchParams`	Used to add filters to be applied on the `candidateSearchParams` searches, to further minimize the number of candidates to analyze.	The fields used in these filters must already be indexed with a SearchParameter. Custom SearchParameter can also be created/indexed beforehand and used for it. An optional qualifier can also be used with the SearchParameter, with either `ABOVE`, `BELOW`, `NOT`, `IN`, `NOT_IN`, `TEXT` or `OF_TYPE` value. Example of filtering: select resources having active status only, family name not equals to a particular test value. Typically, it is used to exclude resources from groups of data selected in `candidateSearchParams`. Example: select resources excluding identifiers of a particular system or value. Other examples: `[ { "resourceType": "Patient", "searchParam": "family", "qualifier": "NOT", "fixedValue": "TestFamilyName" }, { "resourceType": "Patient", "searchParam": "active", "fixedValue": "true" }, { "resourceType": "Patient", "searchParam": "language", "qualifier": "NOT", "fixedValue": "fr-FR" } ]`
`eidSystems`	Optional field used to specify which identifiers can be expected and used as unique identifier on incoming resources.	Using this field affects the way MDM module processes the incoming resources. During the first phase, it will first try to find Golden Record resources having the specified EID before finding candidates. See 'Using Enterprise Identifiers (EIDs) in MDM Rule Definition' section for more details.

Usually, it is possible to test the searches that will be done during the first phase to find the candidate resources, to assess their performance.

For example, using this sample MDM rule definition (excluding matchFields and matchResultMap for simplicity):

{
  "version": "v2022-10-01",
  "mdmTypes": [ "Organization" ],
  "candidateSearchParams": [
      {
          "resourceType": "*",
          "searchParams": [
              "identifier"
          ]
      },
      {
          "resourceType": "Organization",
          "searchParams": [
              "name"
          ]
      }
  ],
  "candidateFilterSearchParams": [
      {
          "resourceType": "Organization",
          "searchParam": "active",
          "fixedValue": "true"
      },
      {
          "resourceType": "Organization",
          "searchParam": "type",
          "qualifier": "NOT",
          "fixedValue": "other"
      }
  ]
}

Here, the JSON document specifies that two searches should be done to find candidates, one with identifier SearchParameter and one with name SearchParameter. Also, the results of both searches should be filtered to include active resources only and exclude all resources having type code value of 'other'.

When adding this new Organization resource:

{
  "resourceType": "Organization",
  "identifier": [
      {
          "system": "http://mysite.com/fhir/system/our-internal-organization-id",
          "value": "MyOrg-123"
      }
  ],
  "name": "MyOrganization",
  "active": true,
  "type": [
      {
          "coding": [
              {
                  "system": "http://terminology.hl7.org/CodeSystem/organization-type",
                  "code": "edu",
                  "display": "Educational Institute"
              }
          ]
      }
  ]
}

These equivalent searches will be run in parallel to find the candidate resources: http://localhost:8000/Organization?identifier=http://mysite.com/fhir/system/our-internal-organization-id|MyOrg-123&active=true&type:not=other
http://localhost:8000/Organization?name=MyOrganization&active=true&type:not=other

All Organization resources found by these queries will be kept for the second phase, and be matched more precisely with the new 'MyOrg-123' Organization resource. If no Organization resource is returned by the search queries, then the second phase is skipped, and no new MDM link is created (beside the one to a new Golden Record).

The searches are pretty specific and fast, and should not return a lot of Organization resources as candidates, which is better performance wise.

Ideally, these searches should be tested beforehand to make sure they run quickly and don't return too many resources.

31.1.3Matching and Creating Links

Field Name	Brief Description	Notes and Comments
`version`	Identify the current version of your rule definition JSON document. Mandatory field, can be any non-empty string of 16 characters maximum.	Useful for debugging purposes, newly created MDM links will be associated with this version. It is highly recommended to change this version when you change your rule definition JSON document, as it could permit to identify unwanted MDM links and understand why they were created more easily.
`matchFields`	Used to specify exactly how to compare one of more fields of the incoming resource to the candidates found.	A lot of different comparison algorithms are provided in the HAPI-FHIR documentation. Each entry can use either a `resourcePath` or a more custom `fhirPath` that will be used to retrieve the resource field values used for the comparisons. An entry from `matchFields` array will be used only if its `name` appears in at least one of `matchResultMap` keys. Unused entries can be safely removed from the array. The `name` field should not contain any comma character because of the way they are used in `matchResultMap`.
`matchResultMap`	This map lists the specific ways for two resources to match together and be linked.	The key of the map entries lists the `matchFields` entries by name to be used for comparisons. Each `matchFields` name in the key is separated by comma character: `matchFieldNameA,matchFieldNameB,matchFieldNameC` for example. Each `matchFields` in the key will be evaluated against the incoming resource. The value of the map entries is the resulting link type: `MATCH` or `POSSIBLE_MATCH`. It is not necessary to include an entry in the map for `NO_MATCH` result, as no link is created when two resources don't match by default. Links are created for `MATCH` and `POSSIBLE_MATCH` results only. Only one link is kept between two resources. If multiple entries in the map are found to be true between two resources, then only one link type result is kept so that `MATCH` results always take precedence over `POSSIBLE_MATCH` results. Performance wise, to make it faster you should try to minimize the number of entries in the map and also try to minimize the overlapping comparisons. Example of unnecessary overlapping comparisons: `[ "matchFieldA,matchFieldB" : "MATCH", "matchFieldA,matchFieldB,matchFieldC" : "MATCH" ]` Here, the second `matchFieldA,matchFieldB,matchFieldC` entry is redundant of the first `matchFieldA,matchFieldB` entry and would not make the process creates any additional link. It is because the first `matchFieldA,matchFieldB` entry would make the MDM module creates a `MATCH` link no matter if `matchFieldC` comparison is true or not. The second `matchFieldA,matchFieldB,matchFieldC` entry should be removed as it would only make the matching process slower for no additional result. Another example of unnecessary overlapping comparisons: `[ "matchFieldA" : "MATCH", "matchFieldA,matchFieldB,matchFieldC" : "MATCH" ]` Again, the second entry is not required and should be removed, as the first entry would always create `MATCH` links no matter if `matchFieldB` or `matchFieldC` comparisons are true or not. An useful way of using overlapping comparisons: `[ "matchFieldA" : "POSSIBLE_MATCH", "matchFieldA,matchFieldB,matchFieldC" : "MATCH" ]` This is a valid example of overlapping comparisons. The first entry would create `POSSIBLE_MATCH` links when `matchFieldA` comparison is true, however if `matchFieldB` and `matchFieldC` comparisons are also found to be true then it would create a `MATCH` link instead of a `POSSIBLE_MATCH` link, as `MATCH` result takes precedence. Finally, the order of match fields in the map key doesn't matter as all comparisons are done nevertheless: `[ "matchFieldA,matchFieldB,matchFieldC" : "MATCH", "matchFieldC,matchFieldA,matchFieldB" : "MATCH" ]` Both entries would produce exactly the same links, so only one entry should be kept.
`eidSystems`	Optional field used to specify which identifiers can be expected and used as unique identifier on incoming resources.	Using this field affects the way MDM module processes the incoming resources. During the second phase, MDM links and Golden Record resources will not be created in the same way, depending on `Prevent modification of External EIDs` and `Prevent multiple EIDs from existing simultaneously on a target resource` enabled properties in the MDM Configuration. See 'Using Enterprise Identifiers (EIDs) in MDM Rule Definition' section for more details.

31.0 MDM 31.2 Using EIDs in MDM Rule Definition