On this page:

5.14Creating Data

 

This section contains information about methods for creating data in the CDR.

5.14.1Transactions and Submitting Bundles

 
  • If you POST to [baseUrl]/Bundle you are submitting the bundle for storage as-is. In other words, the bundle is stored as a Bundle, and the contents inside aren’t looked at by the server (aside from any validation that is enabled). This mode is generally used to store Bundle resources with Bundle.type values such as document, and collection. In its default configuration, Smile CDR will prohibit storing a Bundle with a type value of transaction or batch as this is generally a sign that the client is attempting to perform the operations described below but with an incorrect request URL.

    Naming: This is not a FHIR Transaction, but instead is simply a simple resource create where the resource happens to be a Bundle resource.

  • If you POST to [baseUrl] and your Bundle has a Bundle.type value of transaction you are performing a FHIR “transaction operation”, meaning that all of the individual resources inside the bundle will be processed. It is also possible to include other REST operations such as searches in this kind of bundle. The processing works as an atomic unit, meaning that if anything fails (e.g. invalid data in an individual element) the entire thing will be rolled back.

    Naming: This operation is referred to as a FHIR Transaction operation.

  • If you POST to [baseUrl] and your Bundle has a Bundle.type value of batch, the same processing as the transaction applies, except that individual operations are executed in individual database transactions, so an individual failure doesn’t cause the entire operation to be rolled back. In this case, the response Bundle returned by the server will include status entries indicating the outome for the individual operations within. Note that the batch operation does require the entire Bundle to be valid FHIR at a minimum. This means that it can’t have non-existent resource types in it, malformed datatypes, etc.

    Naming: This operation is referred to as a FHIR Batch operation.

5.14.2Auto Creating Reference Targets

 

Often when batch processing data from multiple sources, you will have data from one source that has references to data from other sources.

For example, a collection of Observation resources could be imported from a lab system data source at the same time that a collection of Patient resources is created from a patient administration data source. The Observation resources would have references to the Patient resources. Under ideal conditions, the Patient resource would process first and be present for the Observation to link to. In the real world however, often it is hard to control the order that transactions occur, and so it might be possible for an Observation to be processed before its Patient. By default this would cause an error since the Observation would have an invalid reference, and nothing would be stored.

This page describes several strategies for solving this issue:

5.14.3Transaction With Conditional Create

 

The semantics of a conditional create are roughly described as "use an existing resource that matches specific criteria if one exists (and do not modify that resource), or create a new one if not". The specific criteria in question can be any set of FHIR search parameters that could be used to otherwise locate the resource to use. The resource identifier field/search parameter is often used for this purpose, but other search parameters can also be used.

In a FHIR Transaction operation, an upsert is performed using a conditional create.

This involves creating a Transaction Bundle with the following properties (an Observation being created with a reference to a Conditionally Created Patient is being used for this example):

  • One or more entries containing an Observation resource with a request.method value of POST. This means that the server should create the new Observation resources, and automatically assign them new IDs.

  • An entry containing a Patient resource with:

    • A request.method value of POST

    • A fullUrl value containing a temporary UUID. This is used as the target for references to this resource from other resources.

    • A request.ifNoneExist value containing a search URL that could be used to find this resource (in the example below, a search for the Patient by identifier). This indicates to the server that this resource should only be created if no existing resource already matches the given search criteria.

  • The Observation resources contain a reference where the target is the fullUrl UUID for the Patient entry. If the Patient target was created (because it did not already exist) the reference will automatically be replaced with a reference to the newly created resource. If the Patient target was not created (because it already existed), the reference will automatically be replaced with a reference to the pre-existing Patient resource.

An example Transaction Bundle is shown below. It should be POSTed to the root of the FHIR Endpoint module server.

{
  "resourceType": "Bundle",
  "type": "transaction",
  "entry": [ {

    "request": {
      "method": "POST",
      "url": "Observation"
    },
    "resource": {
      "resourceType": "Observation",
      "status": "final",
      "code": {
        "coding": [ {
          "system": "http://loinc.org",
          "code": "789-8",
          "display": "Erythrocytes [#/volume] in Blood by Automated count"
        } ]
      },
      "subject": {
        "reference": "urn:uuid:3bc44de3-069d-442d-829b-f3ef68cae371"
      },
      "valueQuantity": {
        "value": 4.12,
        "unit": "10 trillion/L",
        "system": "http://unitsofmeasure.org",
        "code": "10*12/L"
      }
    }

  },{

    "fullUrl": "urn:uuid:3bc44de3-069d-442d-829b-f3ef68cae371",
    "request": {
      "method": "POST",
      "url": "Patient",
      "ifNoneExist": "identifier=http://acme.org/mrns|12345"
    },
    "resource": {
      "resourceType": "Patient",
      "identifier": [ {
        "system": "http://acme.org/mrns",
        "value": "12345"
      } ],
      "name": [ {
        "family": "Jameson",
        "given": [ "J", "Jonah" ]
      } ],
      "gender": "male"
    }

  } ]
}

5.14.4Auto-Create Placeholder Reference Targets

 

If the Auto-Create Placeholder Reference Targets setting is enabled in the FHIR Storage module configuration, it is possible to have the server automatically create an empty "Placeholder" resource with a pre-assigned ID.

This technique is somewhat less complex than the example above, since it does not require a transaction bundle to be created. With this technique, the ID (not the identifier) of the target resource must be known. For example, the following payload could be POSTed to [baseUrl]/Observation, and would result in the creation of an empty resource with the ID Patient/ABC if one does not already exist. Note that if you want to be able to use this technique with purely numeric resource IDs you will also need to adjust the Client ID Mode.

{
  "resourceType": "Observation",
  "status": "final",
  "code": {
    "coding": [ {
      "system": "http://loinc.org",
      "code": "789-8"
    } ]
  },
  "subject": {
    "reference": "Patient/ABC"
  },
  "valueQuantity": {
    "value": 4.12,
    "system": "http://unitsofmeasure.org",
    "code": "10*12/L"
  }
}

5.14.5Auto-Create Placeholder Reference Targets with Identifier

 

If the Auto-Create Placeholder Reference Targets setting is enabled in the FHIR Storage module configuration (as shown above), and the Allow Inline Match URL References Enabled setting is also enabled, you can refine the behavior shown above further.

In this case, it is possible to use an inline match URL instead of a hardcoded resource ID, and you can then achieve similar behavior to the Transaction Bundle use case.

Consider the following Observation being POSTed to /Observation.

{
  "resourceType": "Observation",
  "status": "final",
  "code": {
    "coding": [ {
      "system": "http://loinc.org",
      "code": "789-8"
    } ]
  },
  "subject": {
    "reference": "Patient?identifier=http://foo|1234",
    "identifier": {
      "system": "http://foo",
      "value": "1234"
    }
  },
  "valueQuantity": {
    "value": 4.12,
    "system": "http://unitsofmeasure.org",
    "code": "10*12/L"
  }
}

In this case, the reference will be treated as a local search (in this case for a Patient with the identifier shown), and executed as such. If the search finds zero results, a new Patient resource will be created and the Patient.identifier value will be populated with the Reference.identifier value shown. The reference will then be automatically replaced with a reference to this new Patient. If the search finds one result, the reference will then be automatically replaced with a reference to the found Patient.

It is important to note that a resource automatically created in this manner will only be populated with an identifier if Reference.identifier is provided, and the new resource type itself has an Identifier element. If Reference.identifier is omitted, the reference target cannot be subsequently updated using an inline match URL. For this reason, it is recommended to always populate Reference.identifier when using inline match URLs to facilitate auto-creation and updating of placeholder reference targets.

5.14.6Creation Performance

 

In scenarios where you want to ingest a large amount of data (e.g. for backloading prior to an initial go-live) there are some subtle tweaks that can have a significant impact on how quickly data is ingested. This section applies to data ingested through the FHIR REST API (e.g. a FHIR Endpoint module in Smile CDR) but it also applies to data ingested through ETL and Channel Import modules.

There is no silver bullet to achieving the best performance. The strategies below will each yield incremental improvements, but not all of them may be possible or appropriate for your situation, and that is fine.

  • Avoid client assigned IDs: The FHIR "create with client assigned ID" uses an update/PUT operation to let the client control the ID of the resource inserted in the database, rather than relying on the server to assign one. This can be handy if you are replicating data from another system and want your IDs to match between systems. Client-assigned IDs come at a price however, as the system needs to first perform a read before every write to ensure that a resource doesn't already exist with the specified ID. It is often feasible to use the identifier field instead of the id field to store these source system IDs.

  • Disable Unnecessary Search Parameters: Some resources generate a particularly large number of search parameter indexes. Every enabled FHIR search parameter will result in extra processing time during writes, so it can be beneficial to consider whether you can disable parameters you know you will not need. The Patient and Observation resources are particularly bad in this regard, but this suggestion applies to some extent to all resource types.

  • Disable Unnecessary Features: The following features should be disabled (even if temporarily during backload) unless they are needed, as they add additional processing time for each resource being loaded.

  • Avoid Small FHIR Transactions: A FHIR Transaction object with a small number of entries will often perform slightly worse than equivalent individual operations due to the extra processing required to support placeholder ID resolution in transactions. It is often worth testing whether breaking up a transaction will yield better performance for your specific use case.

  • Large Transactions Can Perform Better: On the other hand, if you are ingesting large numbers of related resources (for example, a collection of Patient resources, plus an even larger collection of Observation resources for each patient) it can be much faster to batch these into a collection of FHIR transaction Bundle resources where each transaction has hundreds or even thousands of resources to be processed at the same time.