Smile CDR v2024.05.PRE
On this page:

31.2.1FHIR Bulk Export Operation
Trial

 

The FHIR Bulk Data Access protocol describes a mechanism for efficiently requesting large amounts of data in a convenient format. This protocol can be used to request all data on a server, or a specific subset that is useful for a given use case.

31.2.2Enabling Bulk Export
Trial

 

Bulk export support is configured on the FHIR Storage module via the bulk_export.enabled property.

Note that Bulk Export file generation can involve the creation of many large Binary resources, so it is recommended to enable Externalized Binary Storage when using this feature.

The Maximum Bulk Export File Capacity property may be used to control the size of the generated files.

31.2.3Types of Bulk Export Requests
Trial

 

The FHIR Bulk Data Access specification describes several invocation styles. The following table outlines currently supported functionality.

Name Description URL Syntax Supported
Endpoint - Group of Patients Export a detailed set of FHIR resources of diverse resource types pertaining to all patients in specified Group. [fhir base]/Group/[id]/$export
Endpoint - System Level Export Export data from a FHIR server, whether it is associated with a patient. [fhir base]/$export
Endpoint - All Patients Export a detailed set of FHIR resources of diverse resource types pertaining to all patients. [fhir base]/Patient/$export

31.2.4Request Parameters
Trial

 
Name Optional? Multiple Allowed? Description Example Supported In
_outputFormat Yes No Specifies the output encoding style that should be used. Currently only application/fhir+ndjson is supported. application/fhir+ndjson Group, Patient, System
_type Yes No Specifies a comma-separated list of resource types to include. If this parameter is absent, all resource types the user has authorization to access, except Binary, will be exported (provided they were created/updated in the past 24 hours - see `_since` description below). The Binary type must not be included in this list. Note that if you are doing a Patient or Group Bulk Export, only resource types which refer to patients via some search parameter can be used here. Patient, Practitioner Group, Patient, System
_since Yes No Only resources that were last updated on or after the given time will be included. If this parameter is absent only resources created or updated in the past 24 hours will be exported. 2019-10-25T11:14:00Z Group, Patient, System
_typeFilter Yes Yes Specifies a search URL that can be used to narrow the results for one or more of the resource types being exported. Can be used in conjunction with the _type parameter to limit the resource types as well as to specific search sets within one or more of the resource types. To support multiple typeFilters, separate them by a comma. See the FHIR specification for more details. Patient?identifier=foo,Practitioner?name=jim Group, Patient, System
_mdm Yes No Specifies whether you want to perform MDM expansion on the group members. For example, if Patient/1 is a member of a group, and they are MDM Matched to Patient/2 who is not in the group, the results for Patient/2 will be exported alongside the results for Patient/1. Setting this to true will also include any resources which refer to the Golden Resource for any members. true Group
_typePostFetchFilterUrl Yes Yes Specifies a search URL that will be applied by an In-Memory matcher against resources after they have been fetched from the database. This can be used to efficiently filter out results in cases where very large numbers of resources are being returned from the database, and complex search parameters such as very long lists of tokens are being used for filters. Note that in many cases it is much more efficient to leverage the database, so putting filters in this parameter should be tested and compared against the standard _typeFilter parameter. Values in this field must take the form [resourceType]?[parameters]. If any values of this parameter are provided for a given resource type being exported, then any candidate files to be included in the bulk export will only be included if they are matched by at least one _typePostFetchFilterUrl filter. Patient?identifier=foo,Practitioner?name=jim Group, Patient, System
_exportId Yes No When Bulk Export generates Binary resources, if an `_exportId` is specified, it will be included in the `binary.meta.extension` field. This can be used to correlate the Binary resources with the original request after the fact. my-patient-export Group, Patient, System

31.2.5Requesting A Bulk Extract
Trial

 
Initiating a Bulk Export requires the FHIR_OP_INITIATE_BULK_DATA_EXPORT permission.

This section describes the user flow when requesting a bulk extract.

First, a user invokes the $export operation to initiate the request. This can be done using either an HTTP GET request with export request parameters included in the URL or as an HTTP POST with the export request parameters included in a FHIR Parameters resource.

31.2.5.1Initiating Using HTTP GET

The following example shows a URL that can be used to initiate a Bulk Export using an HTTP GET. For readability, the URL is split over multiple lines but it would not be in a real invocation. https://fhir.example.com:8000/$export
   
?_outputFormat=application%2Ffhir%2Bndjson
   
&_type=Patient%2C%20Practitioner
   
&_since=2019-10-25T10%3A07%3A58.788-04%3A00
   
&_typeFilter=Practitioner%3Fidentifier%3Dbar

The complete HTTP Request is shown below:

GET /$export?[params]
Prefer: respond-async

A similar Group Bulk Export Request can be seen below:

GET /Group/123/$export?_type=Immunization
Prefer: respond-async

Here's an example of fetching all Immunizations within a Group, which have vaccine-code of COVID-19:

GET /Group/123/$export?_type=Immunization&_typeFilter=Immunization%3Fvaccine-code%3DCOVID-19
Prefer: respond-async

To request multiple resource types, use a comma-separated list for the _type parameter but use multiple instances of the _typeFilter parameter:

GET /$export?_type=Patient,ExplanationOfBenefit&_typeFilter=Patient?_id=Patient/123&_typeFilter=ExplanationOfBenefit?patient=Patient/123
Prefer: respond-async

31.2.5.2Initiating Using HTTP POST

The following example shows an example Parameters resource that can be used to initiate a Bulk Export using an HTTP POST.

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "_outputFormat",
      "valueString": "application/fhir+ndjson"
    },
    {
      "name": "_type",
      "valueString": "Patient, Practitioner"
    },
    {
      "name": "_since",
      "valueInstant": "2019-10-25T11:01:45.660-04:00"
    },
    {
      "name": "_typeFilter",
      "valueString": "Patient?identifier=foo"
    }
  ]
}

The complete HTTP Request is shown below:

POST /$export
Prefer: respond-async
Content-Type: application/fhir+json

{ ...payload... }

31.2.5.3Initiation Response

Once the server has processed the request, it will respond with a response similar to the following:

HTTP/1.1 202 Accepted
Content-Location: https://fhir.example.com:8000/$export-poll-status?_jobId=0000000-1111111-2222222

The status of 202 means that the server has successfully received the request and has scheduled a background job that will assemble the Bulk Export payload. The Content-Location header indicates a URL that may be used by the client to request an update on the status of the background job. If an identical job has been requested in the last hour, instead of receiving a new job, you will be given the URL of the previously stored job. If you need to override this and force a new job to start, you can do so by setting the cache header as follows: Cache-Control: no-cache. This will force a new job regardless of duplicate jobs in the system.

31.2.6Authorization for Bulk Export
Trial

 

A user must have appropriate permissions in order to initiate a bulk export job. For bulk export jobs, often the user is actually a Client that has authenticated using the Client Credentials with JWT Credential system flow, but it can also be a regular user.

A user/client must have appropriate permissions in order to initiate the job. If the user has authorized using SMART on FHIR, the approved scopes must also be appropriate.

The following permissions can be used:

  • FHIR_OP_INITIATE_BULK_DATA_EXPORT – This permission allows the user to initiate a bulk export of any kind (system/group/patient).

    • If the permission has no argument, this permission implies that the user can initiate a bulk export for any or all resource types in the system. This is a very powerful permission and should not be used unless you intend to grant the user virtually unlimited access to the data in the repository.
    • The permission may also take an argument, which is a space-separated list of resource types the user is allowed to request a bulk export for. For example, if the user has the permission FHIR_OP_INITIATE_BULK_DATA_EXPORT/QuestionnaireResponse Questionnaire the user will only be allowed to initiate bulk exports for QuestionnaireResponse and Questionnaire resources.
  • FHIR_OP_INITIATE_BULK_DATA_EXPORT_SYSTEM – This permission allows the user to initiate a system-level bulk export (i.e. a bulk export that is not limited in scope to data about any one patient or group)

    • If the permission has no argument, this permission implies that the user can initiate a bulk export for any or all resource types in the system. This is a very powerful permission and should not be used unless you intend to grant the user virtually unlimited access to the data in the repository.
    • The permission may also take an argument, which is a space-separated list of resource types the user is allowed to request a bulk export for. For example, if the user has the permission FHIR_OP_INITIATE_BULK_DATA_EXPORT_SYSTEM/QuestionnaireResponse Questionnaire the user will only be allowed to initiate bulk exports for QuestionnaireResponse and Questionnaire resources.
  • FHIR_OP_INITIATE_BULK_DATA_EXPORT_GROUP – This permission allows the user to initiate a Group-Level bulk export, meaning an export that contains only files relating to a specific Group resource and its members.

    • This permission must have an argument, containing at a minimum the ID of the Group to export. For example, the permission FHIR_OP_INITIATE_BULK_DATA_EXPORT_GROUP/Group/123 allows the user to initiate a group-level bulk export for the group Group/123.
    • The argument may optionally start with the group ID, but be followed with a space-separated list of resource types the user is allowed to request as a part of the export. For example, the permission FHIR_OP_INITIATE_BULK_DATA_EXPORT_GROUP/Group/123 Patient Observation allows the user to initiate a group-level export, but disallows the user from requesting resource types other than Patient or Observation as a part of that request.
  • FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT – This permission allows the user to initiate a Patient-Level bulk export, meaning an export that contains only files relating to a specific Patient resource.

    • This permission must have an argument, containing at a minimum the ID of the Patient to export. For example, the permission FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT/Patient/123 allows the user to initiate a group-level bulk export for the patient Patient/123.
    • The argument may optionally start with the patient ID, but be followed with a space-separated list of resource types the user is allowed to request as a part of the export. For example, the permission FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT/Patient/123 Patient Observation allows the user to initiate a group-level export, but disallows the user from requesting resource types other than Patient or Observation as a part of that request.

31.2.6.1Approved SMART Scopes

If the user/client has authenticated using SMART on FHIR, the approved scopes will also be factored into the resource types that are permitted to be requested via bulk export.

For example, suppose a user has the following permission:

FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT/Patient/123 Patient Observation Encounter

And the client has approved the following scopes:

system/Patient.read system/Observation.read

In this case, the user will be allowed to request a bulk export for the Patient and Observation resources, but will be blocked from requesting a bulk export containing Encounter resources.

If the client scopes include filter constraints (see SMART finer-grained resource constraints) such as patient/Observation?category=vital-signs, then the export contents will be restricted to match.

31.2.7Polling for Job Status
Trial

 
Polling for Bulk Export job status requires the FHIR_OP_INITIATE_BULK_DATA_EXPORT permission.

Once a job has been initiated, the client may poll the server to request information about the status. This is done by requesting an HTTP GET for the URL specified in the Content-Location header in the previous step.

GET https://fhir.example.com:8000/$export-poll-status?_jobId=0000000-1111111-2222222

The server will respond with a response similar to the following while the job is still being built.

HTTP/1.1 202 Accepted
X-Progress: Build in progress - Status set to BUILDING at 2019-10-25T11:27:16.763-04:00
Retry-After: 120

When the server has completed assembling the export payload, the same polling request will return a response similar to the following. The response payload is a list of files that were assembled by the Binary export job. If no URLs are returned, it means that no resources met the request parameter(s) criteria. Note that the payloads are assembled using Binary resources, so the client will need to have appropriate permission on the server to download Binary resources in order to be able to access the export contents.

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 311 

{
  "transactionTime" : "2019-10-25T11:31:22.487-04:00",
  "output" : [ {
    "type" : "Patient",
    "url" : "https://fhir.example.com:8000/Binary/111"
  }, {
    "type" : "Patient",
    "url" : "https://fhir.example.com:8000/Binary/222"
  }, {
    "type" : "Patient",
    "url" : "https://fhir.example.com:8000/Binary/333"
  } ]
}

31.2.8Accessing Bulk Export Files

 
Bulk export files are stored as Binary resources internally in the system. Users therefore need an appropriate set of permissions allowing them to read binary resources by ID.

When the polling process has completed, the FHIR endpoint will respond with a list of Binary file URLs that can be used to access the NDJSON data gathered as a part of the export.

If Restrict Download to Initiating User has been enabled on the FHIR Endpoint module (this setting defaults to being enabled) then only the same user on the same node will be able to access the generated Binary files.