The FHIR Bulk Data Access protocol describes a mechanism for efficiently requesting large amounts of data in a convenient format. This protocol can be used to request all data on a server, or a specific subset that is useful for a given use case.
Bulk export support is configured on the FHIR Storage module via the bulk_export.enabled property.
Note that Bulk Export file generation can involve the creation of many large Binary resources, so it is recommended to enable Externalized Binary Storage when using this feature.
The Maximum Bulk Export File Capacity property may be used to control the size of the generated files.
The FHIR Bulk Data Access specification describes several invocation styles. The following table outlines currently supported functionality.
Name | Description | URL Syntax | Supported |
---|---|---|---|
Endpoint - Group of Patients | Export a detailed set of FHIR resources of diverse resource types pertaining to all patients in specified Group. | [fhir base]/Group/[id]/$export | |
Endpoint - System Level Export | Export data from a FHIR server, whether it is associated with a patient. | [fhir base]/$export | |
Endpoint - All Patients | Export a detailed set of FHIR resources of diverse resource types pertaining to all patients. | [fhir base]/Patient/$export |
Name | Optional? | Multiple Allowed? | Description | Example | Supported In |
---|---|---|---|---|---|
_outputFormat | Yes | No |
Specifies the output encoding style that should be used. Currently only application/fhir+ndjson is supported.
|
application/fhir+ndjson | Group, Patient, System |
_type | Yes | No |
Specifies a comma-separated list of resource types to include. If this parameter is absent, all resource types the user has authorization to access, except Binary , will be exported (provided they were created/updated in the past 24 hours - see `_since` description below). The Binary type must not be included in this list. Note that if you are doing a Patient or Group Bulk Export, only resource types which refer to patients via some search parameter can be used here.
|
Patient, Practitioner | Group, Patient, System |
_since | Yes | No | Only resources that were last updated on or after the given time will be included. If this parameter is absent only resources created or updated in the past 24 hours will be exported. | 2019-10-25T11:14:00Z | Group, Patient, System |
_typeFilter | Yes | Yes |
Specifies a search URL that can be used to narrow the results for one or more of the resource types being exported. Can be used in conjunction with the _type parameter to limit the resource types as well as to specific search sets within one or more of the resource types. To support multiple typeFilters, separate them by a comma. See the FHIR specification for more details.
|
Patient?identifier=foo,Practitioner?name=jim | Group, Patient, System |
_mdm | Yes | No | Specifies whether you want to perform MDM expansion on the group members. For example, if Patient/1 is a member of a group, and they are MDM Matched to Patient/2 who is not in the group, the results for Patient/2 will be exported alongside the results for Patient/1. Setting this to true will also include any resources which refer to the Golden Resource for any members. | true | Group |
_typePostFetchFilterUrl | Yes | Yes |
Specifies a search URL that will be applied by an In-Memory matcher against
resources after they have been fetched from the database. This can be used
to efficiently filter out results in cases where very large numbers of
resources are being returned from the database, and complex search parameters
such as very long lists of tokens are being used for filters. Note that
in many cases it is much more efficient to leverage the database, so putting
filters in this parameter should be tested and compared against the standard
_typeFilter parameter. Values in this field must take the
form [resourceType]?[parameters] . If any values of this
parameter are provided for a given resource type being exported, then any
candidate files to be included in the
bulk export will only be included if they are matched by at least one
_typePostFetchFilterUrl filter.
|
Patient?identifier=foo,Practitioner?name=jim | Group, Patient, System |
_exportId | Yes | No | When Bulk Export generates Binary resources, if an `_exportId` is specified, it will be included in the `binary.meta.extension` field. This can be used to correlate the Binary resources with the original request after the fact. | my-patient-export | Group, Patient, System |
This section describes the user flow when requesting a bulk extract.
First, a user invokes the $export
operation to initiate the request. This can be done using either an HTTP GET request with export request parameters included in the URL or as an HTTP POST with the export request parameters included in a FHIR Parameters resource.
The following example shows a URL that can be used to initiate a Bulk Export using an HTTP GET. For readability, the URL is split over multiple lines but it would not be in a real invocation.
https://fhir.example.com:8000/$export
The complete HTTP Request is shown below:
GET /$export?[params]
Prefer: respond-async
A similar Group Bulk Export Request can be seen below:
GET /Group/123/$export?_type=Immunization
Prefer: respond-async
Here's an example of fetching all Immunizations within a Group, which have vaccine-code of COVID-19:
GET /Group/123/$export?_type=Immunization&_typeFilter=Immunization%3Fvaccine-code%3DCOVID-19
Prefer: respond-async
To request multiple resource types, use a comma-separated list for the _type
parameter but use multiple instances of the _typeFilter
parameter:
GET /$export?_type=Patient,ExplanationOfBenefit&_typeFilter=Patient?_id=Patient/123&_typeFilter=ExplanationOfBenefit?patient=Patient/123
Prefer: respond-async
The following example shows an example Parameters resource that can be used to initiate a Bulk Export using an HTTP POST.
{
"resourceType": "Parameters",
"parameter": [
{
"name": "_outputFormat",
"valueString": "application/fhir+ndjson"
},
{
"name": "_type",
"valueString": "Patient, Practitioner"
},
{
"name": "_since",
"valueInstant": "2019-10-25T11:01:45.660-04:00"
},
{
"name": "_typeFilter",
"valueString": "Patient?identifier=foo"
}
]
}
The complete HTTP Request is shown below:
POST /$export
Prefer: respond-async
Content-Type: application/fhir+json
{ ...payload... }
Once the server has processed the request, it will respond with a response similar to the following:
HTTP/1.1 202 Accepted
Content-Location: https://fhir.example.com:8000/$export-poll-status?_jobId=0000000-1111111-2222222
The status of 202 means that the server has successfully received the request and has scheduled a background job that will assemble the Bulk Export payload. The Content-Location
header indicates a URL that may be used by the client to request an update on the status of the background job. If an identical job has been requested in the last hour, instead of receiving a new job, you will be given the URL of the previously stored job. If you need to override this and force a new job to start, you can do so by setting the cache header as follows: Cache-Control: no-cache
. This will force a new job regardless of duplicate jobs in the system.
A user must have appropriate permissions in order to initiate a bulk export job. For bulk export jobs, often the user is actually a Client that has authenticated using the Client Credentials with JWT Credential system flow, but it can also be a regular user.
A user/client must have appropriate permissions in order to initiate the job. If the user has authorized using SMART on FHIR, the approved scopes must also be appropriate.
The following permissions can be used:
FHIR_OP_INITIATE_BULK_DATA_EXPORT – This permission allows the user to initiate a bulk export of any kind (system/group/patient).
FHIR_OP_INITIATE_BULK_DATA_EXPORT/QuestionnaireResponse Questionnaire
the user will only be allowed to initiate bulk exports for QuestionnaireResponse and Questionnaire resources.FHIR_OP_INITIATE_BULK_DATA_EXPORT_SYSTEM – This permission allows the user to initiate a system-level bulk export (i.e. a bulk export that is not limited in scope to data about any one patient or group)
FHIR_OP_INITIATE_BULK_DATA_EXPORT_SYSTEM/QuestionnaireResponse Questionnaire
the user will only be allowed to initiate bulk exports for QuestionnaireResponse and Questionnaire resources.FHIR_OP_INITIATE_BULK_DATA_EXPORT_GROUP – This permission allows the user to initiate a Group-Level bulk export, meaning an export that contains only files relating to a specific Group resource and its members.
FHIR_OP_INITIATE_BULK_DATA_EXPORT_GROUP/Group/123
allows the user to initiate a group-level bulk export for the group Group/123.FHIR_OP_INITIATE_BULK_DATA_EXPORT_GROUP/Group/123 Patient Observation
allows the user to initiate a group-level export, but disallows the user from requesting resource types other than Patient or Observation as a part of that request.FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENTS – This permission allows the user to initiate a Patient-Level bulk export, meaning an export that contains only files relating to one or more specific Patient resource.
*
for all resources or the specific list of permitted resources. Multiple IDs should be comma-separated. For example, the permission FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENTS/Patient/a,Patient/b *
allows the user to initiate a group-level bulk export for the patients Patient/123 and Patient/456 for all resources.FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENTS/Patient/a,Patient/b Patient Group
allows the user to initiate a group-level export, but disallows the user from requesting resource types other than Patient or Observation as a part of that request.FHIR_OP_INITIATE_BULK_DATA_EXPORT_ALL_PATIENTS – This permission allows the user to initiate a Patient-Level bulk export with NO restrictions and NO arguments. This permission should take the form of FHIR_OP_INITIATE_BULK_DATA_EXPORT_ALL_PATIENTS
FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT – Deprecated: Users should use FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENTS
or FHIR_OP_INITIATE_BULK_DATA_EXPORT_ALL_PATIENTS
instead. This permission allows the user to initiate a Patient-Level bulk export, meaning an export that contains only files relating to a specific Patient resource.
FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT/Patient/123
allows the user to initiate a group-level bulk export for the patient Patient/123.FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT/Patient/123 Patient Observation
allows the user to initiate a group-level export, but disallows the user from requesting resource types other than Patient or Observation as a part of that request.If the user/client has authenticated using SMART on FHIR, the approved scopes will also be factored into the resource types that are permitted to be requested via bulk export.
For example, suppose a user has the following permission:
FHIR_OP_INITIATE_BULK_DATA_EXPORT_PATIENT/Patient/123 Patient Observation Encounter
And the client has approved the following scopes:
system/Patient.read system/Observation.read
In this case, the user will be allowed to request a bulk export for the Patient and Observation resources, but will be blocked from requesting a bulk export containing Encounter resources.
If the client scopes include filter constraints (see SMART finer-grained resource constraints) such as patient/Observation?category=vital-signs
,
then the export contents will be restricted to match.
Once a job has been initiated, the client may poll the server to request information about the status. This is done by requesting an HTTP GET for the URL specified in the Content-Location
header in the previous step.
GET https://fhir.example.com:8000/$export-poll-status?_jobId=0000000-1111111-2222222
The server will respond with a response similar to the following while the job is still being built.
HTTP/1.1 202 Accepted
X-Progress: Build in progress - Status set to BUILDING at 2019-10-25T11:27:16.763-04:00
Retry-After: 120
When the server has completed assembling the export payload, the same polling request will return a response similar to the following. The response payload is a list of files that were assembled by the Binary export job. If no URLs are returned, it means that no resources met the request parameter(s) criteria. Note that the payloads are assembled using Binary resources, so the client will need to have appropriate permission on the server to download Binary resources in order to be able to access the export contents.
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 311
{
"transactionTime" : "2019-10-25T11:31:22.487-04:00",
"output" : [ {
"type" : "Patient",
"url" : "https://fhir.example.com:8000/Binary/111"
}, {
"type" : "Patient",
"url" : "https://fhir.example.com:8000/Binary/222"
}, {
"type" : "Patient",
"url" : "https://fhir.example.com:8000/Binary/333"
} ]
}
When the polling process has completed, the FHIR endpoint will respond with a list of Binary file URLs that can be used to access the NDJSON data gathered as a part of the export.
If Restrict Download to Initiating User has been enabled on the FHIR Endpoint module (this setting defaults to being enabled) then only the same user on the same node will be able to access the generated Binary files.