On this page:

5.14Binary Data

 

In many scenarios, resources such as DocumentReference are used to store large files such as scanned PDFs and images. These resources use the Attachment datatype, which ultimately stores a content type and a base 64 encoded representation of the binary content.

In the case of large files, using base 64 encoding can take up a lot of extra space, and placing these binary attachments inline within a FHIR resource can be cumbersome for clients.

This page describes strategies for dealing with large binary content.

5.14.1Binary Storage

 

Smile CDR can optionally be configured to move binary content into secondary storage that is better suited to large binary payloads than a relational database would be.

Configuring this setting away from the default is not always required: If your use cases don't involve storing lots of binary data, or if you will rarely be doing so it may be less hassle to simply store the binary data in your database. On the other hand, if you need to lots of binary data, a relational database is not always the most efficient way of doing this and you should consider alternatives.

Important Note on Scope: This feature is currently only used for Attachment data and Binary resource data submitted and retrieved via the Binary Access Operations described below. Over time we plan to add other storage options as well as expand the data that can be stored with binary storage.

The following sections describe options for binary storage:

Binary Storage Mode: Database (Default)

By default, all binary content is stored in the database, directly embedded within FHIR resources as base 64 encoded content. This is an easy configuration to use, especially for testing setups.

Binary Storage Mode: Database Blob

In this configuration, binary data will be stored in the same relational database as other FHIR resource contents, but it will be store separately in a BLOB column in the HFJ_BINARY_STORAGE_BLOB table.

Unlike the default Database mode, binary contents are not stored inline as Base64 encoded contents, and will generally be streamed directly to the database instead of being loaded into memory.

Binary Storage Mode: Filesystem

In filesystem mode, individual files are used within a directory structure to store binary content. Each file is assigned a globally unique name upon create, so it is fine to use a shared directory such as a network share, even if the directory is shared by multiple nodes in a cluster.

When setting up Filesystem based binary storage, the following settings apply:

  • binary_storage.filesystem.directory: This specifies the path (either absolute or relative to Smile CDR) that is used as the base path to store binary files. Smile CDR will create and manage a directory structure beneath this path.

5.14.2Binary Access Operations

 

HAPI FHIR provides two custom FHIR operations that can be used to interact directly with binary content contained within resources such as DocumentReference. These operations can be used both to write and read back binary content.

These operations can be enabled/disabled using the Binary Access Operations Enabled property.

Note that these operations are subject to all of the same security restrictions as a standard FHIR read/write. In other words, a user needs to have appropriate write permissions to the DocumentReference resource in question in order to be able to write binary content within it, and a user needs to have appropriate read permissions in order to read binary content from it.

Binary Access Write Operation

The act of writing a binary payload to a FHIR Endpoint using the Binary Access Write Operation is a two step process: First, the container resource must be created on the server, with a placeholder Attachment that will be populated afterward. Second, the Binary Access Write Operation is invoked to directly populate the content.

The following shows a simple example of a create for a DocumentReference with a placeholder Attachment. Note the almost empty attachment element that must be created in order to create a place for the attachment reference.

POST /DocumentReference
Content-Type: application/fhir+json

{
  "resourceType": "DocumentReference",
	"subject": {
		"reference": "Patient/123"
	},
  "content": [
    {
      "attachment": {
        "contentType": "image/jpeg"
      }
    }
  ]
}

The server will reply with a Location header containing the ID of the newly created resource.

Location: http://localhost:8000/DocumentReference/1623/_history/1

This ID is then used in the Binary Access Write Operation to set the binary content. Note the path parameter, which specifies a FHIRPath expression to the attachment element within the DocumentReference resource. It is important to provide the appropriate content type via the Content-Type header in the operation HTTP request. Smile CDR does not validate this content type, but it will be faithfully preserved and returned if the payload is requested via the Binary Access Read operation.

POST /DocumentReference/1623/$binary-access-write?path=DocumentReference.content.attachment
Content-Type: image/png

(... binary content ...)

Binary Access Read Operation

The Binary Access Read Operation can be used to read back binary content from Attachent elements in a similar way to the write operation above.

The following example shows a read operation:

GET /DocumentReference/1623/$binary-access-read?path=DocumentReference.content.attachment

The server will then respond by serving the binary content with the correct Content Type.

5.14.3Serving Raw Media Resources

 

The Media resource is used to store media such as photos in the FHIR respository. A Media resource has fields for storing metadata such as the subject of the media and the body site, but also has two primary fields for storing the media itself:

  • The Media.content.contentType field stores the mime type of the media, e.g. image/png
  • The Media.content.data field stores the media itself

When retrieving the resource via a standard FHIR operation (e.g. a read or a search) the data is represented as base64 encoded data.

If the Serve Raw Media Resources property is enabled, clients may request the raw contents of the Media resource.

Enabling this setting causes two things to happen:

  1. Raw content is served if the Accept header matches the content type exactly

For example, consider the following (abbreviated) Media resource:

{
  "resourceType": "Media",
  "id": "example999",
  "subject": { "reference": "Patient/123" },
  "content": {
    "contentType": "image/png",
    "data": "R0lGODlhfgCRAPcAAAAAAIAAAACAAICAAAAAgIAA"
  }
}

This resource will be served as a raw binary image if the following HTTP request is used:

GET /Media/example999
Accept: image/png
  1. Raw content is served if the client explicitly requests it

The _output parameter may be used with a value of data to indicate to the server that this resource should be served raw.

For example, the following request will request the resource above as raw content.

GET /Media/example999?_output=data