37.2.1HFQL SQL Syntax: Selects

 

This module uses a proprietary flavour of SQL that is specific to HAPI FHIR. This syntax is referred to as HFQL, the HAPI FHIR Query Language.

HFQL keywords are case-insensitive, although expressions referring to elements within FHIR resources are case-sensitive. All examples here use CAPITALIZED SQL keywords for easier readability but this is not required.

37.2.2Selecting Columns

 

Unlike RDBMS tables, FHIR resources are document based with many repeating elements. Simply selecting columns like in standard SQL can be complicated by this structure. In order to provide flexibility, SELECT expressions can be expressed as FHIRPath Expressions. The FROM keyword indicates which FHIR resource type to search for and return values from.

The following example searches for all Patient resources, and returns their respective ID, family name, and given name in 3 separate columns.

SELECT id, name[0].family, name[0].given[0]
FROM Patient

37.2.2.1Selecting Repeating Elements

One of the most common challenges when writing HFQL expressions is in dealing with repeating elements. This challenge arises because FHIR is a document format, and many elements can repeat. This naturally conflicts with the traditional RDBMS/SQL model where each column can hold a single value within a given row.

Consider the Patient.name element. This element can repeat within a document, and holds values of the FHIR general purpose datatype HumanName. The HumanName datatype in turn holds an element called family which is non-repeatable, and an element called given which is repeatable.

Because of this, the simple path Patient.name.given can potentially return multiple values, either because there are multiple name repetitions, or multiple given repetitions, or both. As a result, selecting a path which can potentially return multiple values will result in JSON output.

For example, consider the following example resource:

{
   "resourceType": "Patient",
   "name": [{
       "family": "Simpson",
       "given": ["Homer", "Jay"]
   }]
}

Selecting the path Patient.name.given will result in a column with the value ["Homer", "Jay"]. You can use array index markers if you want to explicitly select a specific repetition:

SELECT id, name[0].family, name[0].given[0]
FROM Patient

37.2.2.2Selecting Specific Repetitions

FHIRPath also supports various other mechanisms for picking exactly which leaf in a document you want to return. For example, the where operator could be used to pick a specific name regardless of which repetition it was found in.

SELECT id, name.where(use = 'maiden').first().family
FROM Patient

37.2.2.3Selecting Complex/Non-Primitive Objects

If your FHIRPath expression points to a non-primitive element within a FHIR resource, the element will be encoded using FHIR JSON encoding rules and returned as a STRING column value.

For example, consider the following query:

SELECT name
FROM Patient

The name element in FHIR is a complex object, so column values would resemble the following:

{"family":"Simpson","given":["Homer", "Jay"]}

37.2.3Column Name Aliases

 

If you want to control the column name for a given SELECT expression, you can use a standard SQL AS clause. For example:

SELECT id AS ID, 
       name[0].family AS FamilyName, 
       name[0].given[0] AS GivenName,
       name[0].given[1] AS MiddleName
FROM Patient

Alternately, you can also create an alias by prefixing the expression by an alias and a colon. For example:

SELECT ID: id, 
       FamilyName: name[0].family, 
       GivenName: name[0].given[0],
       MiddleName: name[0].given[1]
FROM Patient

37.2.4Column Data Types and Choice[x] Elements

 

The HFQL processor will always try to use the most appropriate SQL datatype for column values that it returns. It does this by inspecting the FHIRPath expression and trying to determine the FHIR datatype associated with the leaf node in the path.

In the case of choice[x] elements, where the datatype can be one of several possibilities, use the FHIRPath ofType(...) operator to select the type you want to return. This also acts as a hint to the HFQL parser about which SQL datatype should be returned. In the example below, only Observation values with a Quantity type are returned, and the Quantity.value element is returned. Because this element has a FHIR decimal datatype, this column is returned by the HFQL processor as a SQL Decimal value.

SELECT
   id, 
   value.ofType(Quantity).system AS Quantity_System,
   value.ofType(Quantity).unit   AS Quantity_Unit,
   value.ofType(Quantity).value  AS Quantity_Value
FROM
   Observation

If you expect different choice[x] types to be available in different resources being selected (e.g. if some Observations will have a valueQuantity but others will have a valueCoding) you may wish to include columns for each. An example is shown below. When using a construct like this one, the Coding_* columns will be NULL if the Quantity columns are not, and vice-versa which may make processing easier.

SELECT
   id,
   value.ofType(Quantity).system AS Quantity_System,
   value.ofType(Quantity).unit   AS Quantity_Unit,
   value.ofType(Quantity).value  AS Quantity_Value,
   value.ofType(Coding).system   AS Coding_System,
   value.ofType(Coding).code     AS Coding_Code,
   value.ofType(Coding).display  AS Coding_Display
FROM
   Observation

Of course, you can always simply include the choice element name as shown below. This will result in JSON content being emitted for whatever type is found.

SELECT
   id,
   value
FROM
   Observation

37.2.5Count and Group-By Expressions

 

The COUNT operator can be used to return counts of various items.

Be warned that this is potentially a very slow operation, since the HFQL executor needs to examine every single Patient resource in the database.

If you are returning any other SELECT expressions, you must also include those expressions in a corresponding GROUP BY clause.

The following example returns a table containing each distinct family name in the database, and the number of times it appears.

SELECT COUNT(*), name[0].family
FROM Patient
GROUP BY name[0].family

37.2.5.1Grouping Repeatable Elements

When queries with a GROUP BY expression containing a repeating element, each repetition will be counted once. For example, consider the following resource:

{
    "resourceType": "Patient",
    "name": [{
        "family": "Simpson",
        "given": ["Homer"]
    }, {
        "family": "Simpson",
        "given": ["Jay"]
    }]
}

Using the following query:

SELECT COUNT(*), name.family
FROM Patient
GROUP BY name.family

Results in the following values:

COUNT(*)name.given
2Simpson

37.2.6Ordering Results

 

The SQL ORDER BY keywords can be used to suggest a specific order for the search results.

Note that ordering will currently only work on statements that return 10,000 rows or less. This limitation may be removed in a future release.

The following statement searches for Patients with a family name of Simpson, counts the frequencies of their first names, and orders by the most frequent names.

SELECT name.given, count(*)
FROM Patient
WHERE family = 'Simpson'
GROUP BY name.given
ORDER BY count(*) DESC

37.2.7Select Data Manipulation and Concatenation

 

FHIRPath contains various helpful keywords that can be used to manipulate the data being returned.

A simple example showing string concatenation is shown below. Note the use of the [0] indexes on repeatable fields. Without these this expression will fail with a FHIRPath execution error if multiple names are present, as FHIRPath does not allow string concatenation using a list of strings on either side of the + operator.

SELECT 
    Patient.name[0].given[0] + ' ' + Patient.name[0].family AS FullName
FROM Patient

37.2.8Limiting Results

 

The SQL LIMIT clause can be used to restrict the number of results that can be fetched. For example:

SELECT
    id, name[0].family, name[0].given[0]
FROM
    Patient
LIMIT
    100