34.3.1OpenTelemetry Integration
Trial

 

OpenTelemetry is a framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs. Starting with the 2024.02 release, support for instrumentation with the OpenTelemetry agent is added to Smile CDR. This makes Smile CDR generate telemetry data that can be used to monitor Smile CDR's performance.

Currently, Smile CDR generates traces and metrics using the OpenTelemetry Java agent. This feature is currently in trial phase and is basically an auto-instrumentation with some minimal Smile CDR related customizations. The auto-instrumentation with the OpenTelemetry Java agent generates telemetry data for the libraries and frameworks that the agent has built-in support for. These include many libraries and frameworks used or supported by Smile CDR as well.

As this integration is still in trial, the details of the data generated are subject to change. These details include the names (such as trace spans and metric names), the trace structure and attributes exposed by trace spans and metrics. Feedback for this feature is welcomed.

34.3.2Observability Backends
Trial

 

To consume OpenTelemetry data generated by Smile CDR, you need observability backends. An observability backend collects, persists the telemetry data and makes it available for monitoring (querying, visualizing and alerting if system is not performing as desired). There are many open source and commercial backends supporting OpenTelemetry. Smile CDR does not recommend any particular observability backend because the choice may depend on your needs, preferences and deployment environment. Since OpenTelemetry is an open standard, you should be able to work with any backend that supports OpenTelemetry.

Some example observability backends and environments that support OpenTelemetry are:

We also have a very basic otel-backend-starter project for learning purposes. This project provides a docker-compose setup to run Jaeger, Prometheus, and OpenTelemetry Collector locally.

34.3.3Enabling OpenTelemetry Instrumentation in Smile CDR
Trial

 

To enable OpenTelemetry instrumentation in Smile CDR, you need to set an environment variable called CDR_OTEL when running Smile CDR.

For example,

CDR_OTEL=y bin/smilecdr start

The value of the CDR_OTEL variable has no significance, the instrumentation is enabled as long as the variable is set. When this variable is set, the Smile CDR process is auto-instrumented with the OpenTelemetry Java agent. A version of the OpenTelemetry Java agent is bundled with the Smile CDR release so there is no need to download it separately.

34.3.4Agent Configuration
Trial

 

By default, the java agent is configured using the following properties file.

otel.service.name=smilecdr
# Send OTEL logging to slf4j so that javaagent logs gets logged in smile.log
otel.javaagent.logging=application
# Capture X-Request-ID as a http span attribute
otel.instrumentation.http.server.capture-response-headers=X-Request-ID
# Use stable semantic conventions for http spans
otel.semconv-stability.opt-in=http
# Disable exporting logs by default, you can set environment variable OTEL_LOGS_EXPORTER to 'otlp' to enable it
otel.logs.exporter=none
# Logback appender related options below take effect only if exporting logs are enabled.
# Enable the capture of experimental log attributes 'thread.name' and 'thread.id'
otel.instrumentation.logback-appender.experimental-log-attributes=true
# add all mdc attributes to exported log record, these include 'requestId' and 'moduleId'
otel.instrumentation.logback-appender.experimental.capture-mdc-attributes=*

If you would like to use your own agent configuration file instead of this default configuration, you need to set the OTEL_JAVAAGENT_CONFIGURATION_FILE environment variable to specify the path to your agent configuration file. For example,

CDR_OTEL=y OTEL_JAVAAGENT_CONFIGURATION_FILE=<path_to_your_agent_config_file> bin/smilecdr start

Alternatively, you can override or set individual configuration options for the agent using other OpenTelemetry Java environment variables.

34.3.5Enabling/Disabling Exported Data
Trial

 

By default, exporting logs is disabled, whereas exporting traces and metrics are enabled.

To enable exporting logs from Smile CDR directly, set the OTEL_LOGS_EXPORTER environment variable to otlp when running Smile CDR, in addition to setting CDR_OTEL.

For example,

CDR_OTEL=y OTEL_LOGS_EXPORTER=otlp bin/smilecdr start

To disable trace or metric exporters, set OTEL_TRACES_EXPORTER or OTEL_METRICS_EXPORTER environment variables to none, respectively.

34.3.6Correlating Logs and Traces
Trial

 

If exporting logs via the agent is enabled, the agent also exports current trace id and span id as part of the log record. These ids are also available in the Smile system logs. Current trace_id and span_id appear on the system log lines with T: and S: prefixes, respectively.

34.3.7Vendor specific OpenTelemetry tools and agents
Trial

 

Some cloud vendors, such AWS and Azure, provide their own distributions of tools for OpenTelemetry. With such cloud vendors, there are 2 general approaches you can take:

The first approach is to use the OpenTelemetry Java agent bundled with Smile CDR, and use and configure OpenTelemetry Collector to convert the data to vendor specific format. If you follow this approach you need to run Smile CDR with the CDR_OTEL environment variable set as explained in the previous section so that Smile CDR is instrumented with the Java agent.

The second approach is to use the OpenTelemetry Java agent distribution provided by a vendor, if there is one. In this approach when running Smile CDR you do not set the CDR_OTEL environment variable but instead set JAVA_TOOL_OPTIONS environment variable to instrument the Smile CDR process.

Both of these approaches are explained in detail next for AWS and Azure.

34.3.7.1AWS OpenTelemetry

AWS provides its own distribution of the OpenTelemetry Java agent and collector.

You can use the AWS Distro for OpenTelemetry Collector to export telemetry data in AWS formats that can be consumed by AWS CloudWatch and AWS X-Ray. For this to work, you run Smile CDR with the CDR_OTEL environment variable set and configure AWS Distro for OpenTelemetry Collector to export data in AWS formats. You can see some examples for AWS OpenTelemetry Collector Configurations in the AWS Observability repo.

You may also decide to use the AWS Distro for the Java agent instead of the Java agent bundled with SmileCDR. For this, when running SmileCDR, do not set the CDR_OTEL environment variable, but instead set the JAVA_TOOL_OPTIONS environment variable to instrument Smile CDR. For example,

JAVA_TOOL_OPTIONS=-javaagent:<path-to-aws-otel-java-agent-jar> OTEL_SERVICE_NAME=smilecdr bin/smilecdr start

You would still need to use AWS Distro for OpenTelemetry Collector to be able to convert telemetry data to AWS specific formats to consume them in AWS Cloudwatch and AWS X-Ray.

34.3.7.2Azure OpenTelemetry

Azure provides its own Application Insights OpenTelemetry Java agent, and there is a community provided Azure Monitor Exporter to be used with OpenTelemetry Collector.

If you decide to use the Azure Monitor Exporter for the OpenTelemetry Collector then you need to configure your OpenTelemetry Collector to export to azuremonitor as explained in its readme, and run Smile CDR with CDR_OTEL environment variable set.

Alternatively, you may decide to use the Application Insights Java agent directly, instead of the Java agent bundled with Smile CDR. In this approach you do not use the OpenTelemetry Collector, and when running Smile CDR, do not set the CDR_OTEL environment variable, but instead set the JAVA_TOOL_OPTIONS environment variable to instrument Smile CDR. You also need to configure Application Insights Agent according to the instructions provided by Azure.

For example,

JAVA_TOOL_OPTIONS=-javaagent:<path-to-azure-application-insights-agent-jar> bin/smilecdr start

while having a applicationinsights.json configuration file in the same directory as the Application Insights agent jar, with a content similar to following:

{
  "connectionString":"<your_connection_string>",
  "role": {
    "name": "smilecdr"
  }
}

34.3.8Custom Telemetry Data Provided by Smile CDR
Trial

 

34.3.8.1FHIR Endpoint HTTP Traces

The following additional attributes are added to the root span in a FHIR Endpoint HTTP trace:

  • smilecdr.fhir_endpoint.request_id: The request id. The request id is also available through http.response.header.x-request-id attribute as the default agent configuration instructs agent to capture it from the response header. The difference between the two is smilecdr.fhir_endpoint.request_id is a string valued attribute whereas http.response.header.x-request-id is an array valued attribute. The string valued version is added because it is easier to search for when using backends that do not support searching array valued attributes yet.
  • smilecdr.fhir_endpoint.tenant_id: This will be present if partitioning is enabled, and indicates the id of the tenant.
  • smilecdr.fhir_endpoint.username: The name of the user making the request. This attribute is not present if there is no user involved, for example when using Client Credentials Authorization flow of OIDC, which is a system flow.
  • smilecdr.fhir_endpoint.oidc_client_id: This attribute is the OIDC client id and present only when using Smart Auth and OIDC clients are managed by Smile CDR.
  • smilecdr.fhir_endpoint.restful_interaction_code: The interaction code for the request, e.g. read, vread, transaction etc.
  • smilecdr.fhir_endpoint.request.path.operation_name: If the request is a FHIR extended operation, this attribute is present, and is the name of the operation, e.g. $everything, $meta etc.
  • smilecdr.fhir_endpoint.request.path.resource_type: The resource type from the request path, e.g. for a request path Patient/1234, the value is Patient.
  • smilecdr.fhir_endpoint.request.path.logical_id: The logical id from the request path, e.g. for a request path Patient/1234, the value is 1234.
  • smilecdr.fhir_endpoint.response.resource_type: If a successful request returns a FHIR resource as a response, this attribute is present and is the type of that resource.
  • smilecdr.fhir_endpoint.response.resource_logical_id: If a successful request returns a FHIR resource that has an id in the response, this attribute is present, and it is the logical id of that resource. This is useful for requests that create a resource with a server-generated id (such as a POST request that creates a resource).

Note: These additional FHIR span attributes (except for the http.response.header.x-request-id) are not available for unauthenticated requests (i.e. the requests that result in a 401 HTTP status code) as they are currently added after a request is authenticated.

If you would like to implement an interceptor to add your own span attributes to the root span in a trace, see accessing the local root span from an interceptor.

34.3.8.2HL7 v2.x Inbound Message Processing Traces and Metrics

For HL7 v2.x inbound messages ingested by Smile CDR through HL7 v2.x endpoint, the following additional telemetry data are available.

34.3.8.2.1Traces

A parent span named smilecdr.hl7v2.inbound_message.process is generated with the following additional span attributes that contain details of the message that is processed:

  • smilecdr.hl7v2.inbound_message.type (the type of the hl7 v2.x message , e.g. ADT_A01)
  • smilecdr.hl7v2.inbound_message.version (the version of the hl7 v2.x message, e.g. 2.5)
  • smilecdr.hl7v2.inbound_message.control_id (the control id of the hl7 v2.x message)

This parent span also captures any conversion issues, that are added to the conversion result during hl7 v2.x to FHIR conversion, as span events with the following attributes:

  • smilecdr.hl7v2.inbound_message.conversion_issue.level (the severity of the conversion issue)
  • smilecdr.hl7v2.inbound_message.conversion_issue.message (the description of the conversion issue)
  • smilecdr.hl7v2.inbound_message.conversion_issue.path (the location of the conversion issue)

34.3.8.2.2Metrics

The following two metrics are generated for counting received and failed messages:

  • smilecdr.hl7v2.inbound_message.count (a counter that is incremented for each hl7 v2.x message received for processing)
  • smilecdr.hl7v2.inbound_message.error_count (a counter that is incremented for each hl7 v2.x message failed to be processed)

Both of these metrics has smilecdr.hl7v2.inbound_message.type and smilecdr.hl7v2.inbound_message.version as attributes so that the counts are available per message type+version pair.

The following metric is generated for counting the FHIR resources included in the FHIR transaction bundles generated by HL7 v2.x to FHIR conversions.

  • smilecdr.hl7v2.inbound_message.conversion_resources_count (the number of FHIR resources in the transaction bundles generated by HL7 v2.x to FHIR conversions)

Note, this metric is not the actual resource counts that are persisted. This metric is published after the conversion but before the transaction is processed. A resource in a transaction bundle may not be persisted if it is a conditional update/create or if transaction fails. For actual persisted resource counts, see Storage Metrics. The metric takes into account the resources that would be created or updated (i.e. the bundle entries with "PATCH", "PUT", or "POST" HTTP verbs are included, whereas "DELETE" operations are ignored). Also, any "contained resources" are not counted.

This metric has smilecdr.hl7v2.inbound_message.type, smilecdr.hl7v2.inbound_message.version, and smilecdr.hl7v2.inbound_message.conversion_resource_type as attributes, which allow to get counts per message type, version and FHIR resource type.

34.3.8.3Camel Route Traces

Traces for Camel Routes are generated without requiring any additional configuration. For Smile Component processors, the base URI of the processor is used as the span name. That is, each Camel processor span for the Smile Component is named in the following format smile://[moduleId]/[processorName].

34.3.8.4JavaScript Callback Spans

An OpenTelemetry span is generated for any JavaScript callback execution. Such spans are named as smilecdr.javascript_callback. The name of the JavaScript callback function is available as a span attribute named smilecdr.javascript_callback.function_name.

34.3.8.5Interceptor Method Spans

An OpenTelemetry span is generated for any interceptor method execution. Such spans are named as hapifhir.interceptor. The name of the pointcut, the interceptor class name and the interceptor method name are available as the following span attributes:

  • hapifhir.interceptor.pointcut_name
  • hapifhir.interceptor.class_name
  • hapifhir.interceptor.method_name

34.3.8.5.1Accessing the Local Root Span from an Interceptor

If you would like to author an interceptor to update the local root span in a trace, you can use LocalRootSpan.current() from the opentelemetry-instrumentation-api library to access the local root span. Using Span.current() will not work, because it will return the span that is created for interceptor method invocation.

34.3.8.6Batch Job Spans

When processing a batch job, spans named hapifhir.batch_job.execute are generated by the worker threads. These spans have the following span attributes related to the batch job:

  • hapifhir.batch_job.definition_id: The name of the job, such as BULK_EXPORT, REINDEX.
  • hapifhir.batch_job.definition_version: The job definition version.
  • hapifhir.batch_job.instance_id: The job id.
  • hapifhir.batch_job.step_id: The name of the step being executed.
  • hapifhir.batch_job.chunk_id: The id of the work chunk being processed. This is not applicable to reduction steps.

34.3.8.7Storage Metrics

The following metrics are generated for resource creations and updates:

  • smilecdr.storage.created_resources_count
  • smilecdr.storage.updated_resources_count

Both metrics have smilecdr.storage.resource_type as an attribute so that the counts are available per FHIR resource type.