4.14.1Production Checklist

 

This page provides considerations for reviewing when preparing a production deployment. Because no two deployments are the same, this page is more of a starting point than a final list, but it can be useful in terms of helping to remember important considerations.

4.14.2Host Server Setup

 

When deploying to a server (e.g. a virtual machine running RHEL or Ubuntu with Smile CDR installed) the following considerations apply:

4.14.2.1Ensure that **ulimit** is set appropriately

The ulimit setting is used in Linux to limit the number of processes, open files, and other resources that an individual user can have open. On some Linux variants (RHEL in particular) this value can be set to a low default in order to prevent individual processes from consuming too many resources.

The normal operation of a high-throughput Smile CDR installation can open large numbers of files, so it is recommended to ensure that the Maximum Number of Open File Descriptors value (known as nofile in limit.conf) is set appropriately. A value of 5000 is recommended.

See this page for information on configuring limit.conf.

4.14.2.2Ensure that Smile CDR does not run as root

Running user applications as the root user is generally considered an anti-pattern in Unix system administration. Smile CDR should always be run using a dedicated user account.

4.14.2.3Ensure that NTP is enabled

When issues arise, it is vitally important that you are able to trust timestamps in log files. For this reason, it is critical to have a network time service running on your server.

4.14.2.4Disable Unneeded Services

Smile CDR had very minimal platform requirements. Consider disabling mail servers, FTP servers, NFS/Samba demons, etc. if they are not needed.

4.14.2.5Discouraging Search Indexes

We recommend that you do not allow search indexes from crawling your Smile CDR servers. You can do this by including a robots.txt file with the following:

User-agent: *
Disallow: /

4.14.3Network Infrastructure

 

4.14.3.1Configure Max Request Size in Reverse Proxies

If you are using a reverse proxy such as NGINX in front of Smile CDR, a common source of issues involves the proxy blocking large requests. The default request payload body size in NGINX is 1mb, which is sufficient in many use cases but can prove to be too small for many others.

The following example shows an NGINX configuration allowing a larger payload size, and increasing proxy timeouts.

# Timeout after 5 minutes (300 seconds)
proxy_connect_timeout       300;
proxy_send_timeout          300;
proxy_read_timeout          300;
send_timeout                300;
# Allow up to 200MB while allowing for a bit larger files
client_max_body_size        200M; 

4.14.4CDR Process Settings

 

4.14.4.1Ensure that Smile CDR system logging settings are appropriate

The Smile CDR System Logging can be useful to understanding what is happening with Smile CDR at runtime. Consider adjusting the following:

  • Retention: By default the primary log files will keep a maximum of 30 days worth of logs, but this can be adjusted using the Logback settings. Ensure that the chosen settings are appropriate for your organizational policies.

  • Location: You may wish to dedicate separate disk partitions for logging, or centralize your log files in /var/log. The location of log files defaults to [base dir]/log but this can be changed in your Logback settings.

4.14.5FHIR Server Performance

 

4.14.5.1Don't use Embedded Databases

The Smile CDR Cluster Manager and FHIR Storage databases use the embedded H2 database by default, as this is a nice option for testing purposes. However, this solution is not recommended for production use, and should be replaced with another supported database.

4.14.5.2Tune the Transaction and Audit Log

It is strongly recommended that you leave the Smile CDR Transaction Log disabled as this can cause the following issues:

  • Performance issues with the SmileCDR application, as keeping a detailed log of every transaction has an impact on system throughput.
  • A large increase in the size of the clustermgr database

Having said that, you may wish to make use of this tool for troubleshooting.

The retention period defaults to 7 days (it used to be 90 days) as of the 2023-02 release.

Consider reducing retention time, disabling request/response body capture, or even disabling the transaction log entirely on production systems to mitigate the problems outline above.

The Smile CDR Audit Log keeps a record of accesses and modifications to data and system settings.

Consider disabling the audit log if it is providing functionality that is duplicated in other parts of your overall solution.

4.14.5.3Disable Resource Counts

By default Smile CDR attempts to maintain accurate counts of the numbers of each resource type in the repository. This is useful in testing scenarios, but can be a very expensive and repetitive query with little value in a production system, especially as the system grows to have larger amounts of data stored.

Consider disabling the Include Resource Counts setting.

4.14.5.4Disable Pretty-Print by Default

FHIR Endpoint modules will pretty-print (i.e. indent) responses by default. This is helpful during development, but increases the size of response payloads and therefore decreases performance. Consider disabling this on production systems. Note that clients may disable pretty printing on a request-by-request basis using the _pretty parameter, but this is often forgotten when left to the client.

4.14.5.5Disable Unnecessary Search Parameter Features

See the Search Parameter Features page for details on optional indexing features. Disabling features you are not using can reduce the amount of disk space required as you scale up, and improve write performance.

4.14.5.6Review Creation Performance Tips

If you are intending on ingesting a large amount of data prior to go-live, see the Performance Tuning page for tips on improving performance.

4.14.6Monitoring

 

4.14.6.1Use a Monitoring Tool

The use of a third-part monitoring tool or framework is recommended. Smile CDR provides many useful hooks that can interact with these tools. One commonly used option is Nagios but there are many others as well.

4.14.6.2Monitor Endpoint Availability

Configuring your monitoring tool to regularly check the Endpoint Health endpoint ensures that you will be notified if an endpoint is down.

4.14.6.3Monitor System Health

Configuring your monitoring tool to regularly check the Runtime Status Health Checks endpoint for module health is a good way to ensure that the system is functioning smoothly. See Runtime Health Checks for information on this feature.

4.14.6.4Monitor Disk Space

If it is not already configured, it is important to ensure that alerts will be raised if a disk becomes critically full.

4.14.7Security

 

4.14.7.1Review HTTP Server Setup

The HTTP Server Setup page outlines a number of configuration options on HTTP Endpoint modules that should be set appropriately for your specific environment. Please review this page and ensure that you are using sensible settings. Some settings to consider include:

  • Ensure that Respect Forward Headers is enabled if your environment includes reverse proxies, load balancers, etc. Verify that client request IP addresses that appear in transaction/audit logs are correct and do not simply correspond to the IP address of the network device.

  • If you are using TLS/SSL terminated by Smile CDR, consider enabling a protocol/cipher whitelist.

  • Consider setting Suppress Platform Information and Suppress Error Details on your end-user facing endpoint modules (FHIR Endpoint, SMART Outbound Security, etc.), in order to avoid exposing internal detail about your specific Smile CDR instance.

4.14.7.2Review FHIR Security Checklist

The FHIR specification contains a Security Page with a number of best practices and suggestions. These should be considered in detail.

4.14.7.3Review User Permissions

Ensure that users have been granted only the permissions necessary in order to allow them to do the task that they need to do. For example:

  • If possible, avoid granting superuser permissions to users.

  • If possible, grant FHIR permissions that are as specific as possible. For example, if a user does not need to be able to write data, they should be granted only READ-level permissions.

  • If you are using the Group resource and compartment-based security, note that the ame Group resource may be in multiple Patient compartments, meaning that if a user has access to a single Patient compartment, they may be able to modify the resource by adding/removing other Patients.

4.14.8Message Broker

 

Note that this section applies only if you are using a message broker (i.e. for Subscription delivery). You may skip it if you are not using a message broker in your configuration.

4.14.8.1Pick an Appropriate Broker

The default installation of Smile CDR uses an embedded instance of Apache ActiveMQ. This is good for testing and low volume setting but should be replaced with something more robust for high volume setups.

Either a standalone ActiveMQ instance or a standalone Kafka instance should be created and configured. See Message Brokers for more information.

4.14.8.2ActiveMQ: Set Resource Limits

If you are using standalone Apache ActiveMQ as your message broker, it is important to configure the Broker Resource Limits appropriately.

4.14.9Cluster Design

 

4.14.9.1Put Administrative Modules on Separate Nodes

The Web Admin Console module should be placed on a separate node if it is being used in a large cluster, so that it can be scaled separately from the end user services (e.g. FHIR endpoint modules).

4.14.9.2Disable Unneeded Modules

The default configuration of Smile CDR includes a number of modules. Any modules that are not needed in order to support the production configuration should be archived. For example, you may not need:

  • The SMART Outbound Security module - If you are not using Smile CDR as an OIDC Identity Provider.

  • The SMART App Demo Host - If you are not using the demonstration SMART applications.