Production Checklist
This page provides considerations for reviewing when preparing a production deployment. Because no two deployments are the same, this page is more of a starting point than a final list, but it can be useful in terms of helping to remember important considerations.
When deploying to a server (e.g. a virtual machine running RHEL or Ubuntu with Smile CDR installed) the following considerations apply:
The ulimit setting is used in Linux to limit the number of processes, open files, and other resources that an individual user can have open. On some Linux variants (RHEL in particular) this value can be set to a low default in order to prevent individual processes from consuming too many resources.
The normal operation of a high-throughput Smile CDR installation can open large numbers of files, so it is recommended to ensure that the Maximum Number of Open File Descriptors value (known as nofile
in limit.conf) is set appropriately. A value of 5000 is recommended.
See this page for information on configuring limit.conf.
Running user applications as the root user is generally considered an anti-pattern in Unix system administration. Smile CDR should always be run using a dedicated user account.
When issues arise, it is vitally important that you are able to trust timestamps in log files. For this reason, it is critical to have a network time service running on your server.
Smile CDR had very minimal platform requirements. Consider disabling mail servers, FTP servers, NFS/Samba demons, etc. if they are not needed.
We recommend that you do not allow search indexes from crawling your Smile CDR servers. You can do this by including a robots.txt file with the following:
User-agent: *
Disallow: /
If you are using a reverse proxy such as NGINX in front of Smile CDR, a common source of issues involves the proxy blocking large requests. The default request payload body size in NGINX is 1mb, which is sufficient in many use cases but can prove to be too small for many others.
The following example shows an NGINX configuration allowing a larger payload size, and increasing proxy timeouts.
# Timeout after 5 minutes (300 seconds)
proxy_connect_timeout 300;
proxy_send_timeout 300;
proxy_read_timeout 300;
send_timeout 300;
# Allow up to 200MB while allowing for a bit larger files
client_max_body_size 200M;
The Smile CDR System Logging can be useful to understanding what is happening with Smile CDR at runtime. Consider adjusting the following:
Retention: By default the primary log files will keep a maximum of 30 days worth of logs, but this can be adjusted using the Logback settings. Ensure that the chosen settings are appropriate for your organizational policies.
Location: You may wish to dedicate separate disk partitions for logging, or centralize your log files in /var/log
. The location of log files defaults to [base dir]/log
but this can be changed in your Logback settings.
The Smile CDR Cluster Manager and FHIR Storage databases use the embedded H2 database by default, as this is a nice option for testing purposes. However, this solution is not recommended for production use, and should be replaced with another supported database.
It is strongly recommended that you leave the Smile CDR Transaction Log disabled as this can cause the following issues:
Having said that, you may wish to make use of this tool for troubleshooting.
The retention period defaults to 7 days (it used to be 90 days) as of the 2023-02 release.
Consider reducing retention time, disabling request/response body capture, or even disabling the transaction log entirely on production systems to mitigate the problems outline above.
The Smile CDR Audit Log keeps a record of accesses and modifications to data and system settings.
Consider disabling the audit log if it is providing functionality that is duplicated in other parts of your overall solution.
By default Smile CDR attempts to maintain accurate counts of the numbers of each resource type in the repository. This is useful in testing scenarios, but can be a very expensive and repetitive query with little value in a production system, especially as the system grows to have larger amounts of data stored.
Consider disabling the Include Resource Counts setting.
FHIR Endpoint modules will pretty-print (i.e. indent) responses by default. This is helpful during development, but increases the size of response payloads and therefore decreases performance. Consider disabling this on production systems. Note that clients may disable pretty printing on a request-by-request basis using the _pretty
parameter, but this is often forgotten when left to the client.
See the Search Parameter Features page for details on optional indexing features. Disabling features you are not using can reduce the amount of disk space required as you scale up, and improve write performance.
If you are intending on ingesting a large amount of data prior to go-live, see the Performance Tuning page for tips on improving performance.
The use of a third-part monitoring tool or framework is recommended. Smile CDR provides many useful hooks that can interact with these tools. One commonly used option is Nagios but there are many others as well.
Configuring your monitoring tool to regularly check the Endpoint Health endpoint ensures that you will be notified if an endpoint is down.
Configuring your monitoring tool to regularly check the Runtime Status Health Checks endpoint for module health is a good way to ensure that the system is functioning smoothly. See Runtime Health Checks for information on this feature.
If it is not already configured, it is important to ensure that alerts will be raised if a disk becomes critically full.
The HTTP Server Setup page outlines a number of configuration options on HTTP Endpoint modules that should be set appropriately for your specific environment. Please review this page and ensure that you are using sensible settings. Some settings to consider include:
Ensure that Respect Forward Headers is enabled if your environment includes reverse proxies, load balancers, etc. Verify that client request IP addresses that appear in transaction/audit logs are correct and do not simply correspond to the IP address of the network device.
If you are using TLS/SSL terminated by Smile CDR, consider enabling a protocol/cipher whitelist.
Consider setting Suppress Platform Information and Suppress Error Details on your end-user facing endpoint modules (FHIR Endpoint, SMART Outbound Security, etc.), in order to avoid exposing internal detail about your specific Smile CDR instance.
The FHIR specification contains a Security Page with a number of best practices and suggestions. These should be considered in detail.
Ensure that users have been granted only the permissions necessary in order to allow them to do the task that they need to do. For example:
If possible, avoid granting superuser permissions to users.
If possible, grant FHIR permissions that are as specific as possible. For example, if a user does not need to be able to write data, they should be granted only READ-level permissions.
If you are using the Group resource and compartment-based security, note that the ame Group resource may be in multiple Patient compartments, meaning that if a user has access to a single Patient compartment, they may be able to modify the resource by adding/removing other Patients.
Note that this section applies only if you are using a message broker (i.e. for Subscription delivery). You may skip it if you are not using a message broker in your configuration.
The default installation of Smile CDR uses an embedded instance of Apache ActiveMQ. This is good for testing and low volume setting but should be replaced with something more robust for high volume setups.
Either a standalone ActiveMQ instance or a standalone Kafka instance should be created and configured. See Message Brokers for more information.
If you are using standalone Apache ActiveMQ as your message broker, it is important to configure the Broker Resource Limits appropriately.
The Web Admin Console module should be placed on a separate node if it is being used in a large cluster, so that it can be scaled separately from the end user services (e.g. FHIR endpoint modules).
The default configuration of Smile CDR includes a number of modules. Any modules that are not needed in order to support the production configuration should be archived. For example, you may not need:
The SMART Outbound Security module - If you are not using Smile CDR as an OIDC Identity Provider.
The SMART App Demo Host - If you are not using the demonstration SMART applications.