Designing a Cluster
Smile CDR is designed to be clustered in horizontal clusters of any size. This means that you can add an arbitrary number of servers to your installation, and they can be used to share the load of incoming requests.
The built-in clustering capability is designed to be flexible. You can build active/active clusters, active/passive clusters, or any combination of the two in order to meet your specific needs.
All components in Smile CDR are designed to be capable of operating without keeping any local state within a single server. This means that a deployment can grow to a very large number of servers as needed. This design also means that nodes can be added and removed from the cluster at any time (i.e. without requiring a restart of the entire cluster).
The general approach in designing a cluster is to create one or more nodes with distinct IDs that will act as "templates" for configuration.
For example, suppose you create a node called FHIR_Support
on server HOST1.acme.org
which is configured with:
8000
jdbc:postgres:cdr
You can now start up as many Smile CDR processes with the same Node ID as you wish. It will use the same settings when it starts. This means that it will connect to the same backing data store, and it will also listen on port 8000
.
The two listeners you have created are both able to handle parallel requests. You can now place a network switch (or load balancer, failover device, reverse proxy, etc.) in front of these two ports, and your requests will be served by both nodes (or by the active node, depending on the configuration).
This same strategy applies to all types of modules that can be created within Smile CDR. Security modules will seamlessly share sessions across all clones, Web and JSON admin APIs will expose their ports and service requests across each node, etc.
The following list of terms are the key concepts in Smile CDR clustering.
_-.
(i.e. spaces are not allowed). The maximum length is 30 characters.The clustering capabilities of Smile CDR rely heavily on having access to a clustered underlying database instance. Setting up a cluster of your chosen database platform (PostgreSQL, Oracle, etc.) is beyond the scope of this documentation but Smile CDR does expect the chosen cluster configuration to be globally consistent.
The Smile CDR FHIR Storage modules may be configured to use Apache Lucene for providing indexing, which is used for certain types of queries. See Lucene Indexing for more information now how this is configured.
If Smile CDR will be used in a cluster (i.e. multiple processes will be created for a single node), ElasticSearch based clustering must be used. Using Lucene in Memory or Disk mode may cause inconsistent results, as indexes are not propagated across the cluster.
Individual processes in a Smile CDR cluster will all have a user-assigned Node ID, which will be the same for all processes that share the same Node configuration. Each process will also have a Process ID, which uniquely identifies the process across the cluster.
Process IDs are automatically assigned by Smile CDR and do not need to be explicitly set by the user (nor can they be).
Smile CDR is able to handle any arbitrary number of processes being added, and these processes can be started or stopped at any time.
The very first time a process is started with a given Node ID, the configuration for that Node ID is saved in the cluster manager database. There is nothing special about this process however, and it may be shut down if other processes for the same node have subsequently been started without any adverse effects (note that this was not the case in previous versions of Smile CDR).
node.server_port_offset
property indicates an integer value to apply as an offset to server port numbers on the clone node. For example, if the master node has a FHIR Endpoint module listening on port 8000
and this property has a value of 10000
, on the clone node the same FHIR Endpoint will listen on port 18000
.In many cases it is desirable to have multiple nodes within a cluster, each with their independent set of modules. This is useful if you are designing a cluster with two independent roles that you want to scale independently.
For example, suppose you are planning a deployment of Smile CDR that will consist of a Web Admin Console, a FHIR Endpoint module, and a SMART Outbound Security module. If all of these modules are on the same node, then they will all be scaled together as more processes are added to the cluster. This has an impact on startup time, memory consumption, etc.
An alternate design is to place each function on its own node. In the example above, this might look like:
With this design, clones could be made of any of these master nodes in order to scale the system up accordingly.
Be aware that the web console and the admin JSON modules can only access batch job status for persistence modules defined in the same node. Batch job information from other nodes will not normally be visible.
To support sharing batch job information across different node groups, you should define duplicate persistence modules that share the connection configuration of the original module. This will provide access to batch job status information in the web console and via admin JSON. You should:
persistenceR4
, name the new module persistenceR4
as well). This will reduce any confusion when viewing the information on different consoles.read_only_mode.enabled=true
), otherwise, this module will block batch 2 and scheduler processesdb.driver
)db.url
)db.username
)db.password
)module.persistence_r4.config.db.connectionpool.maxidle
)module.persistence_r4.config.db.connectionpool.maxtotal
)Because Smile CDR is built as modules that can be used in various combinations and configuration, there are a wide variety of cluster designs that can be built.
The following diagram shows a sample design that follows a fairly common pattern:
These two nodes are scaled independently, so during slow periods each node might be served by 1-2 processes. During busy periods the FHIR node might scale up to many times more processes, while the Administration node might not.