Message Broker Failure Management
Several Smile CDR modules make use of message brokers, acting as consumers. For example, the Channel Import module consumes messages from a broker.
In the normal course of operation, a consumer will be ingesting messages from one or more brokers, and performing whatever processing is necessary internally. However, it is possible that messages may be dropped or silently lost when something goes wrong during that processing.
To mitigate this issue, Smile CDR provides facilities by which any module that requires a channel can also specify a retry channel, a failed channel, and a collection of other settings to control retry behaviour. Any module that requires an intake channel has the ability to also specify the following settings:
module.module_name.config.channel.retry.name =my-retry-channel module.module_name.config.channel.retry.delay_milliseconds =5000 module.module_name.config.channel.retry.maximum_delay_milliseconds =6000 module.module_name.config.channel.retry.maximum_attempts =3 module.module_name.config.channel.retry.strategy =CONSTANT module.module_name.config.channel.retry.retriable_exceptions =ca.uhn.fhir.rest.server.exceptions.InvalidRequestException module.module_name.config.channel.failed.name =my-failure-channel
Setting these properties in a module will cause Smile CDR to automatically wrap any message handling in a retry mechanism following the rules outlined in the properties. Note that all the configuration settings above must be set (i.e. no blank, null or zero values) for the retry mechanism to be enabled. Below is a rough explanation of how the retry mechanism works.
channel.retry.retriable_exceptions, the message handler will set headers on the message indicating retry count, first failure timestamp, and last failure timestamp. The message then moves onto the channel defined in
channel.retry.delay_millisecondsis the minimum amount of milliseconds between attempts.
channel.retry.strategydetermines whether the backoff is exponential or constant. If the delay is 5000ms, a constant backoff would retry every 5 seconds, whereas an exponential backoff would try at 5 seconds, then 10, then 20, and so on.
channel.retry.maximum_delay_millisecondsprovides an upper bound for delay in case of exponential growth.
channel.retry.maximum_attempts, then the message handler will publish it to the failed channel as defined in
channel.failed.name. The message handler will also add the unhandled exception to the headers of the message. Smile CDR does not consume messages off of this channel, those messages should be consumed by an external reporting system, or potentially a dead-letter consumer.