On this page:

12.3Debezium

 

As mentioned in the overview, there are two options for configuring Realtime Export's data source. The first is POINTCUT, which Smile CDR manages internally. This is relatively low latency, but for use cases that require ultra low latency, you can use the CHANNEL source. This data source does not rely on the Storage module, and instead relies on 3rd party software to populate the channel. In order to populate this channel, there are tools such as Debezium which are built for this purpose.

What is Debezium?

Debezium is a Kafka Connect connector which takes change events from a database and puts it on a Kafka topic. Debezium is a Change Data Capture (CDC) technology that captures changes in a database table by polling the transaction log of a database. The benefit of using a log tailing technology is that it exerts no extra load on the database compared to a query based CDC technology.

Configuring Debezium for Realtime Export

This example shows a setup of Debezium with a MS SQL Server database.

In the Persistence database in MS SQL Server you must enable change data capture mode:

sqlcmd -H localhost -U SA -P <PASSWORD>
1> USE fhirpersistence;
2> GO
Changed database context to 'cdr'.
1> EXEC sys.sp_cdc_enable_db  
2> GO
1> EXEC sys.sp_cdc_enable_table
2> @source_schema = N'dbo',
3> @source_name = N'HFJ_RES_VER',
4> @role_name = NULL,
5> @filegroup_name = NULL,
6> @capture_instance = N'DBO_HFJ_RES_VER_2',
7> @supports_net_changes = 0
8> GO

Then create a configuration file for the Kafka Connect process:

connect-standalone.properties:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# These are defaults. This file just demonstrates how to override some settings.
bootstrap.servers=localhost:9092

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000

# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include 
# any combination of: 
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples: 
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/usr/share/java/

Finally, create the Debezium properties file:

debezium-mssql-connector.properties:

name=mssqlDebezium
connector.class=io.debezium.connector.sqlserver.SqlServerConnector
database.hostname=<DATABASE_HOSTNAME>
database.port=1433
database.user=SA
database.password=<PASSWORD>
database.dbname=cdr
database.server.name=mssql_events
table.whitelist=dbo.hfj_res_ver
database.history.kafka.bootstrap.servers=localhost:9092
database.history.kafka.topic=testing_mssql

Download the Debezium Connector for MS SQL Server and place the extracted directory in the plugin.path specified in connect-standalone.properties (e.g. /usr/share/java).

Based on the configuration above in debezium-mssql-connector.properties, ensure that the Channel Name specified in the Realtime Export configuration is set to the same value as database.history.kafka.topic in the above config, in this case testing_mssql.

Running Debezium

To run the Debezium Kafka Connect process, run the command below, substituting /etc/kafka for the location of your Kafka Connect properties file.

connect-standalone /etc/kafka/connect-standalone.properties /etc/kafka/debezium-mssql-connector.properties

Limitations of Debezium

  1. Debezium only works with Kafka as a broker. It is unavailable to users of other brokers such as ActiveMQ.
  2. Currently, Debezium is incompatible with PostgreSQL due to how it stores large objects.
  3. Oracle is compatible with Debezium but it requires a subscription to Oracle Golden Gate. Using the steps above, please refer to Debezium's driver notes for Oracle-specific options as they differ from the MS SQL Server example illustrated.