In modern data architectures, Change Data Capture (CDC) is crucial for real-time data replication across systems. This article demonstrates how to integrate a SingleStore database with Apache Kafka using the Debezium connector for SingleStore. Under the hood it tracks all create, update and delete operations in SingleStore tables.
-With-Kafka-and-Debezium-From-SingleStore-Container_544x306-primary.png?width=736&disable=upscale&auto=webp)
Apache Kafka
Apache Kafka is a messaging system that allows clients to publish and read streams of data (also called events). It has an ecosystem of open-source solutions that you can combine to store, process and integrate these data streams with other parts of your system in a secure, reliable and scalable way.
Kafka Connect
To build integration solutions, you can use the Kafka Connect framework, which provides a suite of connectors to integrate Kafka with external systems. There are two types of Kafka connectors:
- Source connector, used to move data from source systems to Kafka topics
- Sink connector, used to send data from Kafka topics into the target (sink) system.
Debezium
Debezium is a set of distributed services that capture row-level changes in your databases so that your applications can see and respond to those changes. Debezium records in a transaction log all row-level changes committed to each database table. Each application simply reads the transaction logs they’re interested in, seeing all the events in the same order they occur.
Demo details
Check out this blog to learn more about SingleStore’s CDC capabilities.
Prerequisites
The demo will have the following components:
- Docker and Docker Compose. Ensure Docker and Docker Compose are installed on your machine.
- SingleStore. Use SingleStore as your database to host a table.
- Apache Kafka. Kafka will serve as the messaging system.
- Zookeeper. Distributed coordination service that aids in the management of Kafka
- Debezium: Debezium will capture and stream changes from SingleStore
Docker and Docker Compose installation
You can install Docker using the guide found here. Based on the platform of choice, you will want to ensure Docker compose is available.
Setting up the environment
We will use Docker compose to install the following components:
- SingleStore database dev image: ghcr.io/singlestore-labs/singlestoredb-dev:latest
- Zookeeper image: wurstmeister/zookeeper
- Kafka image: wurstmeister/kafka
- Kafka Connect: debezium/connect:1.6
- Kafdrop: obsidiandynamics/kafdrop (this is optional but it allows us to view the topic via a GUI)
- Download the SingleStore Debezium Connector from this link, and have it extracted. The download path will be used in the compose file.
Here is a sample compose file for this demo, which we’re callingS2cdc.yml
1
version: '3.8'2
3
services:4
#SingleStore5
singlestore:6
image: ghcr.io/singlestore-labs/singlestoredb-dev:latest7
platform: linux/x86_648
ports:9
- 3306:330610
- 8080:808011
environment:12
# use the LICENSE_KEY environment variable set in the terminal:13
- SINGLESTORE_LICENSE=<YOUR LICENSE KEY OR TRIAL LICENSE KEY>14
- ROOT_PASSWORD=<YOUR ROOT PASSWORD>15
16
zookeeper:17
image: wurstmeister/zookeeper18
platform: linux/x86_6419
ulimits:20
nofile:21
soft: 6553622
hard: 6553623
container_name: zookeeper24
ports:25
- "2181:2181"26
kafka:27
image: wurstmeister/kafka28
container_name: kafka29
ports:30
- "9092:9092"31
environment:32
KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:9092,OUTSIDE://localhost:909333
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT34
KAFKA_LISTENERS: INSIDE://0.0.0.0:9092,OUTSIDE://0.0.0.0:909335
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE36
KAFKA_ZOOKEEPER_CONNECT: zookeeper:218137
KAFKA_CREATE_TOPICS: "baeldung:1:1"38
depends_on:39
- zookeeper40
kafka-connect:41
image: debezium/connect:1.642
platform: linux/x86_6443
container_name: kafka-connect44
ports:45
- "8083:8083"46
environment:47
- BOOTSTRAP_SERVERS=kafka:909248
- GROUP_ID=connect-cluster49
- CONFIG_STORAGE_TOPIC=my_connect_configs50
- OFFSET_STORAGE_TOPIC=my_connect_offsets51
- STATUS_STORAGE_TOPIC=my_connect_statuses52
depends_on:53
- kafka54
- zookeeper55
volumes:56
- <path to your SingleStore Debezium Connector>/singlestore-debezium-connector:/kafka/connect/singlestore57
kafdrop:58
image: obsidiandynamics/kafdrop59
platform: linux/x86_6460
container_name: kafdrop61
ports:62
- "9000:9000"63
environment:64
KAFKA_BROKERCONNECT: kafka:909265
depends_on:66
- kafka
Create the containers
1
SingleStore_CDCOUT_TO_KAFKA % docker compose -f S2cdc.yml up -d
Once created, it should show the following status:
1
[+] Running 6/62
✔ Network singlestore_cdcout_to_kafka_kafka-net Created3
0.0s4
✔ Container zookeeper Started5
0.4s6
✔ Container singlestore_cdcout_to_kafka-singlestore-1 Started7
0.4s8
✔ Container kafka Started9
0.4s10
✔ Container kafdrop Started11
0.5s12
✔ Container kafka-connect Started
Validate the containers
1
SingleStore_CDCOUT_TO_KAFKA % docker ps2
CONTAINER ID IMAGE COMMAND CREATED STATUS3
PORTS NAMES4
c307da3f3dbb debezium/connect:1.6 "/docker-entrypoint.…" About a minute ago Up5
About a minute 8778/tcp, 9092/tcp, 0.0.0.0:8083->8083/tcp, 9779/tcp kafka-connect6
b260d3082396 obsidiandynamics/kafdrop "/kafdrop.sh" About a minute ago Up7
About a minute 0.0.0.0:9000->9000/tcp kafdrop8
3ff856911f96 wurstmeister/kafka "start-kafka.sh" About a minute ago Up9
About a minute 0.0.0.0:9092->9092/tcp kafka10
3f4425daec34 wurstmeister/zookeeper "/bin/sh -c '/usr/sb…" About a minute ago Up11
About a minute 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp zookeeper12
799a0ee5a935 ghcr.io/singlestore-labs/singlestoredb-dev:latest "/scripts/start.sh" About a minute ago Up13
About a minute (healthy) 0.0.0.0:3306->3306/tcp, 0.0.0.0:8080->8080/tcp, 9000/tcp14
singlestore_cdcout_to_kafka-singlestore-1
Exec into your Kafka Connect container to verify the SingleStore Debezium jars are present
1
SingleStore_CDCOUT_TO_KAFKA % docker exec -it kafka-connect /bin/bashc2
[kafka@c307da3f3dbb ~]$ cd connect3
[kafka@c307da3f3dbb connect]$ ls4
debezium-connector-db2 debezium-connector-mongodb debezium-connector-mysql debezium-connector-oracle5
debezium-connector-postgres debezium-connector-sqlserver debezium-connector-vitess singlestore6
[kafka@c307da3f3dbb connect]$ cd singlestore7
[kafka@c307da3f3dbb singlestore]$ pwd8
/kafka/connect/singlestore
Log in to your SingleStore cluster, creating the required tables that need to be streamed
If you are using the dev image, SingleStore Studio is available here (check your Docker compose yaml).
.png?width=1024&disable=upscale&auto=webp)
1
CREATE DATABASE CDCOUT_KAFKA;2
USE CDCOUT_KAFKA;3
CREATE TABLE Kafka_test_table ( id INT , Student_Name VARCHAR(50), age INT, marks INT )
Enable observe for the cluster

1
curl -i -X POST \2
-H "Accept:application/json" \3
-H "Content-Type:application/json" \4
127.0.0.1:8083/connectors/ \5
-d '6
{7
"name": "singlestore-debezium-connector",8
"config":9
{10
"connector.class": "com.singlestore.debezium.SingleStoreConnector",11
"tasks.max": "1",12
"database.hostname": "singlestore",13
"database.port": "3306",14
"database.user": "root",15
"database.password": "Root",16
"topic.prefix": "SingleStore_CDCOUT_KAFKA_TEST",17
"database.dbname": "CDCOUT_KAFKA",18
"database.table": "Kafka_test_table",19
"delete.handling.mode": "none",20
"topic.name": "S2_CDCOUT_KAFKA"21
}22
}'
You should see the operation is successful.
1
HTTP/1.1 201 Created2
Date: Fri, 09 Aug 2024 17:22:20 GMT3
Location: http://172.18.0.6:8083/connectors/singlestore-debezium-connector4
Content-Type: application/json5
Content-Length: 4496
Server: Jetty(9.4.38.v20210224)7
{"name":"singlestore-debezium-connector","config":{"connector.class":"com.singlestore.debezium.SingleStoreConnector","t8
asks.max":"1","database.hostname":"singlestore","database.port":"3306","database.user":"root","database.password":"Root9
","topic.prefix":"SingleStore_CDCOUT_KAFKA","database.dbname":"CDCOUT_KAFKA","database.table":"Kafka_test_table","delet10
e.handling.mode":"none","name":"singlestore-debezium-connector"},"tasks":[],"type":"source"}[kafka@c307da3f3dbb11
connect]$
Navigate to the Kafdrop UI to see the topics. It should be accessible here (check your Docker Compose file).


Validate existing entries

Add a few entries to see if the changes are streamed to Kafka
.png?width=1024&disable=upscale&auto=webp)
You can see here the offset count has increased.

Now, you can proceed in validating the data.


You can see how easy it is to set up a streaming CDC feed out to Kafka from SingleStore. Check out our SingleStore documentation for more info, and start your free trial today.