MySQL RDS Data Integration Into SingleStore

Clock Icon

4 min read

Pencil Icon

Nov 4, 2024

With more than 300+ existing databases in 2024, choosing the right database equipped with the necessary features, flexibility and cost is more important than ever.

MySQL RDS Data Integration Into SingleStore

There are several factors to consider, like the language of the databases to its performance, and even the way it handles data consistency. One major feature to aid in this choice is data integrations.

Integrating to a new database can be an incredibly cumbersome process. There are many factors to take into account from integrity and consistency of data, schema differences and even potential downtime and performance impact.

To make life a little simpler, let’s take a look at a step-by-step tutorial of integrating an Amazon RDS MySQL instance to SingleStore.

If you aren’t familiar, SingleStore is a real-time database designed to power modern, intelligent applications — delivering maximum performance for both transactional (OLTP) and analytical (OLAP) workloads in a single unified engine. SingleStore also features native data integration services, including connection links. These links automatically detect table schemas and provision pipelines supporting high-speed consistent data ingestion.

Before proceeding, make sure you have the proper accounts setup and packages installed:

single-store-setupSingleStore setup

Navigate to SingleStore’s Cloud Portal, SingleStore Helios®. Provision a new workspace group with a SingleStore (S-00) workspace with the preset configurations.

Under the workspace, navigate to the “Firewall” tab and select “Outbound” under secure connections. Note down the four outbound IP addresses. These IP addresses will be used to create inbound connections to the MySQL RDS instance.

awsAWS

We will now set up an Amazon RDS MySQL instance in a newly provisioned Amazon VPC. In the root directory of the cloned Github repo, run the following command to provision these resources:

./scripts/deploy.sh

AWS CloudFormation includes outputs from the provisioned architecture. These outputs are parameters to update the environment to support the data integration. Note down the RDS Identifier output. We will be using this when creating the connection link.

echo "Pulling environment variables configuration"
export AWS_ACCOUNT_ID=`aws sts get-caller-identity --query Account --output text`
export AWS_REGION=`aws configure get region`
echo "Setting environment variables from CloudFormation Outputs"
export RDS_ENDPOINT=$(aws cloudformation describe-stacks \
--stack-name MySQLStack \
--query 'Stacks[0].Outputs[?OutputKey==`RdsEndpointExport`].OutputValue' \
--region $AWS_REGION \
--output text)
echo "RDS Endpoint: $RDS_ENDPOINT"
export RDS_IDENTIFIER=$(aws cloudformation describe-stacks \
--stack-name MySQLStack \
--query 'Stacks[0].Outputs[?OutputKey==`RdsInstanceIdentifier`].OutputValue' \
--region $AWS_REGION \
--output text)
echo "RDS Identifier: $RDS_IDENTIFIER"
export RDS_PG_NAME=$(aws cloudformation describe-stacks \
--stack-name MySQLStack \
--query 'Stacks[0].Outputs[?OutputKey==`RdsParameterGroupName`].OutputValue' \
--region $AWS_REGION \
--output text)
echo "RDS Parameter Group Name: $RDS_PG_NAME"
export SG_ID=$(aws cloudformation describe-stacks \
--stack-name MySQLStack \
--query 'Stacks[0].Outputs[?OutputKey==`RdsSecurityGroup`].OutputValue' \
--region $AWS_REGION \
--output text)
echo "RDS Security Group Id: $SG_ID"

To support a connection from SingleStore, the AWS environment must include:

  1. Security group inbound request
  2. Binlog enabled
  3. Tables with primary keys
  4. User replication secondary privilege (by default the primary user has these privileges)

security-group-inbound-requestSecurity group inbound request

To support the security group inbound request, run the following command once for every outbound IP address noted down from your SingleStore workspace. Replace <INSERT IP ADDRESS HERE>

IP_ADDRESS="<INSERT IP ADDRESS HERE>"
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 3306 \
--cidr $IP_ADDRESS/32

enable-binlogEnable binlog

The binary logs contain a record of all changes to the databases, both data and structure. Data integration into SingleStore requires this to be enabled on the MySQL instance. The following commands update the parameter group to support this.

aws rds modify-db-parameter-group \
--db-parameter-group-name $RDS_PG_NAME \
--parameters "ParameterName=binlog_format,ParameterValue=ROW,ApplyMethod=immediate" \
> /dev/null 2>&1
aws rds modify-db-instance \
--db-instance-identifier $RDS_IDENTIFIER \
--db-parameter-group-name $RDS_PG_NAME \
--apply-immediately \
> /dev/null 2>&1

Restart the RDS instance upon completion to populate the changes. This may take a few minutes.

aws rds wait db-instance-available --db-instance-identifier $RDS_IDENTIFIER
aws rds reboot-db-instance --db-instance-identifier $RDS_IDENTIFIER \
--output text
aws rds wait db-instance-available --db-instance-identifier $RDS_IDENTIFIER

populate-the-data-create-metadata-storagePopulate the data + create metadata storage

Connect to the MySQL RDS instance from the project root directory.

mysql -h $RDS_ENDPOINT --ssl-ca=global-bundle.pem -P 3306 -u testuser -p

Enter the password: TestPassword123$ .

We will first populate the instance with a database including the tables ride_data and user_data .

-- Create the database
CREATE DATABASE IF NOT EXISTS migration;
USE migration;
-- Create tables
CREATE TABLE ride_data (
ride_id VARCHAR(50) PRIMARY KEY,
user_id VARCHAR(50),
vehicle_id VARCHAR(50),
timestamp DATETIME,
pickup_location_lat FLOAT,
pickup_location_long FLOAT,
dropoff_location_lat FLOAT,
dropoff_location_long FLOAT,
ride_status VARCHAR(50)
);
CREATE TABLE user_data (
user_id VARCHAR(50) PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
registration_date DATETIME
);
-- Load data
LOAD DATA LOCAL INFILE 'data/ride_data.csv'
INTO TABLE ride_data
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(ride_id, user_id, vehicle_id, @timestamp, pickup_location_lat,
pickup_location_long, dropoff_location_lat,
dropoff_location_long, ride_status)
SET timestamp = STR_TO_DATE(@timestamp, '%Y-%m-%dT%H:%i:%s');
LOAD DATA LOCAL INFILE 'data/user_data.csv'
INTO TABLE user_data
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(user_id, name, email, @registration_date)
SET registration_date = STR_TO_DATE(@registration_date,
'%Y-%m-%dT%H:%i:%s');

We will now create a new database named singlestore . The singlestore database is meant to store the pipeline metadata information of the connection link to confirm the status of the pipeline and other operations.

CREATE DATABASE IF NOT EXISTS singlestore;

Everything is now set up in our AWS environment to support a connection link to SingleStore. Let’s navigate to the SQL editor in SingleStore to set up a connection with your provisioned workspace!

single-store-ingestionSingleStore ingestion

In the SQL editor, we will create a database, a connection link, provision tables automatically from the link and pipeline the data into SingleStore.

creating-a-databaseCreating a database

CREATE DATABASE IF NOT EXISTS migration;
USE migration;

Replace <hostname> with the RDS_ENDPOINT you noted earlier.

CREATE LINK mysql_link AS MYSQL
CONFIG '{
"database.hostname": "<hostname>",
"database.port": 3306,
"database.ssl.mode":"required"
}'
CREDENTIALS '{
"database.password": "TestPassword123$",
"database.user": "testuser"
}';

migrationMigration

Auto-detect the tables through inference from the connector.

CREATE TABLES IF NOT EXISTS AS INFER PIPELINE AS LOAD DATA
LINK mysql_link "*" FORMAT AVRO;

Tables have now been provisioned along with additional pipelines. Start the pipelines to populate the tables.

START ALL PIPELINES;

Your tables have now been populated! Retrieve the count and double check your records.

SELECT COUNT(*) FROM ride_data; -- 1000 rows expected
SELECT COUNT(*) FROM user_data; -- 100 rows expected

Congratulations! You have successfully migrated an Amazon MySQL RDS instance to SingleStore using a connection link.

cleanupCleanup

If you decide you no longer want to keep the AWS resources, you can run the following script to teardown the resources within AWS:

./scripts/teardown.sh

If you decide you no longer want to keep the SingleStore resources, spin down the workspace, database and workspace group.

In this blog, you provisioned an Amazon RDS MySQL instance and updated the configurations to support an integration to SingleStore. The updates included:

  1. Enabling binlog
  2. Updating the security group for SingleStore inbound access
  3. Ensuring primary keys are included in the tables

Following data integration, you may consider migrating over to SingleStore entirely. One example of a customer who migrated from MySQL to SingleStore is 6sense. Because of this choice, 6sense reaped the benefits of having 5x lower TCO, 5x faster performance and 5x faster time to market with SingleStore for its real-time analytics use cases.

What were some of the tradeoffs between MySQL and SingleStore that 6sense took into consideration? Check out a breakdown of key metrics between the two products here.

Try SingleStore free.


Share