5 Steps to Migrate From MySQL to SingleStore

Clock Icon

6 min read

Pencil Icon

Nov 25, 2024

As your business grows, you might start hitting the limitations of single-node databases like MySQL — especially when dealing with large datasets and the need for real-time analytics. But migrating to a more powerful database doesn't have to be a complex, daunting task.

5 Steps to Migrate From MySQL to SingleStore

Enter SingleStore's Change Data Capture (CDC) through pipelines — a feature that allows you to replicate your MySQL data into SingleStore in real time. With CDC pipelines, you can seamlessly migrate your data without significant downtime or complexity.

In this post, we'll show you how to use MySQL CDC through pipelines to migrate from MySQL to SingleStore in five easy steps:

  1. Prepare your environment. Set up your SingleStore workspace and access to your MySQL instance.
  2. Configure MySQL for replication. Enable binary logging, create the necessary databases and set up a replication user in MySQL.
  3. Establish a connection between MySQL and SingleStore. Create a link using SingleStore's CDC pipelines to connect to your MySQL database.
  4. Migrate data using CDC pipelines. Create tables in SingleStore with inferred schemas and start data ingestion.
  5. Verify and monitor the migration. Validate data integrity and monitor real-time updates.

By the end of this guide, you'll have a fully functioning SingleStore database that's synchronized with your MySQL instance, ready to handle high-performance workloads and provide real-time analytics — all made possible by using MySQL CDC through pipelines.

why-choose-single-store-over-my-sqlWhy choose SingleStore over MySQL?

You might be wondering, why should I migrate from MySQL to SingleStore? Let's break it down.

my-sql-vs-single-store-a-comparisonMySQL vs. SingleStore: A comparison

FeatureMySQLSingleStore
ArchitectureSingle-node or manual sharding; limited distributed capabilitiesFully distributed, scale-out architecture with built-in sharding and rebalancing
ScalabilityLimited horizontal scalability; complex sharding required for large datasetsHigh scalability; designed to handle terabytes to petabytes of data effortlessly
Data ingestion speedSlower ingestion rates; not optimized for high-throughput dataHigh-throughput parallel data ingestion with real-time capabilities
Analytical performanceOptimized for OLTP; struggles with complex analytical queriesHybrid OLTP and OLAP capabilities; optimized for real-time analytics and complex queries
Storage formatRow-based storage, less efficient for read-heavy analytical workloadsSupports both rowstore and columnstore formats for optimal performance based on workload
Real-time analyticsLimited real-time analytics; often requires additional tools or data warehousesBuilt-in support for real-time analytics with rapid query execution
Vector storage and searchNot natively supported; requires external systemsNative support for vector data types and similarity searches, beneficial for AI/ML applications
Cost efficiencyHidden costs due to additional tools, manual scaling efforts and maintaining multiple databasesCost-effective scaling; reduces the need for multiple specialized databases, lowering total costs
ComplexityIncreased complexity when integrating with other systems for analytics and scalingSimplifies architecture by consolidating workloads into a single, unified platform
Data compressionLimited compression capabilities, leading to higher storage costsAdvanced compression techniques reduce storage costs and improve performance

prerequisitesPrerequisites

Before we dive in, make sure you have the following ready:

  • SingleStore workspace. Sign up for a free SingleStore Helios® account and create a workspace.
  • MySQL instance. Access to your source MySQL database.
  • SingleStore Notebooks. An interactive environment within SingleStore for running SQL commands.
  • MySQL user with replication privileges. A user account in MySQL configured for replication.
  • Network access. Ensure your SingleStore workspace can connect to your MySQL instance.
  • Primary keys on all source tables. All tables in your MySQL database must have primary keys.

step-1-prepare-your-environmentStep 1. Prepare Your Environment

set-up-your-single-store-workspaceSet up your SingleStore workspace

  • Create a workspace. If you haven't already, sign up for SingleStore Helios and create a new workspace.
  • Access Notebooks. Navigate to the Develop tab and open Notebooks. This is where we'll execute our SQL commands.

verify-access-to-your-my-sql-instanceVerify access to your MySQL instance

Ensure you have the necessary credentials and network access to your MySQL database. You'll need this in the upcoming steps.

step-2-configure-my-sql-for-replicationStep 2. Configure MySQL for replication

Time to get your MySQL instance ready for migration.

1. Enable binary logging. Modify your MySQL configuration file (my.cnf or my.ini) to include:

log-bin = mysql-bin
binlog_format = ROW
binlog_row_image = FULL

Restart your MySQL server to apply these changes.

2. Create the singlestore database. This is required for internal usage by SingleStore.

CREATE DATABASE IF NOT EXISTS singlestore;

Note: Do not replicate the singlestore database; it's used for metadata and internal operations.

3. Set up a replication user

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'Password@123';
GRANT CREATE, INSERT, DELETE, DROP, SELECT ON singlestore.* TO 'repl_user'@'%';
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'repl_user'@'%';
GRANT SELECT ON *.* TO 'repl_user'@'%';

Ensure the user uses the mysql_native_password authentication plugin. This user will be used by SingleStore to connect to your MySQL instance.

4. Ensure all tables have primary keys. CDC requires primary keys to track changes accurately.

5. Check network accessibility. Make sure your SingleStore workspace can connect to your MySQL instance. Adjust firewalls and security groups as needed.

step-3-establish-a-connection-between-my-sql-and-single-storeStep 3. Establish a connection between MySQL and SingleStore

Now, let's bridge the gap between MySQL and SingleStore using Change Data Capture (CDC) pipelines.

1. Create a database in SingleStore. In your SingleStore Notebook, run:

DROP DATABASE IF EXISTS mysql_replicate_cdc;
CREATE DATABASE IF NOT EXISTS mysql_replicate_cdc;

Select this database for subsequent operations.

2. Ensure the database is empty

SHOW TABLES;

You should see that no tables are present.

3. Create a link to MySQL

CREATE LINK mysql_replicate_link AS MYSQL
CONFIG '{
"database.hostname": "<your_mysql_host>",
"database.port": 3306,
"database.ssl.mode": "required"
}'
CREDENTIALS '{
"database.user": "repl_user",
"database.password": "Password@123"
}';
  • Replace <your_mysql_host> with your actual MySQL host
  • The database.ssl.mode parameter specifies the SSL mode for the connection
  • Note: Do not include table.include.list or database.exclude.list in the CREATE LINK statement; these parameters are specified later

4. Verify the link

SHOW LINKS;

You should see mysql_replicate_link listed.

step-4-migrate-data-using-cdc-pipelinesStep 4. Migrate data using CDC pipelines

Here's where the magic happens.

option-1-replicate-a-single-tableOption 1. Replicate a single table

1. Create the your_table table with inferred schema

CREATE TABLE your_table AS
INFER PIPELINE AS
LOAD DATA LINK mysql_replicate_link "<your_database>.<your_table>"
FORMAT AVRO;

Replace <your_database> and <your_table> with your MySQL database and table names.

This command:

  • Creates a your_table table in SingleStore
  • Infers the schema from your MySQL table
  • Sets up a CDC pipeline for real-time data ingestion

2. Inspect the created table

DESCRIBE your_table;

Ensure the schema matches your expectations.

option-2-replicate-multiple-tablesOption 2. Replicate multiple tables

If you want to replicate all tables from your MySQL database:

1. Create tables and pipelines for all tables

CREATE TABLES AS INFER PIPELINE AS
LOAD DATA LINK mysql_replicate_link "*"
FORMAT AVRO;

This command will create tables and pipelines for all tables in your MySQL database.

2. Filter specific tables or databases (optional)

To include only specific tables or databases, specify parameters in the CONFIG clause:

CREATE TABLES AS INFER PIPELINE AS
LOAD DATA MYSQL "*"
CONFIG '{
"database.include.list": "<your_database>",
"table.include.list": "<your_database>.<your_table>"
}'
CREDENTIALS '{
"database.user": "repl_user",
"database.password": "Password@123"
}'
FORMAT AVRO;

Note: The table.include.list and database.include.list parameters are supported in the CREATE TABLES AS INFER PIPELINE statement, not in the CREATE LINK statement.

start-data-ingestionStart data ingestion

1. Start the pipelines

START ALL PIPELINES;

2. Monitor the pipelines

SHOW PIPELINES;

The status should show as Running.

3. Verify initial data

SELECT COUNT(*) FROM your_table;
SELECT * FROM your_table LIMIT 5;

You should see data flowing in from MySQL.

step-5-verify-and-monitor-the-migrationStep 5. Verify and monitor the migration

Let's make sure everything is working as it should.

validate-data-integrityValidate data integrity

1. Insert new data into MySQL. In your MySQL client, run:

INSERT INTO <your_database>.<your_table> (columns...)
VALUES (...);

Add new records to test real-time replication.

2. Check for new data in SingleStore. Back in SingleStore, run:

SELECT COUNT(*) FROM your_table;
SELECT * FROM your_table ORDER BY <primary_key_column> DESC LIMIT 5;

The new records should appear almost instantly,thanks to CDC pipelines.

monitor-real-time-data-updatesMonitor real-time data updates

1. Check CDC pipeline status

SELECT SOURCE_PARTITION_ID,
EARLIEST_OFFSET,
LATEST_OFFSET,
LATEST_EXPECTED_OFFSET - LATEST_OFFSET AS DIFF,
UPDATED_UNIX_TIMESTAMP
FROM information_schema.pipelines_cursors;

This helps you see if there are any lags.

2. Look for pipeline errors

SELECT * FROM information_schema.pipelines_errors;

Ensure there are no errors.

3. Review pipeline performance

SELECT * FROM information_schema.pipelines_batches_summary;

Get insights into batch sizes and ingestion rates.

get-started-with-single-storeGet started with SingleStore

Migrating from MySQL to SingleStore doesn't have to be complicated. By leveraging MySQL CDC through pipelines, you can easily replicate your data in real time, all while guaranteeing data consistency and minimal downtime.

key-points-to-rememberKey points to remember

  • All source tables must have primary keys. CDC requires primary keys to track changes accurately
  • Do not alter source tables after starting CDC. Avoid running ALTER TABLE statements on the source MySQL database once the CDC operation has started
  • Use supported parameters appropriately. Parameters like table.include.list are specified in the CREATE TABLES AS INFER PIPELINE statement, not in the CREATE LINK statement

why-single-storeWhy SingleStore?

  • Performance at scale. Handle massive datasets effortlessly, and experience high-throughput data ingestion and lightning-fast queries
  • Real-time analytics. Get immediate insights without impacting transactional workloads
  • Simplified architecture. Replace multiple specialized databases with a unified platform
  • Cost efficiency. Lower your total cost of ownership by consolidating systems
  • Future-proofing. Stay ahead with built-in support for modern data types, including vectors for AI and machine learning applications

Ready to take the leap? Start your migration to SingleStore today and empower your business with real-time analytics and unparalleled performance — all made possible through MySQL CDC pipelines.

For more information, visit the SingleStore Documentation or contact our support team for assistance.


Share