How We Built Our Real-Time Digital Marketing App

Clock Icon

7 min read

Pencil Icon

Aug 27, 2024

SingleStore’s real-time digital marketing demo combines both OLTP and OLAP workloads, and is a great example of how to build real-time hybrid transactional and analytical processing (HTAP) applications on top of SingleStore.

How We Built Our Real-Time Digital Marketing App

In this blog, we’ll provide a high-level overview of the app, diving deeper into details  around the key features in SingleStore that make building this type of application possible — with one database.

app-overviewApp overview

This demo simulates a fictional telco with millions of subscribers, serving targeted offers to customers in real time. Targeted advertisements are given based on demographics, geo location, purchase history and much more. SingleStore is evaluating 80 million ad-serving opportunities,  simulating 30,000 to 50,000 ads and offers — all in real time — to cell phones. This digital marketing app sends out notifications of offers based on matching algorithms and complex customer segmentation on real-time and batch data.

Additionally, there is an operational analytics side of the application showing real-time conversion metrics for advertisers, among other statistics. The following is a picture of one of the dashboards presented in the app.

Being able to handle both OLTP and OLAP workloads in one database with a variety of different data types is one of SingleStore’s key features. Let’s take a look at how this is done within SingleStore by looking at the key features used to power this application.

schema-designSchema design

SingleStore has three different table tables with differing strengths that can be used to optimize performance for HTAP apps.

universal-storageUniversal Storage

SingleStore default table type is its patented Universal Storage table. This combines SingleStore’s three storage layers: memory, disk and object storage. When writing to this table type, data first lands in memory and is immediately available to query. Because data is written to memory first, we achieve extremely high throughput. Data is then flushed to the on-disk columnar store.

In this layer, data is typically compressed by 75-80%, allowing for performant scans of the data whilst pulling out one row via sub-segment access and hash indexes. In addition to this, column group indexes can be used to provide additional performance boosts on OLTP queries. These performance characteristics of our Universal Storage table make it performant in both OLTP and OLAP workloads.

In the demo, we have used this table type for our heavy insert tables used for OLAP-type queries — for example, the locations table where data is continuously appended. Here is the DDL for that table:

CREATE TABLE IF NOT EXISTS locations (
city_id BIGINT NOT NULL,
subscriber_id BIGINT NOT NULL,
ts DATETIME(6) NOT NULL SERIES TIMESTAMP,
lonlat GEOGRAPHYPOINT NOT NULL,
-- open location code length 8 (275m resolution)
olc_8 TEXT NOT NULL,
SHARD KEY (city_id, subscriber_id),
SORT KEY (ts),
KEY (city_id, subscriber_id) USING HASH,
KEY (olc_8) USING HASH
);

You may have noticed in the preceding create table DDL, a shard key and sort key are defined. These are two important key considerations in SingleStore, and will influence performance. The shard key is used to control how rows are distributed across partitions within SingleStore, while the sort key determines how data is ordered within each partition — and will impact the amount of data that gets scanned to run your queries.

Moreover, the locations table exemplifies SingleStore's multi-model capabilities by integrating relational, geo-spatial and time-series data. This ability to handle structured, semi-structured and unstructured data types within a single database underscores SingleStore's versatility and power in managing diverse workloads, eliminating the need for multiple specialized databases.

rowstore-tablesRowstore tables

Although Universal Storage can handle OLTP workloads well, our in-memory rowstore table is available when you need even more performance. Because it is in-memory and stored in a row format, this table type is ideal for high concurrency OLTP workloads with heavy updates and deletes. This is why it was chosen as the table type for our subscribers_last_notification table, where the majority of the write workload consists of updates. The following is the DDL used to create that table:

CREATE ROWSTORE TABLE IF NOT EXISTS subscribers_last_notification (
city_id BIGINT NOT NULL,
subscriber_id BIGINT NOT NULL,
last_notification DATETIME(6),
PRIMARY KEY (city_id, subscriber_id),
INDEX (last_notification)
);

reference-tablesReference tables

The final table type available in SingleStore is our reference tables, which are great for dimension-type tables that frequently get joined to large fact tables. A full copy of the table gets stored on each leaf node, eliminating the need to move data across the network when joining these tables.

The cities table found in this demo illustrates the power of reference tables: 

CREATE ROWSTORE REFERENCE TABLE IF NOT EXISTS cities (
city_id BIGINT NOT NULL PRIMARY KEY,
city_name TEXT NOT NULL,
center GEOGRAPHYPOINT NOT NULL,
diameter DOUBLE
);

This lightweight table houses four columns, and is very important to the expansion of the application. Notice each column represents data that won’t be changing any time soon, making it perfect for tables that are supposed to remain static across every node. For example, joining the locations table to this table will result in local joins at the leaf node level, as no data will need to move across the network to complete the join.

ingestIngest

To simulate subscribers walking around the city and making purchases, we’re creating synthetic data using a simulator written in Golang. This synthetic data is written out into parquet files and uploaded into AWS S3 buckets.

From here, ingesting this data into SingleStore is handled via Pipelines, which are designed for large-scale, parallel data ingestion. SingleStore Pipelines are capable of handling both real-time and batch data sources, like Kafka for streaming data and S3 for batch data stored in files. This flexibility ensures that regardless of how your data is generated or stored, it can be efficiently ingested into SingleStore.

Creating a pipeline in SingleStore is straightforward, with a simple syntax that allows developers to quickly set up data ingestion processes. Here is  one of the pipeline statements used in the demo. You can see the base of the pipeline statements includes the location of the data, passing some credentials then adding where you would like to point the pipeline to:

CREATE OR REPLACE PIPELINE locations
AS LOAD DATA S3 'singlestore-realtime-digital-marketing/locations.*'
CREDENTIALS '{}'
CONFIG '{ "region": "us-east-1" }'
INTO PROCEDURE process_locations FORMAT PARQUET (
subscriber_id <- subscriberid,
offset_x <- offsetX,
offset_y <- offsetY
);

Pipelines usually point to a table where data is directly inserted/updated. However, if you want to add additional logic to your writes, you can point the pipeline to a stored procedure — which is what we’re leveraging in this demo to write to multiple tables from one pipeline.

The following is the procedure the locations pipelines are writing to. We can see the procedure is updating the current location of the user in the subscribers table, and also inserted the location data into the locations table.

DELIMITER //
CREATE OR REPLACE PROCEDURE process_locations (
_batch QUERY(
subscriber_id BIGINT NOT NULL,
offset_x DOUBLE NOT NULL,
offset_y DOUBLE NOT NULL
)
)
AS
DECLARE
_expanded QUERY(city_id BIGINT, subscriber_id BIGINT, lonlat GEOGRAPHYPOINT) = SELECT
city_id, subscriber_id,
GEOGRAPHY_POINT(
GEOGRAPHY_LONGITUDE(center) + (offset_x * diameter),
GEOGRAPHY_LATITUDE(center) + (offset_y * diameter)
) AS lonlat
FROM _batch, cities;
BEGIN
INSERT INTO subscribers (city_id, subscriber_id, current_location)
SELECT city_id, subscriber_id, lonlat
FROM _expanded
ON DUPLICATE KEY UPDATE current_location = VALUES(current_location);
INSERT INTO locations (city_id, subscriber_id, ts, lonlat, olc_8)
SELECT
city_id,
subscriber_id,
now(6) AS ts,
lonlat,
encode_open_location_code(lonlat, 8) AS olc_8
FROM _expanded;
END //

The ingest side of this demo shows how SingleStore efficiently manages both streaming and batch data ingestion, thanks to its three-tiered architecture. It also handles update and delete operations with high performance, making it an ideal solution for use cases involving heavy ingestion workloads and rapidly changing, dynamic data.

queriesQueries

Our real-time digital marketing app also showcases the power of operational analytics in serving targeted ads and offers to users based on their behavior and real-time location. The app utilizes three key components:

  1. Offers. Offers are stored in a table containing details like offer ID, vendor, description and expiration.
  2. Segments. Segments are defined in an in-memory table, allowing for super fast lookups of customer characteristics
  3. Matching.  The matching process, which runs in under 50 milliseconds, combines these elements with real-time customer data to determine which offers are most relevant and sends out notifications in real time. This is achieved through a sophisticated SQL query that joins customer information, segments and offers, applying complex filtering and ranking logic to select the best matches with minimal latency — all while running in under 100 milliseconds.

The app's frontend is built as a single-page web application using React, connecting to SingleStore's data API for seamless data retrieval. The application allows users to select a city of interest, and see the offers being sent out on a map in real time. In addition, the app displays metrics for query runtimes, displays the amount of data ingested and shows how many subscribers we’re reaching and offers we’re sending out.

SingleStore also provides an analytics tab, showing how the offers are performing. This page shows us both the gross and net conversion rate in real time, as well as a sortable dashboard that displays individual company performance. The dashboard can be filtered to sort the companies in ascending or descending order based on their conversion rates, total conversions or total notifications.

This is all powered by SingleStore’s data API, which enables direct SQL queries from the browser, eliminating the need for (and latency from) a separate backend server. This architecture allows for real-time updates and interactions, like displaying customer locations on a map and showing offer details.

build-your-real-time-applications-with-single-storeBuild your real-time applications with SingleStore

The app demonstrates SingleStore's capability to handle both transactional and analytical workloads in a single system, evaluating millions of ad-serving opportunities and simulating thousands of ads and offers in real time. This unified approach to operational analytics enables businesses to make data-driven decisions, and deliver personalized experiences at scale.

Want to build your own real-time application? Start free with SingleStore. 


Share