We’re guiding you through how to integrate Presto, a distributed query engine for SQL users with SingleStoreDB — complete with a deep dive into architecture, installation, queries and more.
What Is Presto?
Presto is a distributed query engine for big data that uses SQL query language. Its architecture enables users to query data sources like Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata — and allows use of multiple data sources.
Simply put, Presto offers compute that runs on top of storage.
Presto architecture
What Is SingleStoreDB?
SingleStoreDB is a real-time, distributed SQL database. With familiar SQL tooling and MySQL wire protocol compatibility, SingleStoreDB eliminates the need for specialized databases and simplifies database architectures.
SingleStoreDB is also built to handle multiple data types (including JSON, time-series, geospatial and full-text search) — delivering high-speed data ingest on a unified transactional and analytical foundation.
SingleStoreDB architecture
Using Presto with SingleStoreDB
Presto is more performant when integrated with SingleStoreDB for a few key reasons:
- Similar to SingleStoreDB, Presto supports supports in-memory processing
- Presto is a pull model
- Like SingleStoreDB, Presto supports columnar storage and execution in its query engine
- Presto supports multi-level caching
Sample use case
Now, let’s take a look and installing and running Presto with SingleStoreDB.
Presto Installation
Presto on EC2 Amazon Linux:
First elevate yourself to root
sudo su
Then update yum:
yum update -y
Now, install OpenJDK for Amazon
yum install java-11-amazon-corretto.x86_64
Check that Java 11 is correctly installed
java --version
Install the Presto binaries
Download the Presto release binaries into the EC2 instance
wget https://repo.maven.apache.org/maven2/io/prestosql/presto-server/330/presto-server-330.tar.gz
Extract the archive to a directory named presto-server-330
tar xvzf presto-server-330.tar.gz
Configure Presto and add a data source
Let’s provide a set of configuration files in presto-server-330/etc , add a data source and start the Presto daemon:
- Presto logging configuration etc/config.properties
- Presto node configuration etc/node.properties
- JVM configuration etc/jvm.config
- Catalog properties file for the TPC-H connector
Create the etc directory in presto-server-330
cd presto-server-330
mkdir etc
Then create the three files:
etc/config.properties
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8081
query.max-memory=5GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
`discovery.uri=http://172.31.21.146:8081`
etc/node.properties
node.environment=demo
etc/jvm.config
-server
-Xmx4G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-Djdk.nio.maxCachedBufferSize=2000000
-Djdk.attach.allowAttachSelf=true
etc/catalog/mysql.properties (if the catalog folder is not found, manually create it)
connector.name=mysql
connection-url=jdbc:mysql://localhost:3306
connection-user=root
connection-password=Singlestore@123
Run Presto
Let’s start Presto! Begin as a foreground process:
bin/launcher run
In the previous function you should see following line:
INFO main io.prestosql.server.PrestoServer ======== SERVER STARTED
This indicates that you have a running instance of Presto.
You can access the Presto UI at http://{ec2-public-ip}:8081
SingleStoreDB Installation
You can deploy SingleStoreDB using any of the methods listed in our deployment documentation.
Here is a sample test result using Presto with SingleStoreDB:
[ec2-user@ip-172-31-21-146 presto-server-330]$ ./presto --server 172.31.21.146:8081 --catalog mysql --schema test
presto:test> show catalog;
Query 20230110_130053_00002_7qnha failed: line 1:6: mismatched input 'catalog'. Expecting: 'CATALOGS', 'COLUMNS', 'CREATE', 'CURRENT', 'FUNCTIONS', 'GRANTS', 'ROLE', 'ROLES', 'SCHEMAS', 'SESSION', 'STATS', 'TABLES'
show catalog
presto:test> show catalogs;
Catalog
---------
mysql
system
(2 rows)
Query 20230110_130101_00003_7qnha, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
185ms [0 rows, 0B] [0 rows/s, 0B/s]
presto:test> show schemas from mysql;
Schema
--------------------
cluster
information_schema
memsql
test
(4 rows)
Query 20230110_130111_00004_7qnha, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
167ms [4 rows, 55B] [23 rows/s, 329B/s]
presto:test> use mysql
-> ;
Query 20230110_130307_00005_7qnha failed: Schema does not exist: mysql.mysql
presto:test> use test;
USE
presto:test> create table presto(id int);
CREATE TABLE
presto:test> insert into presto values (222);
INSERT: 1 row
Query 20230110_130420_00010_7qnha, FINISHED, 1 node
Splits: 35 total, 35 done (100.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]
Get Started Today with SingleStoreDB
In addition to integrating seamlessly with Presto, SingleStoreDB also works with a variety of analytics and BI tools, ETL platforms, security and governance tools, and monitoring technology. To see the full capabilities of SingleStoreDB integrations, get started with a free trial today.