Apache Iceberg has emerged as a powerful open table format that enables organizations to store, manage and access large-scale datasets efficiently across multiple query engines.

By decoupling storage from compute and supporting ACID transactions, Iceberg provides a foundation for data interoperability, allowing businesses to leverage multiple platforms to meet different analytical needs.
As organizations increasingly adopt hybrid and multi-cloud architectures, they require flexible solutions that integrate seamlessly with various data platforms. In this blog, we explore how Snowflake and SingleStore Helios® can work together using Iceberg to optimize data processing and analytics.
How SingleStore supports Iceberg: Key use cases
SingleStore provides native support for Apache Iceberg, enabling organizations to use it as a high-performance analytics engine for Iceberg-managed data. Some key use cases include:
- Acceleration for BI and customer-facing analytics. SingleStore’s distributed SQL engine delivers ultra-low latency and high concurrency, making it an ideal layer for dashboards and external-facing applications that need real-time access to data.
- Low-latency application layer. SingleStore's real-time ingestion and fast query execution make it a perfect fit for serving data directly to applications that require millisecond-level response times, including recommendation engines, fraud detection systems and interactive analytics applications.
Step-by-step guide: Using Snowflake and SingleStore for fast analytics
In this scenario, Snowflake acts as the data owner, storing Iceberg tables backed by S3 storage in AWS while SingleStore Helios, our cloud managed service, accelerates analytics by ingesting the data into its high-performance engine.
This is what our high-level architecture looks like:
.png?width=1024&disable=upscale&auto=webp)
Here is a step-by-step guide one how to set this up (note that parts of this tutorial are referencing Snowflake's Iceberg guide — see more here):
AWS set up. First, we need to create an S3 bucket to hold the Iceberg data and metadata, it needs to be in the same region as your Snowflake account.
Create your S3 bucket. Sign in to your AWS console. Make sure you are positioned in the same region as your Snowflake account, go to S3 management and hit the “Create bucket” button. Give your bucket a name, and leave all other settings as they are and create the bucket
Create an IAM policy. Go to IAM management, click on “Account settings” under “Access management” and make sure the STS status is active for your AWS region. Then click on the “Policies” under “Access management” and click the “Create policy” button. Use the JSON policy editor, and paste in the following:
1
{2
"Version": "2012-10-17",3
"Statement": [4
{5
"Effect": "Allow",6
"Action": [7
"s3:PutObject",8
"s3:GetObject",9
"s3:GetObjectVersion",10
"s3:DeleteObject",11
"s3:DeleteObjectVersion"12
],13
"Resource": "arn:aws:s3:::<my_bucket>/*"14
},15
{16
"Effect": "Allow",17
"Action": [18
"s3:ListBucket",19
"s3:GetBucketLocation"20
],21
"Resource": "arn:aws:s3:::<my_bucket>",22
"Condition": {23
"StringLike": {24
"s3:prefix": [25
"*"26
]27
}28
}29
}30
]31
}
Here, replace "<my_bucket>" with the name of your S3 bucket. This creates a policy with permissions to read and write data to the bucket (Snowflake will use this to manage data in Iceberg). Give your policy a name and create it.
Create an IAM role. Next, go to “Roles” under “Access management” and hit the “Create role” button. Select “AWS account”, “This account” and the “Require external ID” option. Input a descriptor here, like “iceberg_external_table_id”.
.png?width=1024&disable=upscale&auto=webp)
Click “Next”, select the policy you created in the previous step, give your role a name and create it. Make a note of your new role’s ARN (Amazon Resource Name), since you will need it in the next step.
Snowflake set up. We will leverage Snowflake’s TPC-H sample dataset for this demo.
Create external volume. The first step here is to create an external volume pointing to your S3 bucket where the data resides. This allows Snowflake to interact with Iceberg data on S3. Make sure to input the bucket name, your AWS role ARN and the correct external ID you used in previous steps.
1
CREATE OR REPLACE EXTERNAL VOLUME iceberg_external_volume2
STORAGE_LOCATIONS =3
(4
(5
NAME = 'my-s3-us-west-2'6
STORAGE_PROVIDER = 'S3'7
STORAGE_BASE_URL = 's3://<my_bucket>/'8
STORAGE_AWS_ROLE_ARN = '<my-arn>'9
STORAGE_AWS_EXTERNAL_ID = 'iceberg_table_external_id'10
)11
);
Then, run the following command to retrieve Snowflake’s ARN:
1
DESC EXTERNAL VOLUME iceberg_external_volume;
Record the “STORAGE_AWS_IAM_USER_ARN” property. Go back to your AWS console, find the role you create in IAM management and edit the “Trust policy” (under the “Trust relationships” tab).
1
{2
"Version": "2012-10-17",3
"Statement": [4
{5
"Sid": "",6
"Effect": "Allow",7
"Principal": {8
"AWS": "<snowflake_user_arn>"9
},10
"Action": "sts:AssumeRole",11
"Condition": {12
"StringEquals": {13
"sts:ExternalId": "<iceberg_table_external_id>"14
}15
}16
}17
]18
}
Replace the values in brackets above with your Snowflake ARN (from the external volume) and the External ID you used previously to define your AWS role.
b. Create Iceberg table. We then create Iceberg tables in Snowflake using the CREATE ICEBERG TABLE
syntax and load the data. This will load data for two entities — customers (through an INSERT statement) and nations (through a CTAS statement).
1
CREATE OR REPLACE ICEBERG TABLE customer_iceberg (2
c_custkey INTEGER,3
c_name STRING,4
c_address STRING,5
c_nationkey INTEGER,6
c_phone STRING,7
c_acctbal INTEGER,8
c_mktsegment STRING,9
c_comment STRING10
)11
CATALOG = 'SNOWFLAKE'12
EXTERNAL_VOLUME = 'iceberg_external_volume'13
BASE_LOCATION = 'customer_iceberg';14
15
INSERT INTO customer_iceberg16
SELECT * FROM snowflake_sample_data.tpch_sf1.customer;17
18
CREATE OR REPLACE ICEBERG TABLE nation_iceberg (19
n_nationkey INTEGER,20
n_name STRING21
)22
BASE_LOCATION = 'nation_iceberg'23
AS SELECT24
N_NATIONKEY,25
N_NAME26
FROM snowflake_sample_data.tpch_sf1.nation;
c. Configure catalog. In this case, Snowflake will also serve as the Iceberg catalog, so we need to configure that:
1
ALTER DATABASE iceberg_tutorial_db SET CATALOG = 'SNOWFLAKE';2
ALTER DATABASE iceberg_tutorial_db SET EXTERNAL_VOLUME = 'iceberg_external_volume';
3. SingleStore set up. Now that we have set up the Snowflake side, we can continue with configuring SingleStore. The first step is to create the two tables (customer
and nation
) in your Helios organization:
1
-- create the demo database2
CREATE DATABASE iceberg_demo;3
USE iceberg_demo;4
5
-- create the customer table6
CREATE TABLE customer (7
C_CUSTKEY INTEGER,8
C_NAME TEXT,9
C_ADDRESS TEXT,10
C_NATIONKEY INTEGER,11
C_PHONE TEXT,12
C_ACCTBAL INTEGER,13
C_MKTSEGMENT TEXT,14
C_COMMENT TEXT,15
PRIMARY KEY(C_CUSTKEY)16
);17
18
-- create the nation table19
CREATE TABLE nation (20
N_NATIONKEY INTEGER,21
N_NAME TEXT,22
PRIMARY KEY(N_NATIONKEY)23
);
4. We will use SingleStore’s pipelines feature to load data into our tables. To do this, we enable Iceberg ingest via a global variable, and set the timeout for pipelines.
1
SET GLOBAL enable_iceberg_ingest = ON;2
SET GLOBAL pipelines_extractor_get_offsets_timeout_ms = 600000;
5. Once this is configured, we create our pipelines using the following code. Make sure you replace configuration details in brackets, like “<your-snowflake-database>” with actual names of those resources. Also, this assumes you have a user created in Snowflake that can access the warehouse, database, schema and tables in Snowflake, and an AWS access key and secret that SingleStore will use to access the S3 bucket with the Iceberg data.
1
-- create the pipeline to load customer data2
CREATE OR REPLACE PIPELINE customer_iceberg_pipeline AS3
LOAD DATA S34
'<your-snowflake-database>.<snowflake-schema>.<snowflake-customer-table-name>'5
CONFIG '{"region": "<your-snowflake-region>",6
"catalog_type": "SNOWFLAKE",7
"catalog.uri":8
"jdbc:snowflake://<your-snowflake-account>.snowflakecomputing.com",9
"catalog.jdbc.user":"<your-snowflake-user>",10
"catalog.jdbc.password":"<your-snowflake-password>",11
"catalog.jdbc.role":"<your-snowflake-role>"}'12
CREDENTIALS '{"aws_access_key_id": "<your-S3-key>",13
"aws_secret_access_key": "<your-S3-secret>"}'14
REPLACE INTO TABLE customer (15
C_CUSTKEY <- C_CUSTKEY,16
C_NAME <- C_NAME,17
C_ADDRESS <- C_ADDRESS,18
C_NATIONKEY <- C_NATIONKEY,19
C_PHONE <- C_PHONE,20
C_ACCTBAL <- C_ACCTBAL,21
C_MKTSEGMENT <- C_MKTSEGMENT,22
C_COMMENT <- C_COMMENT23
)24
FORMAT ICEBERG;25
26
-- create the pipeline to load nation data27
CREATE OR REPLACE PIPELINE nation_iceberg_pipeline AS28
'<your-snowflake-database>.<snowflake-schema>.<snowflake-nation-table-name>'29
CONFIG '{"region": "<your-snowflake-region>",30
"catalog_type": "SNOWFLAKE",31
"catalog.uri":32
"jdbc:snowflake://<your-snowflake-account>.snowflakecomputing.com",33
"catalog.jdbc.user":"<your-snowflake-user>",34
"catalog.jdbc.password":"<your-snowflake-password>",35
"catalog.jdbc.role":"<your-snowflake-role>"}'36
CREDENTIALS '{"aws_access_key_id": "<your-S3-key>",37
"aws_secret_access_key": "<your-S3-secret>"}'38
REPLACE INTO TABLE nation (39
N_NATIONKEY <- N_NATIONKEY,40
N_NAME <- N_NAME41
)42
FORMAT ICEBERG;
6. You can now test and start your pipelines:
1
-- test and start the customer pipeline2
TEST PIPELINE customer_iceberg_pipeline;3
START PIPELINE customer_iceberg_pipeline;4
5
-- test and start the nation pipeline6
TEST PIPELINE nation_iceberg_pipeline;7
START PIPELINE nation_iceberg_pipeline;
7. Data is now being ingested into SingleStore, and you can query it and compare the results with Snowflake:
1
SELECT COUNT(*) FROM customer;2
SELECT * FROM customer;3
4
-- you can now power your customer facing analytics with SingleStore, and run5
-- queries like this with high concurrency and low latency:6
SELECT c.c_name AS customer_name7
FROM customer c8
INNER JOIN nation n ON c.c_nationkey = n.n_nationkey9
WHERE n.n_name = 'UNITED KINGDOM' and c.c_mktsegment = 'BUILDING'10
LIMIT 15;11
12
SELECT COUNT(*) FROM nation;13
SELECT * FROM nation;
Try SingleStore free
By leveraging Snowflake and Iceberg for storage and SingleStore as an acceleration layer, organizations can achieve the best of both worlds — cost-effective data warehousing and high-performance analytics. Iceberg acts as the glue, ensuring interoperability and flexibility in modern data architectures.
You can also see the configuration and set up of this architecture is fairly simple, and involves only a few steps in SingleStore — creating the tables and pipelines to load data, just as with S3, Kafka or any other data source.
Ready to try it yourself? Start free with SingleStore today.