Bi-Directional Integration for Apache Iceberg

buituanh2002 · February 17, 2025, 9:54am

Announcing: Bi-Directional Integration for Apache Iceberg

I found this quite interesting—being able to write to Iceberg tables. But when will the things mentioned in that blog become reality?
Can’t wait for that

akoller · February 18, 2025, 10:48pm

We’re excited about the potential of writing to Iceberg tables as well. Select customers are currently previewing this feature. I’d love to hear more about your use case—what are you looking to achieve, and what’s top of mind for you?

buituanh2002 · February 19, 2025, 2:02am

I’m particularly interested in whether this feature can handle massive amounts of data—potentially terabytes or even petabytes. How does it manage metadata when dealing with numerous write operations to Iceberg tables, especially at scale?

Also, do you have plans to release this feature in the near future? And when can we expect support for S3-compatible storage or Hive metadata?

Our case is to use SingleStore as an ingestion method for transforming raw data into Iceberg tables. From what I understand, Spark is still one of the most common choices for this task, but we’ve found SingleStore Pipelines to be highly efficient and effective for importing large volumes of data daily. Once SingleStore writes data in Iceberg format, we’d like to leverage it for querying and analysis—either using SingleStore itself or other engines like Presto.

Would love to hear your thoughts!

akoller · February 20, 2025, 3:07am

We can definitely meet the scale you’re describing. I’d be happy to walk you through how we approach reaching high scales, as well as discuss our roadmap for S3-compatible storage and Hive metadata support.

Let’s set up some time to go over your use case in more detail—feel free to reach out to me at akoller@singlestore.com, and we can coordinate a time that works for you. Looking forward to the conversation!