Best method to connect DataBricks to SingleStore db in AWS?

wrachilla1 · May 23, 2022, 7:05pm

Hi, I need to connect my DataBricks notebooks to a native SingleStore 3 leaf instance in AWS. Is there any documentation on setting up the connection?

MariaSilverhardt · May 23, 2022, 8:19pm

Welcome Wrachilla1!

Can you take a look at this and let me know if it is helpful for you?
Process & Analyze SingleStore Data in Databricks (AWS) (cdata.com)

Thanks

achaudhri · May 23, 2022, 11:52pm

I have used SingleStoreDB Cloud on AWS with Databricks and the setup required for the Databricks notebook to communicate with SingleStoreDB is quite straightforward.

You need to add the SingleStore Spark Connector using Maven and also a JDBC driver jar file. Once that is done, there are a few simple steps in the notebook to add the address of the SingleStoreDB cluster and a few Spark settings.

There are a set of instructions in this 3-part series:

Parts 1 and 2 would be the most relevant for setup and connection.

Note that you can use higher versions of the Databricks (DBR) runtime and you should choose the version of the Spark Connector to match the version of Spark.

In the examples, I use a Setup notebook with the address of the SingleStoreDB server and password and another notebook with the Spark settings. However, you could combine both in a single notebook. For example:

server = "<TO DO>"
password = "<TO DO>"

port = "3306"
cluster = server + ":" + port

spark.conf.set("spark.datasource.singlestore.ddlEndpoint", cluster)
spark.conf.set("spark.datasource.singlestore.user", "admin")
spark.conf.set("spark.datasource.singlestore.password", password)
spark.conf.set("spark.datasource.singlestore.disablePushdown", "false")