Bring Your Generative AI Application to the Next Level With watsonx.ai and SingleStoreDB

As a component of the IBM watsonx platform, watsonx.ai is designed to merge groundbreaking generative AI technologies with conventional machine learning.

This all-in-one studio simplifies the AI development process, providing a streamlined environment for training, validating, tuning and deploying generative AI, foundation models and machine learning capabilities. With watsonx.ai, you can create AI quickly across the entire enterprise — even when working with limited datasets.

Clients can also integrate the platform with SingleStoreDB for real-time context data. This enables organizations to customize their Large Language Models (LLMs) to meet specific business requirements. SingleStoreDB combines hybrid search and analytics capabilities to deliver high performance, serving as a knowledge base for generative AI applications. It feeds accurate contextual data to watsonx.ai's LLM models in just milliseconds.

Deploying watsonx.ai and SingleStoreDB can help address network costs and performance bottlenecks, and also offer added layers of security and ease of deployment. This unified platform is designed to deliver a holistic solution for the many AI needs of a business.

SingleStoreDB is known for its HTAP (Hybrid transaction/analytical processing) storage, which outputs strong performance for transactional and analytical queries and enables real-time analytics. Those capabilities also apply to vectors with fast ingestion and efficient storage serving ibm.com applications with low latency response times on semantic searches.

Let’s take a simple example that you can use on your own through IBM watsonx sample notebooks. It lets you use watsonx.ai and SingleStoreDB to respond to natural language questions using the Retrieval Augmented Generation (RAG) approach. We’ll use a LangChain integration to make the developer experience easy.

Outline + steps

Setup and configuration. We ensure all the required packages are installed and configuration information (e.g. credentials) is provided.
Define query. We establish the query to be used. This is established up front because we will use the same query in a basic completion with both an LLM and RAG pattern.
Initialize language model. We select and configure the LLM.
Perform basic completion. We perform a basic completion with our query and LLM.
Get data for documents. We get and preprocess (e.g. split) the data we want to use in our knowledge base.
Initialize embedding model. We select and configure the embedding model we would like to use to encode our data for our knowledge base.
Initialize vector store. We initialize our vector store with our data and embedding model.
Perform similarity search. We use our initialized vector store and perform a similarity search with our query.
Perform RAG. We perform a completion with a RAG pipeline. In this version, we are explicitly passing the relevant docs (from our similarity search).
Perform RAG with Q+A chain. We perform a completion with a RAG pipeline. In this version, there is no explicit passing of relevant docs.

Setup and configuration

Dev settings

1
# Ignore warnings
2
import warnings
3
warnings.filterwarnings("ignore")

Packages

!pip install langchain -q

!pip install ibm-watson-machine-learning -q

!pip install wget -q

!pip install sentence-transformers -q

!pip install singlestoredb -q

!pip install sqlalchemy-singlestoredb -q

langchain: Orchestration framework
ibm-watson-machine-learning: For IBM LLMs
wget: To download knowledge base data
sentence-transformers: For embedding model

Import utility packages

1
import os
2
import getpass

Environment variables and keys

watsonx URL

1
try:
2
    wxa_url = os.environ["WXA_URL"]
3
except KeyError:
4
    wxa_url = getpass.getpass("Please enter your watsonx.ai URL domain (hit enter):
5
")

watsonx API key

1
try:
2
    wxa_api_key = os.environ["WXA_API_KEY"]
3
except KeyError:
4
    wxa_api_key = getpass.getpass("Please enter your watsonx.ai API key
5
(hit enter): ")

watsonx project ID

1
try:
2
    wxa_project_id = os.environ["WXA_PROJECT_ID"]
3
except KeyError:
4
    wxa_project_id = getpass.getpass("Please enter your watsonx.ai Project
5
ID (hit enter): ")

SingleStoreDB connection

If you do not have a SingleStoreDB instance, you can start today with a free trial here. To get the connection strings:

Select a workspace
If the workspace is suspended, click on resume it
Click on Connect
Click on Connect Directly
Click SQL IDE which gives you SINGLESTORE_USER (admin for trials), SINGLESTORE_PASS (Password), SINGLESTORE_PORT (usually 3306
Pick a name for your SINGLESTORE_DATABASE

1
try:
2
    connection_user = os.environ["SINGLESTORE_USER"]
3
except KeyError:
4
    connection_user = getpass.getpass("Please enter your SingleStore username (hit
5
enter): ")

1
try:
2
    connection_password = os.environ["SINGLESTORE_PASS"]
3
except KeyError:
4
    connection_password = getpass.getpass("Please enter your SingleStore
5
password (hit enter): ")

1
try:
2
    connection_port = os.environ["SINGLESTORE_PORT"]
3
except KeyError:
4
    database_name = input("Please enter your SingleStore database name (hit
5
enter): ")

1
try:
2
    connection_host = os.environ["SINGLESTORE_HOST"]
3
except KeyError:
4
    database_name = input("Please enter your SingleStore database name (hit enter):
5
")

1
try:
2
    database_name = os.environ["SINGLESTORE_DATABASE"]
3
except KeyError:
4
    database_name = input("Please enter your SingleStore database name (hit enter):
5
")

1
try:
2
    table_name = os.environ["SINGLESTORE_TABLE"]
3
except KeyError:
4
    table_name = input("Please enter your SingleStore table name (hit
5
enter): ")

Query

query = "What did the president say about Ketanji Brown Jackson?"

Language model

For our language model we will use Granite, an IBM-developed LLM.

1
from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes
2
from ibm_watson_machine_learning.foundation_models import Model
3
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as
4
GenParams
5
from ibm_watson_machine_learning.foundation_models.utils.enums import
6
DecodingMethods

parameters = {

GenParams.DECODING_METHOD: DecodingMethods.GREEDY,

GenParams.MIN_NEW_TOKENS: 1,

GenParams.MAX_NEW_TOKENS: 100

}

1
model = Model(
2
    model_id=ModelTypes.GRANITE_13B_CHAT,
3
    params=parameters,
4
    credentials={
5
        "url": wxa_url,
6
        "apikey": wxa_api_key
7
    },
8
    project_id=wxa_project_id
9
)

1
from ibm_watson_machine_learning.foundation_models.extensions.langchain import
2
WatsonxLLM
3
granite_llm_ibm = WatsonxLLM(model=model)

Basic completion

1
result = granite_llm_ibm(query)
2
print("Query: " + query)
3
print("Response: " + response)

roud” to have nominated her to the Supreme Court.<|endoftext|>

Data for documents

Let’s now load the knowledge base stored in AWS S3 into documents

1
import wget
2

3
filename = './state_of_the_union.txt'
4
url =
5
'https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/foundation_models/state_of_the_union.txt'
6

7
if not os.path.isfile(filename):
8
    wget.download(url, out=filename)

Embeddings

By default, we will be using the LangChain Hugging Face embedding model — which at the time of this writing is sentence-transformers/all-mpnet-base-v2.

Let’s split the documents into chunks:

1
from langchain.embeddings import HuggingFaceEmbeddings
2
embeddings = HuggingFaceEmbeddings()

Vector store

We are going to store the embeddings in SingleStoreDB.

Create a SingleStore SQLAlchemy engine

1
from sqlalchemy import *
2

3
# Without database connection URL - we use that connection string to create a database
4
connection_url =
5
f"singlestoredb://{connection_user}:{connection_password}@{connection_host}:{connection_port}"
6
engine = create_engine(connection_url)

Create database for embeddings (if one doesn’t already exist)

1
# Create database in SingleStoreDB
2

3
with engine.connect() as conn:
4
    result = conn.execute(text("CREATE DATABASE IF NOT EXISTS " + database_name))

Verify the database exists

1
print("Available databases:")
2
with engine.connect() as conn:
3
    result = conn.execute(text("SHOW DATABASES"))
4
    for row in result:
5
        print(row)

Drop table for embeddings (if exists)

1
with engine.connect() as conn:
2
    result = conn.execute(text("DROP TABLE IF EXISTS " + database_name +
3
"." + table_name))

Instantiate SingleStoreDB in LangChain

1
# Connection string to use Langchain with SingleStoreDB
2
os.environ["SINGLESTOREDB_URL"] =
3
f"{connection_user}:{connection_password}@{connection_host}:{connection_por
4
t}/{database_name}"

1
from langchain.vectorstores import SingleStoreDB
2
vectorstore = SingleStoreDB.from_documents(
3
        texts,
4
        embedding_model,
5
        table_name = table_name
6
)

Check table

1
with engine.connect() as conn:
2
    result = conn.execute(text("DESCRIBE " + database_name + "." +
3
table_name))
4
    print(database_name + "." + table_name + " table schema:")
5
    for row in result:
6
        print(row)
7

8
    result = conn.execute(text("SELECT COUNT(vector) FROM " + database_name
9
+ "." + table_name))
10
    print("\nNumber of rows in " + database_name + "." + table_name + ": "
11
+ str(result.first()[0]))

Perform similarity search

Here, we’ll find the similar (i.e. relevant) texts to our query. You can modify the number of results returned with k parameter in the similarity_search method here.

1
texts_sim = vectorstore.similarity_search(query, k=5)
2
print("Number of relevant texts: " + str(len(texts_sim)))

Response: Number of relevant texts: 5

1
print("First 100 characters of relevant texts.")
2
for i in range(len(texts_sim)):
3
        print("Text " + str(i) + ": " + str(texts_sim[i].page_content[0:100]))

Response:

First 100 characters of relevant texts.

Text 1: Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Ac

Text 2: A former top litigator in private practice. A former federal public defender. And from a family of p

Text 3: As Frances Haugen, who is here with us tonight, has shown, we must hold social media platforms accou

Text 4: And I’m taking robust action to make sure the pain of our sanctions is targeted at Russia’s economy

Text 5: But cancer from prolonged exposure to burn pits ravaged Heath’s lungs and body.

Perform RAG with explicit context control

We’ll perform RAG using our model and explicit relevant knowledge (documents) from our similarity search.

1
from langchain.chains.question_answering import load_qa_chain
2
chain = load_qa_chain(granite_llm_ibm, chain_type="stuff")
3
result = chain.run(input_documents=texts_sim, question=query)

1
print("Query: " + query)
2
print("Result:" + result)

Response:

Query: What did the president say about Ketanji Brown Jackson?

Response: The president said that Ketanji Brown Jackson is a consensus builder who will continue Justice Breyer's legacy of excellence.<|endoftext|>

RAG Q+A chain

This includes RAG using a chain of our model and vector store. The chain handles getting the relevant knowledge (texts) under the hood.

1
from langchain.chains import RetrievalQA
2
qa = RetrievalQA.from_chain_type(llm=granite_llm_ibm, chain_type="stuff",
3
retriever=vectorstore.as_retriever())
4
response = qa.run(query)

1
print("Query: " + query)
2
print("Result:" + result)

Response:

Query: What did the president say about Ketanji Brown Jackson?

Response: The president said that Ketanji Brown Jackson is a consensus builder who will continue Justice Breyer's legacy of excellence.<|endoftext|>

Conclusion

We saw how easy it is to integrate SingleStoreDB with IBM watsonx.ai to help enhance your LLM model with a knowledge base collocated with your watsonx stack for fast retrieval on hybrid search and analytics. Start with watsonx.ai and SingleStoreDB today!