In this example, we’ll demonstrate how we use the dot_product function (for cosine similarity) to find a matching image of a celebrity from among 7000 records in just three milliseconds!
Vector functions in SingleStoreDB make it possible to solve AI problems, including face matching, product photo matching, object recognition, text similarity matching and sentiment analysis.
Step 1: Signup for a free SingleStoreDB trial account at https://portal.singlestore.com/
Step 2: Create a workspace (S00 is enough)
Step 3: Create a database called image_recognition in the SQL Editor
Create database image_recognition;
Step 4: Go to ‘connect’ on your workspace in the portal and copy the workspace URL, your username and password to connect to your database using sqlalchemy.
Step 5: Import the the following libraries into your python kernel or Jupyter notebook
!pip3 install pymysql boto3 sqlalchemy
from sqlalchemy import *
import pymysql, boto3, requests, json
import matplotlib.pyplot as plt
import ipywidgets as widgets
from botocore import UNSIGNED
from botocore.client import Config
import botocore.exceptions
import urllib.request
Step 6: Create the connection string
UserName='<Username usually admin>'
Password='<Password for that user>'
DatabaseName='image_recognition'
URL='<Host that you copied above>:3306'
db_connection_str = "mysql+pymysql://"+UserName+":"+Password+"@"+URL+"/"+DatabaseName
db_connection = create_engine(db_connection_str)
Step 7: Create a table — named “people” — in your database
query = 'create table people (filename varchar(255), vector blob,
shard(filename))'
db_connection.execute(query)
Step 8: Import our sample dataset into your database. Alternatively, follow the instructions in our previous blog to learn how you can create your vectors for your own images using facet and insert them into SingleStoreDB.
Note: The notebook or python script will run thousands of commands sequentially from the SQL script stored in Github repo and might take 15-20 minutes to be fully executed.
url =
'https://raw.githubusercontent.com/singlestore-labs/singlestoredb-samples/m
ain/Tutorials/Face%20matching/celebrity_data.sql'
response = requests.get(url)
sql_script = response.text
new_array = sql_script.split('\n')
for i in new_array:
if (i != ''):
db_connection.execute(i)
Step 9: Run our image matching algorithm using just two lines of SQL. In this example, we use Adam Sandler:
selected_name = "Adam_Sandler/Adam_Sandler_0003.jpg"
query1 = 'set @v = (select vector from people where filename = "' +
selected_name + '");'
query2 = 'select filename, dot_product(vector, @v) as score from people
order by score desc limit ' + str(num_matches) +';'
db_connection.execute(query1)
result = db_connection.execute(query2)
for res in result:
print(res)
(Optional) Step 10: Use our visualizer and drop down to see this image matching in action!
s3 = boto3.resource('s3',region_name='us-east-1',
config=Config(signature_version=UNSIGNED))
bucket = s3.Bucket('studiotutorials')
prefix = 'face_matching/'
names=[]
for obj in bucket.objects.all():
if (obj.key.startswith(prefix)):
drop = obj.key[len(prefix):]
names.append(drop)
def on_value_change(change):
selected_name = change.new
print(selected_name)
num_matches = 5
#SingleStore query to find the vectors of images that match
countquery = 'select count(*) from people where filename = "' +
selected_name + '";'
countdb = db_connection.execute(countquery)
for i in countdb:
x = i[0]
if (int(x) > 0):
query1 = 'set @v = (select vector from people where filename = "' +
selected_name + '");'
query2 = 'select filename, dot_product(vector, @v) as score from
people order by score desc limit ' + str(num_matches) +';'
db_connection.execute(query1)
result = db_connection.execute(query2)
original = "/original.jpg"
images = []
matches = []
try:
bucket.download_file(prefix + selected_name, original)
images.append(original)
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
urllib.request.urlretrieve('https://i.redd.it/mn9c32es4zi21.png', original)
else:
raise
cnt = 0
for res in result:
print(res)
temp_file = "/match" + str(cnt) + ".jpg"
images.append(temp_file)
matches.append(res[1])
try:
bucket.download_file(prefix + res[0], temp_file)
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
urllib.request.urlretrieve('https://i.redd.it/mn9c32es4zi21.png',
temp_file)
else:
raise
cnt += 1
fig, axes = plt.subplots(nrows=1, ncols=num_matches+1, figsize=(40,
40))
for i in range(num_matches+1):
axes[i].imshow(plt.imread(images[i]))
axes[i].set_xticks([])
axes[i].set_yticks([])
axes[i].set_xlabel('')
axes[i].set_ylabel('')
if i == 0:
axes[i].set_title("Original Image", fontsize=14)
else:
axes[i].set_title("Match " + str(i) + ". Score: " +
str(matches[i-1]), fontsize=14)
plt.show()
else:
print("No match for this image as it was not inserted into the
People Database")
dropdown = widgets.Dropdown(
options=names,
description='Select an Image:',
value=names[0]
)
dropdown.observe(on_value_change, names='value')
display(dropdown)
Look at SingleStoreLab's github repository on image recognition to follow this demo with our example python notebook!