As businesses increasingly rely on data-driven strategies, graph databases provide a powerful solution for navigating and leveraging data interconnectivity.
A great example of this is Klarna’s use of Neo4j for their internal knowledge graph that serves their internal chatbot, Kiki.
However, newcomers and experienced database users alike often find the setup of graph databases, with their intricate ETL processes and technical hurdles, daunting compared to familiar SQL environments.
By combining SingleStore with PuppyGraph, SQL developers can seamlessly integrate graph querying into their existing setups. SingleStore uniquely enables both OLTP performance for point lookups in graph traversal and state-of-the-art analytics to enable pushing down of graph analytics, while Puppygraph unlocks the ability to conduct highly-complex graph queries within the same SingleStore SQL environment.
This is a unique and exclusive combination of interoperability and performance. We see the productive value that Klarna demonstrated with Kiki and Neo4j, and want to enable many other enterprises to unlock the potential of their data through knowledge graphs.
This article covers the following:
- Foundational concepts of graph databases
- Advantages/disadvantages of graph and SQL querying
- Practical challenges of adopting graph technology
- How PuppyGraph provides a seamless solution with its graph query engine
- SQL tutorial using SingleStore and PuppyGraph to effectively enable graph capabilities
What is a graph database?
In graph databases, data is stored and depicted using nodes, edges and properties. Nodes symbolize entities like individuals, locations, or objects; edges illustrate the connections between them, often enriched with extra information through edge properties. This format allows for straightforward visualization and exploration of both direct and indirect relationships.
Graph databases have become essential for managing intricate networks of data. These databases surpass traditional storage solutions by adeptly managing connected data and presenting relationships in ways that reflect actual interactions.
Graph queries vs. SQL queries
One of the standout features of graph databases is their specialized query languages, which simplify the articulation of data and queries significantly — notably, the ANSI SQL 2023 edition primarily focused on integrating graph representations and graph-specific querying capabilities into SQL. Graph databases excel at naturally mapping and querying interconnected data for pattern detection in a manner that aligns with our intuitive grasp of relationships. This capability is crucial for effectively managing and querying complex networks within the data.
Graph query languages offer a syntax that adeptly handles scenarios involving pattern searches within node relationships. Graph query languages also excel in efficiently handling complex traversal queries across densely linked data networks. Traditional SQL queries can become unwieldy and verbose when applied to graph data, except for cases involving simple graph structures.
Challenges of implementing and running graph databases
ETL, architectural complexity and maintenance demands
Traditionally, to implement complex graph queries (like 10-hop and shortest path queries) on data stored in SQL databases, a common approach is extracting, transforming and loading (ETL) data into a graph database. This transition requires expert knowledge and continuous effort to:
- Develop and manage intricate ETL pipelines to morph relational data into graph-compatible formats of nodes, edges and properties
- Ensure the graph database performs optimally and remains up-to-date as data evolves
Frequent schema changes can complicate these processes, potentially leading to extended periods — over six months — dedicated solely to pipeline development before any data insights are derived. Additionally, numerous custom ETL pipelines clutter your architecture and slow down query response times.
This route is often fraught with challenges while navigating the differences in optimization strategies and storage mechanisms inherent to graph databases. The complexity and effort required for this transition partly explain the rarity of running graph queries on SQL databases.
Understanding and adoption hurdles
The leap from traditional relational databases to graph databases requires a fundamental change in approach to data architecture, which can be daunting. A graph database requires every command to be executed in graph queries which poses a significant learning curve for developers to master a way of thinking. This can create a roadblock for even the most enthusiastic developers — and hinder convincing stakeholders of a graph solution's value.
Scaling difficulties
Graph databases often struggle with scalability issues. As the network of nodes and edges grows, the data becomes more complex, leading to unique challenges in both computational and horizontal scaling that are not typically present in SQL databases. The tightly knit nature of graph data complicates matters further; simply adding additional hardware doesn't necessarily enhance performance and may necessitate a reassessment of the graph model, or the adoption of more sophisticated scaling methods.
Resource and time investment
Setting up infrastructure, mapping data and maintaining graph databases typically requires more resources and time compared to conventional databases. The complexities involved in graph data modeling lead to increased costs and extended development durations.
Higher cost with specialized tooling and integration
Graph databases require specialized tooling that supports unique graph operations, creating a gap with existing SQL tools and infrastructure. This often leads to additional investment in new tools and training, further complicating integration and adoption efforts.
Meet PuppyGraph: Zero ETL graph query engine
PuppyGraph is the first graph query engine that allows developers to enable graph capabilities on SQL data stores. The result is that users can perform graph queries on their existing data stores — without complex ETL processes. PuppyGraph supports a variety of data storage systems, including SingleStore, Apache Iceberg, Delta Lake, Apache Hive and several other SQL databases. The platform provides easy integration and, within minutes, allows users to leverage Apache Gremlin and openCypher query languages against their SQL data.
Just like SingleStore, PuppyGraph allows lightning-fast query speeds (faster than traditional graph databases), enabled by high-performance auto-sharding. It offers scalability and low-latency responses to even the most complex queries (10-hop queries returning in two seconds).
Data management is also streamlined, since PuppyGraph does not require complex ETL to move data from a SQL source to a graph database target. This means no ETL pipelines to maintain and no additional persistent data copies outside of SingleStore. PuppyGraph will also operate within our infrastructure, ensuring complete control and adherence to any data governance policies you must enforce.
PuppyGraph allows for the direct execution of graph queries on data within SQL databases and lakes, serving as a bridge that treats tabular data as if it were a graph. This innovation not only simplifies the execution of graph operations on existing SQL datasets, but also avoids the pitfalls associated with data duplication and the traditional ETL journey.
PuppyGraph’s compatibility with various data storage solutions — including SQL-centric systems like SingleStore — paves the way for leveraging graph query capabilities.
For those seeking the analytical depth of graph queries without overhauling existing data infrastructure, using PuppyGraph with SingleStore offers a streamlined path to integrating graph analytics within SQL data environments. This development is a significant leap forward for companies that previously viewed graph capabilities as overly complex or out of reach, bridging the gap between the structured world of SQL and the interconnected realm of graph querying and analytics.
This approach is particularly beneficial for applications requiring network analysis, complex data hierarchies and other graph-intensive operations while sidestepping the resource-intensive demands of managing a separate graph database and its ETL pipelines.
What is SingleStore?
SingleStore is a real-time data platform that caters to both analytical (OLAP) and transactional (OLTP) processing with its unified architecture. It offers two deployment options:
- SingleStore as a real-time data warehouse, which allows for manual deployment on your own hardware and database management systems
- SingleStore Helios®, a fully managed service that supports hosting on major cloud platforms
Some of SingleStore’s key capabilities include:
Speed
- Features extremely low latency for data ingestion and querying
- Supports high levels of concurrency
- Utilizes efficient data storage and retrieval mechanisms through row and columnstore formats
Scalability
- Separates compute from storage capabilities
- Enables horizontal scaling of application data
- Offers cost-effective scaling while maintaining high performance
Security
- Designed with security as a fundamental feature
- Provides end-to-end encryption
- Includes built-in mechanisms for authentication and authorization
- Holds proven enterprise compliance validations with certifications like ISO/IEC 27001 and SOC Type 2
When combining SingleStore with PuppyGraph, SingleStore users benefit from a rapid, scalable graph model that unveils previously inaccessible, complex business insights. Within as little as 10 minutes, users can deploy a unified graph model and begin querying petabytes of data in mere seconds — transforming the way you interact with your data as is.
Step-by-step tutorial: SingleStore and PuppyGraph
Let’s take a deep dive into a step-by-step tutorial for a quick demo! In this tutorial, we will use PuppyGraph to query data in SingleStore as a graph.
Bootstrap
Let’s start a PuppyGraph instance and a SingleStore instance together using Docker Compose. Create a docker-compose.yaml file as follows:
services:puppygraph:image: puppygraph/puppygraph-dev:stablepull_policy: alwayscontainer_name: puppygraphenvironment:- PUPPYGRAPH_USERNAME=puppygraph- PUPPYGRAPH_PASSWORD=puppygraph123networks:puppy_net:ports:- "8081:8081"- "8182:8182"- "7687:7687"singlestoredb:image: ghcr.io/singlestore-labs/singlestoredb-dev:latestcontainer_name: singlestoredbenvironment:- ROOT_PASSWORD=puppynetworks:puppy_net:ports:- "3306:3306"- "8080:8080"- "9000:9000"networks:puppy_net:name: puppy-singlestore
Run docker-compose up -d
to start the instances in docker.
Creating example data using SingleStore
Run the command docker exec -it singlestoredb singlestore -p
to access the SingleStore database.
Input the password (puppy) to enter the interactive shell. Run the following commands to create a database “modern,” and several tables in it.
drop database if exists modern;create database if not exists modern;create table modern.person (id text, Name text, age int);insert into modern.person values('v1', 'marko', 29),('v2', 'vadas', 27),('v4', 'josh', 32),('v6', 'peter', 35);create table modern.Software (id text, name text, LANG text);insert into modern.Software values('v3', 'lop', 'java'),('v5', 'ripple', 'java');create table modern.created (id text, from_id text, to_id text, weightdouble);insert into modern.created values('e9', 'v1', 'v3', 0.4),('e10', 'v4', 'v5', 1.0),('e11', 'v4', 'v3', 0.4),('e12', 'v6', 'v3', 0.2);create table modern.knows (id text, from_id text, to_id text, weightdouble);insert into modern.knows values('e7', 'v1', 'v2', 0.5),('e8', 'v1', 'v4', 1.0);
Creating example data using SingleStore
Naturally, the tabular data we created now form a graph — and it would be fascinating to analyze it as such. Actually, PuppyGraph allows you to query the data in SingleStore as a graph without any ETL.
Access localhost:8081
in the browser to access it.
Input the username, puppygraph
and password, puppygraph123 to login.
After logging in, the next step is to define a schema. This schema guides PuppyGraph in how to transform data from SingleStore into a graph structure for querying. PuppyGraph offers various methods for schema creation — for this tutorial, we are going to use PuppyGraph’s schema builder.
Click on “Create Graph Schema” to launch the schema builder.
The first step is to input the connection information to the SingleStore instance:
Catalog name: choose an arbitrary name for this connection to use it later
User name: root
Password: puppy
JDBC URI: jdbc:singlestore://singlestoredb:3306
JDBC Driver Class: com.singlestore.jdbc.Driver
JDBC Driver URL: https://github.com/memsql/S2-JDBC-Connector/releases/download/v1.2.3/singlestore-jdbc-client-1.2.3.jar
PuppyGraph will test the connection and list available tables once the connection information is submitted.
It shows four tables we created under the “modern” database. And, we can map them into a graph. First, add a vertex from the “person” table.
The same thing applies to the remaining tables:
The schema builder visualizes the pending schema as vertices, and edges are being created.
Everything looks good. Submit the schema — and now you have got a graph!
PuppyGraph has an interactive shell for querying the graph. Let’s give it a try. Click on the Query tab.
First query:
Getting Marko’s personal information.g.V().has("name", "marko").valueMap()
Second query:
Getting all the softwares created by Marko’s acquaintances.g.V().has("name", "marko").out("knows").out("created").valueMap()
We’ve walked you through how the landscape of data management is being transformed through the integration of graph analytics. Graph databases, while powerful, often present a steep learning curve and technical challenges including complex ETL processes and the need for new database setups.
But through our partnership, PuppyGraph and SingleStore simplify these challenges, enabling SQL developers to seamlessly perform graph queries within their existing data stores — no separate graph database required. This harmonious integration not only opens up new avenues for performant graph queries, but also maintains the simplicity of data management by leveraging existing permissions, thus democratizing access to advanced data analysis techniques.
Get started for free
Ready to get started? Download the forever free PuppyGraph Developer Edition and start free with SingleStore to create your first graph model in minutes.