GCP – From keywords to relationships: Reveal deeper insights with full-text search and Spanner Graph
In today’s data-driven landscape, organizations grapple with the challenge of extracting valuable insights from rapidly expanding volumes of information. Graph databases excel at modeling complex relationships, while full-text search efficiently retrieves pertinent information from unstructured data. However, maintaining separate systems for these capabilities can lead to operational overhead and delayed insights. Spanner Graph offers a unified solution that tightly integrates both functionalities. In this blog post, we’ll dive into the capabilities and advantages of leveraging full-text search and Spanner Graph.
Graph and full-text search are a powerful combination
Graph provides a natural mechanism for representing relationships in data, making it well-suited for analyzing interconnected data, uncovering hidden patterns, and powering applications that rely on understanding connections. From social networks and recommendation engines to fraud detection and supply chain management, graph databases excel at navigating and extracting insights from the intricate web of relationships within data.
On the other hand, unstructured data, encompassing a wide range of formats like text documents, emails, social media posts, and customer reviews, also represents a wealth of information. Full-text search engines provide a powerful mechanism for indexing, searching, and retrieving relevant information from these vast repositories of unstructured data. By enabling users to quickly and easily find the information they need, full-text search plays a crucial role in knowledge discovery, customer support, and content management.
While graph databases and full-text search are valuable tools individually, their true potential lies in their combination. Imagine you are building an e-commerce web site: When a customer searches for “waterproof hiking boots” your site swiftly pinpoints matching products by applying full-text search on product descriptions. Then, your recommendation engine taps into the power of graph, analyzing the customer’s past purchases of hiking socks and backpacks, along with the purchase histories of others who bought similar boots. By combining these insights, your site not only returns waterproof hiking boots that match the user’s search but also recommends complementary items like trekking poles that the user has not purchased yet, effectively cross-selling and enhancing the user’s shopping experience. This personalized approach is made possible by the combination of full-text search and graph.
Using multiple specialized systems is suboptimal
Currently, handling graph and full-text data often involves two distinct types of systems: graph databases and search engines, which introduces a series of challenges:
Data duplication and synchronization challenges: Maintaining consistency between separate graph and full-text search systems necessitates complex ETL pipelines for data copying and transformation. This not only consumes valuable resources but also introduces the risk of errors and delays.
Elevated operational and maintenance overhead: Running a multitude of specialized services translates to increased operational complexity and maintenance overhead. Each service necessitates configuration, monitoring, security updates, and potential troubleshooting, demanding dedicated expertise and time.
Obstacles to integrated queries and analysis: The separation of graph and full-text search capabilities creates barriers to performing integrated queries and analysis. Unifying insights from both domains often requires manually merging and correlating results from separate queries.
Impact on real-time application responsiveness: The inherent latency caused by data synchronization and separate query processing can negatively affect the real-time responsiveness of applications, like fraud detection, customer support, and real-time recommendations, where immediate insights are crucial.
Spanner Graph unites graph and full-text search in one system
Spanner Graph unites purpose-built graph capabilities with Spanner, our always-on, globally consistent, and virtually unlimited-scale database. Graph, full-text search, and AI capabilities are tightly integrated in one system. The integrated full-text search provides battle-tested technology that powers many existing Google products. It goes beyond exact or partial matches to include fuzzy matching for character variations and similar-sounding names, along with result scoring and automatic language detection. It also leverages AI to understand the meaning of search queries, handling synonyms and spell correction. In addition, it supports complex search queries using logical operators similar to Google web search.
Graph query and full-text search are tightly integrated, allowing you to traverse relationships within graph structures using graph queries, while simultaneously retrieving nodes or edges based on their text contents using full-text search. By integrating these complementary techniques, this unified capability enables you to uncover hidden connections, patterns, and insights across graph and unstructured data, all within one system as your single source of truth. In the rest of the blog, we’ll discuss how to use full-text search and Spanner Graph.
Create a full-text search index on graph nodes and edges
To use full-text search with Spanner Graph, the first step is to create a search index on the graph node and edge properties that you want to search on. Spanner Graph maps tables and their columns to graph nodes, edges and their properties. To enable full-text search on a text property of graph nodes and edges, you can create a search index on the corresponding table columns.
Using a simplified example from the e-commerce domain, you first create a retail graph with user and product as entities and past purchases as relationships:
<ListValue: [StructValue([(‘code’, ‘– Creates a table to store product information.rnCREATE TABLE Product (rn productId INT64 NOT NULL,rn name STRING(MAX),rn description STRING(MAX),rn) PRIMARY KEY (productId);rnrn– Creates User and Purchase tables (skipped in this example for simplicity).rn– Then creates the retail graph.rnCREATE OR REPLACE PROPERTY GRAPH RetailGraphrn NODE TABLES (User, Product)rn EDGE TABLES (rn Purchasern SOURCE KEY(userId) REFERENCES Userrn DESTINATION KEY(productId) REFERENCES Productrn );’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb434ce8ee0>)])]>
To make products searchable on their description property, you add a new column descriptionToken of type TOKENLIST to the underlying Product table for storing the tokenized content of the description column:
<ListValue: [StructValue([(‘code’, ‘– Adds a new column to store the tokenized description.rnALTER TABLE ProductrnADD COLUMN descriptionToken TOKENLISTrnAS (TOKENIZE_FULLTEXT(description))rnSTORED HIDDEN;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb434ce8610>)])]>
You can then create a search index on the tokenized description:
<ListValue: [StructValue([(‘code’, ‘CREATE SEARCH INDEX ProductDescriptionIdxrnON Product(descriptionToken)rnOPTIONS (sort_order_sharding = true);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb434ce8a60>)])]>
Use full-text search to find graph nodes
After the search index is created, you can use the SEARCH function to invoke full-text search on Spanner Graph. The SEARCH function takes two parameters: the property to search on and the input search query. In the following example, the Product nodes whose description contains the keywords “waterproof hiking boots” are returned:
<ListValue: [StructValue([(‘code’, ‘– Finds waterproof hiking boots using full-text search on product descriptions.rnGRAPH RetailGraphrnMATCH (product:Product)rnWHERE SEARCH(product.descriptionToken, “waterproof hiking boots”)rnRETURN product.name;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb434ce8190>)])]>
Combine full-text search and graph traversals
The real power emerges when you combine full-text search and graph traversals. You can leverage full-text search to quickly pinpoint relevant graph nodes as the starting points for further exploration. Then, you can employ graph query to traverse relationships. This combination uncovers hidden connections, patterns, and insights that would be difficult to discover using either method alone.
Building upon the previous example, the following scenario demonstrates graph traversal starting from products identified through full-text search, in the context of recommendation engines. For each product found, the graph query retrieves users who previously purchased that item. Subsequently, it identifies other products those users have also bought, which are then returned as product recommendations ranked by their popularity. The product popularity is calculated based on the number of unique customers who have previously purchased the item. Note that the products already purchased by the current user are not returned. This process effectively reveals purchase patterns and interconnected preferences within the customer base.
<ListValue: [StructValue([(‘code’, ‘– Finds waterproof hiking boots using full-text search on product descriptions.rn– Same as the previous example.rnGRAPH RetailGraphrnMATCH (product:Product)rnWHERE SEARCH(product.descriptionToken, “waterproof hiking boots”)rnRETURN productrnrnNEXTrnrn– Recommends products using collaborative filtering by finding other productsrn– frequently purchased by customers who bought waterproof hiking boots before.rnMATCH (product)<-[:Purchase]-(otherUser:User)-[:Purchase]->(alsoBought:Product)rnWHERE product <> alsoBought — Excludes the original products.rn AND NOT EXISTS { — Excludes products the customer has already purchased.rn (:User {userId: @userId})-[p:Purchase]->rn (:Product {productId: alsoBought.productId})rn }rnRETURNrn alsoBought.name,rn COUNT(DISTINCT otherUser.userId) AS popularityrnGROUP BY alsoBought.namernORDER BY popularity DESC;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb434ce8820>)])]>
Query graph and table data together
Spanner Graph bridges the graph and relational words by allowing full interoperability between the Graph Query Language (GQL) and SQL. You can traverse your graphs using GQL, then join the results with tables, all in the same query. This provides you with maximum flexibility to choose the best tool for the job and optimize outcomes. The following example showcases a simplified recommendation algorithm similar to the example above using full-text search and graph. In this query, besides recommending products, we also retrieve the past price history of the recommended products stored in a table using SQL, providing users with valuable insights into price fluctuations and trends:
<ListValue: [StructValue([(‘code’, ‘SELECT priceHistory.*rn– Recommends products using full-text search and graphrn– (a simplified version of the example above).rnFROM GRAPH_TABLE(rn RetailGraphrn MATCH (product)<-[:Purchase]-(:User)-[:Purchase]->(alsoBought:Product)rn WHERE SEARCH(product.descriptionToken, “waterproof hiking boots”)rn RETURN alsoBought.productIdrn) AS alsoBoughtrn– Fetches product price history from table.rnLEFT JOIN ProductPriceHistory priceHistoryrn ON alsoBought.productId = priceHistory.productId;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb434ce84f0>)])]>
Get started
The powerful combination of full-text search andgraph, with interoperability with SQL unlocks hidden insights in your data, transforming the way you explore, analyze, and understand your information landscape. We can not wait to see what you build with Spanner Graph. To get started, you can follow the links below. If you have questions along the way, tag “google-spanner-graph” in community forums or email spanner-graph-feedback@google.com.
Read More for the details.