GCP – Enhancing AlloyDB vector search with inline filtering and enterprise observability
Many specialized vector databases today require you to create complex pipelines and applications in order to get the data you need. AlloyDB for PostgreSQL offers Google Research’s, state-of-the-art vector search index, ScaNN, enabling you to optimize the end-to-end retrieval of the most fresh, relevant data with a single SQL statement.
Today, we are introducing a set of new enhancements to help you get even more out of vector search in AlloyDB. First, we are launching inline filtering, a major performance enhancement to filtered vector search in AlloyDB. One of the most powerful features in AlloyDB is the ability to perform filtered vector search directly in the database, instead of post-processing on the application side. Inline filtering helps ensure that these types of searches are fast, accurate, and efficient — automatically combining the best of vector indexes and traditional indexes on metadata columns to achieve better query performance.
Second, we are launching enterprise-grade observability and management tooling for vector indexes to help you ensure stable performance and the highest quality search results. This includes a new recall evaluator, or built-in tooling for evaluating recall, a key metric of vector search quality. That means you no longer have to build your own measurement pipelines and processes for your applications to deliver good results. We’re also introducing vector index distribution statistics, allowing customers with rapidly changing real-time data to achieve more stable, consistent performance.
Together, these launches further strengthen our mission of providing performant, flexible, high-quality end-to-end solutions for vector search that enterprises can rely on.
- aside_block
- <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e025d3b4280>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
A review of filtered vector search in AlloyDB
Many customers start their journey with vector search trying simple search on a single column. For example, a retailer might want to perform a semantic search on product descriptions to surface the right products to match end-user queries.
- code_block
- <ListValue: [StructValue([(‘code’, “SELECT * FROM productrnORDER BY embedding <=> embedding(‘text-embedding-005’, ‘red cotton crew neck shirt’)::vectorrnLIMIT 50;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e025cfdf370>)])]>
However, very quickly, as you look to productionize these solutions and improve the quality of your results, you may find that the queries themselves get more interesting. You might iterate — add filters, perform joins with other tables, and aggregate your data. For example, the retailer might want to allow users to filter by size, price, and more.
- code_block
- <ListValue: [StructValue([(‘code’, “SELECT * FROM productrnWHERE category=’shirt’&& size=’S’&& price<100rnORDER BY embedding <=> embedding(‘text-embedding-005’, ‘red cotton crew neck’)::vectorrnLIMIT 50;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e025cfdff70>)])]>
AlloyDB’s PostgreSQL interface provides a strong developer experience for these types of workloads. Because vector search is integrated into the SQL interface, developers can very easily query structured and unstructured data together in a single SQL statement, as opposed to writing complex application code that pulls data from multiple sources.
Moreover, changing requirements such as adding new query filters typically don’t require schema or index updates. If our retailer, for example, wants to only show in-stock items at the end user’s local store, they can very easily join their products table with an existing store inventory table via the SQL interface.
- code_block
- <ListValue: [StructValue([(‘code’, “SELECT * FROM product prnJOIN product_inventory pi ON p.id = pi.product_idrnWHERE category=’shirt’ && pi.inventory>0rnORDER BY embedding <=> embedding(‘text-embedding-005’, ‘red cotton crew neck’)::vectorrnLIMIT 50;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e025cfdf820>)])]>
All of this, and more, is possible in AlloyDB!
Inline filtering
But as a developer, you don’t just want to execute the query — you also want excellent performance and recall. To deliver the best performance, the AlloyDB query optimizer makes choices on how to execute a query with filters. Inline filtering, is a new query optimization technique that allows the AlloyDB query optimizer to evaluate both the metadata filtering conditions and the vector search in tandem, leveraging both vector indexes and indexes on the metadata columns. Inline filtering is now available for the ScaNN index in AlloyDB, a search technology based on over a decade of Google research into semantic search algorithms.
AlloyDB intelligently and automatically employs this technique when it’s most beneficial. Depending on the query and the distribution of the underlying data, the query planner automatically chooses the execution plan with the best performance. When filters are very selective, i.e., when a very small number of rows matches the filter, the query planner typically executes a pre-filter. This can leverage an index on a metadata column to find the small subset of rows that match the filter, and then perform a nearest-neighbor search on only those rows. Alternatively, the query planner may decide to execute a post-filter in cases of low selectivity — i.e., if a large percentage of rows match the filtered condition. Here, the query planner starts with the vector index to come up with a list of relevant candidates, and then removes results that do not match the predicates on the metadata columns.
Inline filtering, on the other hand, is best for cases with medium selectivity. As AlloyDB searches through the vector index, it only computes distances for vectors that match the metadata filtering conditions. This massively improves performance for these queries complementing the advantages of post-filter or pre-filter. With this feature, AlloyDB provides great performance across the whole gamut of selectivities of filters when combined with vector search.
Enterprise-grade observability
If you’re running similarity search or generative AI workloads in production, you need stable performance and quality of results, just as you do for any other database workload. Observability and manageability tooling are key to achieving that.
With the new recall evaluator, built directly into the database, you can now more systematically measure, and ultimately tune, search quality with a single stored procedure in the database rather than build custom evaluation pipelines.
Recall in similarity search is the fraction of relevant instances that were retrieved from a search, and is the most common metric used for measuring search quality. One source of recall loss comes from the difference between approximate nearest neighbor search, or aNN, and k (exact) nearest neighbor search, or kNN. Vector indexes like AlloyDB’s ScaNN implement aNN algorithms, allowing you to speed up vector search on large datasets in exchange for a small tradeoff in recall. Now, AlloyDB provides you with the ability to measure this tradeoff directly in the database for individual queries and ensure that it is stable over time. You can update query and index parameters in response to this information to achieve better results and performance. This management tooling is critical if you care deeply about stable, high-quality results.
In addition to recall improvements, we’re also introducing vector index distribution statistics for the ScaNN index, allowing developers to see the distribution of vectors within the index. This is particularly useful for workloads with high write throughput or data change rates. In these scenarios, new real-time data is automatically added to the index and is ready for querying right away. Now, you can monitor any changes in vector-index distribution, and ensure that performance stays robust through these data changes.
Getting started
ScaNN for AlloyDB is generally available in AlloyDB. To get started with it, follow our quickstart guide to creating an AlloyDB instance, then read the documentation for some fast and easy vector queries. You can also now try AlloyDB for free with our 30-day free trials.
You can also tune into our live Data Cloud Innovation Webcast: Data Analytics and Databases to learn more about these new technologies.
To learn more about the ScaNN for AlloyDB index, check out our introduction to the ScaNN for AlloyDB index, or read our ScaNN for AlloyDB whitepaper for an introduction to vector search at large, and then a deep dive into the ScaNN algorithm and how we implemented it in PostgreSQL and AlloyDB.
Read More for the details.