GCP – Manage dynamic query concurrency with BigQuery query queues
BigQuery is a powerful cloud data warehouse that can handle demanding workloads. BigQuery users can get the benefit of continuous improvements in performance, durability, efficiency, and scalability, without downtime and upgrades.
Today, we are pleased to announce the general availability of query queues in BigQuery.
What is query queues and why use it?
BigQuery query queues introduces a dynamic concurrency limit and enables queueing. All BigQuery customers are enabled by default. Previously BigQuery supported a fixed concurrency limit of 100 queries per project. When the number of queries exceeded this limit, users received a quota exceeded error when attempting to submit an interactive job.
Concurrency is now calculated dynamically based on the available slot capacity and the number of queries that are currently running. While most customers will use the dynamic concurrency calculation, administrators can also choose to set a maximum concurrency target for a reservation to ensure that each query has enough slot capacity to run. This also means that queries that cannot be processed immediately are added to a queue and run as soon as resources become available, instead of failing.
Here is what happens with query queues enabled:
Using query queues
Dynamic concurrency: BigQuery dynamically determines the concurrency based on available resources and can automatically set and manage the concurrency based on reservation size and usage patterns. While the default concurrency configuration is set to zero, which enables dynamic configuration, experienced administrators can manually override this option by specifying a target concurrency limit. The admin-specified limit can’t exceed the maximum concurrency provided by available slots. The limit is not configurable by administrators for on-demand workloads.Queuing: Query queues helps to manage scenarios where peak workloads generate a sudden increase in queries that exceed the maximum concurrency limit. With queuing enabled, BigQuery can queue up to 1,000 interactive queries and 20,000 batch queries, ensuring that they are scheduled for execution rather than being terminated due to concurrency limits, as was previously the case. Users no longer need to search for idle times or periods of low usage to optimize when to submit their workload requests. BigQuery automatically runs their requests or schedules them in a queue to run as soon as the current running workloads have finished.
Key metrics and highlights
Target job concurrency: Setting a lower target_job_concurrency for a reservation increases the minimum number of slots allocated per query, which potentially results in faster or more consistent performance, particularly for complex queries. Changes to concurrency are only supported at the reservation level.Specs: Within each project, up to 1,000 interactive queries can be queued at once, and 20,000 for batch queries. Batch queries use the same resources as interactive queries.Timeouts: Users can now configure a timeout value for each query/job queue. If a query can’t start executing within the specified time, BigQuery will attempt to cancel the query/job instead of queuing it for an extended amount of time. The default timeout value is 6 hours for interactive, 24 hours for batch, and can be set at the organization or project level.
For more information, read the query queues documentation.
Read More for the details.