Harnessing Snowflake’s QAS for Big Data on Small Warehouses

chahatchhabra
Sep 20, 2024
5 min read

Updated: Oct 25, 2024

In today’s data-driven world, optimizing both performance and cost-efficiency is a top priority for organizations working in the cloud. For my team, managing data processing and analysis efficiently in Snowflake has always been critical, especially when handling large data sets. Previously, we relied heavily on large warehouses to run procedures and complex queries, given the scale of our data. However, this came with a steep cost.

Fortunately, with the advent of Snowflake's Query Acceleration Service (QAS), we found a way to significantly reduce our dependence on XL warehouses by leveraging XS and S warehouses. This shift has not only reduced our cloud infrastructure expenses but has also prevented the accidental use of large warehouses for smaller queries.

Understanding Query Acceleration Service (QAS)

QAS is Snowflake’s mechanism to optimize the performance of complex queries without having to scale up to a larger warehouse size. When enabled, it allows Snowflake to dynamically scale resources for specific queries, without changing the underlying size of the warehouse. This is a game-changer, particularly for users like us who need to balance performance with cost.

Instead of jumping directly to an XL warehouse for heavy workloads, we can use a smaller warehouse like XS, S or M and let QAS allocate extra resources only when needed for the query. This allows us to handle complex queries without continuously operating at a higher warehouse tier, saving both time and money.

The diagram below shows the architecture used by Query Acceleration Service. Using this approach, when Snowflake detects a massive query that will scan gigabytes of data then additional database servers are requested from QAS to take on the heavy lifting.

Our Use Case: Reducing Dependency on XL Warehouses

In our Snowflake environment, running procedures and large queries typically required XL warehouses due to the scale of our datasets, which include billions of records. However, with QAS, we found that we could continue to use smaller warehouses like XS or S, leveraging QAS when needed. QAS kicks in for those queries that actually read the large datasets – and get this, it does so automatically!

By enabling QAS on our XS warehouses, we observed a significant performance boost for complex queries, which previously only ran efficiently on larger warehouses. This allowed us to reduce costs by minimizing the usage of XL warehouses for everyday operations.

How We Implemented QAS

Implementing QAS in Snowflake is a simple process. We enabled this feature by following the below steps:

Enable QAS on the XS warehouse using the following command:
Set the query acceleration max scale factor to define how much additional resources Snowflake can allocate for a query. This factor allows us to control how aggressively Snowflake can scale up resources for a query. This means that the XS warehouse with just one database server can automatically scale up to 16 servers. That's the equivalent of automatically scaling from an XSMALL to an X2LARGE.
Monitor query performance using the Query Acceleration History. This returns the most recent queries that utilized QAS, including details such as credit_used. The credit_used column is particularly important as it shows how many additional Snowflake credits were consumed by QAS to speed up query execution. This helps you assess the cost associated with query acceleration and ensures that resources are used efficiently, providing a clear balance between performance gains and cost control.

By enabling QAS, we ensured that Snowflake automatically optimized our queries without the need for manual warehouse scaling.

QAS Execution and Insights

To test the effect of Query Acceleration Service, we executed a series of queries against our Sales data. Before we begin it is important to note that the QAS triggers only under specific conditions, which are -

The query must require scanning large volumes of data, ranging from gigabytes to terabytes.
It must include filter conditions that greatly reduces the amount of data processed.

Hence, we chose and ran the following query with multiple filter conditions against a sales table carrying 525 GB of data with 16 billion records on one of our XS warehouse in the Snowflake, both with and without using QAS by following the above mentioned 1 and 2 steps. To check whether the QAS has indeed been enabled for our XS warehouse, I further ran the SHOW WAREHOUSES command and got the following result.

After executing the same query on two XS warehouses - one with QAS enabled and the other without, we observed the following results. With QAS disabled, the query took 5 minutes 46 seconds to complete. However, with the QAS enabled XS warehouse, the query runtime was reduced to just 42 seconds meaning it was 8.25 times faster! The screenshot below shows the query profile of the query above.

Furthermore, looking at the below Statistics screenshots, we can see that the XS warehouse processed just 6348 micro-partitions while the Query Acceleration Service processed 27258 entries which led to the 8.25 times massive improvement in execution time on the same query and the same size warehouse. Lastly, we observed that with QAS enabled, only 20.05 GB of data was scanned, compared to 101.13 GB without QAS, a reduction of nearly fivefold in the amount of data scanned. This difference is due to the QAS server optimizing the query by accelerating targeted scans, avoiding unnecessary partition scans, thereby reducing the overall workload on the warehouse and significantly improving performance.

With QAS enabled (on XS):

Without QAS (on XS):

The following screenshot shows the credit cost of the above QAS operation, and the number of bytes scanned by the query acceleration service.

This difference in performance is noteworthy. By simply enabling QAS, we saw a substantial reduction in query execution time, all while staying on an XS warehouse. The total credits used by a series of queries (some of which triggered QAS) on the QAS-enabled XS warehouse were 0.4532 credits, whereas running the same queries on an XL warehouse consumed 2.6670 total credits. This demonstrates a significant cost efficiency, as QAS not only speeds up queries but also reduces overall credit consumption, making it a highly effective solution for managing performance and cost in Snowflake.

Conclusion: QAS as a Cost-Saving Powerhouse

The introduction of QAS in our Snowflake environment has been transformative. We’ve decreased our dependence on large warehouses and now utilize small warehouses with QAS for different types of workloads.

Furthermore, it addresses the issue of inefficient resource utilization effectively. It helps users to avoid running small queries unnecessarily on a large warehouse by enabling them to run both small and larger queries by offloading the large table scans and filtering operations to dedicated QAS servers, dynamically providing additional resources when needed. The best part is that it operates quietly in the background, improving query performance without any manual intervention. Users only pay for the extra resources consumed during query acceleration, making it both cost-effective and easy to implement.

Insurance

Enterprise Data Hub

Pharma

Data Observability

Commercial Data Warehouse (CDW)