As healthcare organizations grow, incorporating ever-growing volumes of electronic healthcare records (EHRs) and other data from across the patient lifecycle, new strains on their data architecture often come to light. While this strain tends to be especially burdensome for organizations still trapped in costly legacy data systems, its effects can be felt even in modern cloud platforms like Google BigQuery.
While BigQuery, which offers a fully-managed, serverless enterprise data warehouse solution, is a secure and scalable cloud data service in its own right, key differences in the pricing, performance, and ease-of-use between BigQuery and Snowflake can make it a better or worse fit for some organizations.
In this blog, we will examine some of the major differences between Google BigQuery and the Snowflake Data Cloud, comparing performance benchmarks, breaking down their respective pricing models, and highlighting some of the reasons an organization might consider migrating to Snowflake.
BigQuery vs. Snowflake: A Side-by-Side Comparison
BigQuery and Snowflake are both excellent data warehousing solutions, with a few key differences.
Architecture: BigQuery’s architecture differs from traditional, node-based cloud data warehouse solutions by decoupling storage and compute. This decoupling allows both to be scaled independently on demand. Snowflake, meanwhile, leverages a hybrid of traditional shared-disk and shared-nothing database architectures. Data is stored on separated cloud storage objects and operating with cached cluster portions of the entire dataset, allowing for faster querying capabilities.
Pricing Model: Snowflake’s pricing model is based on the usage of computing resources, which means that users are charged for execution time. BigQuery, on the other hand, uses a query-based pricing model for computing resources, meaning users are charged for the amount of data their queries return. While BigQuery storage is slightly cheaper per terabyte than Snowflake storage, the actual cost-effectiveness of either platform is contingent on the size of your datasets and the volume and complexity of queries run on that data. In short, Snowflake is the more cost-effective solution for users running complex queries on large datasets, while BigQuery may be more cost-effective when running simpler queries on smaller datasets.
Platform Integration: Because BigQuery operates on the Google Cloud Platform, it is designed to integrate well with other Google Cloud solutions, including Google Data Studio and Google AI Platform. This smooth integration is a considerable advantage for organizations who utilize Google offerings across their data stack, but can become a double-edged sword for users looking to shop around for best-in-breed offerings from, say, Amazon Web Services or Microsoft Azure. Integration can be a significant factor for healthcare organizations that operate on multi-cloud data strategies, especially in light of Epic Cogito’s impending move to Azure. Snowflake, on the other hand, is platform agnostic, making it the ideal choice for organizations looking to avoid vendor lock-ins, sidestep overcommitment to any one cloud provider, or go multi-cloud. Snowflake users are left free to mix-and-match Azure, AWS, and even GCP offerings. These customers can keep what they like from Epic and GCP while also benefiting from the lengthy catalog of Snowflake features, including Snowflake Native Apps.
Performance: In a series of benchmark tests performed by GigaOm in 2019, Snowflake considerably outperformed BigQuery in completing 103 TPC-DS queries, with a completion time of 5,793 seconds compared to 37,283 seconds for BigQuery. While it is important to caveat that Snowflake is not necessarily faster than BigQuery under every circumstance, and that both platforms are regularly updated with performance enhancements, the difference in these completion times is still considerable, especially when contextualized by Snowflake’s cost-efficiency benefits when running complex queries on large data sets.
Roles and Permissions: Snowflake also allows for more granular permission assignments, whereas BigQuery assigns access permissions based on Google Cloud roles. In Snowflake, you can also assign roles to other roles, allowing for more customization of role hierarchies.
Data Retention: Whereas BigQuery only allows for the ability to view data that has been changed or deleted via Time Travel for the last seven days, Snowflake allows you to view historical data at any point within the last 90 days. The length of this retention window may be an important consideration for highly regulated industries like healthcare, where historical data may help organizations remain compliant.
Ease of Use: While both Snowflake and BigQuery score highly for usability, the unified, easy-to-use UI that Snowflake provides with Snowsight gives it a slight upper hand. Snowflake also supports more data types than BigQuery, including JSON, XML, Avro, and Parquet, among others, making it a somewhat more flexible option.
Snowflake and the Healthcare Sector
Having established how Snowflake measures up against BigQuery in several key areas, let’s take a moment to examine some of its healthcare-specific offerings.
Snowflake’s Healthcare and Life Sciences Data Cloud is an open, interoperable data ecosystem that helps providers and payers alike break down data silos, simplify complex data architectures, and deploy cutting-edge AI tools and technologies to streamline operations and cut costs.
In an industry where upwards of 80% of all patient data is unstructured, offerings like this joint end-to-end image classification solution by Hakkoda and Snowflake also show how the accelerated compute power of the Snowflake environment can be leveraged to drive clinical innovation.
Navigating a BigQuery to Snowflake Migration with Hakkōda
Hakkoda’s Healthcare and Life Sciences team is well-versed in the complex regulatory frameworks that surround electronic health records and other patient records. We also understand the nuances of a successful data migration project, and have helped many healthcare organizations future-proof their data strategies in the Snowflake Data Cloud.
For one client, a large healthcare organization, moving from a legacy Teradata system to Snowflake resulted in an 85% increase in query efficiency, 16 times faster testing, and a cost reduction of 60%.
Private Health Management (PHM), another client, saw efficiency gains of 3x after migrating while building a data architecture in Snowflake that enabled them to predict 26 different kinds of cancer among at-risk groups from employer data across the country.
Our data experts also come prepared with best-in-class solutions and accelerators that can streamline common migration pitfalls: whether it’s by automating query testing with AQPT or making it easier to load interoperable health resources into Snowflake with the FHIR Data Loader.
Ready to learn more about how Snowflake can help your organization lead the charge for healthcare data innovation with a cost-effective, platform agnostic architecture perfect cross-cloud or multi-cloud strategies? Let’s talk.