Cloud giants get tooled up for data-driven battle royale

451 Research recently predicted that databases are likely to be the competitive battlefield in the ongoing price wars between cloud providers. One reason for this is that the three major cloud providers have reached relative parity in their assembled database, data management and analytics portfolios. In this report, we survey their respective data-related arsenals for functionality differentiators, and potential pricing weapons.

The 451 Take

It's by no means guaranteed that database and analytics services will be the next focus of the cloud price wars, but for the reasons outlined below, we believe there are multiple reasons why data and analytics in general – and relational databases in particular – are likely to be the next competitive battlefield. The table below illustrates that, while there are some gaps, the portfolios of Amazon Web Services, Microsoft Azure and Google Cloud are broadly comparable. There remain functional differences between the various services, of course, but the more similar the services, the greater the pressure for price wars. Even where functional differences exist, price remains a competitive weapon to be used in conjunction with differentiated functionality.

451 Research recently said it expects that database services will likely be next focus of the competitive price battles between the cloud services providers. With the cloud-price battlefield having shifted from virtual machines (VMs) to object storage, we have identified databases as a probable candidate to undergo the same pricing pressures over the next 18 months.

There are a number of reasons why databases and data-related services are good candidates for price cuts.

  • First, what percentage of enterprises have at least one relational database? The answer has to be close to 100%. As such, the database is a major candidate for migration to the cloud simply because everyone has one, while adoption of database cloud services has, to date, lagged behind other services, such as compute and storage. Generally, despite the emergence of sexier cloud services such as serverless and containers, most enterprises are still operating using old-school technologies – virtual machines, simple storage and relational databases. For those that want to migrate, these line items are the ones CIOs will be worrying about when it comes to cost, driving price pressure as they hunt for the best deals. Virtual machines and storage have already been the subject of this pressure, and since database services are among the next set of services for which adoption can be expected to expand, they are likely to be next.

  • Second, data has gravity, storing that data costs money, and nearly all end users want to store some data. Virtual machine expenditure is under the cloud/infrastructure administrator's control (even if the administrator provisions auto-scaling), whereas storage and database expenditure is often controlled by line-of-business users. A typical end user (let's say a janitorial assistant) at an enterprise won't impact virtual machine costs because the user wouldn't have the ability to provision one (or the desire to). But that same end user might upload a 100GB file (let's say a roster of cleaning duties in our scenario) to a storage service, which would have a cost implication that the end user is likely to be unaware of. The result is that storage and database costs will probably grow rapidly, and reducing this expenditure through negotiation will become increasingly important to CIOs.

  • Third, relational database as a service (DBaaS) has an indirect but important competitor – hosting one's own database on a cloud provider's virtual machine. Administrators can freely install a database on any cloud provider's virtual machine at no or little premium. Of course, the benefit of DBaaS is that the administrator doesn't have to worry about the infrastructure or most of the platform – but we have a hunch that that database is overpriced compared to hosting it on a VM. Economically, supporting your own database on a VM is a 'substitute good,' and has the potential to force prices down as CIOs become more aware of the premium they are paying relative to self-managing. We think margins on relational DBaaS are still substantial, with plenty of room for movement.

  • Finally, having expanded their data-related services portfolios over recent years, the major cloud providers are now at relative parity. There are some functional differences between services, of course, but relatively few gaps remain to be filled in the portfolios of AWS, Microsoft and Google, which leaves price reductions as the next competitive leverage point.

Choose your weapons – how AWS, Azure and Google Cloud data-services portfolios compare

The table below illustrates how the database, data management and analytics portfolios of AWS, Azure and Google compare. We chose to limit this analysis to those three providers because they are the top three most popular cloud providers, according to the responses of end users to our Voice of the Enterprise surveys and also because they have been involved most directly in price skirmishes to date. However, we also expect them to face significant competition from the incumbent database and analytics providers given their evolving cloud strategies, and in a forthcoming report will be similarly examining the data and analytics cloud services of IBM, Oracle and SAP.

Most of the services listed are generally available, although some are in public or private preview. Below, we will review some of the more interesting functional differences, the gaps that need to be filled, and the primary potential opportunities for price skirmishes.

Mind the gaps

The table above illustrates our point about Google's cloud data-services portfolio: while the company has strengthened and expanded the portfolio in recent years, gaps do remain, most notably in terms of database and data-migration services, as well as data cataloging and data transformation.

While Google does offer functionality for moving data into and out of Google Cloud Storage, BigQuery and Cloud SQL, it currently doesn't offer anything to directly compare with Microsoft's Data Migration Assistant, let alone AWS's combination of AWS Snowball, AWS Snowball Edge and AWS Snowmobile. Microsoft also indicated that this is an area it plans to make improvements in, with its new Azure Data Migration Service now in private preview.

As we also recently noted, data cataloging is a potential area for development and/or acquisition for Google. AWS is currently testing the Glue manage data catalog and ETL service, while Microsoft offers Azure Data Catalog to provide a catalog of all data assets in Azure.

Google isn't the only cloud provider with gaps in its data-services portfolio. Search-based analytics is a gap for Microsoft's Azure, which has nothing to directly compare with the Amazon Elasticsearch Service. Google recently announced plans to plug this gap itself by partnering with Elastic to launch a managed Elasticsearch service in the second half of 2017.

Differentiation

While Google has some gaps, it also offers differentiated services that cannot be found elsewhere. For example, although Amazon RDS, Amazon Aurora, Microsoft Azure SQL Database and Google Cloud SQL are broadly comparable, we previously explained how and why Google Cloud Spanner is a truly differentiated globally distributed transactional database service.

Similarly, while Google BigQuery competes with Amazon Redshift and Microsoft Azure SQL Data Warehouse for data-warehousing workloads, it is based on quite different underlying technology and architectural approaches, to the extent that it also competes with services from AWS and Microsoft for big-data processing, such as Amazon Athena and Microsoft Data Lake Analytics.

Launched in December 2016, Amazon Athena enables SQL-based analysis of data stored in S3. Athena is based on the open source Presto distributed SQL query engine – which originated at Facebook but was inspired by Dremel, Google's internal research project (commercialized as BigQuery) that effectively provides an ad hoc SQL query system to analyze data stored in the Colossus file system.

Similarly, Microsoft's Azure Data Lake Analytics offering (including the U-SQL query language) exposes the results of a number of internal Microsoft big-data projects (including the Dryad distributed data-processing frameworks, the Cosmos storage and computation framework, and the Scope language for parallel query execution).

Meanwhile, AWS also recently introduced the ability to process complex queries on data in S3, without needing to load the data into Amazon Redshift tables, using Redshift via its Redshift Spectrum capability.

The three cloud giants also offer services for data management and preparation that could be described as differentiating, but that actually offer overlapping functionality. For example, AWS Data Pipeline automates the movement and transformation of data, while Microsoft offers Azure Data Factory for creating, managing and scheduling data integration and transformation pipelines.

Google's Cloud Dataflow is a SDK and managed service for building data pipelines to read, transform and analyze data, which can be run in either streaming mode for real-time data or batch mode for historical data. Azure Data Factory also offers data-preparation functionality – as does AWS Glue – which would be comparable to Google Cloud Dataprep, a self-service data-preparation offering currently in private beta that Google developed in conjunction with Trifacta.

Analytics is an area where Microsoft arguably has a significant advantage over its rivals, thanks to its massive on-premises installed base for Excel and the Power extensions that it packaged together as Power BI. Microsoft also recently announced the general availability of Azure Analysis Services, an OLAP engine and modeling platform based on the in-memory analytics engine used by SQL Server Analysis Services.

In comparison, AWS offers Amazon QuickSight for ad hoc analysis and visualization, as well as QuickSight's SPICE in-memory data store, which enables integration with third-party visualization software. Meanwhile, Google offers Cloud Datalab, based on the open source Jupyter notebook that is primarily targeted at data scientists, as well as the more broadly applicable Google Data Studio visual-analysis cloud service.

A significant area of potential competition is machine learning. All three cloud providers offer a core machine-learning service and associated APIs and dedicated services. We will explore the competition and pricing in this space in more detail in a forthcoming report.

Potential pricing focus areas

In theory, the most differentiated offerings should be subject to the lowest price pressure; there is less need to differentiate on price if the product is differentiated on capability. With this in mind, Google Cloud Spanner seems like the least likely candidate to undergo price cuts. But there are substitute goods in terms of all the other database services available – they aren't necessarily equivalent, but could provide alternatives to administrators if Cloud Spanner is seen as too expensive or unsuitable.

Microsoft's new Azure Cosmos DB is a case in point. Although it is globally distributed like Google Cloud Spanner, it is a multi-model NoSQL database rather than a relational database, and therefore arguably not directly comparable to Google Cloud Spanner. However, it does support the SQL dialect of DocumentDB (its predecessor) and can be tuned for strong consistency. Microsoft can be expected to argue that those capabilities, combined with 99.99% availability, will be enough for many applications that don't require the 99.999% availability, and transactional consistency available with Google Cloud Spanner. Microsoft has also shown that is willing to transfer existing DocumentDB customers to Cosmos DB and its additional functionality at no extra cost.

Although Google Cloud Spanner offers scalability advantages over other relational database services, it lacks compatibility with existing database applications. This is an advantage enjoyed by AWS's Amazon RDS, as well as Microsoft's Azure SQL Database and the new Azure Database for MySQL or PostgreSQL, and even Google Cloud SQL.

Amazon RDS, Azure Database and Google Cloud SQL, especially, are (relatively speaking) commodity services, and would be more likely candidates for price cuts. Additionally, we believe AWS has the potential to be more aggressive with Amazon Aurora, which offers greater scalability and performance than standard relational database services, although less than the globally distributed Google Cloud Spanner.

Making an argument that most users don't currently need the full scalability that Spanner offers for the majority of their applications – while also highlighting Aurora's compatibility with MySQL and PostgreSQL, combined with aggressive price cuts – could drive customers to AWS based on a 'more capability for less cash' perception, just as Microsoft is looking to do with Azure Cosmos DB.

Cloud services are economically 'complementary goods' – AWS and Microsoft might make less revenue on Aurora and Cosmos DB than they could if they charged more. But the gains it would make in complementary storage, compute and other services might offset this difference and then some.

Our Cloud Price Index has quantitatively shown that while price is a major contributor to buyers' decisions, other things such as features, relationship and support matter more. The Cloud Price Index provides benchmark prices for a variety of databases and storage services, together with distributions of quotations in the market.

Our advice is always: price reasonably against the market price, and don't be afraid to charge a premium – just make sure the premium is justifiable. And stop kidding yourself that price doesn't matter – enterprises can play with your competitors' services in less than five minutes with a credit card. Service providers and vendors should charge so that the buyer is content with paying a premium, so they're not driven to getting hands-on with your competitors.

New Alert Set

"My Alert"

Failed to Set Alert

"My Alert"