The disrupted datacenter: Chiller-free cooling
Cooling has been the largest facilities cost in datacenters for decades. The combined capital and operational expenditures associated with keeping data halls within a narrow and low temperature band, still the norm at many enterprise and multi-tenant sites, can be between one-third and one-half of the total lifecycle cost of a facility, depending on engineering choices. This is changing. New cooling products that efficiently remove heat combined with the adoption of (slightly) wider temperature bands have gone a long way to reduce cooling expenses. Efficient datacenters waste relatively little energy on cooling as a result – 10-25% of the IT load is typical for an annual average. This compares to the typical overhead of 50-100% in older, less optimized facilities built in the previous decade.
However, 451 Research data indicates that restraining data hall temperatures to narrow temperature bands is probably unjustified. This will increasingly be the case in the future as the costs of IT hardware continue to fall and as workloads become increasingly decoupled from hardware components. For some future builds, it may be financially prudent to further widen climatic set points to eliminate mechanical refrigeration, which will save on overall costs, and may reduce the risk of datacenter failures or a site outage. Chiller-free datacenter cooling is one of more than a dozen technologies that we are evaluating as part of our upcoming Disruptive Technologies in the Datacenter report, a follow-on from our widely read and referenced 2013 report.
The 451 Take
Chiller-free datacenters, while a very small minority, are a reality today. It is within the reach of any operator to build an exceptionally energy and cost-efficient yet mission-critical facility as some have already demonstrated. However, barriers to wider adoption of the concept remain. There is a tendency to reject wide operating temperature bands, a prerequisite for chiller-free datacenters in most climates, out of a fear for IT system health. 451 Research believes such fears will diminish over time as operational confidence with wider bands builds up, and as IT infrastructures are increasingly made up of disposable components from the application's viewpoint. Chiller-free designs are the next major step in shedding excess capital costs in datacenters.
Technology and context
Traditional datacenter cooling involves some sort of mechanical refrigeration. In a 'typical' facility, chillers or direct-expansion (DX) units are used. Chillers produce cold water that is circulated in the building to heat exchangers in the building, while DX-based air conditioning units cool the air directly in the data hall where IT systems produce heat. Even though both technologies have become much more efficient in recent years, the process remains energy intense, which raises costs in both energy and the capital outlay for additional power capacity to support the peak electrical load of the compressors.
At the heart of the issue is the long-held industry consensus that IT systems need to be protected in a tightly controlled environment to minimize the chances of hardware component failures. Historically, it was considered best practice, and for good reason, to keep server inlet temperatures in the data hall between 19-21°C (66-70°F). Another factor to be controlled at great expense is relative humidity, with targets in the 40-55% range. Such care was deliberate: Server and storage systems used to be rather expensive (costing tens if not hundreds of thousands of dollars each), coming in a variety of shapes and sizes and dictating strict climatic specifications. Also, there were relatively few of them in any datacenter by today's standards, and consuming much less energy overall as well. Datacenter builds were dominated by IT systems, and applications depended greatly on the reliability and availability of hardware. Optimizing climatic conditions in the data hall for minimized hardware failure rates made business sense.
However, circumstances have changed profoundly in the last 10 years. By and large, IT infrastructures today are built with inexpensive commodity components. Mechanical means of data storage, such as tapes and hard drives that are sensitive to climatic conditions, are giving way to more resilient solid-state memory in production (primary) systems. Environmental requirements of IT systems from all major suppliers are much relaxed and numerous suppliers support high-temperature operations in excess of 32°C (90°F). ASHRAE, the industry body that publishes widely accepted guidelines on climatic controls, recommends a wider operating air temperature range of 18°C-27°C (64.4°F-80.6°F), with allowances for even wider bands of 15°C-32°C (59°F-89.6°F). This means operators are free to let temperatures drift to optimize for cooling costs.
Regardless of these developments, the mainstream datacenter industry has yet to fully embrace relaxed climatic controls. New sites largely operate within relatively conservative bands and also retain mechanical cooling as an insurance policy with some exceptions, mostly at hyperscale datacenters and some IT services providers.
This is despite the emergence of new cooling systems that can meet industry thermal guidelines without the use of compressors. Advanced indirect air-to-air heat exchangers with evaporative cooling are effective enough in removing excess heat from the data hall in virtually all but the hottest and most humid climates. Direct outside air systems, which tend to be simpler and lower cost, can also deliver on that promise in cool climates.
This is not to say that there has been no progress in the industry. Facility energy efficiency on a whole is already good if not excellent for most new datacenter builds compared to the norm at the start of the decade. Numerous operators today achieve power usage effectiveness (PUE) – which shows how much energy overhead the facility represents compared to the IT – of 1.2 or better on an annual basis using some form of cooling economization. Even more traditional recent builds (with chillers or air conditioners, and less regulated airflow) claim a less efficient 1.4 PUE or better, which compares favorably to the industry average of 1.7-1.8 PUE. But chiller-free designs take the pursuit of cooling efficiency further.
Drivers for adoption
The biggest difference with a chiller-free (and DX-free) datacenter is the markedly lower peak power requirement. This has the dual benefit of shedding capital outlays on both the compressors and the electrical infrastructure that's needed to support them, while also freeing up power capacity to be used for IT in the future. In a traditional design, the additional power provision to support the cooling systems can amount to as much as 40-50% on top of the maximum IT load, even if a more recent and efficient chiller or DX-based design is used.
This is not only expensive, it also reduces site power capacity available for IT production, making the marginal cost of future capacity expansion potentially exorbitant (by requiring new utility substations or a wholly new site). Chiller-free designs may require as little as 10% power overprovisioning. This reallocation of power capacity alone should make it worthwhile for many operators to consider a chiller-free datacenter design. Shedding around 10-15% of capital outlays for the datacenter infrastructure (driven largely by lowered capacity requirements on generators, switchgears and uninterruptible power supplies) is an added bonus of a chiller-free cooling design – at about $1m in savings per megawatt IT capacity, 451 Research estimates.
There are also operational savings. The regular maintenance of chillers and computer room air-conditioning is another significant cost and requires facility engineers to enter data halls with live IT equipment to perform tasks on the cooling units. Water piping also adds additional maintenance requirements and the risk of leaks. Operational simplicity tends to promote fewer errors and in turn, a more reliable and available facility. According to data from the Uptime Institute, a 451 Group company, most unexpected incidents in the datacenter are caused by poor facility management practices, rather than inherent issues with equipment.
Impediments to adoption
The most important barrier to adoption of chiller-free datacenters is the business reality of operators. Most tenants, whether internal IT departments or external customers, are opposed to the idea of significantly relaxed climatic controls, let alone the complete elimination of mechanical cooling units. Nearly a decade after ASHRAE published its recommended operating range of up to 27°C (80.6°F), most facilities still operate at a more conservative range (24-25°C is considered to be progressive), dictated by tenants. Multi-tenant datacenter operators, in particular, simply cannot afford to jump ahead of the market because such a move would dramatically narrow their addressable base of customers.
Behind this reluctance to accept wider temperature ranges is a fear of increased IT hardware failure rates. Indeed, elevated temperatures lead to, statistically speaking, more component failures in servers and storage arrays. Most of these failures may be a non-issue for availability or performance because most of these systems themselves have redundancy, yet some might cause loss of performance or even downtime for an application. An administrator's nightmare is the concurrent failure of multiple hard drives in a storage array that might lead to loss of data and service for hours.
An analysis of ASHRAE's failure rate guidance against temperatures suggests that such fears may be overplayed. First, the effect of temperature on failure rates, although marked, is rather moderated in absolute terms. For example, a data hall operating at 27°C at all times will likely observe about one-third more failures than the same data hall at 20°C. Assuming that 3% of servers experience a failure annually at the baseline of 20°C, the elevated operation adds another 1% point on top of that. We believe that this is a marginal layer of additional risk and cost.
Our second point is that no operator should or would operate to a constant temperature. Calculations performed by 451 Research suggest that if an operator allowed temperatures to drift (at controlled speed) across the full range of ASHRAE's allowed band (15°C-32°C) and deployed highly effective indirect air-to-air systems, annualized failure rates would not significantly deteriorate in most climates. It should actually improve in many cases because external conditions would allow the data hall to be cooled below the baseline of 20°C for many more hours than it spends above it. Our data suggests operators should investigate the adoption of wide operating bands coupled with highly efficient cooling systems. Chiller-free datacenters are disruptive to design and operations. Those operators who successfully take the leap will gain sizeable financial, or even strategic advantages by having the ability to deliver more IT capacity in a given site power envelope than competitors. For suppliers of direct and indirect air-to-air cooling units, it represents an opportunity to stir up competitive positions in a saturated and slow marketplace. Clearly, this challenges major datacenter cooling vendors such as Vertiv, Schneider Electric and Nortek that may lose some market share to smaller challengers.
Overall, the move toward chiller-free facilities continues. During the next decade, we believe it will not just be hyperscale and high-performance computing datacenters that eschew mechanical cooling, but that chiller-free will become the norm. Datacenter operators will increasingly be under pressure to shed as many capital costs that are non-essential to the core function of a datacenter as possible. Chiller and DX units are prime candidates for elimination.