Latency questions give edge datacenter forecasters the jitters
As the next generation of the IT buildout unfolds, there is a widespread expectation that a new tier of datacenters, or at least of compute and storage capacity, will need to be built. This is the 'edge' or 'near edge' – a tier that consists of computing, routing, caching and storage, localized analytics, and some automation and policy management. It will be needed to support anything from a few devices, sensors, actuators and client systems to hundreds, thousands or, in some cases, hundreds of thousands. Expectations of a surge in demand for edge computing means that a large number of edge datacenters will be needed. But there is considerable uncertainty over the size and shape of this demand. At the root of this uncertainty is the issue of latency – the most often cited reason for the expected edge datacenter buildout. In this report, we consider some of the issues around latency and edge datacenters. We will examine edge datacenters and all the various demand drivers in a forthcoming report.
The 451 Take
Will there be a wave of demand for new edge processing/storage capacity to support low-latency applications? The complicated answer runs like this: it depends on how the edge is defined, what latency requirements are, and the capabilities of the networks linking the edge to the core. It also depends on how far Moore's Law and custom electronics develops because miniaturization may mean that devices, not datacenters, will be needed. And, of course, it depends on the adoption of new technologies and applications that scarcely exist, such as augmented reality and the tactile internet. But the simpler answer is that latency is only one driver among many for edge datacenters, and each driver (latency, data volume, availability, control) is strong enough to drive some demand. Add these up, and predictions for a wave of new microdatacenters at or near the edge seem well-founded.
Expectations of a surge in demand for edge computing – and, therefore, new edge datacenters – are already driving the roadmaps of key suppliers: Among (many) major suppliers with products or propositions specifically around edge datacenters are Dell, HPE, Ericsson, Huawei, Cisco, Schneider Electric and Vertiv – they include servers, routers, microdatacenters, power management systems and management systems. Most expect this to be multi-billion-dollar opportunity – an opportunity that exists separately from all the sensors, networking and software tools and applications.
These plans and products are based on the current projections for (and the huge expected investments in) the IoT and for 5G mobile networking. Both of these will support a wave of new applications, connecting both new devices and existing ones that have until now not been IP-enabled. They are certain to drive the creation of a tsunami of data, some of which will need to be processed very near the source of data and the devices or users.
We discuss these developments and their likely evolution in detail in our reports Clearing the Fog Around Edge Computing in the Internet of Things and 5G networks: A catalyst for innovation across IT, as well as in the forthcoming May report, Datacenters at the Edge: Diverse, Cloudy and Connected. In summary, all these reports broadly support the proposition that a large number of edge datacenters will be needed. But there is considerable uncertainty over the size and shape of this demand: Some believe that the edge datacenter opportunity has been exaggerated, and that most of the new traffic can be supported by lightweight access devices and services, large urban datacenters, existing telco central offices (although arguably in need of upgrading) and the back-end cloud.
There are multiple reasons why 451 Research believes there will be increased demand for computing (and datacenters) at the edge of the network. Three key ones are:
- Low latency – the need to process/store information near to the point of generation/use.
- Mission-criticality – the need to ensure data is immediately secured, copied, fully compliant and made available to others (regardless of network connections).
- Volume/bandwidth – volumes of data are high, and the data is, therefore, better stored locally, at least until later.
Of these, latency is the most difficult to solve without considerable edge investment. If 5G applications or IoT systems, for example, involve applications or services that need immediate responses, then at least some of the processing needs to happen nearby, and some of the data must equally be stored nearby. But this raises the question: What does low latency really mean at the edge? Which applications demand it? And even if there is demand for secure edge processing and storage, will it be enough to require secure, available and managed datacenter-type capacity? On all these points, there is still considerable discussion.
What is meant by low latency?
Latency is the time taken – usually measured in networking as a round trip – for a system to complete its expected task. As a general rule, it increases with the number of hops and with distance, and it benefits from dedicated pathways and plenty of bandwidth. Latency also needs to take into account processing time. It isn't just one number, however. Latency jumps up and down constantly, and the variation (jitter) can have as large an impact on some applications as the latency itself. Latency is often reported as a trio of maximum, minimum and average (or mean). Average latency can be estimated by modeling distance, hops and the performance of the technology being used.
There is no 'official' definition of 'low' latency. The table below gives an approximate view of latency ranges and the types of applications that may benefit – or may only work – in these ranges. However, latency is a complicated, fluid issue, and applications that may appear similar may actually vary widely in their latency needs, according to the design, deployment and business purpose.
*These are just examples. Every application differs according to particular needs.
There are no hard limits on relative levels for certain applications, although network architects may use rules of thumb: 250 milliseconds for voice, 100ms for VMware migration, 10ms for Oracle clusters, and a smaller number for closed loop or interactive control – perhaps sub 1ms. In high-frequency financial trading, sub-1ms is often cited.
A lot of the network latency issues don't have too much to do with distance or the speed of light; they have more to do with network design, router protocols, equipment capability and application design. Reliable low latency can only be achieved if the network is designed to support it, and often it is not. Networks that are shared or overloaded, or where there are many hops, or where the equipment is not optimized may not perform reliably even if distances are short. Delay can be caused by the queuing of routers and by the indirect routes that packets may take (the internet protocols don't optimize performance or latency).
Many applications are susceptible to jitter; this is often caused by problems in the network that can be hard to eliminate without considerable control and investment – and which may be more easily solved by local processing/storage, which ensures control. It is not just low-latency applications that can be affected: the growing interdependencies of applications can mean that many applications are slowed down or fail because of a need to wait for other services (complex webpages that link to many other URLs illustrate this problem). In other words, delays can add up.
Packet loss causes particular problems. At the IP level, packet delivery is best effort, not assured: if a router is overloaded, the router will throw away packets (the TCP will request retransmission of missed packets). But each time this happens, the communication slows down, and then speeds up when it can.
One tricky problem to overcome is the way that the internet border gateway protocol (BGP) is used for routing traffic: BGP picks a good path across the network to the destination based on the number of hops but without considering load or quality of service. Similarly, peering points can be arranged without regard for best performance, and can become overloaded so that the routers dump traffic (this usually has to do with the number and speed of ports rather than raw router performance).
Solving network performance is not just an edge problem: WAN performance is becoming critical as the cloud technologies encourage distributed, dynamic applications. Many large organizations (such as financial trading companies) and big cloud providers invest heavily in private networks. Google, for example, has not invested heavily in edge capacity, but it is investing heavily in fiber and networks.
In order to improve or ensure good performance, then, operators will need to continue to pay close attention to the end network components (as they always have done), and decisions on whether to step up capacity at the edge will be taken in the context of overall performance. As we discussed above, improving the overall network fabric is important anyway to support distributed applications, and the growing use of replication for resiliency and recovery. But the question for many suppliers, operators and enterprises is: How much will IoT and 5G changes lead to a new wave of applications that do need more capacity at the edge? How much new demand for processing, storage and facilities (secure site, power, cooling and radio/fiber networks) will be needed on a new edge frontier?
As it stands, only a small fraction of applications – even at the edge – need low or ultra-low latency – meaning below 5ms, according to our table above. In most cases, the great majority of 'things' or IoT devices, including cellphones and cell towers, will spend all or most of their lives within a few miles (say, less than 15) of several large datacenters that can support tens or hundreds of racks of IoT computing at relatively low cost. These distances are near enough to support low/medium-latency applications, provided that the network is sufficiently fast and robust.
For many applications, the devices themselves will have onboard processing to deal with immediate situations. The driverless car, for example, is often cited as a 5G ultra-low-latency application; however, driverless vehicles must be designed to run safely with intermittent network connections – so the network services will, in most cases, provide contextual and secondary management data only, and this may be provided by lightweight access/switching and existing or planned metro datacenters. (This does not mean, however, that they won't have features that can benefit from low-latency connectivity where available.)
This suggests that very often, the existing infrastructure – at least in terms of datacenters – will be sufficient in architecture if not in scale. Many suppliers of edge IT equipment – such as Zscaler and Lenovo – have designed their edge access devices to run in two to four racks. In most cases, that is all that will be needed at the edge, and they would not need to be located in a datacenter. Others – such as CDNs – are buying edge caching/processing equipment, but placing it in the nearest colocation datacenters.
'Edging' their bets
These cases suggest that some of the expectations may have been too high. But there is an aggregating effect: As more services are needed, there will be more edge devices that must be placed nearer the edge – and often, it will make sense to locate all of them in microdatacenters. Intel, when outlining its IoT strategy, made the point that coverage holes or incomplete/uneven services will require edge devices to ensure consistent latency and performance. Another point is that CDNs that invest in edge capacity for volume (cost) and bandwidth reasons can also have a latency issue – even though they are often delivering non-latency-sensitive applications. That is because the peering congestion at the backbone datacenters leads to a slowdown of all applications.
In 5G networks, for example, low-latency applications may be provided by colocating access and processing technology – perhaps the size of a few pizza boxes or less – with the cell tower. 5G's design accommodates this architecture, with low-powered and small masts added to provide local availability/performance to existing 'macro towers.' If there is a need for more edge IT, it seems likely that these existing larger towers will be the best sites because many already have the fiber connections, management, security and backup power in place. There will not be a new microdatacenter, even a small one, for each new mast, but there may be a need for microdatacenter facilities to support clusters of masts.
This demand – one step back from the edge – will likely turn out to be considerable, although it may take many years to evolve. Designers of 5G, for example, envisage that over time, 5G will consist of multiple networks with different performance characteristics, working mostly on standard IT hardware using network function virtualization. Many of these networks will have a capability, at the edge, for ultra-reliable low-latency communications (URLL) in order to support mission-critical real-time control and automation, tactile internet, intelligent transport systems and other emerging applications such as augmented reality. These sub-millisecond URLL networks must be highly reliable and resilient; edge processing capacity will clearly be needed.
Emerging markets are notoriously difficult to predict, especially when the drivers and markets are fragmented. But in the case of edge datacenters, there are clearly enough use cases, drivers and situations for suppliers to plan for real demand. The challenge is ensuring the products are sufficiently and economically optimized for each market.