What is latency?
In IT, latency is the time that elapses between a user request and the completion of that request. Even processes that seem instantaneous have some measurable delay. Reducing such delays has become an important business goal.
Why is low latency needed?
Many of the applications requiring low latency need it to improve the user experience and support customer satisfaction by helping applications run faster and more smoothly. Such applications can include those hosted in the cloud, online meeting applications, or mission-critical computation applications.
What causes latency?
When a user, application, or system requests information from another system, that request is processed locally, then sent over the network to a server or system. There, it is processed again, and a response is formed, starting the reply transmission process for the return trip.
Along the way, and in each direction, are network components known as switches, routers, protocol changes, translators, and changes between the network cabling, fiber, and wireless transmission. At each step, tiny delays are introduced, which can add up to discernible wait times for the user.
As overall network traffic continues to grow, latency is increased for all users as the line to complete transmissions backs up and micro-latencies add up. This manifests as high latency, a frustrating delay before the webpage loading begins.
The geographical distance that data must travel can also have a significant effect. This is why edge computing, the practice of locating data and applications closer to users, is a well-known strategy for reducing latency. In some cases (see below), reducing this distance is a smart, effective way to lower network latency.
What is a low latency network?
A low latency network is one that has been designed and optimized to reduce latency as much as possible. However, a low latency network can't improve latency caused by factors outside the network.
What is latency jitter?
Latency jitters when it deviates unpredictably from an average; in other words, when it is low at one moment, high at the next. For some applications, this unpredictability is more problematic than high latency.
What is ultra-low latency?
Ultra-low latency is measured in nanoseconds, while low latency is measured in milliseconds. Therefore, ultra-low latency delivers a response much faster, with fewer delays than low latency.
How is low latency achieved?
For new deployments, latency is improved through the use of a next-generation programmable network platform built on software-defined hardware, programmable network switches, smart network interface cards, and FPGA-based software applications.
To reduce latency in an existing network, follow the steps below:
- Identify the network problems and impacts
- Confirm that the problems and impacts are caused by high latency
- Identify the IT infrastructure that is contributing to the high-latency problem
- Evaluate which network switches and network interface cards can be replaced to create a low latency environment
- Evaluate which network functions can be offloaded to a field-programmable gate array (FPGA)-programmable switch or smart network interface cards (SmartNICs) to reduce latency to milliseconds or nanoseconds