Edge Intelligence: the Upcoming Challenger to Cloud Intelligence

Cloud services and various artificial intelligence API offerings have catapulted digital transformation across multiple industries. In the last several years, the Internet of Things (IoT) concept has given rise to million data points of sensor data being sent to and analysed using infrastructure on a public and private cloud. Use cases ranged from measuring electricity usage across several households and dynamically pricing the cost of electricity to collating information from vision feeds that tell us if a specific person has entered the building or not (surveillance et al.). In the first case, a time series is consistently sent from the edge-based power meter to the cloud data analytics services. In contrast, image grabs are processed with deep-learning models on the cloud using person detection, tracking, and re-identification modules for the second use case. The edge device is simply a conduit for transferring information to the cloud in both cases.

Why do we need Edge Intelligence?

A network of sensors that must run closer to end-users provides them with reduced latency, saves bandwidth, pre-processes data on the device before passing them on larger cloud compute infrastructure, and guarantees the privacy of data being collected and inferred from form some of the core requirements for edge intelligence. These edge devices or sensors with a low level of computing elastically create a network (a mesh) wherein these edge devices – Amazon Echo, dashcam, your smartphone, temperature sensor, etc. – intermittently join the network to collect, compute and share information.

In yesteryears – I mean – the IoT world, these sensors were sensing the world and faithfully transmitting signals to a mothership (public or private cloud). But consider this, if two or more edge devices can share not only their inputs but also the limited onboard computing to attain a goal: we have a lot of unreliable edge devices that can be brought together under a mesh network to solve, say, an asset tracking problem, estimating vehicle congestion, etc. Loosely, use cases can range from autonomous devices (drones, robots, autonomous vehicles, etc.), immersive experiences (AR/VR, wearable devices, etc.) to IoT analytics (industrial and home sensors, etc.), and others.

There are essentially two ways to operationalise machine learning on edge – a centralised topology using tools from centralised federated learning and a mathematically obtuse field of decentralised and distributed (no oracle) federated learning algorithms. Federated learning (Figure 1) relies on training machine learning models in devices with lower computational power and transferring the locally learned weights/models to the oracle for further processing. As a first step, a cloud trained model (oracle) is sent to individual edge devices; this model is then fine-tuned with local data, the models are sent to the oracle so that the oracle’s model can be updated. The communication pattern here can be either synchronous or asynchronous.

Incarnations like SimFL can enable edge devices to send messages (gradients of a loss function) to one another, reducing the communication load of sending the local model. Decentralised federated learning is often the preferable way of deploying machine learning models. Information is spread across devices rather than a singular point, decreasing the attack surface of any cybersecurity attack. Communication patterns for decentralised formalism can be via a graph, a distributed ledger or simply peer-to-peer (like SimFL)

In summary, the true edge intelligence lies in this decentralised topology, which will give rise to a new generation of chip companies focusing not only on computation (optimal TFLOPS/Watt) but also on co-designing the computation to go hand-in-glove with communication, i.e., the mesh topology (peer-to-peer, distributed ledger, a graph, etc.).

Computation versus communication

Concerning the Centralised Topology, the Oracle becomes the infrastructure on the cloud while the Edge device is a sensor with lower compute power. One can go a bit further and have an Oracle that sits close to the Edge device (for CCTV cameras, it may be an nVidia Xavier kit that sits on-premise; 11 TFLOPS (fp16) at up to 30W) – we now have the emergence of fog computing, slightly diluted hardware capabilities deployed very close to the edge device (slightly larger form-factor can house a single blade of A6000 oracles on the fog – 39 TFLOPS (fp16) at up to 300W, instead of hundreds of A100 oracles on the cloud – 78 TFLOPS at up to 400W).

Let us take our reasoning one step forward; we now remove the oracle and have a mesh of lots of less powerful processors (now think of a single Jetson Nano – 470 GFLOPS (fp16) at up to 10W power consumption). Hopefully, we can appreciate that edge deployment of machine learning models is a delicate balance between processing capacity, power envelope and form factor.

Often people ask if 5G is required for Edge AI. Absolutely not. Many edge applications can leverage current technologies like 4G-LTE-M (Long Term Evolution category M1), NB-IoT (Narrowband IoT) and CBRS (Citizens Broadband Radio Service) – the critical metric to consider is proximity, latency, and mobility. A CCTV is tethered to a wall while a mobile phone or an autonomous vehicle is mobile. Similarly, a car travelling at 80 Km/hr requires a quick response time for its AI than a voice agent that can wait a few seconds for a response. From the notion of proximity, having an edge data-centre (oracle) close to a picking robot might be required compared to a use case with a drone. By construction, it can be optimised to stay close to an access edge or a regional edge provided by the telecommunication network provider.

So, the specific use case along with these three metrics (proximity, latency, and mobility) can quickly get us to a list of what is necessary from the lens of deploying a product at the edge.

Deploying models on the Edge

Deployment of models on edge is the lowest hanging fruit as one can lift and shift many container deployment strategies to the fog/edge. Container engines that can manage the containers deployed to a device and container orchestrators that can distribute tasks across computational nodes are essential for edge computing.

Another important virtualisation technology to watch is that of Unikernels. These are pieces of software that avoid context switches by running in the kernel mode, saving memory footprint. Since these images only include necessary kernel functions, the image sizes are much reduced. They also shrink the attack surface and resource footprint by construction, ensuring a secured deployment.

Similarly, Kata containers combine a secure container runtime with lightweight virtual machines. They provide more substantial workload isolation using hardware virtualisation technology.

Without such deployment strategies, most downstream tasks simply become a theoretical construct. Copy/pasting the Kubernetes model for deployment to resource-constrained devices such as a microcontroller unit, mobile phone or a CCTV camera may not be trivial. Kubernetes ought to evolve too.

Towards Edge Intelligence

The network edge is a volatile environment with constantly changing topology and devices. Orchestration algorithms for scheduling payload dynamically in the cloud exist. Still, for edge devices, this is armoured with problems of scale (millions of devices), more heterogeneous power envelopes and computational capacity of various devices. Data collection and models have increasingly fragmented wherein each device has a certain percentage of the entire data (rather than having access to the whole data-set). More edge devices are becoming miniaturised, low-powered with limited compute.

Future work should, therefore, concentrate upon decentralised algorithms, federating dynamically changing communication patterns, chips with optimal TOPS/W in a small form-factor and most importantly, making computing interwoven with communication.