Data strategies for efficient and secure edge-computing services
The challenges of building and properly managing an Internet of Things (IoT) network have grown alongside the benefits of the technology. At the end of the day, IoT is a distributed processing framework that comes with the challenges of distributed systems. As a result, developers and architects have to consider the business needs for the data (latency, security and volume requirements), cost and as well as other factors to best determine how to architect a distributed environment.
Data Considerations for Edge Computing Services
There is a long list of design questions that comes with executing an IoT network: where does computation happen? Where and how do you store and encrypt data? Do you require encryption for data in motion or just at rest? How do you coordinate workflows across devices? And finally, how much does this cost? While this is an intimidating list, we can build good practices that have evolved both prior to the advent of IoT and more recently with the increasing use of edge computing.
First, let’s take a look at computation and data storage. When possible, computation should happen close to the data. By minimizing transmission time, you reduce the overall latency for receiving results. Remember that distributing computation can increase overall system complexity, creating new vulnerabilities in various endpoints, so it’s important to keep it simple.
One approach is to do minimal processing on IoT devices themselves. A data collection device may just need to package a payload of data, add routing and authentication to the payload, then send it to another device for further processing. There are some instances, however, where computing close to the collection site is necessary.
One example of computing close to a sensor is in anomaly detection. If an IoT device is monitoring the equipment function, you want to know about a malfunction as soon as possible.
On a factory floor, for instance, you may send sensor data to an edge device that analyzes data from all sensors on the floor so the analysis and alerts can be performed quickly after a malfunction and you can be alerted. If immediate analysis is not necessary, however, it may be more cost-effective to send data to a centralized ingestion point, such as an ingestion service in the cloud that then writes the processed data to a data store. This delayed option makes sense when collecting data to train machine learning models for example. Assigning data to different processing locations thus involves understanding the business purpose of the data, which helps you decide how to architect your networks for edge and cloud processing.