Читать книгу Multi-Processor System-on-Chip 1 - Liliana Andrade - Страница 16
1.2. Versatile processors for low-power IoT edge devices 1.2.1. Control processing, DSP and machine learning
ОглавлениеLow-power IoT edge devices typically perform a range of different functions locally on the device. They run a local application that controls the device, its sensors and other interfaces, such as a communications interface to the network and a user interface. For this purpose, a processor must have capabilities for efficient processing of control code, including low branch overheads, efficient interrupt handling, timers, efficient integration with peripherals, support for real-time kernels, etc.
Furthermore, IoT edge devices typically perform some processing on the data acquired through their sensors. These can be sensors to monitor physical phenomena, such as thermometers, gyroscopes, accelerometers and magnetometers. Let us consider, for example, a personal health device or the smart sensing devices used in agriculture, mentioned above. Data rates for this type of sensors are typically low. Microphones are another type of sensors that have higher data rates. For example, a 16 kHz sample rate is often used for voice data. Even higher data rates can be observed in IoT edge devices that use a camera, such as a smart doorbell performing face detection. Data rates for image and video data can vary largely, based on resolution and frame rates. Data rates of hundreds of MB/s are not unusual in high-end devices, but for more power-sensitive camera-based applications, much lower data rates can be observed.
The processing of sensor data typically involves digital signal processing (DSP) with functions such as filtering (e.g. FIR, correlation, biquad), transforms (e.g. FFT, DCT), and vector and matrix operations. Voice data can be processed by various DSP functions, including noise reduction and echo cancellation. In addition, the IoT edge device can perform encoding and/or decoding of voice or audio data. For example, consider an audio playback function on the device.
Communicating data involves further DSP functions. For example, some key functions in an NB-IoT protocol stack involve FFT, auto- and cross-correlations, and complex multiplications and convolutions. Furthermore, trigonometric functions such as sine and cosine must be performed. In addition, such protocol stacks perform convolutional coding, for example, Viterbi.
We conclude that the efficient processing of sensor data on an IoT edge device requires processors equipped with DSP capabilities. The relevant DSP capabilities are:
– support for fixed-point data types and arithmetic, including fixed-point multiply-accumulate (MAC) instructions, wide accumulators, and efficient saturation and rounding;
– support for floating-point data types and instructions, including fused multiply-add instructions;
– advanced address generation for efficient memory access, including circular and bit-reversed addressing for DSP kernels such as FIR filters and FFTs;
– zero-overhead loops;
– support for complex data types and arithmetic, including complex multiply and MAC instructions;
– support for vector or SIMD processing to enable increased efficiency by exploiting data parallelism;
– efficient divide and square root operations;
– high load/store bandwidth, as DSP functions can be memory-access intensive.
In addition to control processing and DSP, machine learning has recently emerged in various application areas as a technology for building IoT edge devices with advanced functionalities. Some illustrative examples are smart speakers, wearable activity trackers and smart doorbells. These devices apply machine learning technology that has been trained to recognize certain complex patterns (e.g. voice commands, human activity, faces) from data captured by one or more sensors (e.g. a microphone, a gyroscope, a camera). When such a pattern is recognized, the device can perform an appropriate action. For example, when the voice command “play music” is recognized, a smart speaker can initiate the playback of a song. In the following sections, we dig deeper into the requirements and processor capabilities for efficient machine learning in low-power IoT devices.
Integrated circuits for low-power IoT edge devices may use one or more processors for implementing the different types of processing. Multiple processors are required if a single processor cannot handle the complete software workload. A further reason for using multiple processors is that specialized processors can be used for the different types of processing. More specifically, different processors can be used for control processing, DSP and machine learning.
However, there are also good reasons to aim to reduce the number of processors. Lower cost is a key benefit, which is particularly relevant for low-cost IoT edge devices that are produced in high volumes. The use of fewer processors also reduces design complexity, as it simplifies the interconnect and memory subsystem required to integrate the processors. Furthermore, if multiple interacting functions are combined to be executed on a single processor, then this will limit data movements and reduce the software overhead for communication. An additional benefit for software developers is that a single tool chain can be used. To enable the flexible combination of functions, we need versatile processors that can efficiently execute different types of workloads, including control tasks, DSP and machine learning. Such processors are also referred to as DSP-enhanced RISC cores. They add a broad set of instructions for DSP and machine learning to a RISC core. If done well, the hardware overhead of these additions is small, for example, by sharing the register file and having unified functional units (e.g. a multiplier) for control processing, DSP and machine learning. Today, optimized DSP-enhanced RISC cores are available from IP vendors.