We’ve generally become comfortable with the idea that when you need digital signal processing – in the physical layers of a modem, or in microphone beamforming of a smart speaker or in geolocation of a tracking device – you use a DSP. And when you need digital control – to run a protocol stack, or manage control aspects of audio codecs or GNSS – you use an MCU. Since both functions are needed in a typical IoT device, you have to sign up for two or more cores. This may not be a big deal in high-margin devices with modest expectations for time between recharges but can be a significant downside for many IoT applications. For these applications, an optimized hybrid processor could meet both needs more cost-effectively and with longer recharge cycle times. We’ve performed a pretty detailed analysis to figure out what this would take; we think such a solution is not only possible but can be very competitive in a wide range of IoT applications.
Think about shared bikes or scooters. You obviously need to track these, so your embedded device must be able to determine location, the modern solution for which uses GNSS. It also has to be able to communicate, commonly nowhere near a Bluetooth mesh or Wi-Fi access point, so cellular access is the ideal platform. But you don’t need to communicate a lot of data, pointing to NB-IoT as the best protocol. There’s also going to be need for some level of local compute, maybe a lot more than you might expect given increased demand for security and privacy.
Now your simple device must support 4G (maybe 5G), GNSS, an application, encryption and maybe secure enclaves/secure boot. But you want to put thousands of these things around the city, in many cities, and your enterprise is going to live or die based on whether you can provide the best availability and pricing. Getting cost and power (to minimize maintenance) as low as possible becomes an existential concern.
We thought it was worth looking harder at the compute demands in these applications, particularly at the balance between digital signal processing and digital control. We started by looking at the underlying algorithms of NB-IoT connectivity, GNSS and security standards. We broke down activity in an asset tracker application running at around 100MHz into DSP functions (baseband modem and some parts of physical layer control) and control functions (protocol stack, security, and general system housekeeping). For a lightweight application in which NB-IoT is communicating infrequently, we found cycles broke down this way:
- Modem PHY (mostly DSP) – ~35%
- L1 Control (DSP and control) – ~25%
- Protocol stack (mostly control) – ~40%
Here, cycles are distributed quite evenly between signal processing and control, making a case for combining processors. Should we worry about performance impact if we can’t run both types of function simultaneously? Not really. These aren’t high performance applications. Where you need speed, as in the latest rev of eNB-IoT, you can often reduce net energy consumption by serializing functions. Each in turn runs fast then stops, a common practice in energy management.
To expand the scope of our analysis, we took a look at a different and very popular application – sound processing and voice control. Think here of smart speakers, wireless earbuds, hearables, voice-activated devices and security devices activated by distinctive noises such as breaking glass. In these applications you have a different mix of needs: audio codecs such as Dolby for music playback, noise reduction in voice/sound pickup and neural network processing to recognize trigger phrases or even a limited vocabulary for appliance control.
Here we used the Dolby Atmos as well as our internal noise reduction and voice recognition benchmarks to characterize activity by cycles, and found these rough splits:
- Audio codecs – 70% in control, 30% in DSP
- Noise reduction- 90% in DSP (many filters), 10% in control
- RNN/LSTM Neural networks – 60% in DSP, 40% in control
These use-cases lean more to DSP activity, but still with a significant control component so a combined core should make sense.
The case for combining both functions in one processor looks quite convincing, but this isn’t simply a matter of dropping a few MACs into a controller. The DSP has to live up to serious DSP applications such as for the latest comms standards. So it will need 16×16 and 32×32 MACs, SIMD, and native support for floating-point and double-precision for GNSS. The architecture has to be particularly flexible to be able to adapt at the software level as standards like NB-IoT and different GNSS constellations continue to evolve. At the same time such a solution has to function very effectively as a controller, supporting very compact code size (where a number crunching DSP would not be as efficient) and efficient out-of-the-box C development support, to connect to legacy, open and ecosystem code sources.
We’ve developed our CEVA-BX1 and CEVA-BX2 cores based on this philosophy. These have been reviewed in a recent Linley Group report, which provides more technical detail and supports the view that platform can, on its own, accomplish what another solution requires both a DSP IP and an MCU IP. Food for thought for anyone concerned about power and cost.
Published on Embedded Computing Design.