Overview

NeuPro-S™ is a low power AI processor architecture for on-device deep learning inferencing, imaging and computer vision workloads.

While NeuPro-S provides a self-contained and specialized AI processor, it also supports heterogeneous co-processing with custom AI engines to enable additional customer differentiation and cover specific application needs, enabling it to fit a broad range of end markets including IoT, smartphones, surveillance, automotive, robotics, medical and industrial.

NeuPro-S builds on CEVA’s industry-leading position and experience in deep neural networks for computer vision applications. Dozens of customers are already deploying the CEVA-XM4 and CEVA-XM6 and NeuPro vision platforms along with the CDNN Compiler in consumer, surveillance and ADAS products.

This new AI processor architecture covers a wide range of processing options, ranging from 2 Tera Ops Per Second (TOPS) up to 12.5 TOPS per core and is fully scalable to reach above 100 TOPS using multi-core instantiations. NeuPro-S was designed to meet the most stringent safety compliance standards and comes complete with a full complementary software stack including CDNN, CEVA-CV, CEVA-SLAM SDK and Wide-angle imaging algorithms.

Benefits

The NeuPro AI processor family were designed to reduce the high barriers-to-entry into the AI space in terms of both architecture and software. Enabling an optimized and cost-effective standard AI platform that can be utilized for a multitude of AI-based workloads and applications

Self-contained, unified imaging, computer vision and AI Processor in single architecture
Unique 4096 native 8x8 MACS processing enabling up to 12.5 TOPS for single core and 100+ TOPS for multi-core instantiations
System aware architecture, optimized for memory bandwidth, power and performance efficiency

Main Features

  • NeuPro-S AI processor consists of NeuPro-S Engine and CEVA-XM Vision DSP
    • NeuPro-S Engine - Specialized engines for Convolution, Activation and Pooling layers as well as weights decompression
    • CEVA-XM6 - Fully programmable vector DSP for complementary NN functions, simultaneous processing of computer vision, imaging and customer extensions workloads
    • Supports both 8-bit and 16-bit quantization mix to enable real-time decision tradeoff between precision vs. performance
  • Supports multi-level memory system hierarchy enables multi-core scalability
  • Optimized DDR bandwidth enabling weight compression and exploring network sparsity
  • Advanced hardware DMA controllers for parallel processing and minimizing system overhead
  • The NeuPro-S AI processor architecture includes the following processor options:
    • NPS1000 includes 1024 8x8-bit MAC units
    • NPS2000 includes 2048 8x8-bit MAC units
    • NPS4000 includes 4096 8x8-bit MAC units
  • Supports heterogeneous scalability and co-processing with custom AI engines to enable further customer differentiation

Block Diagram

Microprocessor Report

New IP Comprises a General-Purpose Machine-Learning Processor

NeuPro’s target applications include advanced driver-assistance systems (ADASs), augmented-reality (AR) headsets, drones, smartphones, and surveillance cameras.