EdgeCortix Unveils Revolutionary AI Edge Accelerator

EdgeCortix Unveils Revolutionary AI Edge Accelerator

EdgeCortix, a five-year-old Tokyo-based startup specializing in AI at the edge, has introduced the SAKURA-II Edge AI accelerator.

This new accelerator, combined with EdgeCortix’s Dynamic Neural Accelerator (DNA) architecture, is designed to handle Generative AI tasks efficiently.

SAKURA-II focuses on flexibility and power efficiency, making it ideal for tasks like Large Language Models (LLMs), Large Vision Models (LVMs), and multi-modal transformer-based applications at the edge.

“SAKURA-II delivers an impressive 60 TOPS performance within 8W of typical power consumption. With its mixed-precision and built-in memory compression capabilities, it’s a game-changer for the latest Generative AI solutions at the edge,” says EdgeCortix founder and CEO Sakyasingha Dasgupta. “Whether running traditional AI models or the latest Llama 2/3, Stable-diffusion, Whisper, or Vision-transformer models, SAKURA-II offers deployment flexibility with superior performance per watt and cost-efficiency.”

SAKURA-II is versatile, serving a wide range of industries like manufacturing, Industry 4.0, security, robotics, aerospace, and telecommunications. It uses EdgeCortix’s reconfigurable DNA-II neural processing engine to deliver power efficiency and real-time processing capabilities while simultaneously handling multiple deep neural network models with low latency.

The SAKURA-II can achieve up to 60 TOPS of 8-bit integer performance and 30 trillion 16-bit TFLOPS, with built-in mixed precision to tackle demanding AI tasks.

The SAKURA-II platform, powered by the advanced MERA software suite, includes a heterogeneous compiler platform, sophisticated quantization, and model calibration features. It supports leading development frameworks like PyTorch, TensorFlow Lite, and ONNX. MERA’s adaptable host-to-accelerator unified runtime can scale across single, multi-chip, and multi-card systems at the edge, making AI inferencing smoother and shortening deployment times for data scientists.

The integration with the MERA Model Library and Hugging Face Optimum provides users access to a wide array of the latest transformer models, ensuring a smooth transition from training to edge inference.

Key Benefits of SAKURA-II include:

1. Optimized for Generative AI: Designed to handle Generative AI tasks at the edge with minimal power usage.
2. Complex Model Handling: Capable of managing multi-billion parameter models like Llama 2, Stable Diffusion, DETR, and ViT within an 8W power envelope.
3. Seamless Software Integration: Fully compatible with the MERA software suite for easy transitions from model training to deployment.
4. Enhanced Memory Bandwidth: Provides up to four times more DRAM bandwidth compared to other AI accelerators, enhancing performance for LLM and LVM.
5. Real-Time Data Streaming: Optimized for low-latency operations during real-time data streaming.
6. Advanced Precision: Offers software-enabled mixed-precision support for near FP32 accuracy.
7. Sparse Computation: Supports sparse computation to reduce memory usage and optimize bandwidth.
8. Versatile Functionality: Supports arbitrary activation functions with hardware approximation for better adaptability.
9. Efficient Data Handling: Features a dedicated reshaper engine to manage complex data permutations on-chip and reduce the load on the host CPU.
10. Power Management: Includes on-chip power-gating and power management features for ultra-high efficiency modes.

SAKURA-II will be available as a stand-alone device, two different M.2 modules with varying DRAM capacity, and single or dual-device low-profile PCIe cards. Customers can reserve these modules and PCIe cards now, with delivery expected in the second half of 2024.

smartautotrends