INTRODUCTION

This project focuses on the design and implementation of a custom chip that accelerates convolutional neural network (CNN) inference for handwritten digit recognition. The chip, fabricated and tested as part of a larger system, is optimized for efficient processing of pre-trained CNN models, demonstrating the potential for high-performance AI-specific hardware.

The chip is designed to handle the classification of 16x16 grayscale images using a CNN. It incorporates custom hardware modules tailored for convolution operations, pooling, and fully connected layers. By integrating quantization, the chip achieves both computational efficiency and reduced power consumption without compromising accuracy. These features make the chip well-suited for real-time applications in resource-constrained environments.

During the workflow, the chip processes sequential pixel data sent from an FPGA, which ensures proper formatting and timing. The onboard logic executes CNN computations, delivering high-speed classification results. The FPGA handles communication between the chip and external systems, such as the Raspberry Pi and display interfaces. This streamlined design allows the chip to achieve remarkable accuracy, demonstrating the feasibility of hardware-accelerated AI.

This project highlights the end-to-end process of chip design, from architecture definition and implementation to integration and testing within a complete system. It addresses key challenges such as data sequencing, timing synchronization, and efficient hardware realization of neural network operations, providing a practical foundation for future AI hardware development.

EE6350 VLSI Design Lab
Spring 2024

NPU: Neural Processing Unit

INTRODUCTION

EE6350 VLSI Design Lab Spring 2024

NPU: Neural Processing Unit

INTRODUCTION

EE6350 VLSI Design Lab
Spring 2024