Performance Analysis of MAC Units Using Various Multiplication Algorithms for Deep Learning Applications

Date

2018

Authors

Jha, Gunjan

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

A wide range of applications, such as Deep Neural Networks, DSP, Multimedia, and Image Processing, use multiply and accumulate unit as a basic building block. The computing efficiency of these applications rely on the speed of the adders and multipliers used in the multiply-accumulate (MAC) unit. The speed can be increased by using high speed adders and multipliers. In the past, main focus has been on the circuit speed. However, currently, low power requirement in these circuits has become more imperative.

In Deep Neural Networks or Deep Learning, convolution operations account for more than 90% of overall computation. These convolution operations are primarily performed using MAC units. The primary objective of this thesis is to design, implement and analyze IEEE 754 single-precision floating-point MAC and 32-bit signed integer MAC, using various multiplication algorithms. The concept of MAC unit is to multiply two numbers and add the product to the accumulator. In this research, performance of MACs using different multipliers is analyzed.

The various MAC units designed are implemented in Verilog, and the synthesis and simulation are done using Xilinx ISE Design Suite 13.4. The performance matrices considered for the analysis are power, delay, and area. The realization is carried out using Synopsys Design Compiler, on 45nm and 90nm technology nodes. Even though the single precision floating point (FP) format can represent a much greater range of values than the 32-bit signed integer format, based on the results, it is found that single-precision floating-point MACs outperforms 32-bit signed integer MACs. MAC with Wallace Tree multiplier performs best in terms of speed whereas Radix-8 Modified Booth multiplier-based MAC consumes minimum power and area as compared to other implemented MACs. Power, delay, and area are found to follow the same trends for both the 90nm and 45nm processes.

Description

This item is available only to currently enrolled UTSA students, faculty or staff. To download, navigate to Log In in the top right-hand corner of this screen, then select Log in with my UTSA ID.

Keywords

Deep Learning, Low Power, MAC

Citation

Department

Electrical and Computer Engineering