# Design and Analysis of 8-bit Array, Carry Save Array, Braun, Wallace Tree and Vedic Multipliers

S. Nagaraj\*, B. Vamsi Krishna, Ganekanti Naresh and Dr.P.K. Anand Prem

Abstract--- Multiplier is the basic building blocks for several applications like digital signal processing processors, digital image processing. In this paper, we have designed 8-bit array, carry save array, Braun, Wallace tree and Vedic multiplier. And we have analyzed speed, area and power. Design was implemented using verilog HDL coading simulated and synthesized using Xilinx Tool.

Index Terms--- Array Multiplier, Carry Save Array Multiplier, Wallace Tree Multiplier, Braun Multiplier, Vedic Multiplier.

# I. INTRODUCTION

Multipliers play important and significant role in Signal Processing according to A. V. Oppenheim and R. W.

Schafer [1] and other various applications. Multiplication is mathematical operation in which the number is added to itself for the specified number of times. Multipliers take more time and area than other arithmetic operations. Multipliers are used in Digital Signal Processing applications such as convolution, filtering, Fast Fourier Transform (FFT) and in Arithmetic Logic Unit(ALU) in microprocessors. 8.72 % of all the instructions in scientific program are multiplication as per A. V. Oppenheim and R. W. Schafer [1]. Researchers have been carried by Poras T. Balsara et al. [7], R. Gnanasekaran [8], Gensuke Goto, Tomio Sato et al. [9], H. I. Saleh, A. H. Khalil et al. [10] to develop new techniques and algorithms for high speed and optimized area.

An efficient multiplier has the following characteristics Speed: The Multiplier should perform the operations at high speed.

Accuracy: The results of Multiplier should be correct. Area: Multiplier should occupy less area that is it should be with minimum no of transistors.

Power: Power consumption of Multiplier should be low.

Multiplication operation generally comprises of two steps.

Partial Products are generated in the first step and then it is added with the previous partial products. Every bit of multiplicand is multiplied by every multiplier bit to get partial products. So to multiply two N-bit numbers N partial product rows of N bit each has to be generated. So we need AND gate to generate every bit of partial product.

The general process can be broken down into three steps as per H. A. Al-Twaijry [2] as shown in Figure 1.

S. Nagaraj\*, Associate Professor, Department of ECE, SVCET, RVS Nagar, Chittoor, AP. E-mail: nagarajsubramanyam@gmail.com B. Vamsi Krishna, Assistant Professor, Department of ECE, CMR College of Engineering & Technology, Telangana. E-mail:vamsi.kurnool@gmail.com

Ganekanti Naresh, Assistant Professor, Center of VLSI and Embedded System, Sree Vidyanikethan Engineering College, Tirupati, AP. E-mail:ganekantinaresh@gmail.com

Dr.P.K. Anand Prem, Assistant Professor, Department of ECE, SVCET, RVS Nagar, Chittoor, AP. E-mail: anandprem.2008@gmail.com

- 1. Partial products are generated in the first step. The partialProducts are generated in parallel and there are several ways for partial products generation.
- 2. Reduction of Partial Products are reduced from N rows to two rows which are called as sum and carry rows. Special adder architectures are used to produce the final two rows.
- 3. The delay of this step can be reduced upto 30 % as per Earl E. Swartzlander, Jr. [5].
- 4. The two rows sum and carry rows are added using Adderto get the final product of the input operands.

The following multipliers are simulated and synthesized

- 1. Array multiplier
- 2. Wallace multiplier
- 3. Braun multiplier
- 4. Carry save array Multiplier
- 5. Vedic multiplier In this paper, we analyze and compare area, speed, power of different multipliers using Verilog HDL. The Figure 3 shows Serial Multiplier. The Figure 4 shows Parallel Multiplier.

### Array Multiplier

The Figure 5 shows regular well known structure of Array Multiplier. In this Multiplier the partial products are generated by multiplying one multiplier bit with multiplicand and the partial products are shifted and added accordingly. The operation involves shifting and adding process. Array method of partial products accumulation was proposed by C.R. Baugh, and Fig. 2: Multiplication Steps



Fig. 1: Multiplication Steps

|     |      |      |      |      |      |      |      | A2   | A5   | AS   | 44   | A3   | A2   | AL   | A0    | MULTIPLICAND          |
|-----|------|------|------|------|------|------|------|------|------|------|------|------|------|------|-------|-----------------------|
|     |      |      |      |      |      |      |      | 87   | 86   | 85   | 84   | 83   | 82   | 81   | 80    | MULTIPLIER            |
|     |      |      |      |      |      |      |      | A780 | A680 | A580 | A480 | A180 | A290 | A180 | A080  |                       |
|     |      |      |      |      |      |      | A781 | A781 | A781 | AJB1 | ATES | A781 | A781 | A781 |       |                       |
|     |      |      |      |      |      | A782 | A682 | ASB2 | A482 | A382 | A282 | A182 | A082 |      |       |                       |
|     |      |      |      |      | A783 | A683 | A583 | A483 | A383 | A283 | A183 | A083 |      | 1    | PART  | IAL PRODUCTS          |
|     |      |      |      | A/84 | A684 | A584 | A484 | A384 | A284 | A184 | A084 |      |      | 0    | ENERA | TION                  |
|     |      |      | A785 | A685 | A585 | A485 | A385 | A285 | ASBS | ADBS |      |      |      |      |       |                       |
|     |      | A786 | A686 | A586 | A486 | A386 | A286 | A186 | A086 |      |      |      |      |      |       |                       |
|     | A787 | A687 | A587 | AA87 | A387 | A287 | A187 | A087 | 1.00 |      | -    | _    | _    |      |       |                       |
|     | 534  | \$13 | 512  | \$11 | \$10 | 59   | 58   | \$7  | 56   | 55   | 54   | 53.  | 52   | 51   | 50    | 2. TWO ROWS           |
| C14 | C13  | C12  | C11  | 610  | (9   | C8   | 0    | C6   | 65   | 64   | C1   | 0    | C1   | 0    | -     |                       |
| P15 | P14  | P13  | P12  | P11  | P10  | 89   | P8   | 97   | 116  | PS.  | 14   | P3.  | P2   | P1   | PO    | <b>3.FINAL PRODUC</b> |

Fig. 2: Multiplication Steps

B.A. Wooley [18], P.E. Blankenship [19]. Array multipliers are suited best for faster computations in digital signal processing applications since it has simple interconnections. An array multiplier to perform multiplication of nxn bits requires  $n^2$  no of AND gates and n(n-1) adders. The Figure 5 shows 4x4 Array Multiplier. Array Multiplier is easy to design and it is very slow due to large critical path.

#### Carry save Array Multiplier

Carry Save Array Multiplier uses Carry Save Adders to reduce the critical path delay. Carry Propagation Adder is used in the final stage for generation the final product. The Figure 6 shows 4x4 Carry Save Array Multiplier.

#### Wallace Tree Multiplier

Wallace Tree Multiplier was proposed in the year 1964 by C.S. Wallace [3]. It is fast method of performing multiplication. The performance of Wallace Tree Multiplier is faster for larger operands. The partial product matrix of an Array Multiplier is rearranged to form a tree like structure as shown in the figure. This reduces the number of adders and the critical path. Wallace Tree Multiplier uses column compression technique. Wallace tree has complexity in design. There are (2:2),(3:2),(4:2) and (5:2) compressors.



Fig. 3: Serial Multiplier





The Figure Fig. 4: Parallel Multiplier 7 shows 8 bit Wallace Tree Multiplier using 3:2 compressor.

# A. 2:2 Compressors

A Half adder is (2:2) compressor it takes two bits from a column of partial product matrix and produces two bits of output, one bit to the next column and one bit to the same column. The Figure 8 shows (2:2) compressor.



Fig. 5: Array Multiplier



Fig. 6: Carry save Array Multiplier

#### B. 3:2 Compressors

Wallace Tree can use 3:2 Compressors proposed in Ahmed M. Shams, Tarek K. Darwish et al. [20], Hung Tine Bui, Yuke Wang et al. [21]. A Full adder is (3:2) compressor it takes three bits from a column of partial product matrix and produces two bits of output, one bit to the next column and one bit to the same column. The Figure 9 shows (3:2) compressor.

### **Braun Multiplier**

Braun Array is simplest multiplier. The simplest parallel multiplier is the Braun array. All the partial products are computed in parallel, then collected through a cascade of Carry Save Adders. The completion time is limited by the depth of the carry save array, and by the carry propagation in the adder. Note that this multiplier is only suited for positive operands. The structure of the Braun algorithm for the unsigned binary multiplication is shown in Figure 10

#### Vedic Multiplier



Fig. 7: Wallace Tree Multiplier using 3:2 Compressor

Vedic Mathematics is ancient Indian system of Maths. The Vedic Maths is based on vedic sutras and there are 16 Vedic sutras. These 16 Vedic sutras describe the way of solving the problems mathematically. Vedic mathematics algorithms has been used in multiplication process of 8085 and 8086 microprocessors and there has been good time savings in the process. Vedic Multipliers can be used for high speed and low power applications and has less complexity as compared to that of a booth multiplier. Vedic multiplier also requires less hardware and hence the area is also less. Therefore Vedic Multipliers has advantages in terms of speed, area, complexity and power. Out of 16 sutras we use Urdhva triyagbhyam sutra technique for multiplication.



Fig. 8: 2:2 Compressor

The Figure 11 shows 2 X 2 multiplication process.

The Figure 12 shows 4 X 4 multiplication process.



Fig. 9: 3:2 Compressor



Fig. 10: Braun Multiplier

# II. METHODOLOGY

Multipliers are designed using verilog coding. Full adder is designed using verilog gate primitives and using these Full Adders Array Multiplier is designed. In the same manner Carry Save Array Multiplier, Wallace Tree Multiplier, Braun Multiplier, Vedic Multiplier are designed. The designed multipliers are sythesized using Xilinx tool to find the power, time delay and area.

# **III. RESULTS**

8-bit Array Multiplier, Wallace Tree Multiplier, Braun Multiplier, Carry save Multiplier and Vedic Multiplier has been simulated and synthesized using Xilinx ISE 9.1i. The Table 1 shows Area, Time delay and Power Consumption of Multipliers. The Figure 13 shows comparision of Area between the multipliers. The Figure 14 shows comparision of Time Delay between the multipliers. The Figure 15 shows comparision of Power consumption between the multipliers.



Fig. 12: Vedic 4x4 Multiplier

### **IV.** CONCLUSION

The performance of Array Multiplier, Wallace Tree Multiplier, Braun Multiplier, Carry save Multiplier and Vedic Multiplier has been compared in terms of their speed, area and power and the results has shown that Wallace tree Multiplier has high speed and low power consumption compared with the other Multipliers. The power consumption of all the multipliers are same. The time delay wallace tree multiplier is low compared to other multipliers. Area of Array multiplier is low compared to the other multipliers.

| MULTIPLIER       | NO    | OF | TIME      | POWER    |
|------------------|-------|----|-----------|----------|
|                  | LUT'S |    | DELAY(nS) | CONSUMP- |
|                  |       |    |           | TION(mW) |
| Array Multiplier | 73    |    | 3.849     | 73       |
| Carry Save Array | 84    |    | 3.876     | 73       |
| Multiplier       |       |    |           |          |
| Wallace Tree     | 89    |    | 3.749     | 73       |
| Multiplier       |       |    |           |          |
| Braun Multiplier | 81    |    | 4.037     | 73       |
| Vedic Multiplier | 99    |    | 3.796     | 73       |

Table 1: Comparison of 8-Bit Multipliers













# POWER(mW)

Fig. 15: Power

# References

- [1] V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. 2nd edition, Prentice Hall, 1999.
- [2] H. A. Al-Twaijry, Area and performance optimized CMOS multipliers, Ph.D. dissertation, Stanford University, Aug. 1997.
- [3] S. Wallace, A Suggestion for a Fast Multiplier, *IEEE Transactions on Electronic Computers*, vol. 13, pp. 14-17, 1964 [4] L. Dadda, Some Schemes for Parallel Multipliers, *Alta Frequenza*, vol. 34, pp. 349-356, 1965.
- [4] Earl E. Swartzlander, Jr., High-Speed Computer Arithmetic, Chapter 22 in 2ed. *Computer Science Handbook*, Boca Raton, FL: Chapman and Hall/CRC, 2004.
- [5] Gowthami. P and R.V.S. Satyanarayana, "Design of an Efficient Multiplier Using Vedic Mathematics and Reversible Logic",2016 *IEEE International Conference on Computational Intelligence and Computing Research.*
- [6] Poras T. Balsara and David T. Harper, Understanding VLSI bit serial multiplier, *IEEE Trans. on Education*, Vol. 39, No. 1, Feb.1996.
- [7] R.Gnanasekaran, A fast serial-parallel binary multiplier, *IEEE Trans. Computer*, Vol. C-34, No. 8, pp. 741-744, Aug. 1985.
- [8] Gensuke Goto, Tomio Sato, Masao Nakajima, and Takao Sukemura, A 54X54 regular structured tree multiplier, *IEEE J. Of Solid State Circuits*, Vol.27, No.9, pp. 1229-1236, Sept. 1992.
- [9] H. I. Saleh, A. H. Khalil, M. A. Ashour, and A. E. Salama, Novel serial parallel multipliers, *IEE Proc*circuits Devices System, Vol.148, No.4, pp. 183-189, Aug. 2001.
- [10] A.D. Booth, A signed binary multiplication technique, *Quarterly J.Mech. Appl. Math*, Vol. 4, Part 2, pp. 236-240, 1951.
- [11] L. P. Rubinfield, A proof of the modified booth's algorithm for multiplication, *IEEE Trans. on Computers*, Vol. 24, No. 10, pp.1014-1015, Oct. 1975.
- [12] K. Choi and M. Song, Design of a high performance 32\*32-bit multiplier with a novel sign select Booth encoder, *in Proc. 2001 IEEE Int. Symp. Circuits and Systems*, Vol.2, pp.701-704, May 2001.
- [13] K-Y. Khoo, Z. Yu, and A.N. Willson, Improved-Booth encoding for lowpower multipliers, in Proc. 1999 IEEE Int. Symp. Circuits and Systems, Vol.1, pp.62-65, May 1999.
- [14] Villeger and V. G. Oklobdzija, "Analysis of Booth encoding efficiency in parallel multipliers using compressors for reduction of partial products, *Proceedings of the 27th Asilomar Conference on Signals, Systems and Computers*, pp. 781-784, 1993.
- [15] H. Sam and A. Gupta, A generalized multibit recoding of twos complement binary numbers and its proof with application in multiplier implementations, *IEEE Trans. Computer*, Vol.39, No.8, pp.1006-1015, Aug. 1990.
- [16] E. Atkin, Design Of arithmetic UNITS OF Illiac III: Use of redundancy and higher radix method, *IEEE Trans. on Computer*, Vol. C-19, No. 8, pp. 720-733, Aug. 1970.
- [17] C.R. Baugh, and B.A. Wooley, A two's complement parallel array multiplication algorithm, *IEEE Trans. Computers*, Vol. 22, No. 12, pp. 1045-1047, Dec. 1973.
- [18] P.E. Blankenship, Comments on a two's complement parallel array multiplication algorithm, *IEEE Trans. Computers*, Vol. 23, pp.1327, 1974.
- [19] Ahmed M. Shams, Tarek K. Darwish and Magdy A. Bayoumi, Performance analysis of low-power 1-bit CMOS full adder cells, *IEEE Transactions On Very Large Scale Integration (VLSI) Systems*, Vol. 10, No. 1, pp. 20-29, Feb. 2002.
- [20] Hung Tine Bui, Yuke Wang, and Yingtao Jiang, Design and analysis of low power 10-transistor full adder using novel XOR XNOR gates, *IEEE Trans. On Circuits And Systems-II: Analog And Digital Signal Processing*, Vol.49, No.1, pp. 25-30, Jan. 2002.
- [21] Hiroshi Makino, Yasunobu Nakase, Hiroaki Suzuki, Hiroyuki Morinaka, Hirofumi Shinohara, and Koichiro Mashiko, An 8.8-ns 54 x 54-Bit multiplier with high speed redundant binary architecture, *IEEE J. Of Solid State Circuits*, Vol.31, No.6, pp. 773-783, June 1996.
- [22] T. Shen and A. Weinberger, 4-2 carry-save adder implementation using send circuits, *In IBM Technical Disclosure Bulletin*, pp. 35943597, February 1978.
- [23] Qi Wang and Yousef R. Shayan, A versatile signed array multiplier suitable for VLSI implementation, in proc. *IEEE CCECE 2003*, Vol. 1, pp 199-202, May 2003.
- [24] K. Prasad and K.K. Parhi, Low-power 4-2 and 5-2 compressors, *Proc. of 2001 Asilomar Conf. on Signals, Systems and Computers, Pacific Grove, CA*, USA, Vol. 1. pp 129-133, Nov. 2001.

- [25] P.J. Song and G. De Micheli, Circuit and architecture trade-of for highspeed multiplication, *IEEE J. Solid-State Circuits, Vol.*26, pp.1184-1198, Sept. 1991.
- [26] Shen-Fu Hsiao, Ming-Roun Jiang, and Jia-Sien Yeh, Design of highspeed low-power 3-2 counter and 4-2 compressor for fast multipliers, *IEE Electronics Letters*, Vol. 34, No. 4, pp. 341-343, Febr. 1998.
- [27] Chidgupkar P.D. and Karad M.T. (2004), The Implementation of Vedic Algorithms in Digital Signal Processing, *Global Journal of Engg Education*, Vol. 8, No.2, Australia.
- [28] Thapliyal H. and Srinivas M.B. (2004), High Speed Efficient N x N Bit Parallel Hierarchical Overlay Multiplier Architecture Based on *Ancient Indian Vedic Mathematics, Transactions on Engineering, Computing and Technology*, Vol.2.
- [29] Rabaey J., Chandrakasan A. and Nikolic B. (2002), Digital Integrated Circuits: A Design Perspective, 2nd edition, Prentice Hall.