New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm

Atef Ibrahim; Fayez Gebali; Hamed Elsimary

doi:10.1109/PACRIM.2013.6625466

New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm

Atef Ibrahim
, Fayez Gebali
, Hamed Elsimary

Computer Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

6 Scopus citations

Abstract

This paper presents a new and improved word-based processor array architecture for unified and scalable radix2 Montgomery modular multiplication algorithm. In this architecture, the multiplicand and the modulus words are allocated to each processing element rather than pipelined between the processing elements as in the previous architecture extracted by Ç. Koç, and also the multiplier bits are fed serially to the first processing element of the processor array every odd clock cycle. Moreover, this architecture was modified to reduce the critical path delay and area by replacing the two levels of carry save adder (CSA) logic by modified 4-to-2 CSA that use only one level of dual field adder logic (DFA) taking advantage of processing two operand words by the same processing element (PE) of the processor array. An ASIC Implementation of the proposed architecture shows that it can perform 1024-bit modular multiplication (for word size w = 32) in about 17.07 μs. Also, the results show that it has smaller Area x Time values compared to all existing designs by ratios ranging from 11.6 % to 47.8 % which makes it suitable for implementations where both area and performance are of concern. Moreover, it has higher throughput (1.8 - 39.5 %) than most of the published unified and scalable architectures except the architecture extracted by Harris. It has slightly higher throughput (4.5 %) than the proposed one.

Original language	English
Title of host publication	2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013
Pages	153-158
Number of pages	6
DOIs	https://doi.org/10.1109/PACRIM.2013.6625466
State	Published - 2013
Event	14th IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, PACRIM 2013 - Vancouver, BC, Canada Duration: 27 Aug 2013 → 29 Aug 2013

Publication series

Name	IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings

Conference

Conference	14th IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, PACRIM 2013
Country/Territory	Canada
City	Vancouver, BC
Period	27/08/13 → 29/08/13

Access to Document

10.1109/PACRIM.2013.6625466

Cite this

Ibrahim, A., Gebali, F., & Elsimary, H. (2013). New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm. In 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013 (pp. 153-158). Article 6625466 (IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings). https://doi.org/10.1109/PACRIM.2013.6625466

Ibrahim, Atef ; Gebali, Fayez ; Elsimary, Hamed. / New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm. 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013. 2013. pp. 153-158 (IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings).

@inproceedings{151952abcd924a33876d800bdfe8f8ac,

title = "New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm",

abstract = "This paper presents a new and improved word-based processor array architecture for unified and scalable radix2 Montgomery modular multiplication algorithm. In this architecture, the multiplicand and the modulus words are allocated to each processing element rather than pipelined between the processing elements as in the previous architecture extracted by {\c C}. Ko{\c c}, and also the multiplier bits are fed serially to the first processing element of the processor array every odd clock cycle. Moreover, this architecture was modified to reduce the critical path delay and area by replacing the two levels of carry save adder (CSA) logic by modified 4-to-2 CSA that use only one level of dual field adder logic (DFA) taking advantage of processing two operand words by the same processing element (PE) of the processor array. An ASIC Implementation of the proposed architecture shows that it can perform 1024-bit modular multiplication (for word size w = 32) in about 17.07 μs. Also, the results show that it has smaller Area x Time values compared to all existing designs by ratios ranging from 11.6 \% to 47.8 \% which makes it suitable for implementations where both area and performance are of concern. Moreover, it has higher throughput (1.8 - 39.5 \%) than most of the published unified and scalable architectures except the architecture extracted by Harris. It has slightly higher throughput (4.5 \%) than the proposed one.",

author = "Atef Ibrahim and Fayez Gebali and Hamed Elsimary",

year = "2013",

doi = "10.1109/PACRIM.2013.6625466",

language = "English",

isbn = "9781479915019",

series = "IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings",

pages = "153--158",

booktitle = "2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013",

note = "14th IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, PACRIM 2013 ; Conference date: 27-08-2013 Through 29-08-2013",

}

Ibrahim, A, Gebali, F & Elsimary, H 2013, New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm. in 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013., 6625466, IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings, pp. 153-158, 14th IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, PACRIM 2013, Vancouver, BC, Canada, 27/08/13. https://doi.org/10.1109/PACRIM.2013.6625466

New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm. / Ibrahim, Atef; Gebali, Fayez; Elsimary, Hamed.
2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013. 2013. p. 153-158 6625466 (IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm

AU - Ibrahim, Atef

AU - Gebali, Fayez

AU - Elsimary, Hamed

PY - 2013

Y1 - 2013

N2 - This paper presents a new and improved word-based processor array architecture for unified and scalable radix2 Montgomery modular multiplication algorithm. In this architecture, the multiplicand and the modulus words are allocated to each processing element rather than pipelined between the processing elements as in the previous architecture extracted by Ç. Koç, and also the multiplier bits are fed serially to the first processing element of the processor array every odd clock cycle. Moreover, this architecture was modified to reduce the critical path delay and area by replacing the two levels of carry save adder (CSA) logic by modified 4-to-2 CSA that use only one level of dual field adder logic (DFA) taking advantage of processing two operand words by the same processing element (PE) of the processor array. An ASIC Implementation of the proposed architecture shows that it can perform 1024-bit modular multiplication (for word size w = 32) in about 17.07 μs. Also, the results show that it has smaller Area x Time values compared to all existing designs by ratios ranging from 11.6 % to 47.8 % which makes it suitable for implementations where both area and performance are of concern. Moreover, it has higher throughput (1.8 - 39.5 %) than most of the published unified and scalable architectures except the architecture extracted by Harris. It has slightly higher throughput (4.5 %) than the proposed one.

AB - This paper presents a new and improved word-based processor array architecture for unified and scalable radix2 Montgomery modular multiplication algorithm. In this architecture, the multiplicand and the modulus words are allocated to each processing element rather than pipelined between the processing elements as in the previous architecture extracted by Ç. Koç, and also the multiplier bits are fed serially to the first processing element of the processor array every odd clock cycle. Moreover, this architecture was modified to reduce the critical path delay and area by replacing the two levels of carry save adder (CSA) logic by modified 4-to-2 CSA that use only one level of dual field adder logic (DFA) taking advantage of processing two operand words by the same processing element (PE) of the processor array. An ASIC Implementation of the proposed architecture shows that it can perform 1024-bit modular multiplication (for word size w = 32) in about 17.07 μs. Also, the results show that it has smaller Area x Time values compared to all existing designs by ratios ranging from 11.6 % to 47.8 % which makes it suitable for implementations where both area and performance are of concern. Moreover, it has higher throughput (1.8 - 39.5 %) than most of the published unified and scalable architectures except the architecture extracted by Harris. It has slightly higher throughput (4.5 %) than the proposed one.

UR - https://www.scopus.com/pages/publications/84889017576

U2 - 10.1109/PACRIM.2013.6625466

DO - 10.1109/PACRIM.2013.6625466

M3 - Conference contribution

AN - SCOPUS:84889017576

SN - 9781479915019

T3 - IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings

SP - 153

EP - 158

BT - 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013

T2 - 14th IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, PACRIM 2013

Y2 - 27 August 2013 through 29 August 2013

ER -

Ibrahim A, Gebali F, Elsimary H. New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm. In 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2013. 2013. p. 153-158. 6625466. (IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings). doi: 10.1109/PACRIM.2013.6625466

New and improved word-based unified and scalable architecture for radix 2 montgomery modular multiplication algorithm

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this