

# Low Power Asynchronous MLST Pipeline Template

Sisira S Nair PG Student, Karpagam College of Engineering.

Abstract- Asynchronous circuits do not have a clock circuitry which often consumes large portion of dynamic power, they are expected to consume low power compared to synchronous circuits. And also assures a new style in design high performance low-power and for application. In order to automate the design of asynchronous circuit and thereby reducing designing efforts like implementation of handshake protocol, asynchronous templates have been widely used. This paper presents a new asynchronous low power multi level single track pipeline template (MLST) for component communication by handshaking protocol. Compared with other templates, the proposed template can achieve higher throughput and reduced area, more over high performance and optimized power.

*Keywords*- handshake protocol, low power, MLST pipeline.

# I. INTRODUCTION

Synchronous design uses clock signal to synchronise the state renewal of the system. But in asynchronous design, there is no signal like global clock to synchronise the system, and here all the blocks are driven by data [9]. The clock signal controls the exact moment when the latches should sample the input data. In order to guarantee the stability of data during sampling, the clock period should account for the worst case delay including skew and all physical clock variations. Asynchronous circuits are those in which circuit elements are communicating through handshaking instead of using global clock. Activity of each stage is driven by data, which facilitate the advantages like "reduced power consumption, absence of clock distribution, so no lock skew, Average case performance compared to worst case performance of synchronous blocks, reduced timing issues, natural adaptation to several properties, and improved EMI.

Ms. Deepika R, M.E., Assistant Professor, Karpagam College of Engineering.

Among the abundant asynchronous designs, the template based pipeline design styles have expressed very high performance. Template based designs have the advantage like, it avoids the need of creating, optimizing and then verifying all the specifications for complex distributed controllers, because this is very difficult and error-prone.

Asynchronous design can be grouped by handshaking protocol they used to communicate. There are two distinct kinds of protocols commonly used for asynchronous circuits, the 2-phase and the 4-phase protocol. Comparing to 2- phase handshake signals, 4-phase is slow and consumes more power. Especially on long wires, they consume large amount of power, but generally results in simpler and less expensive circuits. Single track handshaking combines the advantage of both 2-phase and 4-phase by using single wire to send and acknowledge the data [6].

This paper points evaluation of various templates used for asynchronous communication. Evaluation is carried out by implementing some logic in evaluation block of multi level single track pipeline template. The multiple levels of logic per pipeline stages in multilevel single track offer a new trade off between performance and area. The design can achieve high through put and low power consumption.

#### II. PIPELINE TEMPLATES

There exist several templates based on handshaking protocols. There are single track and double track handshaking [2] shown in Fig.1. Single track requires only one wire to send and acknowledges the data.





Fig.1 Types of handshaking

This section gives a brief introduction to few asynchronous templates which based on these protocols.

WCHBshows a dual-rail Fig.2 buffer implementation called weak-condition half buffer (WCHB) [3]. Left and right channels are indicated by L and R respectively. 0 and 1 indicate zero and one rails, and e is the enable signal (enable signal high means data is ready and low means acknowledge). After resetting the system L0, L1, R0, R1 become low, Le and Re become high. Data arrives by rising one of the left inputs (Lx) this will set Sx to low, which in turn drives the respective output Rx high and left enable Le low. The left channel then will lower the values on Lx; the right channel receives the data Rx and lowers Re. The buffer then raises Le and lowers Rx. The cycle completes only when the right channel reasserts the signal Re. 2 (7)



Fig.2 WCHB Template

Since WCHB requires too many stacked PMOS transistors, it is not recommended. This draw back makes WCHB slower than other templates.

**PCHB-** It uses four-phase 1-of-*N* handshaking protocol ,where data validity is encoded in which data rail is risen high and a separate acknowledgment is used to tell the sender block when the data rail can be reset. The block level template of the pre-charged half buffer (PCHB) is shown in Fig 3 (a). Each pipeline stage has both an input and the output completion detection unit denoted as LCD and RCD respectively to indicate the validity and neutrality of the signals. Also a single level of domino logic in its evaluation block (F) with two control inputs 'en' and 'pc'.



# Fig.3 (a) PCHB template

To analyse the function of PCHB based 4 bit full adder circuit, a full adder unit is implemented in each functional block by using Tanner EDA as shown in Fig.3 (b). The input bits A and B are given to LCD and Output sum and carry are given to RCD, Both LCD and RCD are connected to inverting C element, which produces Ack signal. The ack signal goes low when both LCD and RCD are valid, indicates that the output data has been consumed by the next stage.





Fig. 3 (c) Output waveform- Full adder

Fig.4 PCFB

**PCFB-** The PCFB is more concurrent than the PCHB based on its L and R handshakes reset in parallel at the cost of requiring an additional state variable. The Fig.4 below shows the template for Pre-Charged Full Buffer (PCFB).

**MLD**-The multilevel domino (MLD) in fig.5 also uses the four-phase handshaking protocol but it is not quasi- delay-insensitive (class of almost delay-

insensitive asynchronous circuits which are invariant to the delays of any of the circuit's wires or elements). In the basic multi-level domino template, the circuit is divided into pipeline stages. Each stage consists of potentially multiple-levels of domino logic controlled by a single controller that communicates with other controllers through



handshaking. Here evaluation block of each pipeline stage consists of multiple levels of domino logic. The first K-1 levels of domino logic are controlled by an 'en' signal and the last level of domino logic has its pre-charged controlled by the 'pc' signal and evaluation controlled by a separate 'go' signal. This structure allows the first K-1levels of domino logic to evaluate earlier than the last level of domino logic. A completion detection unit exists for each output and all the validity signals are checked by AND gate tree, indicates to the next pipeline stage that all input data are valid. When the next pipeline stage also receives the output valid signal from its pre-charged AND gate, it acknowledges the sender. This template targets medium-to-high performance applications. And it shows a trade off between throughput and timing robustness for lower area.



#### Fig. 5 MLD template

**SSTFB-** STFB (single track full buffer), uses single track handshaking, where the sender sends data by driving the data wire high and the receiver sends the acknowledgment by driving the data wire low [3]. An STFB buffer is shown in Fig 6 (a). When one of n inputs (Li) is driven high, corresponding NAND gate will drive Si low and drives both respective Ri and A high. A goes high turns Li to reset low, enables left channel to send new data. At the same time Ri goes high turns B to low, restores

Si high and prevents NAND gates to re evaluate even if new token arrives.



Fig. 6(a) 1-of-N STFB Buffer

When not actively driving the wire, the sender and receiver tri-states the wire, which makes it susceptible to crosstalk noise. So we are going for static single-track handshaking (SSTFB) (Fig. 6(b)), here the receiver is also responsible for holding the data wire high until it sends the acknowledgment and the sender is also responsible for holding the data wire low until it is ready to drive the data wire high again, sending a new data [4]. So, in static single-track handshake protocol, the data wire will never be tri-stated. It uses 2-input NOR for right evaluation and NAND for resetting the data inputs.



Fig. 6 (b) SSTFB dual rail Buffer

In medium to low performance applications, due to the relatively high pipeline control overhead, SSTFB designs become area and power inefficient.



**MLST-** Uses multiple levels of logic. This template was first introduced in 2011 [8]. Another single-track template was introduced subsequently in [7], which can support four levels of logic. While this other template improves area efficiency compared to earlier single-track templates, it is still homogeneous in nature, which limits its flexibility. It follows 2-phase static single track protocol, results in less switching and low power compared to other. Give flexibility in terms of performance and area overhead. Reduced pipeline stages improve area efficiency. MLST block diagram is shown in Fig. 7(a). The main parts are logic blocks, Pre-charge completion detector (PCCD), Data path and Controller.

Rbit[0] Last Level • K-1 Levels of . of Domino Domino Logic Logic ~ Rbit Lbit[M] Valid N go PCCD 10v IC<sub>5</sub> 0Cv Controller Delay Line

Fig. 7 (a) MLST Block diagram

PCCD [1] is used to detect the validity of output. It consists of an eight input dynamic AND gate detecting the validity of eight outputs and generating the valid wire V\_R. When all outputs are valid, it drives V\_R high using SNOR gate present inside the PCCD. This V\_R in turn generate go signal. The PCCD generates go signal to control last level of logic as well as en/pc of PCCD. Controllers are used for generating Acknowledgement signal which controls the evaluation and pre-charge of intermediate stages.

Here evaluation of MLST is carried out by implementing a full adder unit in its evaluation block. An MLST based 8 bit full adder and its output waveforms are shown in fig 7(b) and 7(c).







Fig. 7(b) MLST based 8 bit full adder

The output wave form shows that the validity signal gradually decrease from high to low depending on the switching of transistors in PCCD, which depends on validity of arriving signal to the corresponding transistors.



Fig. 7 (c) Output waveform- Full adder

#### IV. CONCLUSION

Here, discussed the merits and demerits of several pipeline template used for asynchronous communication. One differs from another in the way of handshaking between components of circuit. This paper points an MLST template targets medium to high performance. The analysis also shows that MLST pipeline templates have less switching, and thus low power consumption and high throughput. And also improving the design of existing, can results in new MLST; which consumes low power and minimum area.

#### V. REFERENCES

[1] Pankaj Golani and Peter A. Beerel, *Senior Member, IEEE*" Area-Efficient Asynchronous Multilevel Single-Track Pipeline Template. *IEEE transactions on very large scale integration (vlsi) systems, vol. 22, no. 4, April 2014* 

[2] K. van Berkel and A. Bink, "Single-track handshake signalling with application to micropipelines and handshake circuits," in *Proc. IEEE Int. Symp. Asynchron. Circuits Syst.*, Mar. 1996, pp. 122–133.

[3] M. Ferretti, "Single-track asynchronous pipeline template," Ph.D. dissertation,Un

iv. Southern California, Berkeley, CA, USA, 2004.

[4] P. Golani and P. A. Beerel, "High performance noise robust asynchronous circuits," in *Proc. IEEE Comput. Soc. Annu. Symp. Emerging VLSI Technol. Archit.*, Mar. 2006, pp. 173–178

[5] A. M. Lines, "Pipelined asynchronous circuits," M.S. thesis, Dept. Computer Sci. Tech. Rep., Univ. California Inst. Technol., Pasadena, CA, USA, 1995.

[6] K. S. Stevens, P. Golani, and P. A. Beerel, "Energy and performance models for synchronous and asynchronous communication," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 3, pp. 369–382, Mar. 2011.

[7] B. R. Sheikh and R. Manohar, "Energy efficient pipeline templates for high performance asynchronous circuits," *ACM J. Emerging Technol. Comput. Syst.*, vol. 7, no. 4, pp. 1–26, Dec. 2011

[8] P.Golani and P A Beerel," Area efficient multi level single track pipeline template" in Proc. Design Autom. Test Eur. Conf. Exhibit., Mar. 2011, pp. 1509–1512.

[9] Kuan- Hsein Ho and Yao-Wen Chang, "A new asynchronous pipeline template for power and performance optimization". *ACM* 978-1-4503-2730-5, *Dec 14, san Francisco,CA, USA* 

[10] Macros Ferretti, member IEEE and Peter A Beerbel, Member ,IEEE. "High performance asynchronous design using single track full buffer standard cells", *IEEE journal of solid state circuits*, *vol-41,no-6, june 2006*.

[11] Mallika Prakash, "Library Characterisation and static timing analysis of asynchronous circuits". *Dept. Computer Engg. Viterbi school of engg,Univ. Of southern California.* 

[12] Chammika MANNAKKARA, "Asynchronous Pipeline Controllers Based on Early Acknowledgement Protocol", Department of informatics, School of multidisciplinary Sciences, The graduate University for Advanced Studies (SOKENDAI).September 2010.