# Recent Advances in the Design of Efficient-yet-Linear Watt-class mmWave CMOS PAs

(Invited Paper)

Ritesh Bhat, Anandaroop Chakrabarti and Harish Krishnaswamy Department of Electrical Engineering, Columbia University, New York, NY-10027

Abstract—Technology scaling, which has enabled operation at mmWave frequencies in CMOS, has also brought with it a steady decrease in the breakdown voltages of the transistors. The resulting fall in output power of power amplifiers (PAs) combined with the low efficiency due to poor device gain and lossy passive components has led to research efforts towards efficient yet linear high output PAs at mmWave. This paper discusses key advances toward this goal, namely device stacking, mmWave switch-mode operation, large-scale power combining and linearization techniques capable of supporting multi-Gbps complex modulations efficiently.

#### I. INTRODUCTION

Aggressive scaling of CMOS processes has enabled integrated circuits to operate in the mmWave regime (30-300 GHz). For a power amplifier (PA) designer, while the shrinking device size and resulting increase in transistor speed (related to their  $f_T$  which is scaling rapidly) are highly desirable, the decreasing breakdown voltages of the devices severely limit the maximum possible output power of PAs in these technologies. Another challenge is to overcome the lower PA efficiencies at these frequencies due to low available gain of the transistors (related to their  $f_{max}$  which is scaling slowly) and the poor quality of on-chip passive components.

This paper reviews the technique of device stacking, an active voltage summing technique where a series stack of devices operates off a linearly larger supply voltage by distributing the enhanced output swing equally across the devices in the stack [1]–[4]. Recent trends toward inherently high efficiency switching PA classes like Class-E, D<sup>-1</sup> etc. at RF frequencies have seen improved efficiencies since they eliminate the coexistence of voltage across and current through the devices. We describe a loss-aware class-E design methodology [3], [5] that enables the implementation of stacked Class-E-like PAs in bulk and SOI CMOS processes at mmWave frequencies with output powers in the 17-20 dBm range and efficiencies in the 20-35% range. A large-scale low-loss power combiner introduced in [6] is used to combine 8 4-stacked PAs on-chip to produce 26-27 dBm (0.4-0.5 W) across 33-46 GHz [6].

Switching PA classes are inherently nonlinear and hence cannot support complex modulation schemes. Moreover, complex modulation schemes are often characterized by high peak-to-average power ratios (PAPRs). Thus, it is important for a PA or transmitter architecture to have a high efficiency under back-off profile in order to efficiently handle the modulation waveform. For instance, Class A amplifiers typically exhibit  $PAE_{-6dB}/PAE_{peak} < 25\%$  while Class B PAs can only go



Fig. 1. Maximum possible communication range for 45 GHz and 60 GHz links for the peak output power ( $P_{out}$ ) permitted by FCC [9] as a function of antenna gain. The EIRP limitations on the emerging 45 GHz band are assumed to be equal to the 60 GHz band limitations.

upto 50%. In order to achieve better performance, architectural solutions such as outphasing [7] and Doherty [8] have been proposed with varying degrees of success. We describe a recent linearizing architecture for direct digital-to-mmWave conversion which simultaneously achieves high-output power, high efficiency under back-off and linearity [6].

# II. Q-BAND HIGH DATA-RATE BACKHAUL APPLICATIONS

The 60 GHz band consists of 7 GHz of spectrum allocated for license-free operation. Furthermore, the equivalent isotropic radiated power (EIRP) in this band was recently expanded by the FCC to 82 dBm -  $2\times(51$  dBi-antenna gain). According to the FCC report and order [9], RF exposure levels in the near field and on the antenna surface may increase as the size of the antenna decreases. Hence, they proposed to decrease maximum EIRP as the antenna gain is reduced below 51 dBi. Despite this expansion, the high specific atmospheric absorption of oxygen at 60GHz (10-15 dB/km) in addition to Frii's transmission losses make long-range communication challenging (loss over 100 m at 60 GHz is 108 dB).

The 45 GHz band is another candidate which stands to evaluation. There are already working groups that are investigating standardization in this band [10], [11]. The main advantage of the 45 GHz band is that it does not suffer from the severe atmospheric attenuation that the 60 GHz band suffers from. Its specific attenuation is only 0.3 dB/km. A link-budget calculation is performed to compare the performance of the 60 GHz and 45 GHz bands for backhaul applications. The EIRP limitations on the 45 GHz band are assumed to be the same as those of the 60 GHz band. Due to the FCC EIRP rules, as the antenna gain increases, the *maximum* transmitter  $P_{out}$  that satisfies the *maximum* EIRP rule also rises. Fig. 1



Fig. 2. Stacked CMOS Class-E-like PA with annotated voltage swings and the circuit model used in the loss-aware Class-E design methodology.



Fig. 3. Schematics and chip photographs of (a) the 2-stacked and (b) the 4-stacked Q-band Class-E-like 45nm SOI CMOS PAs.

shows the maximum allowable  $P_{out}$  and the corresponding maximum range of communication at 45 GHz and 60 GHz as a function of antenna gain. It is evident that the 45 GHz band achieves ranges that are around one order of magnitude higher than its 60 GHz counterpart across the board. This analysis further corroborates the fact that efficient-yet-linear, watt-class PAs are necessary for long-distance mmWave backhaul.

## III. DEVICE STACKING AND SWITCH-MODE OPERATION

Series stacking of devices in PAs enables the use of higher supply voltages by distributing the voltage stress equally amongst the various stacked devices [12], [13]. The key aspect of stacking is that all devices in the stack must have the same voltage swing between their corresponding junctions so that they operate identically. The higher voltage swings result in a quadratic increase in output power with number of stacked devices assuming the load is kept constant. The concept has been shown for quasi-linear mmWave PAs [2].

Series stacking in the context of switching-class PAs is explored in [3], [4] where moderate output powers (17-20 dBm) with high-efficiency (20-35%) in scaled CMOS technologies have been demonstrated. The concept of stacking in switching PAs is shown in Fig. 2. The devices higher up in the stack turn on and off due to the swing of the intermediary nodes. The topmost drain is loaded with an output network that is designed based on Class-E principles, and consequently sustains a Class-E-like voltage waveform. The intermediary



Fig. 4. Large-scale quarter-wave lumped power combining architecture.



Fig. 5. Chip photograph of the 45nm SOI CMOS watt-class PA array prototype.

drain nodes must also sustain Class-E-like voltage swings with appropriately scaled amplitudes so that the voltage stress is shared equally among all devices. Appropriate voltage swing may be induced at the intermediary nodes through techniques such as inductive tuning [14], capacitive charging acceleration [15], and placement of Class-E load networks at intermediary nodes [4]. The swing at each gate is induced through capacitive coupling from the corresponding source and drain node via  $C_{qs}$  and  $C_{qd}$ , respectively, and is controlled through the gate capacitor. The DC biases of all gates are applied through large resistors. A comprehensive "loss-aware Class-E design methodology" [5] has been developed to aid in the design of stacked Class-E-like PAs at mmWave frequencies. The methodology uses an analysis that includes loss in the device stack, the input power required to drive the bottom-most device, passive losses in the output tuning network, and the finite choke inductance to find the load condition for optimal PAE for a given Class-E tuning. This serves as a starting point from which further simulation-based optimization can be performed. The number of devices that can be stacked is limited by secondary breakdown mechanisms such as drainbulk junction breakdown in the top device in bulk CMOS processes, buried oxide breakdown in SOI CMOS processes as well as practical considerations such as the increasing device size and device current stress with the number of stacked devices if the load is kept constant [3].

Two prototype stacked Class-E PAs are demonstrated in [3] - a 2-stacked PA (Fig. 3(a)) which delivers a saturated output power of 17.6 dBm at 35% peak PAE at 47 GHz and a 4-stacked PA (Fig. 3(b)) which delivers a saturated output power of 20.3 dBm at 20% peak PAE at 47.5 GHz.

### IV. LARGE-SCALE LOW-LOSS POWER COMBINING

Device stacking has enabled power levels of up to 20 dBm, which were formerly only reached at mmWave frequencies in CMOS through passive power combining. Once the limits of stacking are reached, as described in the previous section,



Fig. 6. Comparison of PAs described in this work to other state-of-the-art CMOS mmWave PAs.

passive power combining can be exploited to approach 1 watt.

Large-scale low-loss power combining on-chip is challenging in multiple aspects. The combiner structure must be compact while allowing a large number of (8 or more to approach 1 watt of output power) elements to be combined. It must also be inherently symmetric in order to ensure constructive combination of PA unit-cell output powers. Transformer-based series power combining is limited by the asymmetry due to the parasitic interwinding capacitances which cause the output voltages to non-constructively combine and can also cause instability [16]. An *n*-way combiner formed by cascading several 2:1 Wilkinson combiners has poor efficiency due to its cascaded structure. The zero-degree combiner [17], [18] is a cascaded structure whose efficiency is a function of the impedance transformation performed by each stage.

An *n*-way lumped quarter-wave combiner structure, shown in Fig. 4, which is essentially an *n*-way Wilkinson power combiner sans the isolation resistors, is proposed in [6]. One-step n-way Wilkinson combiners are challenging because low-loss high- $Z_0$  transmission lines that satisfy electromigration are difficult to achieve in a CMOS BEOL. The quarter-wavelength transmission lines in a Wilkinson are replaced by lumped spiral-inductor equivalents since any two port passive network can be approximated with a lumped  $\pi$  network at a single frequency. In order to achieve quarter-wavelength behavior with characteristic impedance  $Z_0 = 50\sqrt{n}$  at a frequency  $\omega_0$ , the spiral must have an inductance of  $L = 50\sqrt{n}/\omega_0$  and equal parasitic capacitances on either side of  $C = 1/(50\omega_0\sqrt{n})$ . For instance, an 8-way combiner at 45 GHz which requires quarter-wave transmission lines with  $Z_0 = 141.421\Omega$  requires each spiral to have an inductance of 500 pH and parasitic capacitances of 25 fF. These values are easily achievable in a low-loss manner on-chip whereas a low-loss transmission line with the same characteristic impedance which satisfies electromigration constraints is impossible to realize. The key insight is that spirals can attain higher  $Z_0$  values than transmission lines due to their magnetic self-coupling. Loss is also reduced as spirals are able to use wider line widths than narrow high  $Z_0$ transmission lines. The maximum number of elements that can be combined is only a function of the self-resonance frequency of the spirals (in turn linked to electromigration constraints and



Fig. 7. Chip block diagram of the 45nm SOI CMOS 3-bit digital to mmWave PA array prototype at 42.5 GHz.



Fig. 8. Simulated power spectral density of the 3-bit digital to mmWave PA array with an ideal phase modulator supporting 802.11 ad OFDM waveforms.

loss of the spirals) and layout floor-planning considerations. It is seen that upto 16 elements may be combined based on achievable SRF in the 45nm SOI CMOS BEOL. An 8-way quarter-wave lumped combiner was implemented in [6] and has a simulated and measured efficiency of 75% at 45 GHz. In comparison, an 8-way 3-stage cascade of 2:1 Wilkinson combiners shows a simulated efficiency of only 63%.

Eight PA unit-cells that exploit aggressive device stacking are combined using the low-loss combiner described above (Fig. 5). Each PA unit-cell is a cascade of the 2-stacked Class-E-like PA followed by the 4-stacked Class-E-like PA unit-cell described in the previous section. The result is a record breaking mmWave CMOS PA which delivers 0.5 W (27 dBm) of output power and maintains 1 dB flatness in saturated output power (26-27 dBm) from 33-46 GHz [6]. The measured PAE varies between 8.8-10.7% in this range.

Fig. 6 compares this work with existing state-of-the-art CMOS mmWave PAs. It can be seen that the prototypes described in this work outperform the rest in terms of output power as well as PAE.

#### V. LINEARIZED POWER DACS

Implementing PAs in scaled CMOS technologies has the benefit of being able to integrate high-speed digital processing on-chip to enhance their operation. Recent trends toward direct digital-to-RF conversion are motivated by the reconfigurability offered by their highly digital nature which enables handling different modulation formats and bandwidths of operation. Digital linearizing architectures also allow the use of nonlinear but efficient switching PAs and still support complex modulation formats in a linear manner. However, direct digital-tommWave conversion remains largely unexplored.

A mmWave digitally-controlled load-modulated linearizing architecture (Fig. 7) for direct digital-to-mmWave conversion which uses large-scale power combining for high output power, supply switching for high back-off efficiency, and loadmodulation for linearity and enhanced back-off efficiency was introduced in [6]. Several (n) Class-E-like mmWave PAs are combined using a non-isolating combiner. Each Class-E-like PA can be turned off using a unique digital control through a supply switch to save DC power under back-off. The nway combiner is the lumped quarter-wave combiner described before which exhibits an interesting load-modulation behavior. Assume that n - m PAs are off and m PAs are on. Each PA is designed to present a short-circuit output impedance to the combiner when off. The  $\lambda/4$  equivalent branch transforms this short-circuit to an open-circuit at the combining point. Consequently, the impedance seen by the m on PAs is  $Z_0^2/(m \times 50)$  (= 200/m in the implementation shown in Fig. 7). Switching-class PAs are essentially voltage-sourcelike PAs whose output power is inversely proportional to load resistance. Consequently, the output power of each PA is given by  $P_{unit} \propto V_{DD}^2/(200/m) \propto m$  and the total output power is given by  $P_{out} \propto m^2$ , making the output amplitude linear with m. A 3-bit direct digital-to-mmWave DAC prototype using this architecture, shown in Fig. 7, is fabricated in 45 nm SOI CMOS. It has a measured saturated output power of 23.4 dBm,  $PAE_{-6dB}/PAE_{peak} = 67.7\%$  as well as a DNL <0.5 LSB and INL < 1 LSB using end-point fit. Modulated measurements of the supply-switched unit cell reveal turnon/off times of 200-250 ps while supporting 1 Gbps OOK modulation [19] (modulation rate limited by measurement equipment), enabling support for Gbps modulation rates in the 3-bit direct digital-to-mmWave DAC.

The measured nonlinearity across digital control word setting and the measured rise/fall times of the unit-cell PAs were incorporated in a study to determine the feasibility of supporting 802.11 ad OFDM waveforms with this prototype. It must be noted that although the 802.11 ad is a 60 GHz band, this study assumes that similar restrictions are applicable for the 45 GHz band as well. An ideal phase modulator is also assumed in this study. The simulated power spectral density across phase modulator resolution is shown in Fig. 8. The sampling frequency used is 5.28 GHz. It is observed that in order to satisfy the spectral mask, a phase modulator resolution greater than 4 bits is sufficient. This is easily achievable in scaled CMOS technology nodes and demonstrates the utility of the 3-bit direct digital-to-mmWave DAC in supporting multi-Gbps modulation standards. The spectral images in the PSD which are spaced at regular intervals of the sampling frequency from the carrier can also violate the spectral mask if they are not sufficiently filtered by the output network of the PA. However, techniques such as on-chip FIR filtering using an early/late PA combination as in [20] can be used to suppress these images and effectively solve this issue.

### VI. CONCLUSION AND ACKNOWLEDGMENTS

While significant advances have been made towards efficient-yet-linear watt-class mmWave PAs in CMOS, several challenges remain. Power DACs that simultaneously achieve high resolution and high back-off efficiency are of interest. The use of power DAC transmitters for emerging applications such as mmWave MIMO is also an exciting research avenue.

This work was funded by the DARPA ELASTx program. The authors thank Drs. Sanjay Raman, Dev Palmer, Paul Watson and Rick Worley for helpful discussions.

#### REFERENCES

- A. Balteanu *et al.*, "A 2-bit, 24 dBm, millimeter-wave SOI CMOS power-DAC cell for watt-level high-efficiency, fully digital m-ary QAM transmitters," *IEEE JSSC*, vol. 48, no. 5, pp. 1126–1137, May 2013.
- [2] H. Dabag et al., "Analysis and design of stacked-FET millimeter-wave power amplifiers," *IEEE T-MTT*, vol. 61, no. 4, pp. 1543–1556, April 2013.
- [3] A. Chakrabarti *et al.*, "High-Power High-Efficiency Class-E-Like Stacked mmWave PAs in SOI and Bulk CMOS: Theory and Implementation," *IEEE T-MTT*, vol. 62, no. 8, pp. 1686–1704, Aug 2014.
- [4] —, "Dual-output stacked class-EE power amplifiers in 45nm SOI CMOS for Q-band applications," in 2012 IEEE CSICS, Oct 2012, pp. 1–4.
- [5] —, "An improved analysis and design methodology for RF class-E power amplifiers with finite DC-feed inductance and switch onresistance," in 2012 IEEE ISCAS, May 2012, pp. 1763–1766.
- [6] R. Bhat *et al.*, "Large-scale power-combining and linearization in wattclass mmWave CMOS power amplifiers," in 2013 IEEE RFIC Symp., June 2013, pp. 283–286.
- [7] D. Zhao et al., "A 60-GHz Outphasing Transmitter in 40-nm CMOS," IEEE JSSC, vol. 47, no. 12, pp. 3172–3183, Dec 2012.
- [8] A. Agah *et al.*, "A 45GHz Doherty power amplifier with 23% PAE and 18dBm output power, in 45nm SOI CMOS," in 2012 IEEE IMS, June 2012, pp. 1–3.
- [9] FCC. (2013, Aug.) Revision of part 15 of the commission's rules regarding operation in the 57-64 ghz band. FCC-13-112A1.pdf. [Online]. Available: http://hraunfoss.fcc.gov/edocs\_public/attachmatch/
- [10] W. Xing et al. (2012, Nov.) 11aj 45GHz link budget for use cases discussion. 11-12-1320-00-00aj-45ghz-link-budget-for-use-casesdiscussion.pptx. [Online]. Available: https://mentor.ieee.org/802.11/dcn/ 12/
- [11] H. Wang et al. (2012, Sep.) Data rate and spectrum requirements for IEEE 802.11aj (45 GHz). 11-12-1196-01-00aj-data-rate-and-spectrumrequirements-for-ieee-802-11aj-45-ghz.pptx. [Online]. Available: https: //mentor.ieee.org/802.11/dcn/12/
- [12] A. Ezzeddine *et al.*, "The high voltage/high power FET (HiVP)," in 2003 IEEE RFIC Symp., June 2003, pp. 215–218.
- [13] J. McRory et al., "Transformer coupled stacked FET power amplifiers," *IEEE JSSC*, vol. 34, no. 2, pp. 157–161, Feb 1999.
- [14] A. Mazzanti *et al.*, "Analysis of reliability and power efficiency in cascode class-E PAs," *IEEE JSSC*, vol. 41, no. 5, pp. 1222–1229, May 2006.
- [15] O. Lee *et al.*, "A Charging Acceleration Technique for Highly Efficient Cascode Class-E CMOS Power Amplifiers," *IEEE JSSC*, vol. 45, no. 10, pp. 2184–2197, Oct 2010.
- [16] J. wei Lai et al., "A 1V 17.9dBm 60GHz power amplifier in standard 65nm CMOS," in 2010 IEEE ISSCC, Feb 2010, pp. 424–425.
- [17] W. Tai *et al.*, "A 0.7W fully integrated 42GHz power amplifier with 10% PAE in 0.13  $\mu$ m SiGe BiCMOS," in 2013 IEEE ISSCC, 2013, pp. 142–143.
- [18] B. Martineau *et al.*, "A 53-to-68GHz 18dBm power amplifier with an 8-way combiner in standard 65nm CMOS," in 2010 IEEE ISSCC, Feb 2010, pp. 428–429.
- [19] A. Chakrabarti *et al.*, "Design considerations for stacked Class-E-like mmWave high-speed power DACs in CMOS," in 2013 IEEE IMS, June 2013, pp. 1–4.
- [20] J. Chen *et al.*, "A digitally modulated mm-Wave cartesian beamforming transmitter with quadrature spatial combining," in 2013 IEEE ISSCC, Feb 2013, pp. 232–233.