0% found this document useful (0 votes)
15 views510 pages

Lecture Notes Digital CMOS IC

Uploaded by

lalaayush22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views510 pages

Lecture Notes Digital CMOS IC

Uploaded by

lalaayush22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 510

EEC 118 Lecture #1:

MOSFET Overview

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Permissions to Use Conditions & Acknowledgment
• Permission is granted to copy and distribute this slide
set for educational purposes only, provided that the
complete bibliographic citation and following credit line
is included: "Copyright 2002 J. Rabaey et al."
Permission is granted to alter and distribute this
material provided that the following credit line is
included: "Adapted from (complete bibliographic
citation). Copyright 2002 J. Rabaey et al."
This material may not be copied or distributed for
commercial purposes without express written
permission of the copyright holders.
• Slides 13-17 Adapted from CSE477 VLSI Digital Circuits
Lecture Slides by Vijay Narayanan and Mary Jane Irwin,
Penn State University

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Administrative Details
• Survey of Digital IC Technology
• MOS Fabrication
• MOSFET Overview

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Personnel
• Prof. Raj Amirtharajah (Instructor)
Office: 3173 Kemper Hall
Email: [email protected]
Please put EEC 118 in email subject line.
Office Hours: F 2 - 3 PM or by appointment.
• Travis Kleeburg
Email: [email protected]
Office Hours: (TBD)
• Erin Fong
Email: [email protected]
Office Hours: (TBD)

• Labs
Tuesdays 6 PM – 9 PM 2157/2161 Kemper
Wednesdays 5 PM – 8 PM 2157/2161 Kemper

Amirtharajah/Parkhurst, EEC 118 Spring 2011 4


Course Materials
• Textbook
Digital Integrated Circuits (2nd ed.)
by J. Rabaey, A. Chandrakasan, and B. Nikolic
• Suggested References
CMOS Digital Integrated Circuits (3rd ed.) Kang and Leblebici
CMOS VLSI Design (3rd ed.) Weste, Harris
• Handouts
Labs, lab report cover sheets, slides, and lecture notes available
on course web page in PDF format.
• Web Page
http://www.ece.ucdavis.edu/~ramirtha/EEC118/S11/S11.html
Linked from SmartSite

Amirtharajah/Parkhurst, EEC 118 Spring 2011 5


Grading

• Letter
• A: 100 - 90%
• B: 90 - 80%
• C: 80 - 70%
• D: 70 - 60%
• F: below 60%

• Expect class average to be around B- / C+


• Curving will only help you

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


Weighting
• Labs 15%
• Design Project 15%
• Weekly Homework 5%
Scale for each problem: 0 = poor effort, 1 = close, but
fundamental problem, 2 = correct
• Quizzes 10%
Four throughout the quarter (approx. every other week),
lowest score dropped (April 11, April 25, May 18, May 25)
• Midterm 20%
Monday, May 2, in class
• Final 35%
Monday, June 6, 8:00 - 10:00 AM
Cumulative, but emphasizes material after midterm
Amirtharajah/Parkhurst, EEC 118 Spring 2011 7
Labs and CAD Software Usage
• Need to know/learn Cadence/Spectre – Circuit
Simulation
• Use same breadboard as EEC 180A
• No unsupervised lab hours!
– TA or instructor must be present for your safety
and security of the lab equipment
– Extra lab hours will be added only in unusual
circumstances

Amirtharajah/Parkhurst, EEC 118 Spring 2011 8


Education Demand for Circuit Design
• Industry needs circuit designers
– Not just logic designers
• Must understand operation at transistor level
– Not just digital designers
• Must understand analog effects
– Not just analog designers
• Must be able to comprehend Deep Sub-Micron
(DSM) effects (<0.13um)
• Fundamental circuit knowledge critical
– Similar techniques for bipolar transistors, NMOS (even
relays and vacuum tubes!)
– Must be able to exploit nanoscale devices in future
Amirtharajah/Parkhurst, EEC 118 Spring 2011 9
Education Demand for System Design
• Industry needs system designers
– Need to understand system implications of your
design
• Power Delivery, Clock Loading – What do you need
– Need to design from the system point of view
• Communication protocol – how to effectively talk
with other blocks
• What should be added into your block to meet
system design requirements(i.e. comprehend soft
block methodology for optimization of area,
interconnect, etc.)

You must operate at both levels!


Amirtharajah/Parkhurst, EEC 118 Spring 2011 10
Historical Background

Graph shows the growing complexity of designing


integrated circuits
Amirtharajah/Parkhurst, EEC 118 Spring 2011 11
Memory, Processors and Graphics
• Used to be that memory and processors were the
two main design drivers.

Amirtharajah/Parkhurst, EEC 118 Spring 2011 http://turquoise.wpi.edu/webcourse/ch01/ch01.html12


Memory, Processors and Graphics
• We now have graphics also driving integration
1000

100 mo
x/ 18
s 8
h ic
ap
Gr
10

/ 18m o
x
CPU 2

1
1H96 2H96 1H97 2H97 1H98 2H98 1H99 2H99 1H00 2H00

From ISPD 1999 Keynote Speech by Chris Malachowsky of NVIDIA


Amirtharajah/Parkhurst, EEC 118 Spring 2011 13
Hybrid to Monolithic Trend
• We continue to integrate multiple functions on a
single chip
– Mixture of Analog, Radio Frequency (RF), Digital
– Graphics/Motherboard chipset an example of this
• Cost and Performance driving market
– Higher performance achieved on chip than off chip
– Lower cost due to a single die versus multi-chip
design
– Saves on packaging, total area by eliminating
redundant functions
• System-on-a-Chip (SOC) concept

Amirtharajah/Parkhurst, EEC 118 Spring 2011 14


What are the issues facing the industry ?
• Growth of transistors is exponential
• Growth of operating frequency is (was?) exponential
– Reaching a limit due to power dissipation (see current
generation Pentiums and Itaniums)
• Complexity continues to grow
– Trend is toward multiple cores on one chip
– Design teams cannot keep up with trend
• Power dissipation a concern
– Power delivery, thermal issues, long term reliability
• Manufacturing providing us with lots of transistors
– How do we use them effectively (besides large caches)?
Amirtharajah/Parkhurst, EEC 118 Spring 2011 15
Why worry about power? Power Dissipation
Lead microprocessors power continues to increase
100

P6
Pentium ®
Power (Watts)

10
486
8086 286
386
8085
1 8080
8008
4004

0.1
1971 1974 1978 1985 1992 2000
Year

Power delivery and dissipation will be prohibitive


Source: Borkar, De Intel®
Amirtharajah/Parkhurst, EEC 118 Spring 2011 16
Why worry about power? Chip Power Density
Sun’s
10000 Surface
Rocket
Power Density (W/cm2)

1000 Nozzle
Nuclear …chips might become hot…

100 Reactor

8086 Hot Plate


10 4004 P6
8008 8085 386 Pentium®
286 486
8080
1
1970 1980 1990 2000 2010
Year
Source: Borkar, De Intel®
Amirtharajah/Parkhurst, EEC 118 Spring 2011 17
Chip Power Density Distribution
Power Map On-Die Temperature
250 110

100
200
90

Heat Flux (W/cm2)

Temperature (C)
150 80

70
100
60
50
50

0 40

• Power density is not uniformly distributed across the chip


• Silicon not the best thermal conductor (isotopically pure
diamond is)
• Max junction temperature is determined by hot-spots
– Impact on packaging, cooling
Amirtharajah/Parkhurst, EEC 118 Spring 2011 18
Recent Battery Scaling and Future Trends

Battery
(40+ lbs)

• Battery energy density increasing 8% per year, demand


increasing 24% per year (Economist, January 6, 2005)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 19
Why worry about power? Standby Power
Year 2002 2005 2008 2011 2014
Power supply Vdd 1.5 1.2 0.9 0.7 0.6
(V)
Threshold VT (V) 0.4 0.4 0.35 0.3 0.25
‰ Drain leakage will increase as VT decreases to maintain noise
margins and meet frequency demands, leading to excessive
battery draining standby power consumption.
50%
8KW
…and phones leaky!

40% 1.7KW
Standby Power

30% 400W

20%
88W
12W
10%

0%
Source: Borkar, De Intel®
2000
Amirtharajah/Parkhurst, 2002
EEC 118 2004
Spring 2011 2006 2008 20
Emerging Microsensor Applications
Industrial Plants and Power Line Monitoring Operating Room of the Future
(courtesy ABB) (courtesy John Guttag)

Target Tracking & Detection NASA/JPL sensorwebs


Location Awareness
(Courtesy of ARL)
(Courtesy of Mark Smith, HP)
Websign

Amirtharajah/Parkhurst, EEC 118 Spring 2011 21


Chip Design Styles
• Field-Programmable Gate Array (FPGA)
– Regular structure. Not all transistors are usable.
– Programmed via software (configurable wiring)
• Gate Array
– Regular structure. Higher usage of transistors than FPGA
– Two step manufacturing process.
• Diffusion and poly initially. Design must be fairly stable
• Metal layers fabricated once design is finalized
• Cell based design
– All transistors used (may have spares to fill in area)
– Each cell is fixed height so that they can be placed in rows
• Full Custom
– Highest level of compactness and performance
– Manually intensive. Not conducive to revision (ECO)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 22
Logic Design Families
• Static CMOS Logic
– Good power delay product (energy)
– Good noise margin
– Not as fast as dynamic
• Dynamic Logic
– Very fast but inefficient in use of power
– Domino, CPL, OPL
• Pass Transistor Logic
– Poor noise margin
– Sometimes static power dissipation
– Less area than static CMOS
Amirtharajah/Parkhurst, EEC 118 Spring 2011 23
Design Parameters
• Reliability (Not dealt with when relating to layout)
– Factors that dictate reliable operation of the circuit
• Electromigration, thermal issues, hot electrons,
noise margins
• Performance (Dealt with in this class)
– Not just measured in clock speed. Power-Delay
Product (PDP, equivalent to energy) is a better
measure
• Area (Not dealt with when relating to layout)
– Directly affects cost

Amirtharajah/Parkhurst, EEC 118 Spring 2011 24


Current State of the Art
• Intel Core® @ 4 GHz (1 or 2 cores/chip going to 4+)
– 800 - 1066 MHz system bus
– AGP 8x graphics (533 MHz bus)
– Memory bus at 533 MHz (DDR)
• Complex Designs demand resources
– Design teams resource limited due to logistics and cost
– Cannot afford to miss issues due to cost of product
recall
– Emphasis on pre-silicon verification as opposed to post
silicon testing
Amirtharajah/Parkhurst, EEC 118 Spring 2011 25
Modern Microprocessor
(> 100,000,000 transistors)
2003

One
centimeter

Amirtharajah/Parkhurst, EEC 118


26 Spring 2011
Modern Multicore
Microprocessor
(790,000,000 transistors)
2007
IBM POWER6

Amirtharajah/Parkhurst, EEC 118


Reick et al., Hot Chips 19, 2007
27 Spring 2011
Moore’s Law

Amirtharajah/Parkhurst, EEC 118 Spring 2011 28


Expectations
• You should already know
– Solid State – (i.e. PN junctions, semiconductor
physics, ..)
• What we will cover
– MOS Transistors Fabrication and Equations
– CMOS logic at the transistor level
– Sequential logic
– Memory
– Arithmetic Circuits
– Interconnect
• Framework
– Course to use PowerPoint for the most part
– Bring PowerPoint slides to class and write notes on
them

Amirtharajah/Parkhurst, EEC 118 Spring 2011 29


MOS Transistor Types
• Rabaey Ch. 3 (Kang & Leblebici Ch. 3)
• Two transistor types (analogous to bipolar NPN, PNP)
– NMOS: p-type substrate, n+ source/drain, electrons are
charge carriers
– PMOS: n-type substrate, p+ source/drain, holes are
charge carriers
gate gate

N+ N+ P+ P+
source drain source drain
P-substrate N-substrate
bulk (substrate) bulk (substrate)
NMOS PMOS
Amirtharajah/Parkhurst, EEC 118 Spring 2011 30
MOS Transistor Symbols
NMOS D PMOS D

G B G B

S S
D D

G B G B

S S
D D

G B G B

S S
Amirtharajah/Parkhurst, EEC 118 Spring 2011 31
Note on MOS Transistor Symbols
• All symbols appear in literature
– Symbols with arrows are conventional in analog papers
– PMOS with a bubble on the gate is conventional in digital
circuits papers
• Sometimes bulk terminal is ignored – implicitly
connected to supply:

NMOS PMOS

• Unlike physical bipolar devices, source and drain are


usually symmetric

Amirtharajah/Parkhurst, EEC 118 Spring 2011 32


MOS Transistor Structure
• Important transistor physical characteristics
– Channel length L = LD – 2xd (K&L L = Lgate – 2LD)
– Channel width W
– Thickness of oxide tox

W
tox
L

xd
Amirtharajah/Parkhurst, EEC 118 Spring 2011 33
MOS Transistor Regions of Operation
• Three main regions of operation
• Cutoff: VGS < VT
No inversion layer formed, drain and source are
isolated by depleted channel. IDS ≈ 0
• Linear (Triode, Ohmic): VGS > VT, VDS < VGS-VT
Inversion layer connects drain and source.
Current is almost linear with VDS (like a resistor)
• Saturation: VGS > VT, VDS ≥ VGS-VT
Channel is “pinched-off”. Current saturates
(becomes independent of VDS, to first order).

Amirtharajah/Parkhurst, EEC 118 Spring 2011 34


Fabrication Process
• Substrate is grown and then cut
– Round silicon wafers are used
– Purity emphasized to prevent impurities from
affecting operation (99.9999% pure)
• Each layer deposited separately
• Some layers used as masks for later layers
• Planar process is important
– Requires minimum percent usage of metal to
ensure flatness

Amirtharajah/Parkhurst, EEC 118 Spring 2011 35


Silicon Substrate Manufacturing

Amirtharajah/Parkhurst, EEC 118 Spring 2011 36


Building a Golf Course with Similar Process

• Plane drops materials from the air


– Sand, then dirt, then grass seeds, then trees
– Certain masks applied during process to prevent material
from hitting particular areas
– For instance: After Sand, mask placed over areas where
sand trap will exist. Mask later taken off at end of process
to reveal sand trap.
Amirtharajah/Parkhurst, EEC 118 Spring 2011 37
Fabrication: Patterning of SiO2 Step I

• Grow SiO2 on Si by exposing to O2


– High temperature accelerates this process
• Cover surface with photoresist (PR)
– Sensitive to UV light (wavelength determines feature size)
– Positive PR becomes soluble after exposure
– Negative PR becomes insoluble after exposure
Amirtharajah/Parkhurst, EEC 118 Spring 2011 38
Fabrication: Patterning of SiO2 Step II

• Exposed PR removed with a solvent


• SiO2 removed by etching (HF – hydrofluoric acid)
• Remaining PR removed with another solvent

Amirtharajah/Parkhurst, EEC 118 Spring 2011 39


NMOS Transistor Fabrication

• Thick field oxide grown


• Field oxide etched to create area for transistor
• Gate oxide (high quality) grown

Amirtharajah/Parkhurst, EEC 118 Spring 2011 40


NMOS Transistor Fabrication

• Polysilicon deposited (doped to reduce resistance R)


• Polysilicon etched to form gate
• Gate oxide etched from source and drain
– Self-aligned process because source/drain aligned by
gate
• Si doped with donors to create n+ regions
Amirtharajah/Parkhurst, EEC 118 Spring 2011 41
NMOS Transistor Fabrication

• Insulating SiO2 grown to cover surface/gate


• Source/Drain regions opened
• Aluminum evaporated to cover surface
• Aluminum etched to form metal1 interconnects
Amirtharajah/Parkhurst, EEC 118 Spring 2011 42
Inverter Fabrication: Layout

• Inverter
– Logic symbol
– CMOS inverter circuit
– CMOS inverter layout (top view of lithographic
masks)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 43
Inverter Fabrication: NWELL and Oxides

• N-wells created
• Thick field oxide grown surrounding active
regions
• Thin gate oxide grown over active regions

Amirtharajah/Parkhurst, EEC 118 Spring 2011 44


Inverter Fabrication: Polysilicon

• Polysilicon deposited
– Chemical vapor deposition (Places the Poly)
– Dry plasma etch (Removes unwanted Poly)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 45


Inverter Fabrication: Diffusions

• N+ and P+ regions created using two masks


– Source/Drain regions
– Self-aligned process since gate is already fabricated
– Substrate contacts

Amirtharajah/Parkhurst, EEC 118 Spring 2011 46


Inverter Fabrication

• Insulating SiO2 deposited using chemical vapor


deposition (CVD)
• Source/Drain/Substrate contacts exposed

Amirtharajah/Parkhurst, EEC 118 Spring 2011 47


Inverter Fabrication

• Metal (Al, Cu) deposited using evaporation


• Metal patterned by etching
• Copper is current metal of choice due to low resistivity

Amirtharajah/Parkhurst, EEC 118 Spring 2011 48


NWELL MOS Process
• MOS transistors use
PN junctions to
isolate different
regions and prevent
current flow.
• NWELL is used in P-
substrate so that
PMOS transistors are
isolated and don’t
share currents.

Amirtharajah/Parkhurst, EEC 118 Spring 2011 49


More Complex Processes
• Twin Well CMOS Process
– Can help to avoid body effect
– Allows for Vt and channel transconductance tuning
– Requires extra processing steps (more costly)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 50


Silicon-On-Insulator (SOI) Process
• Both transistors built on insulating substrate
– Allows for tight compaction of design area
– Some of the parasitic capacitances seen in bulk CMOS
disappear
– Wafer cost is high (IBM produces SOI, Intel doesn’t)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 51


Accounting for VDSM Effects
• VDSM = Very Deep Sub Micron
– Effects significant below 0.25 μm (0.18 μm, 130 nm, 90
nm, 65 nm, 45 nm)
• Compensation made at the mask level
– OPC – Optical Proximity Correction
– Occurs when different mask layers don’t align properly
– Test structures are used to characterize the process
– Ability to adapt depends on the consistency of the error
from process run to process run

Amirtharajah/Parkhurst, EEC 118 Spring 2011 52


Accounting for VDSM Effects: OPC

Amirtharajah/Parkhurst, EEC 118 Spring 2011 53


Accounting for VDSM Effects: Example
• Example of 2D OPC effects: rounded edges,
narrowed lines

Uncorrected Corrected

Amirtharajah/Parkhurst, EEC 118 Spring 2011 54


Compensating for VDSM Effects: Masks

Layout Mask Silicon

Amirtharajah/Parkhurst, EEC 118 Spring 2011 55


Compensating for VDSM Effects: CAD
• Flow to compensate is transparent to layout designer
• Layout design proceeds as normal

Mentor Graphics Flow


http://www.mentor.com/calibre/datasheets/opc/html/

Amirtharajah/Parkhurst, EEC 118 Spring 2011 56


References
• “Design of VLSI Systems”. A web based course
located at: http://turquoise.wpi.edu/webcourse/
• “Simplified Rule Generation for Automated Rules-
Based Optical Enhancement”, Otto et. al. On web
at:
http://www.jetlink.net/~ootto/bacus95/BACUS95In
dex.html
• Mark Anders and Jim Schantz of Intel Corporation
• Jan Rabaey, Lecture notes from his book “Digital
Integrated Circuits, A Design Perspective”

Amirtharajah/Parkhurst, EEC 118 Spring 2011 57


MOSFET Drain Current Overview

μCox W
Saturation: ID = (VGS − VT ) (1 + λVDS )
2

2 L

Linear (Triode, Ohmic):


⎛ VDS ⎞
2
I D = μCox ⎜⎜ (VGS − VT )VDS −
W
⎟⎟
L⎝ 2 ⎠
Cutoff: ID ≈ 0

“Classical” MOSFET model, will discuss deep submicron


modifications as necessary (Rabaey, Eqs. 3.25, 3.29)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 58
A Fourth Region: Subthreshold
VGS
⎛ V
− DS ⎞
Subthreshold: ID = ISe
n kT q ⎜1 − e kT q ⎟
⎜ ⎟
⎝ ⎠
• Sometimes called “weak inversion” region
• When VGS near VT, drain current has an exponential
dependence on gate to source voltage
– Similar to a bipolar device
• Not typically used in digital circuits
– Sometimes used in very low power digital applications
– Often used in low power analog circuits, e.g. quartz
watches
Amirtharajah/Parkhurst, EEC 118 Spring 2011 59
Next Topic: MOSFET Details

• MOS Structure

– Derivation of threshold voltage, drain current equations

• MOSFET Scaling

• MOSFET Capacitances

Amirtharajah/Parkhurst, EEC 118 Spring 2011 60


EEC 118 Lecture #2:
MOSFET Structure and Basic
Operation

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Outline
• Finish Lecture 1 Slides
• Switch Example
• MOSFET Structure
• MOSFET Regimes of Operation
• Scaling
• Parasitic Capacitances

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


MOS Transistor Types
• Rabaey Ch. 3 (Kang & Leblebici Ch. 3)
• Two transistor types (analogous to bipolar NPN, PNP)
– NMOS: p-type substrate, n+ source/drain, electrons are
charge carriers
– PMOS: n-type substrate, p+ source/drain, holes are
charge carriers
gate gate

N+ N+ P+ P+
source drain source drain
P-substrate N-substrate
bulk (substrate) bulk (substrate)
NMOS PMOS
Amirtharajah/Parkhurst, EEC 118 Spring 2011 4
MOS Transistor Symbols
NMOS D PMOS D

G B G B

S S
D D

G B G B

S S
D D

G B G B

S S
Amirtharajah/Parkhurst, EEC 118 Spring 2011 5
Note on MOS Transistor Symbols
• All symbols appear in literature
– Symbols with arrows are conventional in analog papers
– PMOS with a bubble on the gate is conventional in digital
circuits papers
• Sometimes bulk terminal is ignored – implicitly
connected to supply:

NMOS PMOS

• Unlike physical bipolar devices, source and drain are


usually symmetric

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


MOS Transistor Structure
• Important transistor physical characteristics
– Channel length L = LD – 2xd (K&L L = Lgate – 2LD)
– Channel width W
– Thickness of oxide tox

W
tox
L

xd
Amirtharajah/Parkhurst, EEC 118 Spring 2011 7
NMOS Transistor I-V Characteristics I

• I-V curve vaguely resembles bipolar transistor curves


– Quantitatively very different
– Turn-on voltage called Threshold Voltage VT
Amirtharajah/Parkhurst, EEC 118 Spring 2011 8
NMOS Transistor I-V Characteristics II

• Drain current varies quadratically with gate-source


voltage VGS (in Saturation)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 9


MOS Transistor Operation: Cutoff
• Simple case: VD = VS = VB = 0
– Operates as MOS capacitor (Cg = gate to channel)
– Transistor in cutoff region
• When VGS < VT0, depletion region forms
– No carriers in channel to connect S and D (Cutoff)
Vg < VT0
Vs = 0 Vd = 0
depletion
source drain region
P-substrate
VB = 0
Amirtharajah/Parkhurst, EEC 118 Spring 2011 10
MOS Transistor Operation: Inversion
• When VGS > VT0, inversion layer forms
• Source and drain connected by conducting n-
type layer (for NMOS)
– Conducting p-type layer in PMOS

Vg > VT0
Vs = 0 Vd = 0
depletion
source drain region
P-substrate

inversion VB = 0
layer
Amirtharajah/Parkhurst, EEC 118 Spring 2011 11
Threshold Voltage Components
• Four physical components of the threshold voltage
1. Work function difference between gate and channel
(depends on metal or polysilicon gate): ΦGC
2. Gate voltage to invert surface potential: -2ΦF
3. Gate voltage to offset depletion region charge:
QB/Cox
4. Gate voltage to offset fixed charges in the gate oxide
and oxide-channel interface: Qox/Cox

ε ox
Cox = : gate oxide capacitance per unit area
tox
Amirtharajah/Parkhurst, EEC 118 Spring 2011 12
Threshold Voltage Summary
• If VSB = 0 (no substrate bias):
QB 0 Qox
VT 0 = Φ GC − 2φ F − − (K&L 3.20)
Cox Cox

• If VSB ≠ 0 (non-zero substrate bias)

VT = VT 0 + γ ( − 2φ F + VSB − 2φ F ) (3.19)

• Body effect (substrate-bias) coefficient:


2qN Aε Si
γ= (K&L 3.24)
Cox
• Threshold voltage increases as VSB increases!
Amirtharajah/Parkhurst, EEC 118 Spring 2011 13
Threshold Voltage (NMOS vs. PMOS)

NMOS PMOS

Substrate Fermi
potential
φF < 0 φF > 0

Depletion charge
density
QB < 0 QB > 0

Substrate bias
coefficient
γ>0 γ<0

Substrate bias voltage VSB > 0 VSB < 0

Amirtharajah/Parkhurst, EEC 118 Spring 2011 14


Body Effect
• Body effect: Source-bulk voltage VSB affects threshold
voltage of transistor
– Body normally connected to ground for NMOS, Vdd
(Vcc) for PMOS
– Raising source voltage increases VT of transistor
– Implications on circuit design: series stacks of devices

If Vx > 0,
A VSB (A) > 0,
Vx VT(A) > VTO
B
VT0

Amirtharajah/Parkhurst, EEC 118 Spring 2011 15


MOS Transistor Regions of Operation
• Three main regions of operation
• Cutoff: VGS < VT
No inversion layer formed, drain and source are
isolated by depleted channel. IDS ≈ 0
• Linear (Triode, Ohmic): VGS > VT, VDS < VGS-VT
Inversion layer connects drain and source.
Current is almost linear with VDS (like a resistor)
• Saturation: VGS > VT, VDS ≥ VGS-VT
Channel is “pinched-off”. Current saturates
(becomes independent of VDS, to first order).

Amirtharajah/Parkhurst, EEC 118 Spring 2011 16


MOSFET Drain Current Overview

μCox W
Saturation: ID = (VGS − VT ) (1 + λVDS )
2

2 L

Linear (Triode, Ohmic):


⎛ VDS ⎞
2
I D = μCox ⎜⎜ (VGS − VT )VDS −
W
⎟⎟
L⎝ 2 ⎠
Cutoff: ID ≈ 0

“Classical” MOSFET model, will discuss deep submicron


modifications as necessary (Rabaey, Eqs. 3.25, 3.29)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 17
Cutoff Region
VG
VS VD

depletion
source drain region
substrate

• For NMOS: VGS < VTN VB


• For PMOS: VGS > VTP
• Depletion region – no inversion
• Current between drain and source is 0
– Actually there is always some leakage (subthreshold)
current
Amirtharajah/Parkhurst, EEC 118 Spring 2011 18
Linear Region
• When VGS>VT, an inversion layer forms between drain and
source
• Current IDS flows from drain to source (electrons travel
from source to drain)
• Depth of channel depends on V between gate and channel
– Drain end narrower due to larger drain voltage
– Drain end depth reduces as VDS is increased
Vg > VT0
Vs=0 Vd < VGS-VT0
depletion
source drain
Channel region (larger
(inversion layer) at drain end)
P-substrate
VB = 0
Amirtharajah/Parkhurst, EEC 118 Spring 2011 19
Linear Region I/V Equation Derivation

• Gradual Channel Approximation:


– Assume dominant electric field in y-direction
– Current is constant along channel
• Integrate differential voltage drop dVc = IDdR along y
Amirtharajah/Parkhurst, EEC 118 Spring 2011 20
Linear Region I/V Equation

• Valid for continuous channel from Source to Drain

I D = μ nCox
W
L
[
(VGS − VT )VDS − 12 VDS2 ]
W
Device transconductance: k n = μ nCox
L
Process transconductance: k = μ nCox
'
n

ID = k
W
L
'
n [
(VGS − VT )VDS − 2 VDS
1 2
]
Amirtharajah/Parkhurst, EEC 118 Spring 2011 21
Saturation Region
• When VDS = VGS - VT:
– No longer voltage drop of VT from gate to substrate at drain
– Channel is “pinched off”
• If VDS is further increased, no increase in current IDS
– As VDS increased, pinch-off point moves closer to source
– Channel between that point and drain is depleted
– High electric field in depleted region accelerates electrons
towards drain Vg > VT0
Vs=0 Vd > VGS-VT0
depletion
source drain region

pinch-off point VB = 0
Amirtharajah/Parkhurst, EEC 118 Spring 2011 22
Saturation I/V Equation
• As drain voltage increases, channel remains
pinched off
– Channel voltage remains constant
– Current saturates (no increase with increasing VDS)
• To get saturation current, use linear equation with
VDS = VGS - VT

I D = μ nCox (VGS − VTN )


1W 2
2
L

Amirtharajah/Parkhurst, EEC 118 Spring 2011 23


MOS I/V Characteristics
• I/V curve for ideal MOS device
• VGS3> VGS2 >VGS1

VGS3
Drain current IDS

Linear VGS2

VGS1

Saturation

Drain voltage VDS

Amirtharajah/Parkhurst, EEC 118 Spring 2011 24


Channel Length Modulation
• In saturation, pinch-off point moves
– As VDS is increased, pinch-off point moves closer to source
– Effective channel length becomes shorter
– Current increases due to shorter channel

L = L − ΔL
'

I D = μ nCox (VGS − VTN ) (1 + λVDS )


1W 2
2
L
λ = channel length modulation coefficient

Amirtharajah/Parkhurst, EEC 118 Spring 2011 25


MOS I/V Curve Summary
I/V curve for non-ideal NMOS device:

VDS = VGS-VT
VGS3 with channel-
length
Drain current IDS

Linear VGS2 modulation

VGS1 without channel-


length modulation
(λ=0)
Saturation

Drain voltage VDS

Amirtharajah/Parkhurst, EEC 118 Spring 2011 26


MOS I/V Equations Summary
Cutoff VGS < VTN
⇒ ID = 0
VGS > VTP
Linear
VGS ≥VTN, VDS <VGS −VTN
VGS ≤VTP, VDS >VGS −VTP
W
L
[
⇒ID = μCox (VGS −VT )VDS − 12 VDS
2
]
Saturation
VGS ≥VTN, VDS ≥VGS −VTN
⇒ID = 2 μCox (VGS −VT ) (1+λVDS)
1 W 2

VGS ≤VTP, VDS ≤VGS −VTP L


Note: if VSB ≠ 0, need to recalculate VT from VT0
Amirtharajah/Parkhurst, EEC 118 Spring 2011 27
A Fourth Region: Subthreshold
VGS
⎛ V
− DS ⎞
Subthreshold: ID = ISe
n kT q ⎜1 − e kT q ⎟
⎜ ⎟
⎝ ⎠
• Sometimes called “weak inversion” region
• When VGS near VT, drain current has an exponential
dependence on gate to source voltage
– Similar to a bipolar device
• Not typically used in digital circuits
– Sometimes used in very low power digital applications
– Often used in low power analog circuits, e.g. quartz
watches
Amirtharajah/Parkhurst, EEC 118 Spring 2011 28
MOSFET Scaling Effects

• Rabaey Section 3.5 (Kang & Leblebici Section 3.5)


• Scaling provides enormous advantages
– Scale linear dimension (channel length) by factor S > 1
– Better area density, yield, performance
• Two types of scaling
– Constant field scaling (full scaling)
• A’ = A/S2; L’ = L/S; W’ = W/S; ID’ = ID/S; P’ = P/S2 ;
Vdd’ = Vdd/S
• Power Density P’/A’ = stays the same Change these two
– Constant voltage scaling
• A’ = A/S2; L’ = L/S; W’ = W/S; ID’ = ID*S; P’ = P*S;
Vdd’ = Vdd
• Power Density P’/A’ = S3*P (Reliability issue)
Amirtharajah/Parkhurst, EEC 118 Spring 2011
This changed as well 29
Short Channel Effects
• As geometries are scaled down
– VT (effective) goes lower
– Effective channel length decreases
– Sub-threshold Ids occurs
• Current goes from drain to source while Vgs < Vt
– Tox is scaled which can cause reliability problems
• Can’t handle large Vg without hot electron effects
– Changes the Vt when carriers imbed themselves
in the oxide
– Interconnects scale
• Electromigration and ESD become issues

Amirtharajah/Parkhurst, EEC 118 Spring 2011 30


MOSFET Capacitances
• Rabaey Section 3.3 (Kang & Leblebici Section 3.6)
• Oxide Capacitance
– Gate to Source overlap
– Gate to Drain overlap
– Gate to Channel
• Junction Capacitance
– Source to Bulk junction
– Drain to Bulk junction

Amirtharajah/Parkhurst, EEC 118 Spring 2011 31


Oxide Capacitances: Overlap

source Ldrawn drain

xd

• Overlap capacitances
– Gate electrode overlaps source and drain regions

– xd is overlap length on each side of channel


– Leff = Ldrawn – 2xd (effective channel length)
– Overlap capacitance:
CGSO = CGDO = CoxWxd Assume xd equal on both sides
Amirtharajah/Parkhurst, EEC 118 Spring 2011 32
Total Oxide Capacitance
• Total capacitance consists of 2 components
– Overlap capacitance
– Channel capacitance Cgs Cgd
source drain
Cgb
• Cutoff:
– No channel connecting to source or drain

– CGS = CGD = CoxWxd

– CGB = CoxWLeff

– Total Gate Capacitance = CG = CoxWL

Amirtharajah/Parkhurst, EEC 118 Spring 2011 33


Oxide Capacitances: Channel
• Linear mode
– Channel spans from source to drain
– Channel Capacitance split equally between S and D
1 1
CGS = C oxWLeff CGD = C oxWLeff CGB = 0
2 2
– Total Gate capacitance CG = CoxWL
• Saturation regime
– Channel is pinched off: Channel Capacitance --
2
CGD = Wxd Cox CGS = CoxWLeff + COX Wxd CGB = 0
3
– Total Gate capacitance:
CG = 2/3 CoxWLeff + 2xdWCOX
Amirtharajah/Parkhurst, EEC 118 Spring 2011 34
Oxide Capacitances: Channel

Cg,total
(no overlap,
xd = 0)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 35


Junction Capacitance

Reverse-biased P-N junctions!


Capacitance depends on reverse-bias voltage.

Amirtharajah/Parkhurst, EEC 118 Spring 2011 36


Junction Capacitance

A 2qε N d N a
For a P-N junction: Cj =
2 V0 − V N d + N a
qε Si N d N a
If V=0, cap/area = C j0 =
2V0 N d + N a

AC j 0
General form: Cj = m
⎛ V ⎞
⎜⎜1 − ⎟⎟
⎝ V0 ⎠

m = grading coefficient (0.5 for abrupt junctions)


(0.3 for graded junctions)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 37
Junction Capacitance
• Junction with substrate
– Bottom area = W * LS (length of drain/source)
– Total cap = Cj
• Junction with sidewalls
– “Channel-stop implant”
– Perimeter = 2LS + W
– Area = P * Xj
– Total cap = Cjsw
• Total junction cap C = Cj + Cjsw

Amirtharajah/Parkhurst, EEC 118 Spring 2011 38


Junction Capacitance
• Voltage Equivalence Factor
– Creates an average capacitance value for a
voltage transition, defined as ΔQ/ΔV

− AC j 0V0 ⎜ ⎛ V2 ⎞
1− m
⎛ V1 ⎞ ⎞⎟
1− m

Ceq = ⎜⎜1 − ⎟⎟ − ⎜⎜1 − ⎟⎟ = AK eq C j 0


(V2 − V1 )(1 − m ) ⎜⎝ ⎝ V0 ⎠ ⎝ V0 ⎠ ⎟⎠

K eq =
− 2 V0
(V2 − V1 )
(
V0 − V2 − V0 − V1 ) (abrupt junction only)

Cdb = AK eq C j 0 + PX j K eqswC jsw0

Amirtharajah/Parkhurst, EEC 118 Spring 2011 39


Example: Junction Cap
• Consider the following NMOS device
– Substrate doping: NA = 1015 cm-3
– Source/drain doping: ND = 2 x 1020 cm-3
– Channel-stop doping: 10X substrate doping
– Drain length LD = 1um
– Transistor W = 10um
– Junction depth Xj = 0.5um, abrupt junction

• Find capacitance of drain-bulk junction when


drain voltage = 3V

Amirtharajah/Parkhurst, EEC 118 Spring 2011 40


Next Topic: Inverters

• Inverter Characteristics

– Transfer functions, noise margins, resistive and


nonlinear loads

• CMOS Inverters

Amirtharajah/Parkhurst, EEC 118 Spring 2011 41


EEC 116 Lecture #3:
CMOS Inverters
MOS Scaling

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Outline
• Review: Inverter Transfer Characteristics
• Lecture 3: Noise Margins, Rise & Fall Times,
Inverter Delay
• CMOS Inverters: Rabaey 1.3.2, 5 (Kang &
Leblebici, 5.1-5.3 and 6.1-6.2)

Amirtharajah, EEC 116 Fall 2011 3


Review: Inverter Voltage Transfer Curve
Voltage transfer curve (VTC): plot of output voltage
Vout vs. input voltage Vin

Vin Inverter Vout

Ideal digital inverter:


Vdd ideal
–When Vin=0,
Vout=Vdd
Vout
actual –When Vin=Vdd,
Vout=0
–Sharp transition region
0V Vdd
Vin
Amirtharajah, EEC 116 Fall 2011 4
Review: Actual Inverter Output Levels
• VOH and VOL represent the
“high” and “low” output
voltages of the inverter
• VOH = output voltage when
VOH Vin = ‘0’ (V Output High)

Vout • VOL = output voltage when


Vin = ‘1’ (V Output Low)
VOL • Ideally,
0V Vdd
Vin – VOH = Vdd
– VOL = 0 V

Amirtharajah, EEC 116 Fall 2011 5


Review: VOL and VOH

• In transfer function terms:


– VOL = f(VOH)

VOH – VOH = f(VOL)


– f = inverter transfer
Vout function
• Difference (VOH-VOL) is the
VOL voltage swing of the gate
VOL Vin VOHVdd – Full-swing logic swings
from ground to Vdd
– Other families with
smaller swings

Amirtharajah, EEC 116 Fall 2011 6


Review: Inverter Switching Threshold
Inverter switching
threshold:
– Point where voltage
transfer curve intersects
VOH Vout=Vin
line Vout=Vin

Vout – Represents the point at


VM
which the inverter
switches state
VOL
Vdd – Normally, VM ≈ Vdd/2
Vin
– Sometimes other
thresholds desirable

Amirtharajah, EEC 116 Fall 2011 7


VTC Mathematical Definitions
• VOH is the output high level of an inverter
VOH = VTC(VOL)
• VOL is the output low level of an inverter
VOL = VTC(VOH)
• VM is the switching threshold
VM = VIN = VOUT
• VIH is the lowest input voltage for which the output
will be ≥ the input (worst case ‘1’)
dVTC(VIH)/dVIH = -1
• VIL is the highest input voltage for which the output
will be ≤ the input (worst case ‘0’)
dVTC(VIL)/dVIL = -1

Amirtharajah, EEC 116 Fall 2011 8


Noise Margin and Delay Definitions
• NML is the difference between the highest
acceptable ‘0’ and the lowest possible ‘0’
NML = VIL – VOL
• NMH is the difference between the lowest acceptable
‘1’ and the highest possible ‘1’
NMH = VOH – VIH
• tPHL is the propagation delay from the 50% point of
the input to the output when the output goes from
high to low
• tPLH is the propagation delay from the 50% point of
the input to the output when the output goes from
low to high
• tP is the average propagation delay
• tR is the rise time (usually 10% to 90%)
• tF is the fall time (usually 90% to 10%)
Amirtharajah, EEC 116 Fall 2011 9
CMOS Inverter
• Complementary NMOS and
PMOS devices Vdd
• In steady-state, only one device
is on (no static power
consumption) Vin Vout
• Vin=1: NMOS on, PMOS off
– Vout = VOL = 0
• Vin=0: PMOS on, NMOS off
– Vout = VOH = Vdd
• Ideal VOL and VOH!
• Ratioless logic: output is
independent of transistor sizes in
Gnd
steady-state
Amirtharajah, EEC 116 Fall 2011 10
CMOS Inverter: VTC

PMOS NMOS
Vin=4V
Drain current IDS

Vin=3V Vdd

Vout
Vin=2V

Vin=1V

Vout = VDS Vdd 0 1 2 Vin 3 4

• Output goes completely to Vdd and Gnd


• Sharp transition region
Amirtharajah, EEC 116 Fall 2011 11
CMOS Inverter Operation
VDD
• NMOS transistor:
– Cutoff if Vin < VTN Vin Vout
– Linear if Vout < Vin – VTN
– Saturated if Vout > Vin – VTN
• PMOS transistor
– Cutoff if (Vin-VDD) > VTP → Vin > VDD+VTP
– Linear if (Vout-VDD)>Vin-VDD-VTP → Vout>Vin - VTP
– Sat. if (Vout-VDD)<Vin-VDD-VTP → Vout < Vin-VTP

Amirtharajah, EEC 116 Fall 2011 12


CMOS Inverter VTC: Device Operation

P linear P cutoff
N cutoff N linear

P linear
N sat P sat
N sat

P sat
N linear

Amirtharajah, EEC 116 Fall 2011 13


CMOS Inverter VTC: Device Sizing

• Increase W of PMOS
VDD kp=kn kp increases
VTC moves to right
• Increase W of NMOS
Vout kp=5kn
kn increases
VTC moves to left

kp=0.2kn • For VM = VDD/2


kn = kp
VDD 2Wn ≈ Wp
Vin

Amirtharajah, EEC 116 Fall 2011 14


Effects of VM adjustment
• Result from changing kp/kn ratio:
– Inverter threshold VM ≠ VDD/2
– Rise and fall delays unequal
– Noise margins not equal
• Reasons for changing inverter threshold
– Want a faster delay for one type of transition
(rise/fall)
– Remove noise from input signal: increase one
noise margin at expense of the other
– Interfacing other types of logic (with different
swings)

Amirtharajah, EEC 116 Fall 2011 15


CMOS Inverter: VIL Calculation
• KCL (NMOS saturation, PMOS linear):
kn
2
2

2
[
(VGS ,n − VT 0,n ) = 2(VGS , p − VT 0, p )VDS , p − VDS , p 2
kp
]
kn
2
2

2
[
(Vin − VT 0,n ) = 2(Vin − VDD − VT 0, p )(Vout − VDD ) − (Vout − VDD )2
kp
]
• Differentiate and set dVout/dVin to –1
⎡ dVout ⎤
k n (Vin − VT 0,n ) = k p ⎢(Vin − VDD − VT 0, p ) + (Vout − VDD ) − (Vout − VDD )
dVout

⎣ dV in dV in ⎦

kn (VIL −VT 0,n ) = k p (2Vout −VIL +VT 0, p −VDD )

2Vout + VT 0, p − VDD + k RVT 0,n kn


VIL = kR =
1 + kR kp
• Solve simultaneously with KCL to find VIL
Amirtharajah, EEC 116 Fall 2011 16
CMOS Inverter: VIH Calculation
• KCL: k n 2(V
2
GS , n[− VT 0,n )V DS , n − V DS , n
2
=
kp
2
(V ]
GS , p − V T 0, p )2

kn
2
[
2(Vin − VT 0,n )Vout − Vout =
2 kp
2
] (
Vin − VDD − VT 0, p
2
)
• Differentiate and set dVout/dVin to –1
⎡ dVout ⎤
k n ⎢(Vin − VT 0,n ) ⎥ = k p (Vin − VDD − VT 0, p )
dVout
+ Vout − Vout
⎣ dVin dVin ⎦
( )
kn 2Vout −VIH +VT 0, p = k p VIH −VDD −VT 0, p ( )
VDD + VT 0, p + k R (2Vout + VT 0,n ) kn
VIH = kR =
1 + kR kp
• Solve simultaneously with KCL to find VIH
Amirtharajah, EEC 116 Fall 2011 17
CMOS Inverter: VM Calculation
• KCL (NMOS & PMOS saturated):

(VGS ,n − VT 0,n ) = (VGS , p − VT 0, p )2


kn 2 kp
2 2
kn
(Vin − VT 0,n ) = (Vin − VDD − VT 0, p )
2 kp 2

2 2
• Solve for VM = Vin = Vout

VT 0,n +
1
(VDD + VT 0, p )
kR kn
VM = kR =
1 kp
1+
kR
Amirtharajah, EEC 116 Fall 2011 18
CMOS Inverter: Achieving Ideal VM

VT 0,n +
1
kR
(VDD + VT 0, p )
kn
VTH = kR =
1 kp
1+
kR
2
⎛ VDD 2 + VT 0, p ⎞
• Ideally, VM = VDD/2 k R ,ideal = ⎜⎜ ⎟

⎝ V DD 2 + VT 0 , n ⎠

• Assuming VT0,n = VT0,p, k R ,ideal =1


⎛W ⎞
⎜ ⎟
⎝ L ⎠ p μn
= ≈ 2.5
⎛W ⎞ μp
⎜ ⎟
⎝ L ⎠n
Amirtharajah, EEC 116 Fall 2011 19
CMOS Inverter: VIL and VIH for Ideal VM

• Assuming VT0,n=-VT0,p, and kR = 1,

VIL = (3VDD + 2 VT 0 )
1
8
VIH = (5VDD − 2 VT 0 )
1
8
VIL + VIH = VDD

NM L = VIL − VOL = VIL

NM H = VOH − VIH = VDD − VIH = VIL


Amirtharajah, EEC 116 Fall 2011 20
MOSFET Scaling Effects

• Rabaey Section 3.5 (Kang & Leblebici Section 3.5)


• Scaling provides enormous advantages
– Scale linear dimension (channel length) by factor S > 1
– Better area density, yield, performance
• Two types of scaling
– Constant field scaling (full scaling)
• A’ = A/S2; L’ = L/S; W’ = W/S; ID’ = ID/S; P’ = P/S2 ;
Vdd’ = Vdd/S
• Power Density P’/A’ = stays the same Change these two
– Constant voltage scaling
• A’ = A/S2; L’ = L/S; W’ = W/S; ID’ = ID*S; P’ = P*S;
Vdd’ = Vdd
• Power Density P’/A’ = S3*P (Reliability issue)
Amirtharajah, EEC 116 Fall 2011
This changed as well 21
Short Channel Effects
• As geometries are scaled down
– VT (effective) goes lower
– Effective channel length decreases
– Sub-threshold Ids occurs
• Current goes from drain to source while Vgs < Vt
– Tox is scaled which can cause reliability problems
• Can’t handle large Vg without hot electron effects
– Changes the Vt when carriers imbed themselves
in the oxide
– Interconnects scale
• Electromigration and ESD become issues

Amirtharajah, EEC 116 Fall 2011 22


MOSFET Capacitances
• Rabaey Section 3.3 (Kang & Leblebici Section 3.6)
• Oxide Capacitance
– Gate to Source overlap
– Gate to Drain overlap
– Gate to Channel
• Junction Capacitance
– Source to Bulk junction
– Drain to Bulk junction

Amirtharajah, EEC 116 Fall 2011 23


Oxide Capacitances: Overlap

source Ldrawn drain

xd

• Overlap capacitances
– Gate electrode overlaps source and drain regions

– xd is overlap length on each side of channel


– Leff = Ldrawn – 2xd (effective channel length)
– Overlap capacitance:
CGSO = CGDO = CoxWxd Assume xd equal on both sides
Amirtharajah, EEC 116 Fall 2011 24
Total Oxide Capacitance
• Total capacitance consists of 2 components
– Overlap capacitance
– Channel capacitance Cgs Cgd
source drain
Cgb
• Cutoff:
– No channel connecting to source or drain

– CGS = CGD = CoxWxd

– CGB = CoxWLeff

– Total Gate Capacitance = CG = CoxWL

Amirtharajah, EEC 116 Fall 2011 25


Oxide Capacitances: Channel
• Linear mode
– Channel spans from source to drain
– Channel Capacitance split equally between S and D
1 1
CGS = C oxWLeff CGD = C oxWLeff CGB = 0
2 2
– Total Gate capacitance CG = CoxWL
• Saturation regime
– Channel is pinched off: Channel Capacitance --
2
CGD = Wxd Cox CGS = CoxWLeff + COX Wxd CGB = 0
3
– Total Gate capacitance:
CG = 2/3 CoxWLeff + 2xdWCOX
Amirtharajah, EEC 116 Fall 2011 26
Oxide Capacitances: Channel

Cg,total
(no overlap,
xd = 0)

Amirtharajah, EEC 116 Fall 2011 27


Junction Capacitance

Reverse-biased P-N junctions!


Capacitance depends on reverse-bias voltage.

Amirtharajah, EEC 116 Fall 2011 28


Junction Capacitance

A 2qε N d N a
For a P-N junction: Cj =
2 V0 − V N d + N a
qε Si N d N a
If V=0, cap/area = C j0 =
2V0 N d + N a

AC j 0
General form: Cj = m
⎛ V ⎞
⎜⎜1 − ⎟⎟
⎝ V0 ⎠

m = grading coefficient (0.5 for abrupt junctions)


(0.3 for graded junctions)
Amirtharajah, EEC 116 Fall 2011 29
Junction Capacitance
• Junction with substrate
– Bottom area = W * LS (length of drain/source)
– Total cap = Cj
• Junction with sidewalls
– “Channel-stop implant”
– Perimeter = 2LS + W
– Area = P * Xj
– Total cap = Cjsw
• Total junction cap C = Cj + Cjsw

Amirtharajah, EEC 116 Fall 2011 30


Junction Capacitance
• Voltage Equivalence Factor
– Creates an average capacitance value for a
voltage transition, defined as ΔQ/ΔV

− AC j 0V0 ⎜ ⎛ V2 ⎞
1− m
⎛ V1 ⎞ ⎞⎟
1− m

Ceq = ⎜⎜1 − ⎟⎟ − ⎜⎜1 − ⎟⎟ = AK eq C j 0


(V2 − V1 )(1 − m ) ⎜⎝ ⎝ V0 ⎠ ⎝ V0 ⎠ ⎟⎠

K eq =
− 2 V0
(V2 − V1 )
(
V0 − V2 − V0 − V1 ) (abrupt junction only)

Cdb = AK eq C j 0 + PX j K eqswC jsw0

Amirtharajah, EEC 116 Fall 2011 31


Example: Junction Cap
• Consider the following NMOS device
– Substrate doping: NA = 1015 cm-3
– Source/drain doping: ND = 2 x 1020 cm-3
– Channel-stop doping: 10X substrate doping
– Drain length LD = 1um
– Transistor W = 10um
– Junction depth Xj = 0.5um, abrupt junction

• Find capacitance of drain-bulk junction when


drain voltage = 3V

Amirtharajah, EEC 116 Fall 2011 32


Next Time: AC Characteristics

• CMOS Inverters

– AC Characteristics: Designing for speed

Amirtharajah, EEC 116 Fall 2011 33


EEC 118 Lecture #4:
CMOS Inverters

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Announcements
• Lab 2 this week, report due next week
• Lab 1 reports due this week at lab section
• HW 2 due this Friday at 4 PM in box, Kemper
2131

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Review: Inverter Transfer Characteristics
• Lecture 3: Noise Margins, Rise & Fall Times,
Inverter Delay
• CMOS Inverters: Rabaey 1.3.2, 5 (Kang &
Leblebici, 5.1-5.3 and 6.1-6.2)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Review: Inverter Voltage Transfer Curve
Voltage transfer curve (VTC): plot of output voltage
Vout vs. input voltage Vin

Vin Inverter Vout

Ideal digital inverter:


Vdd ideal
–When Vin=0,
Vout=Vdd
Vout
actual –When Vin=Vdd,
Vout=0
–Sharp transition region
0V Vdd
Vin
Amirtharajah/Parkhurst, EEC 118 Spring 2011 4
Review: Actual Inverter Output Levels
• VOH and VOL represent the
“high” and “low” output
voltages of the inverter
• VOH = output voltage when
VOH Vin = ‘0’ (V Output High)

Vout • VOL = output voltage when


Vin = ‘1’ (V Output Low)
VOL • Ideally,
0V Vdd
Vin – VOH = Vdd
– VOL = 0 V

Amirtharajah/Parkhurst, EEC 118 Spring 2011 5


Review: VOL and VOH

• In transfer function terms:


– VOL = f(VOH)

VOH – VOH = f(VOL)


– f = inverter transfer
Vout function
• Difference (VOH-VOL) is the
VOL voltage swing of the gate
VOL Vin VOHVdd – Full-swing logic swings
from ground to Vdd
– Other families with
smaller swings

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


Review: Inverter Switching Threshold
Inverter switching
threshold:
– Point where voltage
transfer curve intersects
VOH Vout=Vin
line Vout=Vin

Vout – Represents the point at


VM
which the inverter
switches state
VOL
Vdd – Normally, VM ≈ Vdd/2
Vin
– Sometimes other
thresholds desirable

Amirtharajah/Parkhurst, EEC 118 Spring 2011 7


VTC Mathematical Definitions
• VOH is the output high level of an inverter
VOH = VTC(VOL)
• VOL is the output low level of an inverter
VOL = VTC(VOH)
• VM is the switching threshold
VM = VIN = VOUT
• VIH is the lowest input voltage for which the output
will be ≥ the input (worst case ‘1’)
dVTC(VIH)/dVIH = -1
• VIL is the highest input voltage for which the output
will be ≤ the input (worst case ‘0’)
dVTC(VIL)/dVIL = -1

Amirtharajah/Parkhurst, EEC 118 Spring 2011 8


Noise Margin and Delay Definitions
• NML is the difference between the highest
acceptable ‘0’ and the lowest possible ‘0’
NML = VIL – VOL
• NMH is the difference between the lowest acceptable
‘1’ and the highest possible ‘1’
NMH = VOH – VIH
• tPHL is the propagation delay from the 50% point of
the input to the output when the output goes from
high to low
• tPLH is the propagation delay from the 50% point of
the input to the output when the output goes from
low to high
• tP is the average propagation delay
• tR is the rise time (usually 10% to 90%)
• tF is the fall time (usually 90% to 10%)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 9
CMOS Inverter
• Complementary NMOS and
PMOS devices Vdd
• In steady-state, only one device
is on (no static power
consumption) Vin Vout
• Vin=1: NMOS on, PMOS off
– Vout = VOL = 0
• Vin=0: PMOS on, NMOS off
– Vout = VOH = Vdd
• Ideal VOL and VOH!
• Ratioless logic: output is
independent of transistor sizes in
Gnd
steady-state
Amirtharajah/Parkhurst, EEC 118 Spring 2011 10
CMOS Inverter: VTC

PMOS NMOS
Vin=4V
Drain current IDS

Vin=3V Vdd

Vout
Vin=2V

Vin=1V

Vout = VDS Vdd 0 1 2 Vin 3 4

• Output goes completely to Vdd and Gnd


• Sharp transition region
Amirtharajah/Parkhurst, EEC 118 Spring 2011 11
CMOS Inverter Operation
VDD
• NMOS transistor:
– Cutoff if Vin < VTN Vin Vout
– Linear if Vout < Vin – VTN
– Saturated if Vout > Vin – VTN
• PMOS transistor
– Cutoff if (Vin-VDD) > VTP → Vin > VDD+VTP
– Linear if (Vout-VDD)>Vin-VDD-VTP → Vout>Vin - VTP
– Sat. if (Vout-VDD)<Vin-VDD-VTP → Vout < Vin-VTP

Amirtharajah/Parkhurst, EEC 118 Spring 2011 12


CMOS Inverter VTC: Device Operation

P linear P cutoff
N cutoff N linear

P linear
N sat P sat
N sat

P sat
N linear

Amirtharajah/Parkhurst, EEC 118 Spring 2011 13


CMOS Inverter VTC: Device Sizing

• Increase W of PMOS
VDD kp=kn kp increases
VTC moves to right
• Increase W of NMOS
Vout kp=5kn
kn increases
VTC moves to left

kp=0.2kn • For VM = VDD/2


kn = kp
VDD 2Wn ≈ Wp
Vin

Amirtharajah/Parkhurst, EEC 118 Spring 2011 14


Effects of VM adjustment
• Result from changing kp/kn ratio:
– Inverter threshold VM ≠ VDD/2
– Rise and fall delays unequal
– Noise margins not equal
• Reasons for changing inverter threshold
– Want a faster delay for one type of transition
(rise/fall)
– Remove noise from input signal: increase one
noise margin at expense of the other
– Interfacing other types of logic (with different
swings)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 15


CMOS Inverter: VIL Calculation
• KCL (NMOS saturation, PMOS linear):
kn
2
2

2
[
(VGS ,n − VT 0,n ) = 2(VGS , p − VT 0, p )VDS , p − VDS , p 2
kp
]
kn
2
2

2
[
(Vin − VT 0,n ) = 2(Vin − VDD − VT 0, p )(Vout − VDD ) − (Vout − VDD )2
kp
]
• Differentiate and set dVout/dVin to –1
⎡ dVout ⎤
k n (Vin − VT 0,n ) = k p ⎢(Vin − VDD − VT 0, p ) + (Vout − VDD ) − (Vout − VDD )
dVout

⎣ dV in dV in ⎦

kn (VIL −VT 0,n ) = k p (2Vout −VIL +VT 0, p −VDD )

2Vout + VT 0, p − VDD + k RVT 0,n kn


VIL = kR =
1 + kR kp
• Solve simultaneously with KCL to find VIL
Amirtharajah/Parkhurst, EEC 118 Spring 2011 16
CMOS Inverter: VIH Calculation
• KCL: k n 2(V
2
GS , n[− VT 0,n )V DS , n − V DS , n
2
=
kp
2
(V ]
GS , p − V T 0, p )2

kn
2
[
2(Vin − VT 0,n )Vout − Vout =
2 kp
2
] (
Vin − VDD − VT 0, p
2
)
• Differentiate and set dVout/dVin to –1
⎡ dVout ⎤
k n ⎢(Vin − VT 0,n ) ⎥ = k p (Vin − VDD − VT 0, p )
dVout
+ Vout − Vout
⎣ dVin dVin ⎦
( )
kn 2Vout −VIH +VT 0, p = k p VIH −VDD −VT 0, p ( )
VDD + VT 0, p + k R (2Vout + VT 0,n ) kn
VIH = kR =
1 + kR kp
• Solve simultaneously with KCL to find VIH
Amirtharajah/Parkhurst, EEC 118 Spring 2011 17
CMOS Inverter: VM Calculation
• KCL (NMOS & PMOS saturated):

(VGS ,n − VT 0,n ) = (VGS , p − VT 0, p )2


kn 2 kp
2 2
kn
(Vin − VT 0,n ) = (Vin − VDD − VT 0, p )
2 kp 2

2 2
• Solve for VM = Vin = Vout

VT 0,n +
1
(VDD + VT 0, p )
kR kn
VM = kR =
1 kp
1+
kR
Amirtharajah/Parkhurst, EEC 118 Spring 2011 18
CMOS Inverter: Achieving Ideal VM

VT 0,n +
1
kR
(VDD + VT 0, p )
kn
VTH = kR =
1 kp
1+
kR
2
⎛ VDD 2 + VT 0, p ⎞
• Ideally, VM = VDD/2 k R ,ideal = ⎜⎜ ⎟

⎝ V DD 2 + VT 0 , n ⎠

• Assuming VT0,n = VT0,p, k R ,ideal =1


⎛W ⎞
⎜ ⎟
⎝ L ⎠ p μn
= ≈ 2.5
⎛W ⎞ μp
⎜ ⎟
⎝ L ⎠n
Amirtharajah/Parkhurst, EEC 118 Spring 2011 19
CMOS Inverter: VIL and VIH for Ideal VM

• Assuming VT0,n=-VT0,p, and kR = 1,

VIL = (3VDD + 2 VT 0 )
1
8
VIH = (5VDD − 2 VT 0 )
1
8
VIL + VIH = VDD

NM L = VIL − VOL = VIL

NM H = VOH − VIH = VDD − VIH = VIL


Amirtharajah/Parkhurst, EEC 118 Spring 2011 20
Next Time: AC Characteristics & Fabrication

• CMOS Inverters

– AC Characteristics: Designing for speed

Amirtharajah/Parkhurst, EEC 118 Spring 2011 21


EEC 118 Lecture #5:
CMOS Inverter AC
Characteristics

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Acknowledgments
• Slides due to Rajit Manohar from ECE 547
Advanced VLSI Design at Cornell University

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Review: CMOS Inverter Transfer Characteristics
• CMOS Inverters: Rabaey 5.4-5.5 (Kang &
Leblebici, 6.1-6.4, 6.7)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 4


CMOS Inverter VTC: Device Operation

P linear P cutoff
N cutoff N linear

P linear
N sat P sat
N sat

P sat
N linear

Amirtharajah/Parkhurst, EEC 118 Spring 2011 5


Logic Circuit Delay
• For CMOS (or almost all logic circuit families), only
one fundamental equation necessary to determine
delay:
dV
I =C
dt
ΔV
• Consider the discretized version: I =C
Δt
ΔV
• Rewrite to solve for delay: Δt = C
I
• Only three ways to make faster logic: C, ΔV, I

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


CMOS Inverter Capacitances
Vdd • Assume input
Cgs,p Csb,p transition is fixed,
then delay
determined by
output

Cgd,p Cdb,p
Capacitance on
Vin f node f (output):
Cgd,n Cdb,n Cint
Cg
• Junction cap
Cdb,p and Cdb,n
• Gate capacitance
Cgd,p and Cgd,n
Cgs,n Csb,n • Interconnect cap
Gnd • Receiver gate cap
Amirtharajah/Parkhurst, EEC 118 Spring 2011 7
CMOS Inverter Junction Capacitances
• Junction capacitances Cdb,p and Cdb,n:
– Equation for junction cap:
m
AC j 0 ⎛ εq N a N d 1 ⎞
C j (V ) = , C j0 = ⎜⎜ ⎟⎟
⎝ 2 N a + N d φ0 ⎠
m
⎛ V⎞
⎜⎜ 1 − ⎟⎟
⎝ φ0 ⎠
– Non-linear, depends on voltage across junction
– Use Keq factor to get equivalent capacitance for a
voltage transition
Cdb = AK eq C j + PK eqswC jsw

Amirtharajah/Parkhurst, EEC 118 Spring 2011 8


CMOS Inverter Gate Capacitances
• Gate capacitances CGD,p and CGD,n:
– Just after the input switches(t = 0+), what regions
are transistors in?
– One is in cutoff: CGD = Overlap Cap
– One is in Saturation: CGD = Overlap Cap
– Therefore, gate-to-drain capacitance is due to
overlap capacitance :

C gd , p = C gd ,n = CoxWLD
However, also need to consider Miller effect ...
Amirtharajah/Parkhurst, EEC 118 Spring 2011 9
CMOS Inverter Capacitances: Miller Effect
Cgd1
Vout
Vout
Vin Vin 2Cgd1

• When input rises by ΔV, output falls by ΔV


– Change in stored charge: ΔQ = Cgd1ΔV – (-Cgd1ΔV)
– Effective voltage change across Cgd1 is 2ΔV
– Effective capacitance to ground is twice Cgd1
• Including Miller effect:
C gd , p = C gd ,n = 2CoxWLD (For transistor in Cutoff)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 10


CMOS Inverter Capacitances: Receiver

• Receiver gate capacitance


– Includes all capacitances of gate(s) connected to
output node
– Unknown region of operation for receiver
transistor: total gate cap varies from (2/3)WLCox to
WLCox
– Ignore Miller effect (taken into account on output)
– Assume worst-case value, include overlap
C g = WLeff Cox + 2WLD Cox
Cg = WL Cox
Amirtharajah/Parkhurst, EEC 118 Spring 2011 11
Inverter Capacitances: Analysis
• Simplify the circuit: combine all capacitances at
output into one lumped linear capacitance:

Cload = 2*Cgd,n + 2*Cgd,p + Cdb,n + Cdb,p + Cint


+ Cg
Miller effect
• Csb,n = Csb,p = 0

• Cgs,n and Cgs,p are not connected to the load.


These are part of the gate capacitance Cg

Amirtharajah/Parkhurst, EEC 118 Spring 2011 12


First-Order Inverter Delay
• Suppose ideal voltage step at input
• Assume: Current charging or
discharging capacitance Cload is
nearly constant Iavg Vout
Vin
• tPHL = Cload (Vdd - Vdd/2) / Iavg Cload

• tPLH = Cload (Vdd/2 - Vss) / Iavg

Amirtharajah/Parkhurst, EEC 118 Spring 2011 13


Inverter Delay: Falling

ID.n Cload
Vin

• Assume PMOS fully off (ideal step input, ID,p = 0)

dV
I =C
dt
dVout
I D ,n = Cload Need to determine ID,n
dt

Amirtharajah/Parkhurst, EEC 118 Spring 2011 14


Inverter Delay: Falling
NMOS in saturation
Vdd
Vdd - Vtn NMOS in linear region
Vdd/2

t0 t1 t2

• From t0 to t1: NMOS in saturation


• From t1 to t2: NMOS in linear region
• Find ID in each region

Amirtharajah/Parkhurst, EEC 118 Spring 2011 15


Inverter Delay: Falling t1-t0
• Assumption: Input fast enough to go through
transition before output voltage changes
• Vout drops from VOH to VDD-VTN (NMOS saturated)

I DS = kn (Vin − VT 0,n ) / 2 = kn (VOH − VT 0,n ) / 2


2 2

VOH −VT 0 , n
− 2CL
t1

∫t dt = kn (VOH − VT 0,n )2 ∫ dV
VOH
out
0

2CLVT 0,n
t1 − t0 =
kn (VOH − VT 0,n )2

Amirtharajah/Parkhurst, EEC 118 Spring 2011 16


Inverter Delay: Falling t2-t1
• Vout drops from (VOH-VT0,n) to VDD/2
• NMOS in linear region

[
I DS = kn (VOH − VT 0,n )Vout − 12 Vout
2
]
(VOH +VOL ) / 2
dVout
t2 − t1 = −CL ∫
VOH −VT 0 ,n
[
kn (VOH − VT 0,n )Vout − 12 Vout
2
]
CL ⎡ 2(VOH − VT 0,n ) − (VOH + VOL ) / 2 ⎤
t2 − t1 = ln⎢ ⎥
kn (VOH − VT 0,n ) ⎣ (VOH + VOL ) / 2 ⎦

Amirtharajah/Parkhurst, EEC 118 Spring 2011 17


Inverter Delay: Falling, Total
• Total fall delay = (t1-t0) + (t2-t1)

CL ⎡ 2VT 0,n ⎛ 4(VOH − VT 0,n ) ⎞⎤


t PHL = ⎢ + ln⎜⎜ − 1⎟⎟⎥
k n (VOH − VT 0,n ) ⎣VOH − VT 0,n ⎝ VOH + VOL ⎠⎦

Amirtharajah/Parkhurst, EEC 118 Spring 2011 18


Inverter Delay: Rising
• Similar calculation as for falling delay
• Separate into regions where PMOS is in linear,
saturation
CL ⎡ 2 VT 0, p ⎛ 4(VOH − VOL − VT 0, p ) ⎞⎤
t PLH = ⎢ + ln⎜ − 1⎟⎥
k p (VOH − VOL − VT 0, p ) ⎣⎢VOH − VOL − VT 0, p ⎜ V + V ⎟⎥
⎝ OH OL ⎠⎦

• Note: to balance rise and fall delays (assuming VOH =


VDD, VOL = 0V, and VT0,n=VT0,p) requires
⎛W ⎞
kp ⎜ ⎟
⎝ L ⎠ p μn
=1 ⎛W ⎞
=
μp
≈ 2.5
kn ⎜ ⎟
⎝ L ⎠n
Amirtharajah/Parkhurst, EEC 118 Spring 2011 19
Inverter Rise, Fall Times
• Summary -- Exact method: separate into two regions
– t1
• Vout drops from 0.9VDD to VDD-VT,n (NMOS in
saturation)
• Vout rises from 0.1VDD to |VT,p| (PMOS in saturation)
– t2
• Vout drops from VDD-VT,n to 0.1VDD (NMOS in linear
region)
• Vout rises from |VT,p| to 0.9 VDD (PMOS in linear
region)
– tf,r = t1 + t2

Amirtharajah/Parkhurst, EEC 118 Spring 2011 20


CMOS Inverter Delay
• Review of approximate
method
– Assume a constant average I1
V1=Vdd
current for the transition
– Iavg = average of drain V2=½Vdd I2
current at beginning and end
of transition
t1 t2
t PHL =
Cload
(VDD − 12 VDD )
I avg Iavg = ½(I1+I2)

t PLH =
Cload
( 12 VDD − VSS )
I avg
Amirtharajah/Parkhurst, EEC 118 Spring 2011 21
CMOS Inverter Delay: 2nd Approximation
• Another approximate
method:
– Again assume constant Iavg
– Iavg = current I1 at start of I1
V1=Vdd
transition
CloadVDD
t PHL = V2=½Vdd
k n (VDD − VTn )
2

CloadVDD
t PLH =
k p (VDD − VTP )
2 t1 t2

– Why is this a good Iavg = I1


approximation (esp. for deep
submicron)?
Amirtharajah/Parkhurst, EEC 118 Spring 2011 22
CMOS Inverter Delay: Finite Input Transitions
• What if input has finite rise/fall time?
– Both transistors are on for some amount of time
– Capacitor charge/discharge current is reduced

Empirical equations:
2
⎛ tr ⎞
tpHL(ns)

t phl (actual ) = t phl ( step ) + ⎜ ⎟


2

⎝2⎠
2
⎛tf ⎞
t plh (actual ) = t ( step ) + ⎜⎜ ⎟⎟
2
plh
⎝2⎠
trise(ns)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 23
How to Improve Delay?
• Minimize load capacitances
– Small interconnect capacitance
– Small Cg of next stage
• Raise supply voltage
– Increases current faster than increased swing ΔV
• Increase transistor gain factor
– Increase transistor drive current for
charging/discharging output capacitance
• Use low threshold voltage devices
– More subthreshold leakage power dissipation
Amirtharajah/Parkhurst, EEC 118 Spring 2011 24
Inverter Power Consumption
• Static power consumption (ideal) = 0
– Actually DIBL (Drain-Induced Barrier Lowering),
gate leakage, junction leakage are still present
• Dynamic power consumption
T
1
Pavg = ∫ v(t )i(t )dt
T0
1⎡ ⎛ dVout ⎞ ⎛ dVout ⎞ ⎤
T /2 T
Pavg = ⎢ ∫ Vout ⎜ − Cload ⎟dt + ∫ (VDD − Vout )⎜ Cload ⎟dt ⎥
T⎣0 ⎝ dt ⎠ T /2 ⎝ dt ⎠ ⎦
⎡ T /2 T ⎤
1 ⎢⎛ Vout ⎞
⎟ + ⎛⎜VDDVout Cload − CloadVout 2 ⎞⎟ ⎥
2
1
Pavg = ⎜ − Cload
T ⎢⎜⎝ 2 ⎟⎠ ⎝ 2 ⎠ T /2 ⎥
⎣ 0 ⎦
1
Pavg = CloadVDD = CloadVDD f
2 2

T
Amirtharajah/Parkhurst, EEC 118 Spring 2011 25
Next Time: Combinational Logic

• Combinational MOS Logic

– DC Characteristics, Equivalent Inverter method

– AC Characteristics, Switch Model

Amirtharajah/Parkhurst, EEC 118 Spring 2011 26


EEC 118 Lecture #6:
CMOS Logic

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Announcements
• Quiz 1 today!
• Lab 2 reports due this week
• Lab 3 this week
• HW 3 due this Friday at 4 PM in box, Kemper
2131

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Review: CMOS Inverter Transient Characteristics
• Review: Inverter Power Consumption
• Combinational MOS Logic Circuits: Rabaey 6.1-
6.2 (Kang & Leblebici, 7.1-7.4)
• Combinational MOS Logic Transient Response

– AC Characteristics, Switch Model

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Review: Logic Circuit Delay
• For CMOS (or almost all logic circuit families), only
one fundamental equation necessary to determine
delay:
dV
I =C
dt
ΔV
• Consider the discretized version: I =C
Δt
ΔV
• Rewrite to solve for delay: Δt = C
I
• Only three ways to make faster logic: C, ΔV, I

Amirtharajah/Parkhurst, EEC 118 Spring 2011 4


Review: Inverter Delays
• High-to-low and low-to-high transitions (exact):
CL ⎡ 2VT 0,n ⎛ 4(VOH − VT 0,n ) ⎞⎤
t PHL = ⎢ + ln⎜⎜ − 1⎟⎟⎥
k n (VOH − VT 0,n ) ⎣VOH − VT 0,n ⎝ VOH + VOL ⎠⎦
CL ⎡ 2 VT 0, p ⎛ 4(VOH − VOL − VT 0, p ) ⎞⎤
t PLH = ⎢ + ln⎜ − 1⎟⎥
k p (VOH − VOL − VT 0, p ) ⎣⎢VOH − VOL − VT 0, p ⎜ V + V ⎟⎥
⎝ OH OL ⎠⎦
• Similar exact method to find rise and fall times
• Note: to balance rise and fall delays (assuming VOH =
VDD, VOL = 0V, and VT0,n=VT0,p) requires
kp ⎛W ⎞ ⎛W ⎞ μn
=1 ⎜ ⎟ ⎜ ⎟ = ≈ 2.5
kn ⎝ L ⎠p ⎝ L ⎠n μ p
Amirtharajah/Parkhurst, EEC 118 Spring 2011 5
Review: Inverter Power Consumption
• Static power consumption (ideal) = 0
– Actually DIBL (Drain-Induced Barrier Lowering),
gate leakage, junction leakage are still present
• Dynamic power consumption
T
1
Pavg = ∫ v(t )i(t )dt
T0
1⎡ dVout ⎞ ⎤
T /2
⎛ dVout ⎞ ⎛
T
Pavg = ⎢ ∫ Vout ⎜ − Cload ⎟dt + ∫ (VDD − Vout )⎜ Cload ⎟dt ⎥
T⎣0 ⎝ dt ⎠ T /2 ⎝ dt ⎠ ⎦
⎡ T /2 T ⎤
1 ⎢⎛ Vout ⎞
2
⎟ + ⎛⎜VDDVout Cload − CloadVout 2 ⎞⎟ ⎥
1
Pavg = ⎜ − Cload
T ⎢⎜⎝ 2 ⎟⎠ ⎝ 2 ⎠ T /2 ⎥
⎣ 0 ⎦
1
Pavg = CloadVDD = CloadVDD f
2 2

T
Amirtharajah/Parkhurst, EEC 118 Spring 2011 6
Static CMOS
• Complementary pullup
network (PUN) and pulldown
network (PDN)
• Only one network is on at a A
time B PUN
C
• PUN: PMOS devices
F
– Why? A
• PDN: NMOS devices B PDN
C
– Why?
• PUN and PDN are dual
networks

Amirtharajah/Parkhurst, EEC 118 Spring 2011 7


Dual Networks
• Dual networks: parallel Example: NAND gate
connection in PDN = series
connection in PUN, vice- parallel
versa

A F
• If CMOS gate implements
B
logic function F:
series
– PUN implements function F
– PDN implements function G
=F

Amirtharajah/Parkhurst, EEC 118 Spring 2011 8


NAND Gate

• NAND function: F = A•B

• PUN function: F = A•B = A + B


– “Or” function (+) → parallel connection
– Inverted inputs A, B → PMOS transistors

• PDN function: G = F = A•B


– “And” function (•) → series connection
– Non-inverted inputs → NMOS transistors

Amirtharajah/Parkhurst, EEC 118 Spring 2011 9


NOR Gate

• NOR gate operation: F = A+B

A
• PUN: F = A+B = A•B
B

• PDN: G = F = A+B A B

Amirtharajah/Parkhurst, EEC 118 Spring 2011 10


Analysis of CMOS Gates
• Represent “on” transistors as resistors

1 1 W R
W R
1 W R

• Transistors in series → resistances in series


• Effective resistance = 2R
• Effective length = 2L

Amirtharajah/Parkhurst, EEC 118 Spring 2011 11


Analysis of CMOS Gates (cont.)
• Represent “on” transistors as resistors

W W R R
W R
0 0
0

• Transistors in parallel → resistances in parallel


• Effective resistance = ½ R
• Effective width = 2W

Amirtharajah/Parkhurst, EEC 118 Spring 2011 12


CMOS Gates: Equivalent Inverter
• Represent complex gate as inverter for delay
estimation
• Typically use worst-case delays
• Example: NAND gate
– Worst-case (slowest) pull-up: only 1 PMOS “on”
– Pull-down: both NMOS “on”
WP WP WP

WN ½ WN
WN

Amirtharajah/Parkhurst, EEC 118 Spring 2011 13


Example: Complex Gate
Design CMOS gate for this truth table:
A B C F
0 0 0 1

0 0 1 1

0 1 0 1

0 1 1 1

1 0 0 1

1 0 1 0

1 1 0 0

1 1 1 0

F = A•(B+C)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 14
Example: Complex Gate
Design CMOS gate for this logic function:
F = A•(B+C) = A + B•C

1. Find NMOS pulldown network diagram:


G = F = A•(B+C)

B C

Not a unique solution: can exchange order of


series connection
Amirtharajah/Parkhurst, EEC 118 Spring 2011 15
Example: Complex Gate
2. Find PMOS pullup network diagram: F = A+(B•C)

B
A
C
F

Not a unique solution: can exchange order of


series connection (B and C inputs)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 16
Example: Complex Gate
Completed gate: • What is worse-case pullup delay?

B WP
A WP • What is worse-case pulldown delay?
C WP
F • Effective inverter for delay calculation:
A WN

B C WN ½ WP
WN
½ WN

Amirtharajah/Parkhurst, EEC 118 Spring 2011 17


CMOS Gate Design
• Designing a CMOS gate:
– Find pulldown NMOS network from logic function
or by inspection
– Find pullup PMOS network
• By inspection
• Using logic function
• Using dual network approach
– Size transistors using equivalent inverter
• Find worst-case pullup and pulldown paths
• Size to meet rise/fall or threshold requirements

Amirtharajah/Parkhurst, EEC 118 Spring 2011 18


Analysis of CMOS gates
• Represent “on” transistors as resistors

1 1 W R
W R
1 W R

• Transistors in series → resistances in series


• Effective resistance = 2R
• Effective width = ½ W (equivalent to 2L)
• Typically use minimum length devices (L = Lmin)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 19


Analysis of CMOS Gates (cont.)
• Represent “on” transistors as resistors

W W R R
W R
0 0
0

• Transistors in parallel → resistances in parallel


• Effective resistance = ½ R
• Effective width = 2W
• Typically use minimum length devices (L = Lmin)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 20


Equivalent Inverter
• CMOS gates: many paths to Vdd and Gnd
– Multiple values for VM, VIL, VIH, etc
– Different delays for each input combination
• Equivalent inverter
– Represent each gate as an inverter with
appropriate device width
– Include only transistors which are on or switching
– Calculate VM, delays, etc using inverter equations

Amirtharajah/Parkhurst, EEC 118 Spring 2011 21


Static CMOS Logic Characteristics
• For VM, the VM of the equivalent inverter is used
(assumes all inputs are tied together)
– For specific input patterns, VM will be different
• For VIL and VIH, only the worst case is interesting
since circuits must be designed for worst-case
noise margin
• For delays, both the maximum and minimum
must be accounted for in race analysis

Amirtharajah/Parkhurst, EEC 118 Spring 2011 22


Equivalent Inverter: VM
• Example: NAND gate threshold VM
Three possibilities:
– A & B switch together
– A switches alone
– B switches alone

• What is equivalent inverter for each case?

Amirtharajah/Parkhurst, EEC 118 Spring 2011 23


Equivalent Inverter: Delay
• Represent complex gate as inverter for delay
estimation
• Use worse-case delays
• Example: NAND gate
– Worse-case (slowest) pull-up: only 1 PMOS “on”
– Pull-down: both NMOS “on”
WP WP WP

WN ½ WN
WN

Amirtharajah/Parkhurst, EEC 118 Spring 2011 24


Example: NOR gate
• Find threshold voltage VTH when
both inputs switch
simultaneously
• Two methods:
A WP
– Transistor equations (complex)
B WP
F – Equivalent inverter

A WN – Should get same answer


B
WN

Amirtharajah/Parkhurst, EEC 118 Spring 2011 25


Example: Complex Gate
Completed gate: • What is worse-case pullup delay?

B WP
A WP • What is worse-case pulldown delay?
C WP
F • Effective inverter for delay calculation:
A WN

B WNC WN ½ WP
½ WN

Amirtharajah/Parkhurst, EEC 118 Spring 2011 26


Transistor Sizing
• Sizing for switching threshold
– All inputs switch together

• Sizing for delay


– Find worst-case input combination

• Find equivalent inverter, use inverter analysis to


set device sizes

Amirtharajah/Parkhurst, EEC 118 Spring 2011 27


Common CMOS Gate Topologies

• And-Or-Invert (AOI)
– Sum of products boolean function
– Parallel branches of series connected NMOS
• Or-And-Invert (OAI)
– Product of sums boolean function
– Series connection of sets of parallel NMOS

Amirtharajah/Parkhurst, EEC 118 Spring 2011 28


Graph-Based Dual Network
• Use graph theory to help design gates
– Mostly implemented in CAD tools
• Draw network for PUN or PDN
– Circuit nodes are vertices
– Transistors are edges
F
F
A B
A B
gnd

Amirtharajah/Parkhurst, EEC 118 Spring 2011 29


Graph-Based Dual Network (2)

• To derive dual network:


– Create new node in each enclosed region of graph
– Draw new edge intersecting each original edge
– Edge is controlled by inverted input
F

A n1 B A
vdd F n1
A B B
F
gnd
– Convert to layout using consistent Euler paths
Amirtharajah/Parkhurst, EEC 118 Spring 2011 30
Propagation Delay Analysis - The Switch Model
RON
=

VDD VDD
VDD
Rp Rp Rp
Rp
A B B
A F
Rn Rp
F CL
B A
Rn
CL F
Rn Rn Rn
A CL
A B
A

(a) Inverter (b) 2-input NAND (c) 2-input NOR

tp = 0.69 Ron CL

(assuming that CL dominates!)


Amirtharajah/Parkhurst, EEC 118 Spring 2011 31
Switch Level Model
• Model transistors as switches with
series resistance
• Resistance Ron = average resistance
for a transition RP
A
• Capacitance CL = average load
capacitance for a transition (same as
we analyzed for transient inverter RN CL
delays) A

Amirtharajah/Parkhurst, EEC 118 Spring 2011 32


What is the Value of Ron?

Amirtharajah/Parkhurst, EEC 118 Spring 2011 33


Switch Level Model Delays
Delay estimation using switch-level
model (for general RC circuit):
dV C
I =C → dt = dV
RN CL dt I
V RC
I= → dt = dV
R V
V1
RC
t1 − t0 = t p = ∫ dV
V0
V
⎛ V1 ⎞
t p = RC [ln(V1 ) − ln(V0 )] = RC ln⎜⎜ ⎟⎟
⎝ V0 ⎠
Amirtharajah/Parkhurst, EEC 118 Spring 2011 34
Switch Level Model RC Delays
• For fall delay tphl, V0=VDD, V1=VDD/2

⎛ V1 ⎞ ⎛ 12 VDD ⎞
t p = RC ln⎜⎜ ⎟⎟ = RC ln⎜⎜ ⎟⎟
⎝ V0 ⎠ ⎝ VDD ⎠
t p = RC ln(0.5)
t phl = 0.69 RnC L Standard RC-delay
equations from literature
t plh = 0.69 R p C L

Amirtharajah/Parkhurst, EEC 118 Spring 2011 35


Numerical Examples
• Example resistances for 1.2 μm CMOS

Amirtharajah/Parkhurst, EEC 118 Spring 2011 36


Analysis of Propagation Delay
VDD 1. Assume Rn =Rp = resistance of minimum
Rp Rp sized NMOS inverter

A B 2. Determine “Worst Case Input” transition


F (Delay depends on input values)
Rn
CL 3. Example: tpLH for 2input NAND
B - Worst case when only ONE PMOS Pulls
up the output node
Rn
- For 2 PMOS devices in parallel, the
A
resistance is lower
tpLH = 0.69Rp CL
2-input NAND 4. Example: tpHL for 2input NAND
- Worst case : TWO NMOS in series
tpHL = 0.69(2Rn)CL
Amirtharajah/Parkhurst, EEC 118 Spring 2011 37
Design for Worst Case
V DD
VDD

1 1 B 4
A A 2
B
F C 4
2 CL
B D 2
F
2 A 2
D 1
A
B 2C 2

NAND Gate Complex Gate

Here it is assumed that Rp = Rn


Amirtharajah/Parkhurst, EEC 118 Spring 2011 38
Fan-In and Fan-Out
V
DD Fan-Out
Number of logic gates
A B C D connected to output
(2 FET gate capacitances
per fan-out)
A
Fan-In
B Number of logical inputs
Quadratic delay term due to:
C 1. Resistance increasing
2. Capacitance increasing
D
for tpHL (series NMOS)

tp proportional to a1FI + a2FI2 + a3FO


Amirtharajah/Parkhurst, EEC 118 Spring 2011 39
Fast Complex Gates - Design Techniques
• Increase Transistor Sizing:
Works as long as Fan-out capacitance
dominates self capacitance (S/D cap increases
with increased width)
• Progressive Sizing:
Out
InN MN CL

M 1 > M 2 > M 3 > MN

In3 C3
M3

Distributed RC-line
In2 M2 C2

Can Reduce Delay by more


In1 M1 C1 than 30%!
Amirtharajah/Parkhurst, EEC 118 Spring 2011 40
Fast Complex Gates - Design Techniques (2)
• Transistor Ordering
Place last arriving input closest to output node
critical path critical path

CL CL
In3 M3 In1 M1

In2 M2 C2 C2
In2 M2

In1 M1 C1 C3
In3 M3

(a) (b)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 41
Fast Complex Gates - Design Techniques (3)
• Improved Logic Design

Note Fan-Out capacitance is the same, but Fan-In


resistance lower for input gates (fewer series FETs)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 42


Fast Complex Gates - Design Techniques (4)
• Buffering: Isolate Fan-in from Fan-out

CL CL

Keeps high fan-in resistance isolated from large


capacitive load CL
Amirtharajah/Parkhurst, EEC 118 Spring 2011 43
4 Input NAND Gate

VDD VDD

In1 In2 In3 In4


Out
In1

In2
Out
In3

In4

GND
In1 In2 In3 In4

Amirtharajah/Parkhurst, EEC 118 Spring 2011 44


Capacitances in a 4 input NAND Gate
VDD

Cgs5 Cgs6 Csb6 Cgs7


Csb5 Csb7 Cgs8 Csb8
In1 In2 In3 In4
Cgd5 Cgd6 Cdb6 Cgd7
Cdb5 Cdb Cgd8
7 Cdb8

Vout
Cgd Cdb1
In1 1 Note that the value of Cload for calculating
Cgs1 Csb1 propagation delay depends on which capacitances
2
Cgd Cdb2 need to be discharged or charged when the critical
In2 2 signal arrives.
Cgs2 Csb2
3 Example: In1 = In3 = In4 = 1. In2 = 0. In2 switches from low
Cgd Cdb3 to high. Hence, Nodes 3 and 4 are already discharged to
In3 3
ground. In order for Vout to go from high to low… Vout
Cgs3 Csb3
4 node and node 2 must be discharged.
Cgd Cdb4 CL =
In4 4 Cgd5+Cgd7+Cgd8+2Cgd6(Miller)+Cdb5+Cdb6+Cdb7+Cd
Cgs4 Csb4 b8 +Cgd1+ Cdb1+ Cgs1+ Csb1+ 2Cgd2+ Cdb2+ Cw

Amirtharajah/Parkhurst, EEC 118 Spring 2011 45


Next Topic: Sequential Logic

• Basic sequential circuits in CMOS

– RS latches, transparent latches, flip-flops

– Alternative sequential element topologies

– Pipelining

Amirtharajah/Parkhurst, EEC 118 Spring 2011 46


EEC 118 Lecture #7:
Designing with Logical Effort

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Announcements
• Lab 3 this week at lab section
• HW 3 due this Friday at 4 PM in box, Kemper
2131
• Quizzes will be handed back in lab section

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Review: CMOS Combinational Gate Design
• Finish Lecture 6 slides
• Logical Effort
• Combinational MOS Logic Circuits: Rabaey 6.1-
6.2 (Kang & Leblebici, 7.1-7.4)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Acknowledgments
• Slides due to David Money Harris from E158:
Introduction to CMOS VLSI Design at Harvey Mudd
College

Amirtharajah/Parkhurst, EEC 118 Spring 2011 4


Review: Static CMOS
• Complementary pullup
network (PUN) and pulldown
network (PDN)
• Only one network is on at a A
time B PUN
C
• PUN: PMOS devices
F
– Why? Pulls up to VDD. A
• PDN: NMOS devices B PDN
C
– Why? Pulls down to ground.
• PUN and PDN are dual
networks

Amirtharajah/Parkhurst, EEC 118 Spring 2011 5


Review: Dual Networks
• Dual networks: parallel Example: NAND gate
connection in PDN = series
connection in PUN, vice- parallel
versa

A F
• If CMOS gate implements
B
logic function F:
series
– PUN implements function F
– PDN implements function G
=F

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


Lecture 6:
Logical
Effort
Outline
‰ Logical Effort
‰ Delay in a Logic Gate
‰ Multistage Logic Networks
‰ Choosing the Best Number of Stages
‰ Example
‰ Summary

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 8
Introduction
‰ Chip designers face a bewildering array of choices
– What is the best circuit topology for a function?
– How many stages of logic give least delay?
???
– How wide should the transistors be?

‰ Logical effort is a method to make these decisions


– Uses a simple model of delay
– Allows back-of-the-envelope calculations
– Helps make rapid comparisons between alternatives
– Emphasizes remarkable symmetries

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 9
Example
‰ Ben Bitdiddle is the memory designer for the Motoroil 68W86,
an embedded automotive processor. Help Ben design the
A[3:0] A[3:0]
decoder for a register file. 32 bits

‰ Decoder specifications:

4:16 Decoder

16 words
16
Register File
– 16 word register file
– Each word is 32 bits wide
– Each bit presents load of 3 unit-sized transistors
– True and complementary address inputs A[3:0]
– Each input may drive 10 unit-sized transistors
‰ Ben needs to decide:
– How many stages to use?
– How large should each gate be?
– How fast can decoder operate?
Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 10
Delay in a Logic Gate
‰ Express delays in process-independent unit d = d abs
‰ Delay has two components: d = f + p τ
τ = 3RC
‰ f: effort delay = gh (a.k.a. stage effort)
≈ 3 ps in 65 nm process
– Again has two components 60 ps in 0.6 μm process
‰ g: logical effort
– Measures relative ability of gate to deliver current
– g ≡ 1 for inverter
‰ h: electrical effort = Cout / Cin
– Ratio of output to input capacitance
– Sometimes called fanout
‰ p: parasitic (intrinsic) delay
– Represents delay of gate driving no load
– Set by internal parasitic capacitance

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 11
Delay Plots
d =f+p 2-input
= gh + p 6
NAND Inverter
g = 4/3

Normalized Delay: d
5 p=2
‰ What about d = (4/3)h + 2
4 g=1
NOR2? p=1
3 d=h+1

2 Effort Delay: f

1
Parasitic Delay: p
0
0 1 2 3 4 5

Electrical Effort:
h = Cout / Cin

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 12
Computing Logical Effort
‰ DEF: Logical effort is the ratio of the input
capacitance of a gate to the input capacitance of an
inverter delivering the same output current.
‰ Measure from delay vs. fanout plots
‰ Or estimate by counting transistor widths
2 2 A 4
Y
2 B 4
A 2
A Y Y
1 B 2 1 1

Cin = 3 Cin = 4 Cin = 5


g = 3/3 g = 4/3 g = 5/3

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 13
Catalog of Gates
‰ Logical effort of common gates

Gate type Number of inputs


1 2 3 4 n
Inverter 1
NAND 4/3 5/3 6/3 (n+2)/3
NOR 5/3 7/3 9/3 (2n+1)/3
Tristate / mux 2 2 2 2 2
XOR, XNOR 4, 4 6, 12, 6 8, 16, 16, 8

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 14
Catalog of Gates
‰ Parasitic delay of common gates
– In multiples of pinv (≈1)
Gate type Number of inputs
1 2 3 4 n
Inverter 1
NAND 2 3 4 n
NOR 2 3 4 n
Tristate / mux 2 4 6 8 2n
XOR, XNOR 4 6 8

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 15
Example: Ring Oscillator
‰ Estimate the frequency of an N-stage ring oscillator

Logical Effort: g=1 31 stage ring oscillator in


0.6 μm process has
Electrical Effort: h=1 frequency of ~ 200 MHz
Parasitic Delay: p=1
Stage Delay: d=2
Frequency: fosc = 1/(2*N*d) = 1/4N

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 16
Example: FO4 Inverter
‰ Estimate the delay of a fanout-of-4 (FO4) inverter
d

Logical Effort: g=1


Electrical Effort: h=4 The FO4 delay is about

Parasitic Delay: p=1 300 ps in 0.6 μm process

Stage Delay: d=5 15 ps in a 65 nm process

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 17
Multistage Logic Networks
‰ Logical effort generalizes to multistage networks
‰ Path Logical Effort G= gi ∏
Cout-path
‰ Path Electrical Effort H=
Cin-path
‰ Path Effort F = ∏ f i = ∏ gi hi

10
x z
y
20
g1 = 1 g2 = 5/3 g3 = 4/3 g4 = 1
h1 = x/10 h2 = y/x h3 = z/y h4 = 20/z

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 18
Multistage Logic Networks
‰ Logical effort generalizes to multistage networks
‰ Path Logical Effort G= gi ∏
Cout − path
‰ Path Electrical Effort H=
Cin − path
‰ Path Effort F = ∏ f i = ∏ gi hi
‰ Can we write F = GH?

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 19
Paths that Branch
‰ No! Consider paths that branch:
15
G =1 90
5
H = 90 / 5 = 18
GH = 18 15
90
h1 = (15 +15) / 5 = 6
h2 = 90 / 15 = 6
F = g1g2h1h2 = 36 = 2GH

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 20
Branching Effort
‰ Introduce branching effort
– Accounts for branching between stages in path
Con path + Coff path
b=
Con path
B = ∏ bi
Note:

∏h i = BH
‰ Now we compute the path effort
– F = GBH

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 21
Multistage Delays
‰ Path Effort Delay DF = ∑ f i

‰ Path Parasitic Delay P = ∑ pi

‰ Path Delay D = ∑ d i = DF + P

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 22
Designing Fast Circuits
D = ∑ d i = DF + P
‰ Delay is smallest when each stage bears same effort
1
fˆ = gi hi = F N

‰ Thus minimum delay of N stage path is


1
D = NF + P N

‰ This is a key result of logical effort


– Find fastest possible delay
– Doesn’t require calculating gate sizes

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 23
Gate Sizes
‰ How wide should the gates be for least delay?

fˆ = gh = g CCoutin
gi Couti
⇒ Cini =

‰ Working backward, apply capacitance
transformation to find input capacitance of each gate
given load it drives.
‰ Check work by verifying input cap spec is met.

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 24
Example: 3-stage path
‰ Select gate sizes x and y for least delay from A to B

y
x
45
A 8
x
y B
45

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 25
Example: 3-stage path
x

y
x
45
A 8
x
y B
45

Logical Effort G = (4/3)*(5/3)*(5/3) = 100/27


Electrical Effort H = 45/8
Branching Effort B=3*2=6
Path Effort F = GBH = 125
Best Stage Effort fˆ = 3 F = 5
Parasitic Delay P=2+3+2=7
Delay D = 3*5 + 7 = 22 = 4.4 FO4

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 26
Example: 3-stage path
‰ Work backward for sizes
y = 45 * (5/3) / 5 = 15
x = (15*2) * (5/3) / 5 = 10

y
x
45
45
A P:
84 P:
x 4
N: 4 P:
y 12 B
N: 6 45
N: 3 45

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 27
Best Number of Stages
‰ How many stages should a path use?
– Minimizing number of stages is not always fastest
‰ Example: drive 64-bit datapath with unit inverter
Initial Driver 1 1 1 1

8 4 2.8

D = NF1/N + P 16 8

= N(64)1/N + N
23

Datapath Load 64 64 64 64

N: 1 2 3 4
f: 64 8 4 2.8
D: 65 18 15 15.3
Fastest

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 28
Derivation
‰ Consider adding inverters to end of path
– How many give least delay? N - n1 ExtraInverters
Logic Block:
n1 n1Stages

D = NF + ∑ pi + ( N − n1 ) pinv
1
N Path Effort F

i =1
∂D 1 1 1
= − F N ln F N + F N + pinv = 0
∂N
ρ=F
1
‰ Define best stage effort N

pinv + ρ (1 − ln ρ ) = 0

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 29
Best Stage Effort
‰ pinv + ρ (1 − ln ρ ) = 0 has no closed-form solution

‰ Neglecting parasitics (pinv = 0), we find ρ = 2.718 (e)


‰ For pinv = 1, solve numerically for ρ = 3.59

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 30
Sensitivity Analysis
‰ How sensitive is delay to using exactly the best
number of stages? 1.6
1.51

D(N) /D(N)
1.4
1.26
1.2 1.15

1.0

(ρ=6) (ρ =2.4)

0.0
0.5 0.7 1.0 1.4 2.0

N/ N

‰ 2.4 < ρ < 6 gives delay within 15% of optimal


– We can be sloppy!
– I like ρ = 4

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 31
Example, Revisited
‰ Ben Bitdiddle is the memory designer for the Motoroil 68W86,
an embedded automotive processor. Help Ben design the
A[3:0] A[3:0]
decoder for a register file. 32 bits

‰ Decoder specifications:

4:16 Decoder

16 words
16
Register File
– 16 word register file
– Each word is 32 bits wide
– Each bit presents load of 3 unit-sized transistors
– True and complementary address inputs A[3:0]
– Each input may drive 10 unit-sized transistors
‰ Ben needs to decide:
– How many stages to use?
– How large should each gate be?
– How fast can decoder operate?
Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 32
Number of Stages
‰ Decoder effort is mainly electrical and branching
Electrical Effort: H = (32*3) / 10 = 9.6
Branching Effort: B=8

‰ If we neglect logical effort (assume G = 1)


Path Effort: F = GBH = 76.8

Number of Stages: N = log4F = 3.1

‰ Try a 3-stage design

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 33
Gate Sizes & Delay
Logical Effort: G = 1 * 6/3 * 1 = 2
Path Effort: F = GBH = 154
Stage Effort: fˆ = F 1/ 3 = 5.36
Path Delay: D = 3 fˆ + 1 + 4 + 1 = 22.1
Gate sizes: z = 96*1/5.36 = 18 y = 18*2/5.36 = 6.7
A[3] A[3] A[2] A[2] A[1] A[1] A[0] A[0]

10 10 10 10 10 10 10 10

y z word[0]

96 units of wordline capacitance

y z word[15]

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 34
Comparison
‰ Compare many alternatives with a spreadsheet
‰ D = N(76.8 G)1/N + P
Design N G P D
NOR4 1 3 4 234
NAND4-INV 2 2 5 29.8
NAND2-NOR2 2 20/9 4 30.1
INV-NAND4-INV 3 2 6 22.1
NAND4-INV-INV-INV 4 2 7 21.1
NAND2-NOR2-INV-INV 4 20/9 6 20.5
NAND2-INV-NAND2-INV 4 16/9 6 19.7
INV-NAND2-INV-NAND2-INV 5 16/9 7 20.4
NAND2-INV-NAND2-INV-INV-INV 6 16/9 8 21.6

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 35
Review of Definitions
Term Stage Path
number of stages 1 N
logical effort g G = ∏ gi

H=
Cout-path
electrical effort h= Cout
Cin Cin-path
Con-path + Coff-path
branching effort b= Con-path B = ∏ bi
effort f = gh F = GBH

effort delay f DF = ∑ f i

parasitic delay p P = ∑ pi
delay d= f +p D = ∑ d i = DF + P

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 36
Method of Logical Effort
1) Compute path effort F = GBH
2) Estimate best number of stages N = log 4 F
3) Sketch path with N stages
1
4) Estimate least delay D = NF + P N

5) Determine best stage effort ˆf = F N1

gi Couti
6) Find gate sizes Cini =

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 37
Limits of Logical Effort
‰ Chicken and egg problem
– Need path to compute G
– But don’t know number of stages without G
‰ Simplistic delay model
– Neglects input rise time effects
‰ Interconnect
– Iteration required in designs with wire
‰ Maximum speed only
– Not minimum area/power for constrained delay

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 38
Summary
‰ Logical effort is useful for thinking of delay in circuits
– Numeric logical effort characterizes gates
– NANDs are faster than NORs in CMOS
– Paths are fastest when effort delays are ~4
– Path delay is weakly sensitive to stages, sizes
– But using fewer stages doesn’t mean faster paths
– Delay of path is about log4F FO4 inverter delays
– Inverters and NAND2 best for driving large caps
‰ Provides language for discussing fast circuits
– But requires practice to master

Amirtharajah/Parkhurst, EEC 118 Spring 2011 CMOS VLSI Design 4th Ed. 39
Next Topic: Sequential Logic

• Basic sequential circuits in CMOS

– RS latches, transparent latches, flip-flops

– Alternative sequential element topologies

– Pipelining

Amirtharajah/Parkhurst, EEC 118 Spring 2011 40


EEC 118 Lecture #8:
CMOS Logic Transient
Characteristics

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Announcements
• Quiz 2 on Monday, April 26
• Midterm on Monday, May 3
– Covers material through Lecture (Monday 4/26)
• HW4 due Friday, 4PM in box, Kemper 2131
• Lab 3, Part 2 report due next week

Amirtharajah/Parkhurst, EEC 118 Spring 2010 2


Outline
• Review: Static CMOS Logic
• Finish equivalent inverter discussion
• Combinational MOS Logic Circuits: Rabaey 6.1-
6.2, 7.1-7.3 (Kang & Leblebici, 7.1-7.4)

Amirtharajah/Parkhurst, EEC 118 Spring 2010 3


Review: Static CMOS
• Complementary pullup
network (PUN) and pulldown
network (PDN)
• Only one network is on at a A
time B PUN
C
• PUN: PMOS devices
F
– Why? VOH = VDD A
• PDN: NMOS devices B PDN
C
– Why? VOL = 0 V
• PUN and PDN are dual
networks

Amirtharajah/Parkhurst, EEC 118 Spring 2010 4


Review: Dual Networks
• Dual networks: parallel Example: NAND gate
connection in PDN = series
connection in PUN, vice- parallel
versa

A F
• If CMOS gate implements
B
logic function F:
series
– PUN implements function F
– PDN implements function G
=F

Amirtharajah/Parkhurst, EEC 118 Spring 2010 5


Review: Equivalent Inverter
• Represent complex gate as inverter for delay
estimation, VTC analysis
• Use worse-case conditions for delays
• Example: NAND gate
– Worse-case (slowest) pull-up: only 1 PMOS “on”
– Pull-down: both NMOS “on”
WP WP WP

WN ½ WN
WN

Amirtharajah/Parkhurst, EEC 118 Spring 2010 6


Graph-Based Dual Network
• Use graph theory to help design gates
– Mostly implemented in CAD tools
• Draw network for PUN or PDN
– Circuit nodes are vertices
– Transistors are edges
F
F
A B
A B
gnd

Amirtharajah/Parkhurst, EEC 118 Spring 2010 7


Graph-Based Dual Network (2)

• To derive dual network:


– Create new node in each enclosed region of graph
– Draw new edge intersecting each original edge
– Edge is controlled by inverted input

A B A
A B B
F

Amirtharajah/Parkhurst, EEC 118 Spring 2010 8


Propagation Delay Analysis - The Switch Model
RON
=

VDD VDD
VDD
Rp Rp Rp
Rp
A B B
A F
Rn Rp
F CL
B A
Rn
CL F
Rn Rn Rn
A CL
A B
A

(a) Inverter (b) 2-input NAND (c) 2-input NOR

tp = 0.69 Ron CL

(assuming that CL dominates!)


Amirtharajah/Parkhurst, EEC 118 Spring 2010 9
Switch Level Model
• Model transistors as switches with
series resistance
• Resistance Ron = average resistance
for a transition RP
A
• Capacitance CL = average load
capacitance for a transition (same as
we analyzed for transient inverter RN CL
delays) A

Amirtharajah/Parkhurst, EEC 118 Spring 2010 10


What is the Value of Ron?

Amirtharajah/Parkhurst, EEC 118 Spring 2010 11


Switch Level Model Delays
Delay estimation using switch-level
model (for general RC circuit):
dV C
I =C → dt = dV
RN CL dt I
V RC
I= → dt = dV
R V
V1
RC
t1 − t0 = t p = ∫ dV
V0
V
⎛ V1 ⎞
t p = RC [ln(V1 ) − ln(V0 )] = RC ln⎜⎜ ⎟⎟
⎝ V0 ⎠
Amirtharajah/Parkhurst, EEC 118 Spring 2010 12
Switch Level Model RC Delays
• For fall delay tphl, V0=VDD, V1=VDD/2

⎛ V1 ⎞ ⎛ 12 VDD ⎞
t p = RC ln⎜⎜ ⎟⎟ = RC ln⎜⎜ ⎟⎟
⎝ V0 ⎠ ⎝ VDD ⎠
t p = RC ln(0.5)
t phl = 0.69 RnC L Standard RC-delay
t plh = 0.69 R p C L equations from literature

Amirtharajah/Parkhurst, EEC 118 Spring 2010 13


Numerical Examples
• Example resistances for 1.2 μm CMOS

Amirtharajah/Parkhurst, EEC 118 Spring 2010 14


Analysis of Propagation Delay
VDD 1. Assume Rn =Rp = resistance of minimum
Rp Rp sized NMOS inverter

A B 2. Determine “Worst Case Input” transition


F (Delay depends on input values)
Rn
CL 3. Example: tpLH for 2input NAND
B - Worst case when only ONE PMOS Pulls
up the output node
Rn
- For 2 PMOS devices in parallel, the
A
resistance is lower
tpLH = 0.69Rp CL
2-input NAND 4. Example: tpHL for 2input NAND
- Worst case : TWO NMOS in series
tpHL = 0.69(2Rn)CL
Amirtharajah/Parkhurst, EEC 118 Spring 2010 15
Design for Worst Case
V DD
VDD

1 1 B 4
A A 2
B
F C 4
2 CL
B D 2
F
2 A 2
D 1
A
B 2C 2

NAND Gate Complex Gate

Here it is assumed that Rp = Rn


Amirtharajah/Parkhurst, EEC 118 Spring 2010 16
Fan-In and Fan-Out
V
DD Fan-Out
Number of logic gates
A B C D connected to output
(2 FET gate capacitances
per fan-out)
A
Fan-In
B Number of logical inputs
Quadratic delay term due to:
C 1. Resistance increasing
2. Capacitance increasing
D
for tpHL (series NMOS)

tp proportional to a1FI + a2FI2 + a3FO


Amirtharajah/Parkhurst, EEC 118 Spring 2010 17
Fast Complex Gates - Design Techniques
• Increase Transistor Sizing:
Works as long as Fan-out capacitance
dominates self capacitance (S/D cap increases
with increased width)
• Progressive Sizing:
Out
InN MN CL

M 1 > M 2 > M 3 > MN

In3 C3
M3

Distributed RC-line
In2 M2 C2

Can Reduce Delay by more


In1 M1 C1 than 30%!
Amirtharajah/Parkhurst, EEC 118 Spring 2010 18
Fast Complex Gates - Design Techniques (2)
• Transistor Ordering
Place last arriving input closest to output node
critical path critical path

CL CL
In3 M3 In1 M1

In2 M2 C2 C2
In2 M2

In1 M1 C1 C3
In3 M3

(a) (b)
Amirtharajah/Parkhurst, EEC 118 Spring 2010 19
Fast Complex Gates - Design Techniques (3)
• Improved Logic Design

Note Fan-Out capacitance is the same, but Fan-In


resistance lower for input gates (fewer series FETs)

Amirtharajah/Parkhurst, EEC 118 Spring 2010 20


Fast Complex Gates - Design Techniques (4)
• Buffering: Isolate Fan-in from Fan-out

CL CL

Keeps high fan-in resistance isolated from large


capacitive load CL
Amirtharajah/Parkhurst, EEC 118 Spring 2010 21
4 Input NAND Gate

VDD VDD

In1 In2 In3 In4


Out
In1

In2
Out
In3

In4

GND
In1 In2 In3 In4

Amirtharajah/Parkhurst, EEC 118 Spring 2010 22


Capacitances in a 4 input NAND Gate
VDD

Cgs5 Cgs6 Csb6 Cgs7


Csb5 Csb7 Cgs8 Csb8
In1 In2 In3 In4
Cgd5 Cgd6 Cdb6 Cgd7
Cdb5 Cdb Cgd8
7 Cdb8

Vout
Cgd Cdb1
In1 1 Note that the value of Cload for calculating
Cgs1 Csb1 propagation delay depends on which capacitances
2
Cgd Cdb2 need to be discharged or charged when the critical
In2 2 signal arrives.
Cgs2 Csb2
3 Example: In1 = In3 = In4 = 1. In2 = 0. In2 switches from low
Cgd Cdb3 to high. Hence, Nodes 3 and 4 are already discharged to
In3 3
ground. In order for Vout to go from high to low… Vout
Cgs3 Csb3
4 node and node 2 must be discharged.
Cgd Cdb4 CL =
In4 4 Cgd5+Cgd7+Cgd8+2Cgd6(Miller)+Cdb5+Cdb6+Cdb7+Cd
Cgs4 Csb4 b8 +Cgd1+ Cdb1+ Cgs1+ Csb1+ 2Cgd2+ Cdb2+ Cw

Amirtharajah/Parkhurst, EEC 118 Spring 2010 23


Next Topic: Sequential Logic

• Basic sequential circuits in CMOS

– RS latches, transparent latches, flip-flops

– Alternative sequential element topologies

– Pipelining

Amirtharajah/Parkhurst, EEC 118 Spring 2010 24


EEC 118 Lecture #9:
Sequential Logic

Rajeevan Amirtharajah
University of California, Davis
Jeff Parkhurst
Intel Corporation
Outline
• Review: Static CMOS Logic
• Finish Static CMOS transient analysis
• Sequential MOS Logic Circuits: Rabaey, 7.1-7.3
(Kang & Leblebici, 8.1-8.5)

Amirtharajah/Parkhurst, EEC 118 Spring 2010 3


Sequential Logic Basic Definition
• Combinational circuits’ output is a function of the
circuit inputs and a delay time
– Examples: NAND, NOR, XOR, adder, multiplier
• Sequential circuits’ output is a function of the
circuit inputs, previous circuit state, and a delay
time
– Examples: Latches, flip-flops, FSMs, pipelined
adders and multipliers, microprocessors
– Sequential elements are critical to implementing
techniques such as feedback or blocks such as
memory

Amirtharajah/Parkhurst, EEC 118 Spring 2010 4


Sequential Logic Example: Mealy FSM

In LOGIC Out
tp,comb

Φ
• Two information storage mechanisms
– Positive feedback-based (static) circuits
– Charge storage-based (dynamic) circuits
• Clock signal Φ controls timing of state (memory)
updates
Amirtharajah/Parkhurst, EEC 118 Spring 2010 5
Positive Feedback: Bistability

Vi1 Vo1 = Vi 2 Vo 2

Vi1 = Vo 2
A

Vi 2 = Vo1 C (metastable)

B
Vi1 = Vo 2
Amirtharajah/Parkhurst, EEC 118 Spring 2010 6
Metastability

A A

Vo1=Vi2
Vo1=Vi2

C C

B B

δ Vi1=Vo2 δ Vi1=Vo2

Gain should be larger than 1 in the transition region

Amirtharajah/Parkhurst, EEC 118 Spring 2010 7


Bistable Elements
• Bistable elements have two stable states or
operation modes
• Cross-coupled inverters are the most basic bistable
element
– Circuit forms the basis of latches and SRAM memory
– Stable points on the VTC are those with the lowest
energy
– Points with high energy are unstable, perturbations
are amplified

Amirtharajah/Parkhurst, EEC 118 Spring 2010 8


Set-Reset (SR) Latch
• Change inverters to NAND or NOR gates, with
second inputs = S(set) and R(reset)
S Q S Q

Q Q
R R
• Allows control of the state of the bistable element
• One input state is not allowed
• Gating S and R with the clock prevents the latch
from responding except during one phase of the
clock cycle

Amirtharajah/Parkhurst, EEC 118 Spring 2010 9


SR Latch
• Sequential circuits: circuits which “store state”:
circuits with memory elements
• Latches: store previous output value for certain
input combinations
• SR latch (NAND-based):

not allowed S R Qnext Qnext

S Q
0 0 1 1
0 1 1 0
1 0 0 1
R Q
1 1 Q Q
memory
Amirtharajah/Parkhurst, EEC 118 Spring 2010 10
Other Latches
• Clocked SR latch
– Adds clock input. Latch output can only be
set/reset when clk=1 (or clk=0)
• Other latch types:
– JK latch: Removes “not allowed” state – e.g.,
toggles when inputs are both 1
– T latch: Toggles when T input = 1
– D latch: Output = D input

Amirtharajah/Parkhurst, EEC 118 Spring 2010 11


Latch Circuits
• Many methods for implementing latches
– Standard CMOS gates (cross-coupled NAND, etc)
– Transmission gates
– Tri-state inverters
tri-state
en inverter

A F

When en=0, F is
en “floating”, i.e. high
impedance
Amirtharajah/Parkhurst, EEC 118 Spring 2010 12
Positive Dynamic Transmission Gate Latch

Clk
Q
D I0

C0
Clk
• No feedback devices
• Data stored on input capacitance of inverter I0
• Dynamic logic issues apply: leakage, capacitive
coupling, charge sharing
Amirtharajah/Parkhurst, EEC 118 Spring 2010 13
Transmission Gate Positive Static Latch

Clk
Q

Clk
D

Clk
Amirtharajah/Parkhurst, EEC 118 Spring 2010 14
NMOS Pass Gate Positive Static Latch

Clk
Q
Q
Clk
VDD − VTn
D
• Fewer devices, less area, lower clock load
• Threshold drop on internal nodes implies more static
power, less noise margin
Amirtharajah/Parkhurst, EEC 118 Spring 2010 15
Master-Slave Flip-Flop
• By cascading two level-sensitive latches, one
type of edge triggered flip-flop is created
• JK latch can be used for first stage so that no
input combinations are invalid
• SR latch is then used for the second stage
because inputs cannot be invalid
• Rather than using logic gate-based latches, can
cascade latches such as above (e.g.,
transmission gate dynamic or static latches)

Amirtharajah/Parkhurst, EEC 118 Spring 2010 16


Edge-Triggered Flip-Flops
• Types of latches/flip-flops:
– Level-sensitive: output is set when clock is a
certain level (0 or 1)
– Edge-triggered: output can only be set on a clock
edge (rising or falling)
• Advantages of edge-triggered flip-flops:
– Data only needs to be stable at clock edge
– Reduces race conditions: potential errors where
an input data change travels through multiple
latches during their “transparent” phase

Amirtharajah/Parkhurst, EEC 118 Spring 2010 17


Dynamic Positive Edge-Triggered FF
Clk Clk
Q
D I0 I1

C0 C1
Clk Clk
• No feedback devices
• Data stored on input capacitances of inverters I0 and I1
• Dynamic logic issues apply: leakage, capacitive
coupling, charge sharing
Amirtharajah/Parkhurst, EEC 118 Spring 2010 18
Clocked Circuit Timing
• Timing definitions:
– Clock-to-Q or Propagation Delay (tclkQ): delay of
flip-flop from clock edge to output Q
– Setup Time (tsetup): amount of time before clock
edge that data has to be stable. If data arrives
after this time, it will not be latched correctly.
– Hold Time (thold): amount of time after clock edge
that data has to be stable.
• It is possible to trade off setup and hold time with
flip-flop circuit design
– Modify data and clock timing relationship by
delaying one of the two signals

Amirtharajah/Parkhurst, EEC 118 Spring 2010 19


Flip-Flop: Timing Definitions
φ

t
tsetup thold
In

DATA
STABLE
t

tpFF
Out

DATA
STABLE
t

From Digital Integrated Circuits – Jan Rabaey Notes


Amirtharajah/Parkhurst, EEC 118 Spring 2010 20
Maximum Clock Frequency

In LOGIC Out
tp,comb

Φ
1
t pFF + t p ,comb + t setup <T =
f
• Signals must propagate out of flip-flop, through
combinational logic, and be stable before next
clock edge (clock period = T, clock frequency = f)
Amirtharajah/Parkhurst, EEC 118 Spring 2010 21
Staticized Dynamic Positive Edge-Triggered FF

Clk I1 Clk I3

Q
D I0 I2

C0 C1
Clk Clk

• Use weak feedback inverters to enhance robustness


• Returns to reduced clock load static flip-flop with same
sizing issues
Amirtharajah/Parkhurst, EEC 118 Spring 2010 22
Clock Overlap Failures
B
Clk

Clk Clk
D
A Clk

Clk Q

Clk
1. Both high simultaneously, race condition from D to Q
2. Node A can be driven simultaneously by D and B
Amirtharajah/Parkhurst, EEC 118 Spring 2010 23
Race Through and Feedback Paths
B
Clk

Clk Clk
D
A Clk

Clk Q

Clk
1. Both high simultaneously, race condition from D to Q
2. Node A can be driven simultaneously by D and B
Amirtharajah/Parkhurst, EEC 118 Spring 2010 24
Nonoverlapping Clocks Methodology
B
PHI 0
PHI 1 PHI 1
D
A PHI 0

PHI 0 Q

PHI1
• Guarantee nonoverlap period long enough
• Note: internal nodes left high Z during nonoverlap
Amirtharajah/Parkhurst, EEC 118 Spring 2010 25
C2MOS Edge Triggered Flip-Flop

Clk Clk
D Q
Clk C0 Clk C1

• Tristate inverters eliminate clock overlap race condition


Amirtharajah/Parkhurst, EEC 118 Spring 2010 26
Zero-Zero Overlap Condition

Gnd Gnd
D Q

C0 C1

• Both phases low simultaneously enables opposite nets


Amirtharajah/Parkhurst, EEC 118 Spring 2010 27
High-High Overlap Condition

D Q
VDD VDD
C0 C1

• Both phases high simultaneously enables opposite nets


Amirtharajah/Parkhurst, EEC 118 Spring 2010 28
C2MOS Design
• Clock overlap problems eliminated as long as rise and
fall times remain fast
– Slow rise / fall times imply pullup and pulldown nets on
simultaneously resulting in potential errors, static power
• Dynamic flip-flop style leaves output high Z
– Must take care when using since output wire could be
exposed to many more noise sources than internal nodes
• Mix and match styles by using C2MOS as master and
other types of latch as slave
• Clock load small, but potentially larger than
transmission gate dynamic latches due to PMOS sizing

Amirtharajah/Parkhurst, EEC 118 Spring 2010 29


Pipelining
REG

REG
a a

REG

REG

REG
φ φ .

REG
. log Out log Out

φ φ φ φ
REG

REG
b b

Non-pipelined version Pipelined version


φ φ

From Digital Integrated Circuits – Jan Rabaey Notes


Amirtharajah/Parkhurst, EEC 118 Spring 2010 30
Next Topic: Arithmetic Circuits

• Computing arithmetic functions with CMOS logic

– Half adder and full adder circuits

– Circuit architectures for addition

– Array multipliers

Amirtharajah/Parkhurst, EEC 118 Spring 2010 31


EEC 118 Lecture #10:
Dynamic Logic

Rajeevan Amirtharajah
University of California, Davis

Jeff Parkhurst
Intel Corporation
Announcements
• Complete Lab 4 this week

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Today: Alternative MOS Logic Styles
• Dynamic MOS Logic Circuits: Rabaey 6.3 (Kang &
Leblebici, 9.4-9.6)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Review: Transmission Gate XOR

S
S

A F = A⊕ S

S
S
• If S = 0, F = A and when S = 1, F = ~A
Amirtharajah/Parkhurst, EEC 118 Spring 2011 4
Review: Transmission Gate Multiplexer

F = AS + BS
A

S
Amirtharajah/Parkhurst, EEC 118 Spring 2011 5
Dynamic CMOS
• Operation
– Clk low during Pre-charge clk Mp
• Mp is on while Mn is off
• Output charged to Vdd
– Clk high during evaluate NMOS
network
• Mn is on while Mp is off
• Output pulled down clk Mn
according to PDN
function
• PDN design same as static Gnd
CMOS

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


Dynamic CMOS Tradeoffs
• Advantages: • Disadvantages:
– Faster – why? – Multiple stage issues
• Reduced input load – Charge leakage
• No switching
contention – Charge sharing
– Less layout area – Capacitive coupling
– Cannot be cascaded
clk – Complicated timing/clocking
– Higher power
NMOS – Lower noise margins
network – Does not scale well with
process
clk
Gnd
Amirtharajah/Parkhurst, EEC 118 Spring 2011 7
Multiple Stage Issue: Output Discharge Race

clk Mp
clk Mp
Out 2

1
NMOS
network Out 1
clk Mn
clk Mn
Gnd
Gnd
• During pre-charge stage, inputs to second gate are all
high: Out 2 could discharge before Out 1 discharges
Amirtharajah/Parkhurst, EEC 118 Spring 2011 8
Cascading Multiple Stages
• During pre-charge stage, inputs to second gate are
all high
– At the beginning of evaluate stage, Out 2 is
discharged.
– Out 1 goes through its evaluation stage concurrently
and goes low
• Hence out 2 was supposed to be high, but already
discharged.
• Dynamic logic driven by the same clock cannot be
cascaded directly

Amirtharajah/Parkhurst, EEC 118 Spring 2011 9


Charge Sharing
• Output is floating after clk = ‘1’ if inputs are ‘0’
• If upper transistors in a stack switch, the
intermediate and output node voltages will be
equalized, possibly leading to a drop in the
output voltage = noise
• Final output (initial charge distributed over both
capacitors):
V=(C1V1+C2V2)/(C1+C2)
C1 C2

Amirtharajah/Parkhurst, EEC 118 Spring 2011 10


Charge Leakage & Capacitive Coupling
• Output is floating after clk = ‘1’ if inputs are ‘0’
• Since the current is not 0 when transistors are in
cutoff, current can leak charge away from the
output when all inputs are ‘0’
• Changes in input signals couple to the output and
intermediate nodes, also resulting in voltage
drops

Amirtharajah/Parkhurst, EEC 118 Spring 2011 11


Noise Solutions
• Charge sharing:
– Ensure the output capacitance is large enough
such that the voltage drop is minimal
– Precharge internal stack nodes to VDD
– Pre-discharging internal stack nodes can increase
performance, but worsens noise
• Charge leakage/sharing and capacitive coupling:
– Add a keeper PMOS (weak P pullup) – increased
evaluation contention

Amirtharajah/Parkhurst, EEC 118 Spring 2011 12


Domino Logic

clk Mp
clk Mp

Out 2
NMOS 1
network Out 1
clk Mn

clk Mn
Gnd

Gnd
Amirtharajah/Parkhurst, EEC 118 Spring 2011 13
Domino Logic
• Add an inverter between dynamic gates
– Inverter drives the gate’s fanout – increased
performance
• Sometimes the inverter is replaced with a more
complex static CMOS gate
– Incorporates more logic per stage to improve
speed
• Static CMOS gate improves overall circuit
dynamic noise margins

Amirtharajah/Parkhurst, EEC 118 Spring 2011 14


Cascading Domino
• For gates with all inputs coming from other
domino gates, the bottom NMOS transistor can
be eliminated
– Why? All inputs will be ‘0’ during precharge and
can only transition from ‘0’ to ‘1’ during evaluate
due to inverter between stages…
– Results in increased performance due to
decreased stack height
– Precharge now depends on input precharge time

Amirtharajah/Parkhurst, EEC 118 Spring 2011 15


Dynamic Logic Power
• Power depends upon switching activity
– Switching activity depends upon the probability of a ‘1’
input
1
Pavg = CloadVDD = CloadVDD f
2 2

T
– Effective capacitance Cload is doubled when the gate
evaluates because the gate must later precharge
– Frequency must be multiplied by the probability that an
evaluation will occur
• Power is usually higher for domino logic except when it
replaces prior logic with very high activity factors

Amirtharajah/Parkhurst, EEC 118 Spring 2011 16


Bottom Line
• Tradeoff between performance and power exists
• Many things can go wrong from a design
standpoint (high risk)
– Charge sharing, noise, leakage currents, race
conditions
• Debugging challenge, especially in deep
submicron
– Leakage currents put lower bound on clock
frequency for testing
• Best to use dynamic logic only when necessary
– High performance circuits such as microprocessor
critical paths
Amirtharajah/Parkhurst, EEC 118 Spring 2011 17
NORA Logic Gate

clk Mp clk Mp

PMOS
NMOS network
network
clk Mn clk Mn

Gnd Gnd

Amirtharajah/Parkhurst, EEC 118 Spring 2011 18


NORA Logic
• Solves problem of cascading dynamic gates, but
is vulnerable to noise
– Alternate P-dynamic and N-dynamic stages
– Both clk and clk required
– Lose some of the speed benefits due to added
PUNs

Zipper Logic
• Like NORA logic…but,
– PMOS precharge and NMOS pre-discharge
weakly on during evaluation stage…

Amirtharajah/Parkhurst, EEC 118 Spring 2011 19


Variations on the Domino Theme
• Multiple-Output Domino
– Exploit situation when certain outputs are subsets of
other outputs to reduce area
– Precharge intermediate nodes in PDN and follow with
inverters to drive other N-block dynamic gates
• Compound Domino
– Use complex static CMOS gates (NANDs, NORs) on
outputs of multiple dynamic gates in parallel
– Replaces large fanin domino gates with lower fanin
gates
– Capacitive coupling from static gate outputs to dynamic
gate outputs an issue
Amirtharajah/Parkhurst, EEC 118 Spring 2011 20
Multiple Output Domino CMOS Logic

Clk Clk
Out0
In0
In1 PDN Out1
In4 PDN
In5
Clk

Amirtharajah/Parkhurst, EEC 118 Spring 2011 21


Compound Domino CMOS Logic

Clk Clk Out


In0
In1 PDN In4
PDN
In2 In5
Clk Clk

Amirtharajah/Parkhurst, EEC 118 Spring 2011 22


Next Topic: Arithmetic

• Computing arithmetic functions with CMOS logic

– Half adder and full adder circuits

– Circuit architectures for addition

– Array multipliers

Amirtharajah/Parkhurst, EEC 118 Spring 2011 23


EEC 118 Lecture #11:
CMOS Design Guidelines
Alternative Static Logic
Families
Rajeevan Amirtharajah
University of California, Davis

Jeff Parkhurst
Intel Corporation
Announcements
• Homework 5 this week
• Lab 4 Parts 1 + 2 – keep working!
• Midterm next Monday, May 2 (in class)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Finish Logical Effort Discussion
• Review: Static CMOS Sizing
• Design Guidelines for CMOS
• Pseudo-NMOS Logic: Rabaey 6.2
• Pass Transistor Circuits: Rabaey 6.2 (Kang &
Leblebici 9.1-9.2)
• Midterm Overview

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Review: CMOS Sizing
• Equivalent inverter approach: replace transistors
which are “on” with equivalent transistor
• Use equivalent inverter to find VM, delays, etc.

A Wpa if A=0, Wpeff


B switches:
B F
B Wpb Wneff
F

1 1 1
A B
= +
Wna Wnb W peff W pa W pb
Wneff = Wnb
Amirtharajah/Parkhurst, EEC 118 Spring 2011 4
Review of Sizing
• Gate delays depend on which inputs switch
– Normally sized for worst-case delay
– Best-case (fastest) delay also important due to
race conditions in a pipelined datapath
• Switching threshold VM normally considers all
inputs switching
• Delay estimation
– Combine switching transistors into equivalent
inverter

Amirtharajah/Parkhurst, EEC 118 Spring 2011 5


Example: NAND gate
• Circuit:
A Wp B Wp C Wp – Load cap CL=400fF
– PMOS W/L = 2
F
– NMOS W/L = 1
A Wn
– kn’ = 200 mA/V2
B Wn – kp’ = 80 mA/V2

C – VT = 0.5V
Wn
• 1st: Find delay of inverter
• 2nd: Find delay of NAND

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


Equivalent Inverter
• Problems with equivalent inverter method:
– Need to take into account load capacitance CL
• Depends on number of transistors connected to
output (junction capacitances)
• Even transistors which are off (not included in
equivalent inverter) contribute to capacitance (i.e.
PMOS Drain Capacitance)
– Need to include capacitance in intermediate stack
nodes (NMOS caps). Worst-case: need to
charge/discharge all nodes
– Body effect of stacked transistors

Amirtharajah/Parkhurst, EEC 118 Spring 2011 7


Load Capacitance
• Output capacitance includes junction caps of all
transistors on output
• Reducing load capacitance
– Minimize number of transistors on output node
– Tapering transistor stacks:
• Wider transistors closest to power and ground nodes,
narrower at output
• Transistors closest to power nodes carry more current

Amirtharajah/Parkhurst, EEC 118 Spring 2011 8


Intermediate Node Capacitances
• Internal capacitances in CMOS gates are charged
and discharged
– Depends on input pattern
– Increases delay of gate
• Simple analysis
– Combine internal capacitances into output load
– Assumes all capacitances charged and
discharged fully
• Effect on delay analysis
– Gate delay depends on timing of inputs!

Amirtharajah/Parkhurst, EEC 118 Spring 2011 9


CMOS Design Guidelines I
• Transistor sizing
– Size for worst-case delay, threshold, etc
– Tapering: transistors near power supply are larger
than transistors near output
• Transistor ordering
– Critical signal is defined as the latest-arriving
signal to input of gate of interest.
– Put critical signals closest to output
• Stack nodes are discharged by early signals
• Reduced body effect on top transistor

Amirtharajah/Parkhurst, EEC 118 Spring 2011 10


CMOS Design Guidelines II
• Limit fan-in of gate
– Fan-in: number of gate inputs
– Affects size of transistor stacks
– Normally fan-in limit is 3-4
• Convert large multi-input gates into smaller chain
of gates
• Limit fanout of gate
– Fanout: number of gates connected to output
– Capacitive load: affects gate delay
• NANDs are better than NORs
– Series NMOS devices less area, capacitance than
equivalent series PMOS devices

Amirtharajah/Parkhurst, EEC 118 Spring 2011 11


CMOS Disadvantages
• For N-input CMOS gate, 2N transistors required
– Each input connects to an NMOS and PMOS
transistor
– Large input capacitance: limits fanout
• Large fan-in gates: always have long transistor
stack in PUN or PDN
– Limits pullup or pulldown delay
– Requires very large transistors
• Single-stage gates are inverting

Amirtharajah/Parkhurst, EEC 118 Spring 2011 12


Pseudo-NMOS Logic
• Pseudo-NMOS: replace PMOS PUN with single
“always-on” PMOS device (grounded gate)
• Same problems as true NMOS inverter:
– VOL larger than 0 V
– Static power dissipation when PDN is on
• Advantages
– Replace large PMOS stacks with single device
– Reduces overall gate size, input capacitance
– Especially useful for wide-NOR structures

Amirtharajah/Parkhurst, EEC 118 Spring 2011 13


Pseudo-NMOS Inverter Circuit
• Replace PUN or resistor with
“always-on” PMOS transistor VDD
VGS,P = -VDD
• Easier to implement in standard S
process than large resistance G
value D
Vout
• PMOS load transistor: Vin
– On when VGS < VTP →
VGS = -VDD: transistor always on
– Linear when VDS > VGS-VTP → Gnd
Vout-VDD > -VDD-VTP → Vout > -VTP
Remember:
– Saturated when VDS < VGS-VT → VT (PMOS) < 0
Vout-VDD < -VDD-VTP → Vout < -VTP

Amirtharajah/Parkhurst, EEC 118 Spring 2011 14


Pseudo-NMOS Inverter: VOH
• VOH for pseudo-NMOS
inverter: VDD

– Vin = 0
– NMOS in cutoff: no drain
current Vout

• Result: VOH is VDD (as in


resistive-load inverter or
CMOS inverter case)

Gnd

Amirtharajah/Parkhurst, EEC 118 Spring 2011 15


Pseudo-NMOS Inverter: VOL
• Find VOL of pseudo-NMOS inverter:
– Vin = VDD: NMOS on in linear mode (assume VOL <
VDD-VT,n)
[
I Dn = kn (VDD −VTn )VOL − 12 VOL
2
]
– PMOS on in saturation mode (assume)

I Dp = k p (− VDD − VTp )
2
1
2
(neglecting λ)
– Setting Idn = Idp:

k V − kn (VDD −VTn )VOL + k p (−VDD −VTp ) = 0


1 2 1 2
2 n OL 2

• Key point: VOL is not zero


– Depends on thresholds, sizes of N and P transistors

Amirtharajah/Parkhurst, EEC 118 Spring 2011 16


Pseudo NMOS Inverter: I/V Curves
I/V curve for NMOS: I/V curve for PMOS:
Vin=4V

-Drain current -IDS


Drain current IDS

Vin=3V
VGS=-VDD

Vin=2V

Vin=1V

VDS = Vout VDD -VDS = -(Vout - VDD)


• Plotof -IDS vs -VDS since
current is from source to drain
• Only one curve since VGS
fixed
Amirtharajah/Parkhurst, EEC 118 Spring 2011 17
Pseudo NMOS Inverter: VTC
Vin=4V
Drain current IDS

Vin=3V VDD

Vout
Vin=2V

Vin=1V

Vout = VDS VDD 0 1 2 Vin 3 4

• Similar VTC to resistive-load inverter


– Sharper transition region, smaller area
• VOL worse than CMOS inverter
Amirtharajah/Parkhurst, EEC 118 Spring 2011 18
Transmission Gate Logic

= =

• NMOS and PMOS connected in parallel


• Allows full rail transition – ratioless logic
• Equivalent resistance relatively constant during
transition
• Complementary signals required for gates
• Some gates can be efficiently implemented using
transmission gate logic (XOR in particular)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 19


Equivalent Transmission Gate Resistance

0V

Vout = 0V @ t=0
Vin

VDD

• For a rising transition at the output (step input)


– NMOS sat, PMOS sat until output reaches |VTP|
– NMOS sat, PMOS lin until output reaches VDD-VTN
– NMOS off, PMOS lin for the final VDD – VTN to VDD
voltage swing
Amirtharajah/Parkhurst, EEC 118 Spring 2011 20
Equivalent Resistance

• Equivalent
resistance Req is Req,n
parallel combinaton
Req,p
of Req,n and Req,p
R
• Req is relatively
constant
Req

VTp VDD-VTn VDD


Vout

Amirtharajah/Parkhurst, EEC 118 Spring 2011 21


Resistance Approximations
• To estimate equivalent resistance:
– Assume both transistors in linear region
– Ignore body effect
– Assume voltage difference (VDS) is small

1 1
Req ,n ≈ Req , p ≈
k n (VDD − Vtn ) (
k p VDD − Vtp )
1
Req ≈
(
k n (VDD − Vtn ) + k p VDD − Vtp )
Amirtharajah/Parkhurst, EEC 118 Spring 2011 22
Equivalent Resistance – Region 1
• NMOS saturation:

Req ,n =
(VDD − Vout )
k n (VDD − Vout − Vtn )
1 2
2

• PMOS saturation:
Req , p =
(VDD − Vout )
k p (− VDD − Vtp )
1 2
2

Amirtharajah/Parkhurst, EEC 118 Spring 2011 23


Equivalent Resistance – Region 2
• NMOS saturation:

Req ,n =
(VDD − Vout )
k n (VDD − Vout − Vtn )
1 2
2

• PMOS linear:
2(VDD − Vout )
=
Req , p
(
k p 2(VDD − VTP )(VDD − Vout ) − (VDD − Vout )
2
)
2
=
k p [2(VDD − VTP ) − (VDD − Vout )]

Amirtharajah/Parkhurst, EEC 118 Spring 2011 24


Equivalent Resistance – Region 3
• NMOS cut off:
Req ,n = ∞

• PMOS linear:
2
Req , p =
k p [2(VDD − VTP ) − (VDD − Vout )]

Amirtharajah/Parkhurst, EEC 118 Spring 2011 25


Transmission Gate Logic
• Useful for multiplexers (select between multiple
inputs) and XORs
• Transmission gate implements logic function F =
A if S
– If S is 0, output is floating, which should be
avoided
– Always make sure one path is conducting from
input to output
• Only two transmission gates needed to
implement AS + AS
– Transmission Gate 1: A if S
– Transmission Gate 2: A if S
Amirtharajah/Parkhurst, EEC 118 Spring 2011 26
Transmission Gate XOR

S
S

A F = A⊕ S

S
S
• If S = 0, F = A and when S = 1, F = ~A
Amirtharajah/Parkhurst, EEC 118 Spring 2011 27
Transmission Gate Multiplexer

F = AS + BS
A

S
Amirtharajah/Parkhurst, EEC 118 Spring 2011 28
Full Transmission Gate Logic
B C
F = A BC
A

B C
• PMOS devices in parallel with NMOS transistors pass
full VDD (only one logic path shown above)
• Requires more devices, but each can be sized smaller
than static CMOS
• Output inverter reduces impact of fanout
Amirtharajah/Parkhurst, EEC 118 Spring 2011 29
Next Topic: Dynamic Circuits

• Extend dynamic sequential circuit idea to logic circuits

– Improved speed

– Reduced area

– Challenging to design: timing and noise issues, charge


sharing, leakage

– Preferred design style for high performance circuits

Amirtharajah/Parkhurst, EEC 118 Spring 2011 30


Midterm Overview
• Closed book, closed notes
– Formula sheet provided (see last year’s exam)
– Need to know IDS equations, capacitor delay
equation, dynamic power equation, timing
parameter definitions
• Transistor Operation
• Inverters
• Static CMOS Combinational Logic
• Sequential Logic
• Labs
• (Logical Effort)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 31


EEC 118 Lecture #12:
Dynamic Logic

Rajeevan Amirtharajah
University of California, Davis

Jeff Parkhurst
Intel Corporation
Outline
• Today: Alternative MOS Logic Styles
• Dynamic MOS Logic Circuits: Rabaey 6.3 (Kang &
Leblebici, 9.4-9.6)

Amirtharajah/Parkhurst, EEC 118 Spring 2010 3


Review: Transmission Gate XOR

S
S

A F = A⊕ S

S
S
• If S = 0, F = A and when S = 1, F = ~A
Amirtharajah/Parkhurst, EEC 118 Spring 2010 4
Review: Transmission Gate Multiplexer

F = AS + BS
A

S
Amirtharajah/Parkhurst, EEC 118 Spring 2010 5
Dynamic CMOS
• Operation
– Clk low during Pre-charge clk Mp
• Mp is on while Mn is off
• Output charged to Vdd
– Clk high during evaluate NMOS
network
• Mn is on while Mp is off
• Output pulled down clk Mn
according to PDN
function
• PDN design same as static Gnd
CMOS

Amirtharajah/Parkhurst, EEC 118 Spring 2010 6


Dynamic CMOS Tradeoffs
• Advantages: • Disadvantages:
– Faster – why? – Multiple stage issues
• Reduced input load – Charge leakage
• No switching
contention – Charge sharing
– Less layout area – Capacitive coupling
– Cannot be cascaded
clk – Complicated timing/clocking
– Higher power
NMOS – Lower noise margins
network – Does not scale well with
process
clk
Gnd
Amirtharajah/Parkhurst, EEC 118 Spring 2010 7
Multiple Stage Issue: Output Discharge Race

clk Mp
clk Mp
Out 2

1
NMOS
network Out 1
clk Mn
clk Mn
Gnd
Gnd
• During pre-charge stage, inputs to second gate are all
high: Out 2 could discharge before Out 1 discharges
Amirtharajah/Parkhurst, EEC 118 Spring 2010 8
Cascading Multiple Stages
• During pre-charge stage, inputs to second gate are
all high
– At the beginning of evaluate stage, Out 2 is
discharged.
– Out 1 goes through its evaluation stage concurrently
and goes low
• Hence out 2 was supposed to be high, but already
discharged.
• Dynamic logic driven by the same clock cannot be
cascaded directly

Amirtharajah/Parkhurst, EEC 118 Spring 2010 9


Charge Sharing
• Output is floating after clk = ‘1’ if inputs are ‘0’
• If upper transistors in a stack switch, the
intermediate and output node voltages will be
equalized, possibly leading to a drop in the
output voltage = noise
• Final output (initial charge distributed over both
capacitors):
V=(C1V1+C2V2)/(C1+C2)
C1 C2

Amirtharajah/Parkhurst, EEC 118 Spring 2010 10


Charge Leakage & Capacitive Coupling
• Output is floating after clk = ‘1’ if inputs are ‘0’
• Since the current is not 0 when transistors are in
cutoff, current can leak charge away from the
output when all inputs are ‘0’
• Changes in input signals couple to the output and
intermediate nodes, also resulting in voltage
drops

Amirtharajah/Parkhurst, EEC 118 Spring 2010 11


Noise Solutions
• Charge sharing:
– Ensure the output capacitance is large enough
such that the voltage drop is minimal
– Precharge internal stack nodes to VDD
– Pre-discharging internal stack nodes can increase
performance, but worsens noise
• Charge leakage/sharing and capacitive coupling:
– Add a keeper PMOS (weak P pullup) – increased
evaluation contention

Amirtharajah/Parkhurst, EEC 118 Spring 2010 12


Domino Logic

clk Mp
clk Mp

Out 2
NMOS 1
network Out 1
clk Mn

clk Mn
Gnd

Gnd
Amirtharajah/Parkhurst, EEC 118 Spring 2010 13
Domino Logic
• Add an inverter between dynamic gates
– Inverter drives the gate’s fanout – increased
performance
• Sometimes the inverter is replaced with a more
complex static CMOS gate
– Incorporates more logic per stage to improve
speed
• Static CMOS gate improves overall circuit
dynamic noise margins

Amirtharajah/Parkhurst, EEC 118 Spring 2010 14


Cascading Domino
• For gates with all inputs coming from other
domino gates, the bottom NMOS transistor can
be eliminated
– Why? All inputs will be ‘0’ during precharge and
can only transition from ‘0’ to ‘1’ during evaluate
due to inverter between stages…
– Results in increased performance due to
decreased stack height
– Precharge now depends on input precharge time

Amirtharajah/Parkhurst, EEC 118 Spring 2010 15


Dynamic Logic Power
• Power depends upon switching activity
– Switching activity depends upon the probability of a ‘1’
input
1
Pavg = CloadVDD = CloadVDD f
2 2

T
– Effective capacitance Cload is doubled when the gate
evaluates because the gate must later precharge
– Frequency must be multiplied by the probability that an
evaluation will occur
• Power is usually higher for domino logic except when it
replaces prior logic with very high activity factors

Amirtharajah/Parkhurst, EEC 118 Spring 2010 16


Bottom Line
• Tradeoff between performance and power exists
• Many things can go wrong from a design
standpoint (high risk)
– Charge sharing, noise, leakage currents, race
conditions
• Debugging challenge, especially in deep
submicron
– Leakage currents put lower bound on clock
frequency for testing
• Best to use dynamic logic only when necessary
– High performance circuits such as microprocessor
critical paths
Amirtharajah/Parkhurst, EEC 118 Spring 2010 17
NORA Logic Gate

clk Mp clk Mp

PMOS
NMOS network
network
clk Mn clk Mn

Gnd Gnd

Amirtharajah/Parkhurst, EEC 118 Spring 2010 18


NORA Logic
• Solves problem of cascading dynamic gates, but
is vulnerable to noise
– Alternate P-dynamic and N-dynamic stages
– Both clk and clk required
– Lose some of the speed benefits due to added
PUNs

Zipper Logic
• Like NORA logic…but,
– PMOS precharge and NMOS pre-discharge
weakly on during evaluation stage…

Amirtharajah/Parkhurst, EEC 118 Spring 2010 19


Variations on the Domino Theme
• Multiple-Output Domino
– Exploit situation when certain outputs are subsets of
other outputs to reduce area
– Precharge intermediate nodes in PDN and follow with
inverters to drive other N-block dynamic gates
• Compound Domino
– Use complex static CMOS gates (NANDs, NORs) on
outputs of multiple dynamic gates in parallel
– Replaces large fanin domino gates with lower fanin
gates
– Capacitive coupling from static gate outputs to dynamic
gate outputs an issue
Amirtharajah/Parkhurst, EEC 118 Spring 2010 20
Multiple Output Domino CMOS Logic

Clk Clk
Out0
In0
In1 PDN Out1
In4 PDN
In5
Clk

Amirtharajah/Parkhurst, EEC 118 Spring 2010 21


Compound Domino CMOS Logic

Clk Clk Out


In0
In1 PDN In4
PDN
In2 In5
Clk Clk

Amirtharajah/Parkhurst, EEC 118 Spring 2010 22


Next Topic: Memories

• Memory principles and circuits

– ROM: Read Only Memory

– RWM (Read/Write Memory) or RAM (Random Access


Memory)

• DRAM, SRAM

– Nonvolatile memories (Flash, PROM, EEPROM)

Amirtharajah/Parkhurst, EEC 118 Spring 2010 23


EEC 118 Lecture #13:
Memories

Rajeevan Amirtharajah
University of California, Davis

Jeff Parkhurst
Intel Corporation
Announcements
• Finish Lab 5 this week
• Quiz 3 Wednesday
• Homework 7 issued later this week, due next
week
• Lab 6 next week, report due June 2

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Review: Adders (Rabaey 11.3)
• Multipliers: Rabaey 11.4
• Memories: Rabaey 12.1-12.2 (Kang & Leblebici,
10.1-10.6)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Review: Ripple Carry Adder
A0 B0 A1 B1 A2 B2 A3 B3

Ci,0 Co,0 C o,1 Co,2 Co,3


FA FA FA FA
(= C i,1)

S0 S1 S2 S3

Worst case delay linear with the number of bits


td = O(N)

tadder ≈ ( N – 1 )tcarry + tsum

Goal: Make the fastest possible carry path circuit

Amirtharajah/Parkhurst, EEC 118 Spring 2011 4


Review: Carry Bypass Adder

Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15

Setup Setup Setup Setup

Carry Carry Carry Carry


C i,0 Propagation Propagation Propagation Propagation

Sum Sum Sum Sum

• Note that this is done at the expense of a MUX in


the carry delay path !!

• Delay increases linearly but with smaller slope


than RCA.
Amirtharajah/Parkhurst, EEC 118 Spring 2011 5
Review: Carry Select Adder
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13 Bit 14-19

Setup Setup Setup Setup


(1)

"0" Carry "0" Carry "0" Carry "0" Carry


"0" "0" "0" "0"
(1)

"1" Carry "1" Carry "1" Carry "1" Carry


"1" "1" "1" "1"
(3) (3) (4) (5) (6) (7)
(4) (5) (6) (7)
Multiplexer Multiplexer Multiplexer Multiplexer Mux
Ci,0
(8)
Sum Generation Sum Generation Sum Generation Sum Generation Sum

S0-1 S2-4 S5-8 S9-13 S14-19 (9)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 6


Memory and Performance Trend I
• Memory is becoming a key factor in the performance of
a computer
• 1st generation computers had just system memory
– Very slow DRAM
– As microprocessors got faster, the bottleneck in
performance was data access
• 2nd generation computers added cache memory
– This provided faster access to small localized memory
that was being read or written
• Memory placed on front side bus (off-chip)
• Performance increased
– As processors got faster memory access again became
the bottleneck
Amirtharajah/Parkhurst, EEC 118 Spring 2011 7
Memory and Performance Trend II
• 3rd generation computers added on chip cache
– These caches started out 16K and were termed level 1
cache
– Soon we had level 1 and level 2 cache
– Currently we have three levels of cache on chip that run
at processor frequency, then access main memory if
data can’t be found in any of these caches
– Moving to stacking memory die on top of processor

Amirtharajah/Parkhurst, EEC 118 Spring 2011 8


Types of Memory I
• ROM: read-only memory
– Non-volatile – mask programmed
• RWM: read-write memory (RAM, random access
memory)
– SRAM: static memory
• Data is stored as the state of a bistable circuit
• State is retained without refresh as long as power is
supplied
– DRAM: dynamic memory
• Data is stored as a charge on a capacitor
• State leaks away, refresh is required

Amirtharajah/Parkhurst, EEC 118 Spring 2011 9


Types of Memory II
• NVRWM: non-volatile read-write memory (also called
NVRAM, non-volatile random access memory)
– Flash (EEPROM): ROM at low voltages, writable at high
voltages (Electrically Erasable Programmable Read-
Only Memory)
– EPROM: ROM, but erasable with UV light (falling out of
common usage)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 10


Memory Usage in Computers
• DRAM Memory
– Main memory storage. Used for data and programs
• SRAM Memory
– Faster than DRAM, however, uses more transistors
• Used to be used for external cache
• Variant used in internal cache (on chip cache)
• FLASH Memory and ROM
– Used for BIOS data storage in PCs
– Also used to store pictures, MP3 files for digital
cameras and MP3 players – eventually for hard disk

Amirtharajah/Parkhurst, EEC 118 Spring 2011 11


Basic Memory Array Structure
• Memory cells 0
arranged in a
1
rectangular array
• Rows correspond A0
to data words A1
– Accessed through Memory Array
a row decoder 2N bits
• Columns to Ak −1
individual bits
– Selected through
a column mux 2k − 1 0 1 2 N −k − 1
• Bit voltage Ak K AN −1
amplified by sense (N − k )
amplifier D0 D1 Dm
Amirtharajah/Parkhurst, EEC 118 Spring 2011 12
Memory Circuit Operation
• Wordlines (WL) control row (word) access
– Usually control gates of pass transistors
• Bitlines (BL) route column data (individual bits)
– Bitlines usually precharged high (like dynamic logic)
– Memory cells discharge bitline depending on stored
data (bitline left high if cell stores 1, bitline discharged
if cell stores 0)
– Bitline swings usually small (10s – 100s of mV) and
must be amplified by sense amplifiers
– Synchronous or asynchronous timing can be used
• Memory cells store data value
– Static vs. dynamic, single or multiple bits, etc.
Amirtharajah/Parkhurst, EEC 118 Spring 2011 13
DRAM
BL

WL

• Smaller cell size (1 transistor or 1T cell)


– Reason for inexpensive memory in computers
– Tradeoff of area (memory density) vs. speed and
complexity (refreshing)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 14
DRAM Issues
• Must be periodically refreshed
– Reads are destructive (modify voltage stored on
capacitor)
– Every read followed by a refresh of the bit (write back of
read value)
• No static power dissipation
• Output voltage is charge sharing result of storage
capacitor and bitline capacitance
– More complex sense amplifiers
– Higher noise susceptibility
• Requires different CMOS process than high performance
logic
– Not compatible with cache in microprocessors
Amirtharajah/Parkhurst, EEC 118 Spring 2011 15
3T DRAM Cell

Bit 3T DRAM
Line
Write Read
M2 M3

Store
M1

• Early DRAM technology


• Gate cap of M1 stores bit
• Nondestructive reads
• Storage node voltage < VDD
– Compensate with boosted
wordline
Advanced DRAM Process

• Vertical transistor, trench capacitor (Beintner, JSSC 04)


Amirtharajah/Parkhurst, EEC 118 Spring 2011 17
ROM
• Dotted lines refer to BL
either set at ‘1’ or ‘0’
– PROM: Replace
Q Either
dotted lines with
fuses
• Small cell size (1T cell)
• Not necessary to WL
refresh
• No static power
dissipation
• Output voltage is set
by WL duration
Amirtharajah/Parkhurst, EEC 118 Spring 2011 18
SRAM Cell
BL BL

Q Q

WL

• Cross-coupled inverters: bistable element


• Density is important in memories
– Single NMOS pass transistor used for reading/writing
– Transistor sizes should take up minimum area
• Faster than DRAM since typically fewer cells
• No refresh required (nondestructive reads)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 19
SRAM Design: Read “0”
BL BL
M5 M6

M3 Q Q M4
Vdd Vdd

Cc Cc
M1 M2

WL

• Prior to read operation, voltage at node Q = 0V and Q =


Vdd, bit lines precharged to Vdd
• Transistors M3 and M4 are turned on by word line
(WL) select circuitry

Amirtharajah/Parkhurst, EEC 118 Spring 2011 20


SRAM Design: Read “0”
• Transistors M3 and M4 are turned on by WL line select
circuitry
– Cc = Vdd to start…capacitance discharges through M1.
– Need to make sure the ratio between M1 and M3 does not
allow Q to go above Vtn.
• Otherwise node Q accidentally discharged
• Conservative since there will also be charge sharing at
that node as well between small internal node
capacitance and large bitline capacitance
– Sense amp detects that node Q was a stored 0 due to the
minor drop of voltage on the bitline

Amirtharajah/Parkhurst, EEC 118 Spring 2011 21


SRAM Design: Read “0”
• Data must not be destroyed when bitline voltage different
than storage node
– VQ must not exceed the threshold of the inverter (assumed
to be VDD/2), more conservative to keep it below VTN
• Assume VBL initially remains at VDD: M3 in saturation, M1
in linear k

2
n,3
(VDD − VQ − VTN ) =
k
2
2 n,1
(2(V
DD − V )V
TN Q − VQ
2
)
⎛W ⎞
⎜ ⎟
⎝ L ⎠3 2(VDD − 1.5VTN )VTN plug into ID equations:
Guarantee VQ < VTN,
k n ,3
= <
k n ,1 ⎛W ⎞
⎜ ⎟
(VDD − 2VTN )
2 ID3 < ID1 at VQ=VTN

⎝ L ⎠1
Amirtharajah/Parkhurst, EEC 118 Spring 2011 22
SRAM Design: Write “0” (1st Analysis)
BL = 0V BL = Vdd
M5 M6
VBL
M3 Q Q M4

Cc Cc
M1 M2

WL
• Assume VQ = Vdd and VQ = 0V
• Data must be forced into the cell
– VQ must fall below the threshold of the inverter to turn M2 off.
– This allows VQ to go high enough to go above the Vt of M1
• This discharges node Q and stores a 0
• Assume VBL remains at 0V: M3 linear, M5 linear (VQ=VDD/2)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 23
SRAM Design: Write “0” (1st Analysis)
BL = 0V BL = Vdd
M5 M6
VBL
M3 Q Q M4

Cc Cc
M1 M2

WL

• Conditions for this to happen: requires M5 to M3 ratio to


be relatively small (VDS = VQ = VDD/2)
k p ,5 ⎛ 2
⎞ ⎛ VDD ⎞
2
⎜ (VDD − VTP )VDD −
k
⎜ (VDD − VTN )VDD −
VDD
⎟= n,3

2 ⎜⎝ 4 ⎟⎠ 2 ⎜⎝ 4 ⎟⎠
Amirtharajah/Parkhurst, EEC 118 Spring 2011 24
SRAM Design: Write “0” (1st Analysis)
BL = 0V BL = Vdd
M5 M6
VBL
M3 Q Q M4

Cc Cc
M1 M2

WL

• Required sizing (size M5 below mobility ratio):


⎛W ⎞ ⎛ V 2

⎜ ⎟ ⎜⎜ (VDD − VTN )VDD − DD
⎟⎟
⎝L ⎠5 k n′ ⎝ 4 ⎠ k n′
< = If VTN = |VTP|
⎛W ⎞ ′
kp ⎛ VDD ⎞ k p
2

⎜ ⎟ ⎜⎜ (VDD − VTP )VDD − ⎟⎟
⎝L ⎠3 ⎝ 4 ⎠
Amirtharajah/Parkhurst, EEC 118 Spring 2011 25
SRAM Design: Write “0” (2nd Analysis)
BL = 0V BL = Vdd
M5 M6
VBL
M3 Q Q M4

Cc Cc
M1 M2

WL
• Assume VQ = Vdd and VQ = 0V
• Data must be forced into the cell
– VQ must fall below the threshold of the NMOS (turns M2 off).
– This allows VQ to go high enough to go above the Vt of M1
• This discharges node Q and stores a 0
• Assume VBL remains at 0V: M5 sat., M3 linear (VQ = VTN)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 26
SRAM Design: Write “0” (2nd Analysis)
BL = 0V BL = Vdd
M5 M6
VBL
M3 Q Q M4

Cc Cc
M1 M2

WL

• Desired current conditions, want VQ < VTN (M3 lin., M5 sat.):


kn , 3
2
(2(VDD − VTN )VTN − V 2
TN )= k p ,5
2
(0 −VDD −VTP )2

Amirtharajah/Parkhurst, EEC 118 Spring 2011 27


SRAM Design: Write “0” (2nd Analysis)
BL = 0V BL = Vdd
M5 M6
VBL
M3 Q Q M4

Cc Cc
M1 M2

WL

• Required sizing (make M5 relatively weaker than 1st case):


⎛W ⎞
⎜ ⎟
⎝L ⎠5 k n′ 2(VDD − 1.5VTN )VTN
<
⎛W ⎞ k ′p (VDD + VTP ) 2
⎜ ⎟
⎝L ⎠3
Amirtharajah/Parkhurst, EEC 118 Spring 2011 28
SRAM Static Noise Margins
• SRAM cell stability and writability quantitatively
specified by static noise margin (SNM)
– Dependent on mode of operation (Hold, Read, or Write)
VN 1

+
-
VN 2
+
-
• Hypothetical noise sources added to inverter inputs
• SNM corresponds to largest noise disturbances which
won’t disrupt cell operation
• SNM can be determined graphically by butterfly plot
Amirtharajah/Parkhurst, EEC 118 Spring 2011 29
Hold Static Noise Margin

Hold SNM Half Circuit Hold SNM Butterfly Plot


Vdd

Vin Vout SNM

Gnd
• Plot two mirrored Vin/Vout curves
• SNM = side of largest inscribed
square
Amirtharajah/Parkhurst, EEC 118 Spring 2011 30
Read Static Noise Margin

Read SNM Half Circuit Read SNM Butterfly Plot


Vdd

Vin
SNM
Vout

Gnd
• Diode-connected access NMOS
simulates precharged bitline

Amirtharajah/Parkhurst, EEC 118 Spring 2011 31


Write Static Noise Margin

Write 0 SNM Half Circuit Write SNM Butterfly Plot


Vdd

Vin

Vout
SNM

Gnd
• Always-on ground-connected
access NMOS simulates bitline
driven low, use Read Ckt for Write 1
Amirtharajah/Parkhurst, EEC 118 Spring 2011 32
Memory Peripherals
• Memory core (memory cells) largely determined by
technological considerations
– Emphasizes reduced area, sacrifices speed, reliability
– Peripheral circuits can recover some of the lost
performace
• Address Decoders
– Row Decoders: one-hot decoding for word lines
– Column Decoders: 2L-to-1 multiplexers for bit lines
• I/O Buffers and Drivers
• Sense Amplifiers
• Memory Timing and Control

Amirtharajah/Parkhurst, EEC 118 Spring 2011 33


Flash Memory Transistor With Floating Gate
Control Gate Floating Gate

S D

n+ n+

p substrate

• Threshold voltage of device adjusted by placing


charge on floating gate
Amirtharajah/Parkhurst, EEC 118 Spring 2011 34
Flash Memory Operation
• Two threshold voltages correspond to two states (1 bit)
– Bitline is precharged high before a read
– Low VT state, when wordline (control gate) is high the bitline
is discharged and a “0” is read
– High VT state, the wordline (control gate) can’t go high
enough to turn on transistor, bitline stays high and a “1” is
read
• Writing a “1”: electrons are accelerated by a high field
until they accumulate on the floating gate, raising VT
• Writing a “0”: electrons driven off floating gate by a
reverse gate-source bias through Fowler-Nordheim
tunneling, lowering VT

Amirtharajah/Parkhurst, EEC 118 Spring 2011 35


Next Topics: Low Power Circuits and Wires

• Low power design principles and circuit techniques

– Voltage scaling, activity factor reduction, clock gating,


leakage reduction

• On-chip resistance and capacitance

– Delay estimation

– Buffering and repeater insertion

Amirtharajah/Parkhurst, EEC 118 Spring 2011 36


EEC 118 Lecture #14:
Low Power Circuits

Rajeevan Amirtharajah
University of California, Davis

Jeff Parkhurst
Intel Corporation
Announcements
• HW7: Optional
– Issued later today
• Lab 6: Memories
– Issued this evening, due last day of class

• Quiz 4 Wednesday

Amirtharajah/Parkhurst, EEC 118 Spring 2011 2


Outline
• Review: Memory Basics
• Finish Memories: Rabaey 12.1-12.2 (Kang &
Leblebici, 10.1-10.6)
• Low Power Circuits: Rabaey 5.5 (Kang &
Leblebici, 11.1-11.3)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 3


Why Power Matters
• Packaging costs: many pins to get 10s of Amps into chip
• Power supply rail design: must get 10s of Amps through
1-10 μm2 of on-chip wire area
• Chip and system cooling costs: large server farms might
consume 1-10s MW
• Noise immunity and system reliability: high temperature
bad for noise, devices degrade faster
• Battery life and weight (in portable systems)
• Environmental concerns
– Office equipment accounted for 5% of total US commercial
energy usage in 1993
Amirtharajah/Parkhurst, EEC 118 Spring 2011 4
State-of-the-Art Processor Power
• Reported at ISSCC 2004
– IBM POWER5: 130 nm SOI, 1.5 GHz at 1.3 V,
incorporates 24 digital temperature sensors distributed
over die for hot-spot throttling
– Sun UltraSPARC: 130 nm CMOS, 1.2 GHz at 1.3 V, 23
W typical dissipation
– IBM PowerPC 970: 130 nm SOI, 1.8 GHz at 1.45 V, 57
W typical dissipation
– IBM PowerPC 970+: 90 nm SOI, 2.5 GHz at 1.3 V, 49
W typical dissipation
• Careful design still keeping power below 100 W
– Montecito ISSCC 2005 (dual-core Itanium): 300 W
down to 100 W

Amirtharajah/Parkhurst, EEC 118 Spring 2011 5


Recent Battery Scaling and Future Trends

• Battery energy density increasing 8% per year, demand


increasing 24% per year (the Economist, January 6, 2005)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 6
Overview of Dynamic Power Consumption
• Dynamic (Switching) Power Dissipation
– Due to charging output node capacitance
• Output node capacitance of driver
• Total interconnect capacitance
• Input node capacitance of receivers
• Pavg = CLoad x (VDD)2 x Fclk
– Note power is a factor of
• Supply voltage
• Switching frequency
• Cload (transistor sizing, interconnect width)
• NOT dependent on rise/fall

Amirtharajah/Parkhurst, EEC 118 Spring 2011 7


Circuit Capacitances

Amirtharajah/Parkhurst, EEC 118 Spring 2011 8


Capacitance Analysis
1st sentence after equation (K&L
11.3) is wrong. It ignores Miller
effects from driver as well as
driver drain capacitances.

Amirtharajah/Parkhurst, EEC 118 Spring 2011 9


Reducing Switching Power Consumption

Pavg = CLoad x (VDD)2 x Fclk


• Reduce Power Supply voltage
– Process scaling accomplishes this due to reliability
issues, but trend is slowing down
• Reduce load capacitance
– Process scaling helps with this (approximately halves
capacitance every node)
– Proper sizing of transistors
• Reduce activity factor (probability that capacitance is
charged)
– Refer to Rabaey 6.2 (K&L 11.4)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 10
Delay and Power versus Supply Voltage

Amirtharajah/Parkhurst, EEC 118 Spring 2011 11


CMOS Inverter Short Circuit Current
Input signal with finite
rise and fall time

vin
I dyn vout
I sc
CL

• As input switches, both transistors are on for a finite


amount of time: current travels from Vdd directly to Gnd
Amirtharajah/Parkhurst, EEC 118 Spring 2011 12
Short Circuit Power Dissipation

Amirtharajah/Parkhurst, EEC 118 Spring 2011 13


Short Circuit Current Triangle Approx.

VDD − VT
vin (t )
VT

I peak
I sc (t )
t sc t sc

Amirtharajah/Parkhurst, EEC 118 Spring 2011 14


Short Circuit Current With Large Load

VDS ≈ 0
vin vout
I sc ≈ 0

C L big

• If inputs switch fast and output switches slowly, very little short
circuit current results
– Translates to slower propagation delays which might not be tolerable
Amirtharajah/Parkhurst, EEC 118 Spring 2011 15
Short Circuit Power Dissipation

Amirtharajah/Parkhurst, EEC 118 Spring 2011 16


Short Circuit Current With Small Load

VDS ≈ VDD
vin vout
I sc ≈ I MAX

C L small

• If inputs switch slowly and output switches fast, short circuit


current maximized since VDS=VDD for most of input transition
Amirtharajah/Parkhurst, EEC 118 Spring 2011 17
Minimizing Short Circuit Power

• Peak current determined by MOSFET saturation current,


so directly proportional to device sizes

• Peak current also strong function of ratio between input


and output slopes as shown in previous 2 slides

• For individual gate, minimize short circuit current by


making output rise/fall time much bigger than input
rise/fall time

– Slows down circuit

– Increases short circuit current in fanout gates

• Compromise: match input and output rise/fall times


Amirtharajah/Parkhurst, EEC 118 Spring 2011 18
Some Final Words on Short Circuit Power

• When input and output rise/fall times are equalized,


most power is associated with dynamic power

– <10% devoted to short circuit currents

• Can eliminate short circuit dissipation entirely by very


aggressive voltage scaling

– Need VDD < VTn + VTp


– Both devices can’t be on simultaneously

• Short circuit power becoming less important in deep


submicron

– Threshold voltages not scaling as fast as supply voltages


Amirtharajah/Parkhurst, EEC 118 Spring 2011 19
Leakage Currents in Deep Submicron
G
I7, I8
S D

I2, I3, I6

I5 I1
I4
B

• Many physical mechanisms produce static currents


in deep submicron
Amirtharajah/Parkhurst, EEC 118 Spring 2011 20
Transistor Leakage Mechanisms
1. pn Reverse Bias Current (I1)
2. Subthreshold (Weak Inversion) (I2)
3. Drain Induced Barrier Lowering (I3)
4. Gate Induced Drain Leakage (I4)
5. Punchthrough (I5)
6. Narrow Width Effect (I6)
7. Gate Oxide Tunneling (I7)
8. Hot Carrier Injection (I8)

Amirtharajah/Parkhurst, EEC 118 Spring 2011 21


Reverse Diode Leakage Current

Reverse leakage current paths in a CMOS inverter


Amirtharajah/Parkhurst, EEC 118 Spring 2011 22
Subthreshold Leakage Current

Subthreshold leakage current path in CMOS inverter


Amirtharajah/Parkhurst, EEC 118 Spring 2011 23
Leakage Power
• Reverse bias diode leakage current
– Diode between well and substrate reverse biased
– Reverse saturation current Is drains power from Vdd
• Sub-threshold leakage current
– Due to channel being in weak inversion instead of being
completely off
– Noise on ground line can contribute to sub-threshold
leakage (negative noise voltage yields positive VGS)
– Avoid low Vt transistors to minimize leakage (limit to
<10% of total transistor count)
– Will dominate total power consumption if scaling trend
continues

Amirtharajah/Parkhurst, EEC 118 Spring 2011 24


Reducing Power by Voltage Scaling

• Plot of Normalized delay vs Power supply for different Vt


– Increasing power supply voltage decreases delay
– Decreasing Vt for a given Vdd also decreased delay(up to a point)
• Note it is important to linearly scale Vt with Vdd when process scaling
to meet delay specs, but subthreshold leakage increases as we scale
– Use Multiple Threshold transistor solution in your design (if allowed)
Amirtharajah/Parkhurst, EEC 118 Spring 2011 25
Figure of Merit: Power Delay Product

Power-Delay
Product (PDP) tries
to balance power
W/L decreasing
and delay tradeoff
Amirtharajah/Parkhurst, EEC 118 Spring 2011 26
Power Delay Product Optimum
• Just like Vt scaling vs. power supply there is diminishing
returns for sizing
– Preceding curve shows delay vs. power
• Obtained by modifying the size of the gate to analyze
delay and power
• By decreasing W/L, delay goes up but power goes down
– After a while, decreasing W/L increases delay
tremendously without lowering power
• By increasing W/L, delay goes down but power goes up
– After a while, increasing W/L costs you tremendously
in power without lowering delay
• Optimal point where slope of curve is -1
Amirtharajah/Parkhurst, EEC 118 Spring 2011 27
Pipeline Approach to Voltage Scaling
• Start with a single design with two registers
– Consider the logic in between allows freq = fmax
• Now break the logic into N separate parts with equal delay
– Separate each part by a register
– Logic will be several times faster (New fmax = N x Old fmax)
• Vdd can be lowered in order slow down logic to fit original
fmax freq
– However, additional capacitance of each register has been
added.
• Power savings could be as much as 80% once all things
are considered

Amirtharajah/Parkhurst, EEC 118 Spring 2011 28


Pipeline Approach

Single Register Multiple Registers


• Tradeoff power for a little more area and more latency
by reducing voltage to meet fixed throughput
Amirtharajah/Parkhurst, EEC 118 Spring 2011 29
Hardware Replication (Parallelism)

• Create N redundant paths for data/logic


• Input data sent to all path inputs
– Outputs from the multiple paths arrive at same time
• Have clock to each input register at Fclk/N
• Use mux to select from all outputs
• To reduce power
– Reduce power supply voltage for each path
• You can afford the slower speed since replication
speeds up total circuit performance
– Gate clocks (turn them off) for unused paths
Amirtharajah/Parkhurst, EEC 118 Spring 2011 30
Parallelization Driven Voltage Scaling

LOGIC LOGIC
D A B

LOGIC LOGIC
A B
f
2
• Parallelize computation up to N times
• Reduce clock frequency by factor N
• Reduce voltage to meet relaxed frequency constraint
Amirtharajah/Parkhurst, EEC 118 Spring 2011 31
Tradeoffs of Parallelization
• Amount of parallelism in application may be limited
• Extra capacitance overhead of multiple datapaths
– N times higher input loading
– N-to-1 selector on output
– Lower clock frequency somewhat offset by higher clock
load
• Consumes more area, devices, more leakage power
especially in deep submicron
• Voltage reduction typically results in dramatic power
gains
– ~3X power reduction

Amirtharajah/Parkhurst, EEC 118 Spring 2011 32


Summary

• Various causes of power dissipation


– Switching, short circuit, leakage current
• Reducing power dissipation
– Voltage scaling – Decreases dynamic power
quadratically, other power linearly
– Technology Scaling – Reduces capacitances
– Transistor Sizing – Make sure you are on the correct part
of the power delay tradeoff curve
– Pipeline approach
– Hardware replication (parallelism) approach

Amirtharajah/Parkhurst, EEC 118 Spring 2011 33


Next Topic: Wires

• On-chip resistance and capacitance

– Delay estimation

– Buffering and repeater insertion

Amirtharajah/Parkhurst, EEC 118 Spring 2011 34


EEC 118 Lecture #15:
Interconnect

Rajeevan Amirtharajah
University of California, Davis
Outline
• Review and Finish: Low Power Design
• Interconnect Effects: Rabaey Ch. 4 and Ch. 9
(Kang & Leblebici, 6.5-6.6)

Amirtharajah, EEC 118 Spring 2011 3


Interconnect Modeling
• Early days of CMOS, wires could be treated as ideal for
most digital applications, not so anymore!

• On-chip wires have resistance, capacitance, and


inductance

– Similar to MOSFET charging, energy depends solely on


capacitance

– Resistance might impact low power adiabatic charging,


static current dissipation, speed

– Ignore inductance for all but highest speed designs

• Interconnect modeling is whole field of research itself!

Amirtharajah, EEC 118 Spring 2011 4


Interconnect Models: Regions of Applicability
• For highest speed applications, wire must be treated as
a transmission line
– Includes distributed series resistance, inductance,
capacitance, and shunt conductance (RLGC)
• Many applications it is sufficient to use lumped
capacitance (C) or distributed series resistance-
capacitance model (RC)
• Valid model depends on ratio of rise/fall times to time-
of-flight along wire
– l: wire length
– v: propagation velocity (speed of light)
– l/v: time-of-flight on wire

Amirtharajah, EEC 118 Spring 2011 5


Interconnect Models: Regions of Applicability

• Transmission line modeling (inductance significant):

trise (tfall) < 2.5 x (l / v)

• Either transmission line or lumped modeling:

2.5 x (l / v) < trise (tfall) < 5 x (l / v)

• Lumped modeling:

trise (tfall) > 5 x (l / v)


Amirtharajah, EEC 118 Spring 2011 6
Resistance
• Resistance proportional to length and inversely
proportional to cross section

• Depends on material constant resistivity ρ (Ω-m)

t L

W
ρL L ρL ρ
R= = = Rsq Rsq =
A tW W t
Amirtharajah, EEC 118 Spring 2011 7
Parallel-Plate Capacitance
• Width large compared to dielectric thickness, height
small compared to width: E field lines orthogonal to
substrate

W
t L
h dielectric

substrate

εr
C= WL
h
Amirtharajah, EEC 118 Spring 2011 8
Fringing Field Capacitance
• When height comparable to width, must account for
fringing field component as well

L
W

h dielectric

substrate

Amirtharajah, EEC 118 Spring 2011 9


Total Capacitance Model
• When height comparable to width, must account for
fringing field component as well

• Model as a cylindrical conductor above substrate

dielectric t W

h
substrate

Amirtharajah, EEC 118 Spring 2011 10


Total Capacitance Model
• Total capacitance per unit length is parallel-plate (area)
term plus fringing-field term:

εr ⎛
t⎞ 2πε r
c = c pp + c fringe = ⎜W − ⎟ +
h ⎝ 2 ⎠ log(2h t + 1)
• Model is simple and works fairly well (Rabaey, 2nd ed.)

– More sophisticated numerical models also available

• Process models often give both area and fringing (also


known as sidewall) capacitance numbers per unit
length of wire for each interconnect layer
Amirtharajah, EEC 118 Spring 2011 11
Alternative Total Capacitance Models
• For wide lines (w ≥ t/2) Kang & Leblebici Eq. 6.53:
t⎞εr ⎛ 2πε r
C = ⎜W − ⎟ +
h ⎝ 2⎠ ⎛ 2h 2 h ⎛ 2 h ⎞ ⎞
ln⎜1 + + ⎜ + 2⎟ ⎟
⎜ t t ⎝ t ⎠ ⎟
⎝ ⎠
• For narrow lines (w ≤ t/2) Kang & Leblebici Eq. 6.54:
⎛ t ⎞
πε r ⎜1 − 0.0543 ⎟
ε rW ⎝ 2h ⎠
C= + + 1.47ε r
h ⎛ 2h 2h ⎛ 2h ⎞ ⎞
ln⎜⎜1 + + ⎜ + 2 ⎟ ⎟⎟
⎝ t t ⎝ t ⎠⎠
Amirtharajah, EEC 118 Spring 2011 12
Capacitive Coupling
• Fringing fields can terminate on adjacent conductors
as well as substrate

• Mutual capacitance between wires implies crosstalk,


affects data dependency of power

dielectric

substrate

Amirtharajah, EEC 118 Spring 2011 13


Miller Capacitance
• Amount of charge moved onto mutual capacitance
depends on switching of surrounding wires

• When adjacent wires move in opposite direction,


capacitance is effectively doubled (Miller effect)

A −
V
Cm
ΔQ = Cm (Vf −Vi )
= Cm(VDD − (−VDD))
+
B +
Cm
V = 2CmVDD

C

Amirtharajah, EEC 118 Spring 2011 14


Data Dependent Switched Capacitance 1
• When adjacent wires move in same direction, mutual
capacitance is effectively eliminated

A B C OR A B C Ceff = 0

A B C OR A B C Ceff = 4Cm

A B C A B C
Ceff = 2Cm
OR

A B C OR A B C
Amirtharajah, EEC 118 Spring 2011 15
Data Dependent Switched Capacitance 2
• When adjacent wires are static, mutual capacitance is
effectively to ground

0B 0 OR 1B 1
1B 0 OR 0B 1
Ceff = 2Cm
0B 1 OR 1B 0
1B 1 OR 0B 0
• Remember: it is the charging of capacitance where we
account for energy from supply, not discharging

Amirtharajah, EEC 118 Spring 2011 16


Lumped RC Model
R

• Simplest model used to represent the resistive and


capacitive interconnect parasitics
• Propagation delay (same as FET switch model):
t PLH ≈ 0.69 RC
Amirtharajah, EEC 118 Spring 2011 17
RC T-Model

R/2 R/2

• Significantly improves accuracy of transient


behavior over the lumped RC model

• Useful if simulation time is a bottleneck, much


simpler than fully distributed model
Amirtharajah, EEC 118 Spring 2011 18
Distributed RC Model

R/N R/N … R/N

C/N C/N C/N

• Elmore delay approximation for RC ladder network:


RC
t DN = as N →∞
2
Amirtharajah, EEC 118 Spring 2011 19
Repeater Insertion to Reduce Wire Delay

1 2 N

C/N C/N

• Insert inverters along long wires at regular intervals


• Breaks up resistance and capacitance, reducing delay
dramatically

Amirtharajah, EEC 118 Spring 2011 20


Inductance
• Inductance can be determined by direct application of
definition:
di
ΔV = L
dt
• Can compute inductance directly from wire geometry
and surrounding environment using field solver

• Simpler approach relates capacitance per length c with


inductance per length l:

cl = εμ
– Assumes uniform or “average” dielectric
Amirtharajah, EEC 118 Spring 2011 21
Summary
• Many important effects to consider in interconnect design
– Resistance, capacitance, inductance can all affect signal
performance
– Long rise/fall time signals, only resistance and capacitance
needs to be considered
• Several models useful for RC interconnect delay analysis
– Simple lumped (1 R, 1 C) model: easy to analyze and/or
simulate, will be pessimistic
– T-model (2 Req = R/2, 1 C): more accurate than lumped
– Distributed model (N Req = R/N, N Ceq = C/N): most accurate,
use Elmore delay approximation for hand analysis
Amirtharajah, EEC 118 Spring 2011 22
Next Topic: Design for Manufacturability

• Parameter variations in CMOS digital circuits

• Yield maximization and worst-case design

Amirtharajah, EEC 118 Spring 2011 23


EEC 118 Lecture #16:
Manufacturability

Rajeevan Amirtharajah
University of California, Davis
Outline
• Finish interconnect discussion
• Manufacturability: Rabaey G, H (Kang & Leblebici,
14)

Amirtharajah, EEC 118 Spring 2011 3


Design for Manufacturability
• For class projects or university research, goal is a
single working circuit or small number of prototypes

– Similar scale for industrial research projects

• Production goal is usually thousands to 100s of


millions of working (or at least marketable) parts

– Must evaluate circuit designs over a range of parameter


variations to ensure correct functionality, performance

• Design for Manufacturability or Statistical Circuit


Design encompasses a variety of techniques

– Yield estimation and maximization, worst-case analysis,


etc.
Amirtharajah, EEC 118 Spring 2011 4
Circuit Parameter Variations
• All circuit parameters vary some amount due to
variations in process, lithography, or environment
– Geometric parameters: transistor W and L
– Device parameters: VT, tox, μ
– Interconnect parameters: R, C
– Operating conditions: VDD, T
• Variations occur both spatially and temporally
– Circuit-to-circuit on same die (spatial)
Increasing
– Die-to-die on same wafer (spatial)
variation
– Wafer-to-wafer in same fab (temporal)
• Example: transistor width W = W0 + ΔW
Designer controls Random
Amirtharajah, EEC 118 Spring 2011 5
CMOS Inverter Example
• For both NMOS and PMOS:
– W = W0 + ΔW
– L = L0 + ΔL
– VT = VT0 + ΔVT Vout
Vin
– k’ = k’0 + Δk’
Cload
• For capacitor:
– Cload = C0 + ΔC
• For entire circuit:
– T = T0 + ΔT
– VDD = VDD0 + ΔVDD

• All these parameters affect circuit performance!


Amirtharajah, EEC 118 Spring 2011 6
Performance Variation Example
Inverter Delay Histogram
25

20

15
Count

10

0
0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-
100
Propagation Delay (ps)
• Delay variations with parameters, loading, VDD, and T
Amirtharajah, EEC 118 Spring 2011 7
Yield Estimation and Maximization
• Parametric Yield: ratio of total acceptable circuits to total
manufactured circuits
– Design for manufacturability aims to maximize yield (and $$)
• Yield statistics are usually complicated since circuit
performance is complex function of parameters
• Numerous methods for estimating and maximizing yield
– Response surface models (RSM): compact analytical model
fit to circuit simulations using Design of Experiments
– Direct Monte Carlo circuit simulations or the RSM can be
used to estimate yields
– Designer controlled parameters then adjusted to maximize
yield estimates
Amirtharajah, EEC 118 Spring 2011 8
Worst-Case Design 1
• Given range of variations for process, voltage,
temperature identify worst (best) cases for performance
parameter of interest
– Process corner models from fab define limits of device
performance
– Labeled by NMOS-PMOS pairs, e.g. Typical NMOS-Typical
PMOS (TT)
– Usual additional corners: Fast NMOS-Fast PMOS (FF),
Slow NMOS-Slow PMOS (SS), Fast NMOS-Slow PMOS
(FS), Slow NMOS-Fast PMOS (SF)
– Usual voltage corners: Nominal VDD +/- 10%
– Temperature range: 0 – 100 oC

Amirtharajah, EEC 118 Spring 2011 9


Worst-Case Design 2
• Identify worst (best) cases for performance
parameter of interest
• Typical Speed Corner
– Typical NMOS-Typical PMOS (TT), nominal VDD,
room temperature 27 oC
• Slow Speed Corner
– Slow NMOS-Slow PMOS (SS), 0.9 x VDD, maximum
temperature 100 oC
• Fast Speed Corner
– Fast NMOS-Fast PMOS (FF), 1.1 x VDD, minimum
temperature 0 oC
Amirtharajah, EEC 118 Spring 2011 10
Summary
• Design for manufacturability converts a prototype into a
“real” design for large-scale production
– Statistical models of process, device and interconnect
parameters, and operating conditions used to estimate and
maximize yield (and profits)
– Analysis is difficult because of complexity, usually
numerical models and many simulations required
• Variability trend is worsening as processes shrink
– For example, locations of individual dopant atoms can
affect transistor performance
• Statistical circuit design is becoming as important as
performance and power!

Amirtharajah, EEC 118 Spring 2011 11


Next Topic: Future Directions & Final Review

• Future directions in CMOS digital circuits

• Alternative logic technologies to CMOS

• Final exam review

Amirtharajah, EEC 118 Spring 2011 12


EEC 118 Lecture #17:
Future Directions
Alternatives to CMOS
Final Review
Rajeevan Amirtharajah
University of California, Davis
Announcements
• Lab 6 final report due Thursday, June 2, 5 PM in
homework box
• Final Exam: Monday, June 6, 8-10 AM
• Regular Office Hours: Wednesday 2-3 PM
• TA Office Hours Final Review: TBD

Amirtharajah, EEC 118 Spring 2011 2


Digital Circuits Beyond CMOS
• Past digital technologies
– Electromagnetic Relay: mechanical switch controlled by a
current; ideal switch but high power and slow
– Vacuum Tube: three (usually) terminal device, similar to a
MOSFET but very high power
– Bipolar Transistor: high transconductance, but high power,
finite input impedance limits fanout
• State-of-the-art digital technology is bulk silicon CMOS
– Low power, little static power (compared to other
technologies)
– Infinite input impedance allows large fanouts
– Vast development investment has led to extremely high
reliability, high yields, and high performance
Amirtharajah, EEC 118 Spring 2011 3
Digital Circuits Beyond CMOS
• Modifications on CMOS can extend Moore’s Law
– Strained silicon: mechanical stress on bulk crystal improves
mobility
– Silicon-on-Insulator: insulating bulk decreases parasitic
capacitors, allows tighter integration
– New materials: low-k interconnect dielectrics, high-k gate
dielectrics, metal gates
– New structures: 3D gates (finFET)
• But what happens when scaling stops?
– International Technology Roadmap for Semiconductors
extends to about 20 nm gate lengths in 2015-2020 time
frame
– Research devices demonstrated down to 5 nm gate lengths
Amirtharajah, EEC 118 Spring 2011 4
Nanometer Gate Length Bulk FETs

10 nm FET

180 nm FET

Doyle, Intel 2002

• Still more room at the bottom!


• In 10 nm CMOS, Intel 386 occupies 25 μm x 25 μm
• Fabricated using a combination of lithography and
epitaxial growth to define feature sizes
Amirtharajah, EEC 118 Spring 2011 5
Three Dimensional Transistors: FinFETs

Hisamoto, JSSC 2000


• Channel forms in thin silicon fin
• Gate wraps around fin to control channel formation
• Allows very small channel lengths
Amirtharajah, EEC 118 Spring 2011 6
Digital Circuits Beyond CMOS
• New devices based on new materials or quantum effects
– Gallium Arsenide (GaAs) MESFET: high electron mobility
but no complementary device and poor oxide isolation
– Josephson Junctions: superconducting logic and
interconnect promises very high speed (THz), but only two
terminals inconvenient for circuit design
– Carbon Nanotube Transistors: carbon nanotube devices
demonstrate high carrier mobilities and promise high speed
– Molecular Electronics: single organic molecules shown to
switch states in the lab, but also only two terminals
– Organic Electronics: semiconducting plastic substrate,
enables flexible displays and low cost but offers poor
performance

Amirtharajah, EEC 118 Spring 2011 7


Alternatives to Electronics
• A number of completely different digital technologies
– Optical Computing: use photons to carry information
instead of charge carriers, but no good three terminal
nonlinear optical element and difficult integration
– Quantum Computing: using various atomic-scale structures
to store multiple bits simultaneously and operating on them
using laws of quantum mechanics allows massive
parallelism, but very sensitive to noise
– Biological Computing: use DNA gene coding and promoter
and repressor sites to control synthesis of proteins, which
form the digital “signals”
• But can anything replace CMOS?
– Maybe, but not for a long time
– Some parts of a system (memory) before others (logic)
Amirtharajah, EEC 118 Spring 2011 8
Final Exam Review List
Closed Book, Closed Notes, 1 8.5 in. x 11 in. Formula Sheet
allowed, both sides, (You may bring a calculator)
• MOS Fabrication (very basics)
• MOS Structure
• Inverter Operation
• CMOS Inverter (Static Characteristics, e.g. VTC)
• CMOS Inverter (Dynamic Characteristics, e.g tpd, etc.)
• Complex Gates, Pseudo NMOS
– DC and Transient characteristics
• Sequential Logic
• Logical Effort
Amirtharajah, EEC 118 Spring 2011 9
Final Exam Review List
• Pass Transistor Logic, Pseudo NMOS
• Dynamic Logic
• Memory (DRAM, SRAM, Flash, ROM)
• Low Power Circuits
• Arithmetic Circuits
• Interconnect
• Study all homework and lecture materials
• Study labs

Amirtharajah, EEC 118 Spring 2011 10


Key Learnings
• Should know what region of operation the transistor is in
given the bias voltages at its terminals
• Should know what the PMOS and NMOS Id vs Vds curves
look like
• Be able to identify major points of the VTC for a CMOS
Device [Voh, Vol, Vih, Vil, Vth]
• Need to know how to calculate the total capacitance at
the output node (including Miller effect)
• Know the relevant capacitances of a transistor used in
transient analysis (i.e. Cgs, Cgd, …).

Amirtharajah, EEC 118 Spring 2011 11


Key Learnings
• Know how to calculate propagation delay using Req and
capacitance load. Be able to derive Req and Cload.
• Know Pseudo NMOS and Pass Transistor Logic pros and
cons
• Know Different Adders and Multipliers
– Know concepts of how they speed up these arithmetic units
• Dynamic Logic concepts
– Pros and cons, techniques used to cascade, avoid noise,
etc.
• Memory (different types, SRAM operation)
• Low Power Design techniques (voltage scaling, pipelining,
etc.)
• Interconnect basics (resistance, capacitance)
Amirtharajah, EEC 118 Spring 2011 12
Key Learnings

• Know how to find the Boolean function from a


schematic
• Know how to properly size transistors to get the
equivalent resistance of a basic inverter
• Know how to size devices and choose logic depth
based on logical effort

These are here to help focus your study but is not an


exhaustive list of what you are responsible for

Amirtharajah, EEC 118 Spring 2011 13

You might also like