0% found this document useful (0 votes)
1 views

L04 Arithmetic I

The document outlines Lecture 4 of COE 485, focusing on basic arithmetic operations and the Arithmetic Logic Unit (ALU) in computer architecture. It covers number representations, addition and subtraction methods, and the design of full adders and ripple-carry adders. Additionally, it discusses logical operations, overflow, and the implementation of various arithmetic functions using selection codes.

Uploaded by

Isaac
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

L04 Arithmetic I

The document outlines Lecture 4 of COE 485, focusing on basic arithmetic operations and the Arithmetic Logic Unit (ALU) in computer architecture. It covers number representations, addition and subtraction methods, and the design of full adders and ripple-carry adders. Additionally, it discusses logical operations, overflow, and the implementation of various arithmetic functions using selection codes.

Uploaded by

Isaac
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 70

COE 485:

Advanced Computer Architecture


Lecture 4: Arithmetic I

Instructor: K. O. Boateng
TA:

Semester 1, 2022/23

Dept. of Comp. Engg., KNUST, Kumasi


Basic Arithmetic and the ALU
 Number representations: 2’s complement,
unsigned
 Addition/Subtraction
 Add/Sub ALU
 Full adder, ripple carry, subtraction
 Carry-lookahead addition
 Logical operations
 and, or, xor, nor, shifts
 Overflow
Basic Arithmetic and the ALU
 Covered later:
– Integer multiplication, division
– Floating point arithmetic
Background
 Recall
– n bits enables 2n unique combinations
 Notation: b b … b b b b
31 30 3 2 1 0

 No inherent meaning
– f(b31…b0) => integer value
– f(b31…b0) => control signals
Background
 32-bit types include
– Unsigned integers
– Signed integers
– Single-precision floating point
– MIPS instructions (book inside cover)
Unsigned Integers
 f(b …b ) = b x 2 31
+ … + b x 2 1
+ b x 2 0
31 0 31 1 0
 Treat as normal binary number
E.g. 011010101
= 1 x 27 + 1 x 2 6 + 0 x 2 5 + 1 x 2 4 + 0 x 2 3 + 1 x 2 2 + 0 x 2 1 + 1 x 2 0
= 128 + 64 + 16 + 4 + 1 = 213
 Max f(111…11) = 232 – 1 = 4,294,967,295
 Min f(000…00) = 0
 Range [0,232-1] => # values (232-1) – 0 + 1 = 232
Signed Integers
 2’s complement
f(b31…b0) = -b31 x 231 + … b1 x 21 + b0 x 20
 Max f(0111…11) = 231 – 1 = 2147483647
 Min f(100…00) = -231 = -2147483648
(asymmetric)
 Range[-231,231-1] => # values(231-1 - -231)+1 = 232
 Invert bits and add one: e.g. –6
– 000…0110 => 111…1001 + 1 => 111…1010
Why 2’s Complement
 Why not use sign-magnitude?
 2’s complement makes hardware simpler
 Just like humans don’t work with Roman
numerals
 Representation affects ease of calculation, not
correctness of answer
000 000
111 0 001 111 0 001
-1 1 -3 1
110 -2 2 010 110 -2 2 010
-3 3 -1 3
101 -4 011 101 -0 011
100 100
Addition and Subtraction
 4-bit unsigned example
0 0 1 1 3
1 0 1 0 10
1 1 0 1 13
 4-bit 2’s complement – ignoring overflow
0 0 1 1 3
1 0 1 0 -6
1 1 0 1 -3
Subtraction
A – B = A + 2’s complement of B
 E.g. 3 – 2

0 0 1 1 3
1 1 1 0 -2
0 0 0 1 1
Full Adder
 Full adder (a,b,cin) => (cout, s)
 c = two or more of (a, b, c ) are 1s
out in
 s = exactly one or three of (a,b,c ) are 1s
in

a b cin cout s
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
The Full Adder
Arithmetic-logic units (ALUs)

• The ALU performs many different


arithmetic and logic operations
• The “heart” of a processor
– everything else in the CPU is there to support
the ALU
• Here’s the plan:
– An arithmetic unit first
• based on the adder-subtractor circuit
– Then a bit about logic operations
• and build a logic unit
– Finally, putting the two together using
multiplexers
Arithmetic-logic units 13
4-bit Ripple-carry Adder

• Just concatenate the full adders

At the ith adder:

ci+1 = ai * bi + ai * ci + bi * ci
Ripple-carry Subtractor

• B – A = B + (-A) => invert A and set c0 to 1


Combined Ripple-carry Adder/Subtractor

• Control = 1 => subtract


• XOR Y with control and set cin0 to control
The four-bit adder
• Always computes S = A + B + CI

• Changing what goes into inputs A, B and CI can change


the adder output S
Arithmetic-logic units 17
It’s the adder-subtractor again!
• Here the signal Sub and some XOR gates alter the adder
inputs
– When Sub = 0, the adder inputs A, B, CI are Y, X, 0, so the
adder produces G = X + Y + 0, or just X + Y.
– When Sub = 1, the adder inputs are Y’, X and 1, so the adder
output is G = X + Y’ + 1, or the two’s complement operation X
- Y.

Arithmetic-logic units 18
The multi-purpose adder
• One adder performing two separate functions
• “Sub” acts like a function select input
– determines whether the circuit performs addition or subtraction
• Circuit-wise, all “Sub” does is modify the adder’s inputs A
and CI

Arithmetic-logic units 19
Modifying the adder inputs
• By the same approach, we can use an adder to compute
other functions as well
• Figure out which functions are needed, and then put the
right circuitry into the “Input Logic” box

Arithmetic-logic units 20
Some more possible functions
• Adder inputs A, B and CI can be set to compute either X + Y
or X - Y
• How can we produce the increment function G = X + 1?
One way: Set A = 0000, B = X, and CI = 1
• How about decrement: G = X - 1?
A = 1111 (-1), B = X, CI = 0

• How about transfer: G = X?


(This can be useful.)

A = 0000, B = X, CI = 0

Almost the same as the


increment function!
Arithmetic-logic units 21
The role of CI

• The transfer and increment operations


have the same A and B inputs, and differ
only in the CI input
• In general we can get additional functions
(not all of them useful) by using both CI =
0 and CI = 1.
• Another example:
– Two’s-complement subtraction is obtained by setting A =
Y’, B = X, and CI = 1, so G = X + Y’ + 1
– If we keep A = Y’ and B = X, but set CI to 0, we get G =
X + Y’
• This turns out to be a ones’ complement subtraction operation

Arithmetic-logic units 22
Table of arithmetic functions
• Here are some of the different possible arithmetic
operations
• We’ll need some way to specify which function we’re
interested in, so we’ve randomly assigned a selection code
S2 S1 S0
to each operation Arithmetic operation
0 0 0 X (transf er)
0 0 1 X +1 (increment)
0 1 0 X +Y (add)
0 1 1 X +Y +1
1 0 0 X +Y’ (1C subtraction)
1 0 1 X +Y’ +1 (2C subtraction)
1 1 0 X –1 (decrement)
1 1 1 X (transf er)

Arithmetic-logic units 23
Mapping the table to an adder
• This second table shows what the adder’s inputs should be
for each of our eight desired arithmetic operations
– Adder input CI is always the same as selection code bit S 0.
– B is always set to X.
– A depends only on S2 and S1.
• These equations depend on both the desired operations
and the assignment of selection codes
Selection code Desired arithmetic operation Required adder inputs
S2 S1 S0 G (A +B +CI ) A B CI
0 0 0 X (transf er) 0000 X 0
0 0 1 X +1 (increment) 0000 X 1
0 1 0 X +Y (add) Y X 0
0 1 1 X +Y +1 Y X 1
1 0 0 X +Y’ (1C subtraction) Y’ X 0
1 0 1 X +Y’ +1 (2C subtraction) Y’ X 1
1 1 0 X –1 (decrement) 1111 X 0
1 1 1 X (transf er) 1111 X 1

Arithmetic-logic units 24
Building the input logic
• Compute adder input A, given input Y and the function select
code S (actually just S2 and S 1—why?)
• Here is an abbreviated truth table:

S2 S1 A
0 0 0000
0 1 Y
1 0 Y’
1 1 1111

• We want to pick one of these four possible values for A, depending on


S2 and S1.

Arithmetic-logic units 25
Primitive gate-based input logic

• We could build this circuit using primitive gates


• If we want to use K-maps for simplification, then
we should first expand out the abbreviated truth
table
– The Y that appears in the output column (A) is actually an
input
– We make that explicit in the table on the right. S 2 S 1 Yi Ai

• Remember A and Y are each 4 bits long! 0 0 0 0


S S A 0 0 1 0
2 1

0 0 0000 0 1 0 0
0 1 Y 0 1 1 1
1 0 Y’ 1 0 0 1
1 1 1111 1 0 1 0
1 1 0 1
1 1 1 1

Arithmetic-logic units 26
Primitive gate implementation
• From the truth table, we can
find an MSP (minimal SoP):

S1
0 0 1 0
S2 1 0 1 1
Yi

Ai = S 2 Y i ’ + S 1 Y i

• Again, we have to repeat this


once for each bit Y3-Y0,
connecting to the adder inputs
A3-A0

• This completes our arithmetic


unit

Arithmetic-logic units 27
Bitwise operations
• Most computers also support logical operations like AND,
OR and NOT, but extended to multi-bit words
• To apply a logical operation to two words X and Y, apply the
operation on each pair of bits Xi and Yi:
1011 1011 1011
AND 1 1 1 0 OR 1110 XOR 1 1 1 0
1010 1111 0101

Arithmetic-logic units 28
Bitwise operations in programming
• Languages like C, C++ and Java provide bitwise logical
operations:

& (AND) | (OR) ^ (XOR) ~ (NOT)

• Each integer is treated as a bunch of individual bits:


13 & 25 = 9 because 01101 & 11001 = 01001

• Not the same as the operators &&, || and !, which treat each
integer as a single logical value (0 is false, everything else is
true):

13 && 25 = 1 because true && true = true

• Bitwise operators are often used in programs to set a bunch of


Boolean options, or flags, with one argument
• Easy to represent sets of fixed universe size with bits:
– 1: is member, 0 not a member. Unions: OR, Intersections: AND

Arithmetic-logic units 29
Bitwise operations in networking
• IP addresses are actually 32-bit binary numbers, and
bitwise operations can be used to find network information
• For example, you can bitwise-AND an address
192.168.10.43 with a “subnet mask” to find the “network
address,” or which network the machine is connected to
192.168. 10. 43 = 11000000.10101000.00001010.00101011
& 255.255.255.224 = 11111111.11111111.11111111.11100000
192.168. 10. 32 = 11000000.10101000.00001010.00100000

• You can use bitwise-OR to generate a “broadcast address,”


for sending data to all machines on the local network

192.168. 10. 43 = 11000000.10101000.00001010.00101011


| 0. 0. 0. 31 = 00000000.00000000.00000000.00011111
192.168. 10. 63 = 11000000.10101000.00001010.00111111

Arithmetic-logic units 30
Defining a logic unit
• A logic unit supports different logical
functions on two multi-bit inputs X and
Y, producing an output G
• This abbreviated table shows four
possible functions and assigns a
selection code S to each
S1 S0 Output
0 0 Gi =X iYi
0 1 Gi =X i +Yi
1 0 Gi =X i  Yi
1 1 Gi =X i’

• Using multiplexers and some primitive


gates to implement this
• Again, need one multiplexer for each
bit of X and Y

Arithmetic-logic units 31
Our simple logic unit

• Inputs:
– X (4 bits)
– Y (4 bits)
– S (2 bits)
• Outputs:
– G (4 bits)

Arithmetic-logic units 32
Combining the arithmetic and logic units

• Now we have two pieces of the puzzle:


– An arithmetic unit that can compute eight functions on
4-bit inputs
– A logic unit that can perform four functions on 4-bit
inputs

• We can combine these together into a


single circuit, an arithmetic-logic unit (ALU)

Arithmetic-logic units 33
Our ALU function table
• This table shows a sample
function table for an ALU S3 S2 S1 S0 Operation
• All of the arithmetic 0 0 0 0 G =X
operations have S3=0, and 0 0 0 1 G =X +1
all of the logical operations 0 0 1 0 G =X +Y
have S3=1 0 0 1 1 G =X +Y +1
0 1 0 0 G =X +Y’
• These are the same 0 1 0 1 G =X +Y’ +1
functions we saw when we 0 1 1 0 G =X –1
built our arithmetic and 0 1 1 1 G =X
logic units a few minutes 1 x 0 0 G =X and Y
ago 1 x 0 1 G =X or Y
• Since our ALU only has 4 1 x 1 0 G =X  Y
logical operations, we don’t 1 x 1 1 G =X’
need S2—The operation
done by the logic unit
depends only on S1 and S0

Arithmetic-logic units 34
A complete ALU circuit

The / and 4 on a line indicate that it’s actually four lines

4 Cout should be ignored


when logic operations
are performed (when
4 S3=1).

4 4

G is the final ALU output


• When S3 = 0, the
final output comes
from the arithmetic
unit
• When S3 = 1, the
output comes from
the logic unit
The arithmetic and logic units share the select inputs
S1 and S0, but only the arithmetic unit uses S2

Arithmetic-logic units 35
Comments on the multiplexer

• Both the arithmetic unit and the logic unit are


“active” and produce outputs
– The mux determines whether the final result comes from the
arithmetic or logic unit
– The output of the other one is effectively ignored
• Our hardware scheme may seem like wasted
effort, but it’s not really
– “Deactivating” one or the other wouldn’t save that much time
– We have to build hardware for both units anyway, so we might
as well run them together
• This is a very common use of multiplexers in logic
design

Arithmetic-logic units 36
The completed ALU

• This ALU is a good example of hierarchical design


– With the 12 inputs, the truth table would have had 212 = 4096
lines That’s an awful lot of paper
– Instead, we are able to use components that we’ve seen
before to construct the entire circuit from a couple of easy-to-
understand components
• As always, we encapsulate the complete circuit in
a “black box” so we can reuse it in fancier circuits
4

4
4
4

Arithmetic-logic units 37
Arithmetic summary

• In the last few lectures we looked at:


– Building adders hierarchically, starting with one-bit full
adders
– Representations of negative numbers to simplify
subtraction
– Using adders to implement a variety of arithmetic
functions
– Logic functions applied to multi-bit quantities
– Combining all of these operations into one unit, the ALU
• Where are we now?
– We started at the very bottom, with primitive gates, and
now we can understand a small but critical part of a CPU
– This all built upon our knowledge of Boolean algebra,
Karnaugh maps, multiplexers, circuit analysis and design,
and data representations
Arithmetic-logic units 38
Overflow

• With n bits only 2n combinations


• Unsigned [0, 2n-1], 2’s complement [-
2n-1, 2n-1-1]
• Unsigned Add
5 + 6 > 7: 101 + 110 => 1011
f(3:0) = a(2:0) + b(2:0) => overflow
= f(3)
Carryout from MSB
Overflow

• More involved for 2’s complement


-1 + -1 = -2:
111 + 111 = 1110
110 = -2 is correct
• Can’t just use carry-out to signal
overflow
Addition Overflow

• When is overflow NOT possible?


(p1, p2) > 0 and (n1, n2) < 0
p1 + p2
p1 + n1 not possible
n1 + p2 not possible
n1 + n2
• Just checking signs of inputs is not
sufficient
Addition Overflow

• 2 + 3 = 5 > 4: 010 + 011 = 101 =? –


3<0
– Sum of two positive numbers should not
be negative
•Conclude: overflow
• -1 + -4: 111 + 100 = 011 > 0
– Sum of two negative numbers should
not be positive
•Conclude: overflow
Overflow = f(2) * ~(a2)*~(b2) + ~f(2) * a(2)
* b(2)
Subtraction Overflow

• No overflow on a-b if signs are the


same
• Neg – pos => neg ;; overflow
otherwise
• Pos – neg => pos ;; overflow
otherwise
Overflow = f(2) * ~(a2)*(b2) + ~f(2) * a(2) *
~b(2)
What to do on Overflow?

• Ignore ! (C language semantics)


– What about Java? (try/catch?)
• Flag – condition code
• Sticky flag – e.g. for floating point
– Otherwise gets in the way of fast
hardware
• Trap – possibly maskable
– MIPS has e.g. add that traps, addu that
does not
Zero and Negative

• Zero = ~[f(2) + f(1) + f(0)]


• Negative = f(2) (sign bit)
Zero and Negative

• May also want correct answer even


on overflow
• Negative = (a < b) = (a – b) < 0
even if overflow
• E.g. is –4 < 2?
100 – 010 = 1010 (-4 – 2 = -6): Overflow!
• Work it out: negative = f(2) XOR
overflow
Summary

• Binary representations,
signed/unsigned
• Arithmetic
– Full adder, ripple-carry, carry lookahead
– Carry-select, Carry-save
– Overflow, negative
– More (multiply/divide/FP) later
• Logical
– Shift, and, or
Carry Lookahead
 The above ALU is too slow
– Gate delays for add = 32 x FA + XOR ~= 64
 Theoretically, in parallel
– Sum0 = f(cin, a0, b0)
– Sumi = f(cin, ai…a0,, bi…b0)
– Sum31 = f(cin, a31…a0, b31…b0)
 Any boolean function in two levels, right?
– Wrong! Too much fan-in!
Carry Lookahead
Carry Lookahead
 Need compromise
– Build tree so delay is O(log2 n) for n bits
– E.g. 2 x 5 gate delays for 32 bits
 We will consider basic concept with
– 4 bits
– 16 bits
Carry Lookahead
Carry Lookahead
Carry Lookahead
0101 0100
0011 0110
Recall: ci+1 = ai * bi + ai * ci + bi * ci
= ai * bi + (ai + bi) * ci
= gi + pi * ci

Need both 1s to generate carry and at least one 1 to


propagate carry
Define: gi = ai * bi ## carry generate
pi = ai + bi ## carry propagate
Carry Lookahead
 Therefore

c 1 = g0 + p 0 * c 0
c2 = g1 + p1 * c1 = g1 + p1 * (g0 + p0 * c0)
= g1 + p 1 * g 0 + p 1 * p 0 * c 0
c3 = g2 + p2 * g1 + p2 * p1 * g0 + p2 * p1 * p0 * c0
c4 = g3 + p3*g2 + p3*p2*g1 + p3*p2*p1*g0 + p3*p2*p1*p0*c0
 Uses one level to form pi and gi, two levels for
carry
 But, this needs n+1 fanin at the OR and the
rightmost AND
4-bit Carry Lookahead Adder
c0
c4 Carry Lookahead Block

g3 p3 a3 b3 g2 p2 a2 b2 g1 p1 a1 b1 g0 p0 a0 b0

c3 c2 c1 c0

s3 s2 s1 s0
Hierarchical Carry Lookahead
for 16 bits
c0
c16 Carry Lookahead Block

G P a,b12-15 G P a,b 8-11 G P a4-7b4-7 G P a0-3b0-3

c12 c8 c4 c0

s12-15 s8-11 s4-7 s0-3


Hierarchical CLA for 16 bits
Build 16-bit adder from four 4-bit adders
Figure out G and P for 4 bits together
c4 = g3 + p3*g2 + p3*p2*g1 + p3*p2*p1*g0 + p3*p2*p1*p0*c0

= G0,3 + P0,3 c0

G0,3 = g3+ p3 * g2 + p3 * p2 * g1 + p3 * p2 * p1 * g0

P0,3 = p3 * p2 * p1 * p0 (Notation a little different from the book)

G4,7 = g7 + p7 * g6 + p7 * p6 * g5 + p7 * p6 * p5 * g4

P4,7 = p7 * p6* p5 * p4

G8,11 = g11+ p11 * g10 + p11* p10 * g9 + p11 * p10 * p9 * g8

P8,11 = p11*p10*p9*p8

G12,15 = g15 + p15 * g14 + p15* p14 * g13 + p15 * p14 * p13 * g12

P12,15 = p15 * p14 * p13 * p12


Hierarchical CLA for 16 bits
Build 16-bit adder from four 4-bit adders
Figure out G and P for 4 bits together
c4 = g3 + p3*g2 + p3*p2*g1 + p3*p2*p1*g0 + p3*p2*p1*p0*c0

= G0,3 + P0,3 c0

c8 = g7 + p7*g6 + p7*p6*g5 + p7*p6*p5*g4 + p7*p6*p5*p4*c4

= G4,7 + P4,7*c4 = G4,7 + P4,7*(G0,3 + P0,3 c0 )

= G4,7 + P4,7*G0,3 + P4,7*P0,3*c0 = G0,7 + P0,7*c0

= G4,7 + P4,7*G0,3 + P4,7*P0,3 c0

Thus G0,7 = G4,7 + P4,7*G0,3 and P0,7 = P4,7*P0,3


Carry Lookahead Basics
Generalizing expressions for G’s and P’s
Gi, k = Gj,k + Pj, k * Gi,j-1

Pi,k = Pj, k * Pi,j-1

where j = (i+k+1)/2

Thus

G8,15 = G12,15 + P12,15 * G8,11 P8,15 = P12, 15* P8,11

G0,15 = G8,15 + P8,15 * G0,7 P0,15 = P8, 15 * P0,7


Other Adders: Carry-Select
 Two adds in parallel: for cin = 0 and for cin = 1
– When Cin is actually generated, select the correct result

60
© 2000 Mark Hill
Carry-Select Adder

61
© 2000 Mark Hill
Other Adders: Carry Save
Logical Operations
 Bitwise AND, OR, XOR, NOR
– Implement w/ 32 gates in parallel
 Shifts and rotates
– sll -> shift left logical (0->LSB)
– srl -> shift right logical (0->MSB)
– sra -> shift right arithmetic (old MSB->new MSB)
– rol => rotate left (MSB->LSB)
– ror => rotate right (LSB->MSB)
Shifter

Using 2-1 MUXs each represented as Mux(S, y, x)


8-bit Combinational Shifter
Stage 0
D7 D6 D5 D4 D3 D2 D1 D0

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S0

D7 D6 D5 D4 D3 D2 D1 D0 }S0=0

D6 D5 D4 D3 D2 D1 D0 0 }S0=1

stage0<7:0> = Mux(S0 ,D<7:0>, D<6:0> || 0)


8-bit Combinational Shifter
Stage 1

D7 D6 D5 D4 D3 D2 D1 D0

0 0

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
S1

D7 D6 D5 D4 D3 D2 D1 D0 }S1=0

D5 D4 D3 D2 D1 D0 0 0 }S1=1

stage1<7:0> = Mux(S1 ,D<7:0>, D<5:0> || 00)


8-bit Combinational Shifter:
Stage 2

D7 D6 D5 D4 D3 D2 D1 D0
0 0 0 0

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
S2

D7 D6 D5 D4 D3 D2 D1 D0 }S2=0

D3 D2 D1 D0 0 0 0 0 }S2=1

stage2<7:0> = Mux(S2 ,D<7:0>, D<3:0> || 0000)


The 8-bit Combinational Shifter
D7 D6 D5 D4 D3 D2 D1 D0

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Shamt0

0
0

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
Shamt1

0 0 0 0

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
Shamt2

Dout7 Dout6 Dout5 Dout4 Dout3 Dout2 Dout1 Dout0


Shifter
stage0<7:0> = Mux(shamt0, D<7:0>, D<6:0> || 0)

stage1<7:0> = Mux(shamt1, stage0<7:0>, stage0<5:0> || 00)

Dout<7:0) = Mux(shamt2, stage1<7:0>, stage1<3:0> || 0000)

ASSIGNMENT:

Implement an 8-bit barrel shifter using MUXs.


70
© 2000 Mark Hill

You might also like