CA L7 Unit3 Slides
CA L7 Unit3 Slides
1
Outline: Unit 3
2
Integer Addition
3
Integer Addition: Overflow
4
Integer Subtraction
5
Integer Subtraction: Overflow
6
Review of Logic Design
Input 1 Output 1
System
Input 2 Output 2
0-1 0-1
Digital Logic
System
0-1 0-1
7
Review of Logic Design
0-1 0-1
Combinational
Logic System
0-1 0-1
Sequential Logic
0-1 0-1
System
8
Review of Logic Design
• Truth Table
– Because a combinational logic block contains no memory, it can be completely specified
by defining the values of the outputs for each possible set of input values.
– Such a description is normally given as a truth table.
9
Review of Logic Design
• Logic Gates
– Logic blocks/systems are built from Logic Gates that implement basic logic functions
A
B
A.B A + B𝐴
10
Review of Logic Design
• Decoder
– The decoder has an n-bit input and 2n outputs, where only one output is asserted for each
input combination
11
Review of Logic Design
• Multiplexor (Selector)
– Consider the two-input multiplexor.
– This multiplexor has three inputs: two data values and a selector (or control) value.
– The selector value determines which of the inputs becomes the output.
12
Review of Logic Design
• Multiplexor (Selector)
– Consider the two-input multiplexor.
– This multiplexor has three inputs: two data values and a selector (or control) value.
– The selector value determines which of the inputs becomes the output.
13
Arithmetic Logic Unit (ALU)
• The arithmetic logic unit (ALU) is the brain of the computer, the device that
performs the arithmetic operations like addition and subtraction or logical
operations like AND and OR.
14
Arithmetic Logic Unit (ALU)
• The arithmetic logic unit (ALU) is the brain of the computer, the device that
performs the arithmetic operations like addition and subtraction or logical
operations like AND and OR.
• Let’s focus on 1-bit ALU and progressively build an ALU with more
functionality
15
Arithmetic Logic Unit (ALU)
16
1-bit Adder
17
Arithmetic Logic Unit (ALU)
he 1-bit Logical Unit for AND and OR. The 1-bit ALU that performs
AND, OR, and addition.
18
Arithmetic Logic Unit (ALU)
The 1-bit ALU that performs Support for NOR operation (through
AND, OR, addition, and subtraction AND).
(through addition).
• DeMorgan’s Theorem .
21
Arithmetic Logic Unit (ALU)
A 64-bit ALU constructed from 64 1-bit ALUs (Notice the ALU for
24
Most Significant Bit)
Arithmetic Logic Unit (ALU)
A 64-bit ALU constructed from 64 1-bit ALUs (Notice the ALU for
26
Most Significant Bit)
Arithmetic Logic Unit (ALU)
27
Arithmetic Logic Unit (ALU)
Simplifying Control
28
Arithmetic Logic Unit (ALU)
Simplifying Control
29
Arithmetic Logic Unit (ALU)
ALU Control
30
Arithmetic Logic Unit (ALU)
31
Processor Implementation (Chapter 4)
• Instruction Count
– Determined by Compiler and Instruction Set Architecture
• CPI and Clock cycle time
– Determined by Processor Implementation
32
Processor Implementation (Chapter 4)
• Instruction Count
– Determined by Compiler and Instruction Set Architecture
• CPI and Clock cycle time
– Determined by Processor Implementation
– We will examine two RISC-V processor implementations
• A simplified version
• A more realistic pipelined version
33
RISC-V Processor Implementation: Simplified Version
34
RISC-V Processor Implementation: Overview
35
RISC-V Processor Implementation: Instruction Execution
36
RISC-V Processor Implementation: Overview
37
RISC-V Processor Implementation: Instruction Execution
38
RISC-V Processor Implementation: Overview
39
RISC-V Processor Implementation: Instruction Execution
40
RISC-V Processor Implementation: Overview
41
RISC-V Processor Implementation: Instruction Execution
42
RISC-V Processor Implementation: Overview
43
RISC-V Processor Implementation: Instruction Execution
5. PC target address or PC + 4
44
RISC-V Processor Implementation: Overview
45
RISC-V Processor Implementation: Multiplexers
46
RISC-V Processor Implementation: Control
47
RISC-V Processor Implementation: Overview
48
RISC-V Processor Implementation: Instruction Execution
49
RISC-V Processor Implementation: Overview
50
RISC-V Processor Implementation: Instruction Execution
51
RISC-V Processor Implementation: Overview
52
RISC-V Processor Implementation: Instruction Execution
53
RISC-V Processor Implementation: Overview
54
RISC-V Processor Implementation: Instruction Execution
55
RISC-V Processor Implementation: Instruction Execution
5. PC target address or PC + 4
56
RISC-V Processor Implementation: Overview
57
RISC-V Processor Implementation: Overview
58
RISC-V Processor Implementation: Overview
• Datapath Design
59
Building a Datapath
• Datapath
– Elements that process data and addresses in the CPU
– Examples: Registers, ALUs, mux’s, memories, …
60
Building a Datapath: Instruction Fetch
61
RISC-V R-format Instructions
• Instruction fields
– opcode: operation code
– rd: destination register number
– funct3: 3-bit function code (additional opcode)
– rs1: the first source register number
– rs2: the second source register number
– funct7: 7-bit function code (additional opcode)
62
Building a Datapath: R-Format Instructions
63
RISC-V I-format Load Instruction
• Instruction fields
– opcode: operation code
– rd: destination register number
– funct3: 3-bit function code (additional opcode)
– rs1: source or base address register number
– Immediate: constant operand, or offset added to base address (2’s
complement, sign extended)
64
RISC-V S-format Store Instructions
• Instruction fields
– opcode: operation code
– funct3: 3-bit function code (additional opcode)
– rs1: base address register number
– rs2: source operand register number
– immediate: offset added to base address (Split so that rs1 and rs2 fields always
in the same place across various instruction formats)
65
Building a Datapath: Load-Store Instructions
66
Building a Datapath: Composing the Elements
67
Conditional Branches
68
Branch If Equal (beq)
– To calculate Branch Target Address, PC acts as the base address while a 12-bit
offset field is available in the instruction.
– The offset field is left-shifted 1-bit before being added to the PC to calculate the
branch target address.
– Why? The offset to the target instruction is supposed to represent the number
of instructions (not the number of bytes) that we need to jump. If we restrict
ourselves to 32-bit instructions (as in this course), then we need to shift by 2.
However, shift by 1 is there to support the extension to RISC-V ISA that contains
16-bit compressed instructions.
69
Branch If Equal (beq)
• Compare operands
– Use ALU, subtract and check
Zero output
• Sign-extend displacement
• Add to PC value
70
Building a Datapath: Composing the Elements
• Full Datapath
71
RISC-V Processor Implementation: Overview
72
Arithmetic Logic Unit (ALU)
ALU Control
73
Arithmetic Logic Unit (ALU)
74
Building a Datapath: Composing the Elements
• Full Datapath
75
ALU Usage for Different Instructions
• Load-Store Instructions
– ALU used for addition
• R-type Instruction
– ALU usage depends on opcode
76
Generating the ALU Control Input
77
Datapath with Control
78
Generating the ALU Control Input
79
Generating the ALU Control Input
Truth Table
• We can generate the 4-bit ALU
control input using a small
control unit (ALUControl)
80
Control Signals Other than ALU
81
Main Control Unit
82
Datapath with Control: R-Type Instruction
83
Datapath with Control: Load Instruction
84
Datapath with Control: Beq Instruction
85
Implementation of Main Control Unit
86
Implementation of Main Control Unit
Truth Table
• Control signals are derived
from binary encoded
instructions.
87
Datapath with Control
88
RISC-V Instruction Execution Steps
89
Steps in RISC-V Instruction Execution
90
Review of Logic Design
0-1 0-1
Combinational
Logic System
0-1 0-1
Sequential Logic
0-1 0-1
System
91
Combinational Logic Elements: Example
A A
Y + Y
B
B
A
I0 M
u Y ALU Y
I1 x
B
S F
92
Sequential Logic (State) Elements: Example
D Q Clk
D
Clk
Q
93
Sequential Logic (State) Elements: Example
Clk
D Q
Write
Write
D
Clk
Q
94
Combination of Combinational and Sequential Logic
Elements
• Because only state elements can store a data value, any collection of
combinational logic must have its inputs come from a set of state
elements and its outputs written into a set of state elements
95
Datapath with Control
96
Steps in RISC-V Instruction Execution
97
Steps in RISC-V Instruction Execution
ld x1, 100(x4)
ld x2, 200(x4)
ld x3, 300(x4)
98
Steps in RISC-V Instruction Execution
ld x1, 100(x4)
ld x2, 200(x4)
ld x3, 300(x4)
Clock Cycle: 1 99
Steps in RISC-V Instruction Execution
ld x1, 100(x4)
ld x2, 200(x4)
ld x3, 300(x4)
ld x1, 100(x4)
ld x2, 200(x4)
ld x3, 300(x4)
• Instruction Count
– Determined by Compiler and Instruction Set Architecture
• CPI and Clock cycle time
– Determined by Processor Implementation
102
Performance Issues of Single Cycle Processor Design
• CPI is 1.
103
Steps in RISC-V Instruction Execution
104
RISC-V Performance: Single Cycle
• Assume that the operation times for the major functional units of data
path are :
– 100ps for register read or write
– 200 ps for memory access for instructions or data,
– 200 ps for ALU operation
Single-Cycle Implementation
• Clock cycle time must support the
longest instruction (i.e. ld)
• Clock Cycle Time : 800 ps
105
Steps in RISC-V Instruction Execution
106
Clock Cycle = 800
RISC-V Performance: Single Cycle
Single-cycle
107
Performance Issues of Single Cycle Processor Design
• CPI is 1.
108
Performance Issues of Single Cycle Processor Design
• CPI is 1.
109