2. Computer System Architecture
2. Computer System Architecture
the digital logic circuits are basic building blocks of the digital systems (digital computers). these
digital logic circuits can be classified into two categories suchas combinational logic circuits and
sequential logic circuits. before studying about the difference between combinational and sequential
logic circuits, primarily, we must know what combinational logic circuit is and what are sequential
logic circuits.
Digital Computers
a digital computer can be considered as a digital system that performs various computational tasks.
the first electronic digital computer was developed in the late 1940s and was used primarily for
numerical computations.
by convention, the digital computers use the binary number system, which has two digits: 0 and 1. a
binary digit is called a bit. A computer system is subdivided into two functional entities: hardware and
software. The hardware consists of all the electronic components and electromechanical devices
that comprise the physical entity of the device. The software of the computer consists of the
instructions and data that thecomputer manipulates to perform various data-processing tasks.
the central processing unit (CPU) contains an arithmetic and logic unit for manipulating data, several
registers for storing data, and a control circuit for fetching and executing instructions.
the memory unit of a digital computer contains storage for instructions anddata.
the random-access memory (dram) for real-time processing of the data.
the input-output devices for generating inputs from the user anddisplaying the final results to the
user.
the input-output devices connected to the computer include the keyboard, mouse, terminals,
magnetic disk drives, and other communication devices.
Logic Gates
the logic gates are the main structural part of a digital system.
logic gates are a block of hardware that produces signals of binary 1 or 0 when input logic
requirements are satisfied.
each gate has a distinct graphic symbol, and its operation can be describedby means of algebraic
expressions. The seven basic logic gates include: and, or, xor, not, Nand, nor, andnor.
the relationship between the input-output binary variables for each gatecan be represented in tabular
171
form by a truth table.
each gate has one or two binary input variables designated by a and b andone binary output variable
designated by x.
And Gate:
the and gate is an electronic circuit which gives a high output only if all its inputs are high. the and
operation is represented by a dot (.) sign.
r gate:
172
the or gate is an electronic circuit which gives a high output if one or more of its inputs are high. the
Nand Gate:
the not-and (nand) gate which is equal to an and gate followed by a not gate.the nand gate gives a
high output if any of the inputs are low. the nand gate is represented by a and gate with a small circle
on the output. the small circle represents inversion.
Nor Gate:
the not-or (nor) gate which is equal to an or gate followed by a not gate. nor gate gives a low output
if any of the inputs are high. the nor gate is represented by an or gate with a small circle on the output.
the small circle represents inversion.
173
the 'exclusive-or' gate is a circuit which will give a high output if one of its inputs is high but not both
of them. the xor operation is represented by an encircled plus sign.
Exclusive-Nor/Equivalence Gate:
the 'exclusive-nor' gate is a circuit that does the inverse operation to the xor gate. it will give a low
output if one of its inputs is high but not both of them. the small circle represents inversion.
Boolean Algebra
Boolean algebra can be considered as an algebra that deals with binary variables and logic
operations. boolean algebraic variables are designated by letters such asa, b, x, and y. the basic
operations performed are and, or, and complement.
the boolean algebraic functions are mostly expressed with binary variables, logic operation symbols,
parentheses, and equal sign. for a given value of variables, the boolean function can be either 1 or 0.
for instance, consider the boolean function:
f = x + y'z the logic diagram for the boolean function f = x + y'z can be represented as:
174
the boolean function f = x + y'z is transformed from an algebraicexpression into a logic diagram
composed of and, or, and inverter gates.inverter at input 'y' generates its complement y'.
there is an and gate for the term y'z, and an or gate is used to combinethe two terms (x and y'z).
the variables of the function are taken to be the inputs of the circuit, andthe variable symbol of the
function is taken as the output of the circuit.
the truth table for the boolean function f = x + y'z can be represented as:
Note: A truth table can represent the relationship between a function and its
binary variables. To represent a function in a truth table, we need a list of the 2^n
combinations of n binary variables.
Map Simplification
the map method involves a simple, straightforward procedure for simplifyingboolean expressions.
map simplification may be regarded as a pictorial arrangement of the truth tablewhich allows an easy
interpretation for choosing the minimum number of terms needed to express the function
algebraically. the map method is also known as karnaugh map or k-map.
each combination of the variables in a truth table is called a mid-term.
note: when expressed in a truth table a function of n variables will have 2^n min-terms, equivalent to
the 2^n binary numbers obtained from n bits.
there are four min-terms in a two-variable map. therefore, the map consists offour squares, one for
each min-term. the 0's and 1's marked for each row, and each column designates the values of
variable x and y, respectively.
176
Two-Variable Map:
the map was drawn in part (b) in the above image is marked with numbers in each row and each
column to show the relationship between the squaresand the three variables.
any two adjacent squares in the map differ by only one variable, which is primed in one square and
unprimed in the other. for example, m5 and m7lie in the two adjacent squares. variable y is primed in
m5 and unprimed inm7, whereas the other two variables are the same in both the squares.
from the postulates of boolean algebra, it follows that the sum of two min-terms in adjacent squares
177
can be simplified to a single and term consistingof only two literals. for example, consider the sum
of two adjacent squaressay m5 and m7:
m5+m7 = xy'z+xyz= xz(y'+y)= xz.
178
the simple time independent logic circuits that are implemented using boolean circuits whose output
logic value depends only on the input logic values can be called as combinational logic circuits.
Truth Table
Boolean Algebra
Logic Diagram
combinational logic circuit using logic gates the graphical representation of combinational logic
functions using logic gates is called as logic diagram. the logic diagram for above discussed logic
function truthtable and boolean expression can be realized as shown in the above figure.
the combinational logic circuits can also be called as decision-making circuits,as these are designed
using individual logic gates. the combinational logic is the process of combining logic gates to
process the given two or more inputs such that to generate at least one output signal based on the
logic function of each logic gate.
as shown in figure there are two types of input to the combinational logic :
external inputs which not controlled by the circuit.
internal inputs which are a function of a previous output states.
secondary inputs are state variables produced by the storage elements, where as secondary outputs
are excitations for the storage elements.
types of sequential circuits – there are two types of sequential circuit :
asynchronous sequential circuit – these circuit do not use a clock signal but uses the pulses of
the inputs. these circuits are faster than synchronous sequential circuits because there is clock pulse
and change their state immediately when there is a change in the input signal. we use
asynchronous sequential circuits when speed of operation is importantand independent of internal
clock pulse.
181
but these circuits are more difficult to design and their output is uncertain.
synchronous sequential circuit – these circuit uses clock signal and level inputs (or pulsed) (with
restrictions on pulse width and circuit propagation).the output pulse is the same duration as the clock
pulse for the clocked sequential circuits. since they wait for the next clock pulse to arrive to perform
the next operation, so these circuits are bit slower compared to asynchronous. level output changes
state at the start of an input pulse and remains in that until the next input or clock pulse.
we use synchronous sequential circuit in synchronous counters, flip flops, and in the design of moore-
mealy state management machines.
182
sequential logic circuit
the figure represents the block diagram of the sequential logic circuit.
difference between synchronous and asynchronous sequential circuits
as the name suggests both synchronous and asynchronous sequential circuits are the type of
sequential circuits which uses feedback for the next output generationhowever on the basis of the
type of this feedback both circuits can be get differentiated.
following are the important differences between synchronous and asynchronous sequential circuits
−
sr.
key synchronous sequentialcircuits asynchronous sequential circuits
no.
sr.
key synchronous sequentialcircuits asynchronous sequential circuits
no.
signals.
in synchronoussequential
on other hand unclocked flip flop or time
circuits, the memory unit which
2 memoryunit delay is used as memoryelement in case of
is being get used for
asynchronoussequential circuits.
governance is clocked flip flop.
183
however on other hand the presence of
it is easy to design feedback among logic gates causes
4 complexity
synchronous sequentialcircuits instability issues making the design of
asynchronoussequential circuits difficult.
due to the propagation delay of
clock signal inreaching all
since there is no clock signal delay, these
elements of the circuit the
5 performance are fast compared to the synchronous
synchronous sequentialcircuits
sequential circuits
are slower in its operation
speed
on other hand asynchronous circuits are
synchronous circuits areused in used in low power and high speed
6 example counters, shift registers, operations such as simple microprocessors,
memory units. digitalsignal processing units and in
communication systems for email
sr.
key synchronous sequentialcircuits asynchronous sequential circuits
no.
184
logic.
Sequential Logic Circuits
the digital logic circuits whose outputs can be determined using the logic function of current state
inputs and past state inputs are called assequential logic circuits.
these sequential digital logic circuits are capable to retain the earlierstate of the system based on the
current inputs and earlier state.
hence, unlike the combinational logic circuits, these sequential digitallogic circuits are capable of
storing the data in a digital circuit.
the sequential logic circuits contain memory elements.
the latch is considered as the simplest element used to retain the earlier memory or state in the
sequential digital logic.
latches can also be called as flip-flops, but, if we consider the true structural form, then it can be
considered as a combinational circuit withone or more than one outputs fed back as inputs.
these sequential digital logic circuits are used in maximum types of memory elements and also in
finite state machines, which are replicawatches for sale digital circuit models with finite possible
states. the maximum number of sequential logic circuits uses a clock for triggering the flip flops
operation. if the flip flop in the digital logic circuit is triggered, then the circuit is called as synchronous
sequential circuit and the other circuits (which are simultaneously not triggered) are called as
asynchronous sequential circuits.
the sequential digital logic circuits utilize the feedbacks from outputs toinputs.
the sequential logic circuit’s behavior can be defined by using the set ofoutput functions and set of
next state or memory functions.
in practical digital logic circuits, combinational digital logic circuits andsequential digital logic circuits
are used.
185
flip flop conversion
186
Sr-Ff To Jk-Ff Conversion
there are eight possible combinations for two i/ps j and k. for every combination of j, k & qp, the
equivalent qp+1 states are found. qp+1 simply recommends thefuture values to be found by the jk-
flip flop after the importance of qp. then the table is finished by writing the values of s & r compulsory
to get each qp+1 from the equivalent qp. that is, the s and r values are compulsory to change the state
of the flip flop from qp to qp+1 are written
for every combination, the equivalent qp+1 o/p’s are found. the o/p’s for the combinations of s=r=1
are not acceptable for an sr-ff. therefore the o/p’s areconsidered as invalid and the j & k values are
taken as “don’t care”.
187
sr-ff to d-ff conversion
188
Jk-Flip Flop To D-Flip Flop Conversion
in this type of flip flop conversion, j&k are the actual inputs where d is the external input of the flip
flop. the four combinations of the flip flop will be doneby using d & qp, and in terms of these two j&k
are expressed. the conversion table with four combinations, jk-ff to d-ff conversion logic diagram and
karnaugh map for j & k in terms of d & are shown below.
the eight combinations can make by using j, k and qp that isshown in the conversion table below. d is
stated in terms of j, k & qp. the karnaugh map d in terms of j, k & qp, conversion table and the logic
diagram of the d-ff to jk-ff is shown below.
thus, this is all about different types of flip flop conversions, that includes sr-ff to jk-ff , jk-ff to sr-ff ,
sr-ff to d-ff , d-ff to sr-ff , jk-ff to t-ff , jk-ff to d-ff and d-ff to jk-ff . we hope that you have got a better
understanding of this concept. furthermore, any doubts regarding the applications of flip-flops or
electronics projects, please give your feedback by commenting in the commentsection below.here is
a question for you, what are the applications of flip flops?
Integrated Circuit
a microprocessor is digital is a digital circuit which is built using a combinationlogic functions. the
microprocessor package contains an integrated circuit.
an integrated circuit is electronic circuit or device that has electronic components on a small
semiconductor chip. it has functionality of logic and or amplifying of asignal. these are mainly two
types of circuits: digital or analog. analog ics handlecontinuous signals such as audio signals and
digital ics handle discrete signals such as binary values.an integrated circuit, or ic, is small chip that
can function as an amplifier, oscillator, timer, microprocessor, or even computer memory. an ic is a
small wafer, usually made of silicon, that can hold anywhere from hundreds to millionsof transistors,
resistors, and capacitors. these extremely small electronics can perform calculations and store data
189
using either digital or analog technology.
digital ics use logic gates, which work only with values of ones and zeros. a low signal sent to to a
component on a digital ic will result in a value of 0, while a high signal creates a value of 1. digital ics
are the kind you will usually find incomputers, networking equipment, and most consumer electronics.
analog, or linear ics work with continuous values. this means a component on a linear ic can take a
value of any kind and output another value. the term "linear"is used since the output value is a linear
function of the input. for example, a component on a linear ic may multiple an incoming value by a
factor of 2.5 and output the result. linear ices are typically used in audio and radio frequency
amplification.
there might ten billion or more transistors in modern digital circuit. so, we need integrated circuits
(ics) that combine a small or large number of these transistors to achieve particular functionality.
these circuits provide benefiting students, verylow cost and higher level of reliability. examples of
integrated circuits are mos, cmos, ttl etc. cmos ics are fault tolerant, reduce risk of chip failure, use
190
of anti-static foam for storage and transport of ics. ttl technology requires regulated power supply of
5 volts.
Decoders
decoder is a combinational circuit that has ‘n’ input lines and maximum of2n output lines. one of these
outputs will be active high based on the combination of inputs present, when the decoder is enabled.
that means decoderdetects a particular code. the outputs of the decoder are nothing but the min
terms of ‘n’ input variables lines, when it is enabled.
To 4 Decoder
let 2 to 4 decoder has two inputs a1 & a0 and four outputs y3, y2, y1 & y0.the block diagram of 2 to 4
decoder is shown in the following figure.
191
one of these four outputs will be ‘1’ for each combination of inputs when enable,e is ‘1’. the truth table
of 2 to 4 decoder is shown below.
e a1 a0 y3 y2 y1 y0
0 x x 0 0 0 0
1 0 0 0 0 0 1
1 0 1 0 0 1 0
1 1 0 0 1 0 0
1 1 1 1 0 0 0
from truth table, we can write the boolean functions for each output asy3=e.a1.a0
y2=e.a1.a0′y1=e.a1′.a0y0=e.a1′.a0′
192
each output is having one product term. so, there are four product terms in total.we can implement
these four product terms by using four and gates having three inputs each & two inverters. the
circuit diagram of 2 to 4 decoder is shownin the following figure.
therefore, the outputs of 2 to 4 decoder are nothing but the min terms of two input variables a1 & a0,
when enable, e is equal to one. if enable, e is zero, then allthe outputs of decoder will be equal to zero.
similarly, 3 to 8 decoder produces eight min terms of three input variables a2,
a1 & a0 and 4 to 16 decoder produces sixteen min terms of four input variables a3,a2, a1 & a0.
implementation of higher-order decoders
now, let us implement the following two higher-order decoders using lower-orderdecoders.
3 to 8 decoder
4 to 16 decoder
To 8 Decoder
in this section, let us implement 3 to 8 decoder using 2 to 4 decoders. we know that 2 to 4 decoder
has two inputs, a1 & a0 and four outputs, y3 to y0. whereas, 3to 8 decoder has three inputs a2, a1 & a0
and eight outputs, y7 to y0.
we can find the number of lower order decoders required for implementinghigher order decoder using
the following formula.
required number of lower order decoders =m2/m1 required number of lower order decoders =
m2/m1where,
m1/m1 is the number of outputs of lower order decoder. m2/m2 is the number of outputs of higher
order decoder.
here, m1 = 4 and m2 = 8. substitute, these two values in the above formula.
required number of 2 to 4 decoders = 8/4 =2 required number of 2 to 4decoders=8/4=2
193
therefore, we require two 2 to 4 decoders for implementing one 3 to 8 decoder.the block diagram of
3 to 8 decoder using 2 to 4 decoders is shown in the following figure.
the parallel inputs a1 & a0 are applied to each 2 to 4 decoder. the complement ofinput a2 is connected
to enable, e of lower 2 to 4 decoder in order to get the outputs, y3 to y0. these are the lower four min
terms. the input, a2 is directly connected to enable, e of upper 2 to 4 decoder in order to get the
outputs, y7 to y4. these are the higher four min terms.
To 16 Decoder
in this section, let us implement 4 to 16 decoder using 3 to 8 decoders. we knowthat 3 to 8 decoder
has three inputs a2, a1 & a0 and eight outputs, y7 to y0. whereas, 4 to 16 decoder has four inputs a3, a2,
a1 & a0 and sixteen outputs,y15 to y0 we know the following formula for finding the number of lower
order decodersrequired.
required number of lower order decoders = m2/m1 required number of lowerorder decoders = m2/m1
substitute, m1 = 8 and m2 = 16 in the above formula.
required number of 3 to 8 decoders = 16/8 = 2 required number of 3 to 8decoders = 16/8 =2
194
therefore, we require two 3 to 8 decoders for implementing one 4 to 16 decoder.the block diagram of
4 to 16 decoder using 3 to 8 decoders is shown in the following figure.
the parallel inputs a2, a1 & a0 are applied to each 3 to 8 decoder. the complement of input, a3 is
connected to enable, e of lower 3 to 8 decoder in order to get the outputs, y7 to y0. these are the lower
eight min terms. the input,a3 is directly connected to enable, e of upper 3 to 8 decoder in order to get
the outputs, y15 to y8. these are the higher eight min terms.
Multiplexers
multiplexer is a combinational circuit that has maximum of 2n data inputs, ‘n’ selection lines and
single output line. one of these data inputs will be connected to the output based on the values of
selection lines.
since there are ‘n’ selection lines, there will be 2n possible combinations of zeros and ones. so, each
combination will select only one data input. multiplexer is also called as mux.
4x1 Multiplexer
4x1 multiplexer has four data inputs i3, i2, i1 & i0, two selection lines s1 & s0 and one output y. the block
diagram of 4x1 multiplexer is shown in the following figure.
195
one of these 4 inputs will be connected to the output based on the combination of inputs present at
these two selection lines. truth table of 4x1 multiplexer is shown below.
s1 s0 y
0 0 i0
0 1 i1
1 0 i2
1 1 i3
from truth table, we can directly write the boolean function for output, y as
y=s1′s0′i0+s1′s0i1+s1s0′i2+s1s0i3
we can implement this boolean function using inverters, and gates & or gate.the circuit diagram of
4x1 multiplexer is shown in the following figure.we can easily understand the operation of the above
circuit. similarly, you can implement 8x1 multiplexer and 16x1 multiplexer by following the same
procedure.
implementation of higher-order multiplexers.
now, implement the following two higher-order multiplexers using lower-order multiplexers.
8x1 multiplexer
16x1 multiplexer
196
8x1 Multiplexer
in this section, implement 8x1 multiplexer using 4x1 multiplexers and 2x1 multiplexer. we know that
4x1 multiplexer has 4 data inputs, 2 selection lines and one output. whereas, 8x1 multiplexer has 8
data inputs, 3 selection linesand one output.
so, we require two 4x1 multiplexers in first stage in order to get the 8 data inputs. since, each 4x1
multiplexer produces one output, we require a 2x1 multiplexer in second stage by considering the
outputs of first stage as inputs and to produce the final output.
let the 8x1 multiplexer has eight data inputs i7 to i0, three selection lines s2, s1 & s0 and one output y.
the truth table of 8x1 multiplexer is shown below.
s2 s1 s0 y
0 0 0 i0
0 0 1 i1
0 1 0 i2
0 1 1 i3
1 0 0 i4
1 0 1 i5
1 1 0 i6
1 1 1 i7
we can implement 8x1 multiplexer using lower order multiplexers easily byconsidering the above
truth table. the block diagram of 8x1 multiplexer is shown in the following figure.
197
the same selection lines, s1 & s0 are applied to both 4x1 multiplexers. the data inputs of upper 4x1
multiplexer are i7 to i4 and the data inputs of lower 4x1 multiplexer are i3 to i0. therefore, each 4x1
multiplexer produces an outputbased on the values of selection lines, s1 & s0.
the outputs of first stage 4x1 multiplexers are applied as inputs of 2x1 multiplexer that is present in
second stage. the other selection line, s2 is applied to 2x1 multiplexer.
if s2 is zero, then the output of 2x1 multiplexer will be one of the 4 inputsi3 to i0 based on
the values of selection lines s1 & s0.
if s2 is one, then the output of 2x1 multiplexer will be one of the 4 inputsi7 to i4 based on the values
of selection lines s1 & s0.
therefore, the overall combination of two 4x1 multiplexers and one 2x1multiplexer performs as
one 8x1 multiplexer.
16x1 Multiplexer
in this section, implement 16x1 multiplexer using 8x1 multiplexers and 2x1 multiplexer. we know that
8x1 multiplexer has 8 data inputs, 3 selection lines and one output. whereas, 16x1 multiplexer has 16
data inputs, 4 selection lines and one output.
so, we require two 8x1 multiplexers in first stage in order to get the 16 data inputs. since, each 8x1
multiplexer produces one output, we require a 2x1 multiplexer in second stage by considering the
outputs of first stage as inputs and to produce the final output.
let the 16x1 multiplexer has sixteen data inputs i15 to i0, four selection lines s3 to s0 and one output y.
the truth table of 16x1 multiplexer is shown below.
s3 s2 s1 s0 y
0 0 0 0 i0
0 0 0 1 i1
198
0 0 1 0 i2
0 0 1 1 i3
0 1 0 0 i4
0 1 0 1 i5
0 1 1 0 i6
0 1 1 1 i7
1 0 0 0 i8
1 0 0 1 i9
1 0 1 0 i10
1 0 1 1 i11
1 1 0 0 i12
1 1 0 1 i13
1 1 1 0 i14
1 1 1 1 i15
we can implement 16x1 multiplexer using lower order multiplexers easily by considering the above
truth table. the block diagram of 16x1 multiplexer is shown in the following figure.
the same selection lines, s2, s1 & s0 are applied to both 8x1 multiplexers. the data inputs of upper 8x1
multiplexer are i15 to i8 and the data inputs of lower 8x1multiplexer are i7 to i0. therefore, each 8x1
multiplexer produces an outputbased on the values of selection lines, s2, s1 & s0.
199
the outputs of first stage 8x1 multiplexers are applied as inputs of 2x1 multiplexer that is present in
second stage. the other selection line, s3 is applied to 2x1 multiplexer.
if s3 is zero, then the output of 2x1 multiplexer will be one of the 8 inputs is7 to i0 based on the values
of selection lines s2, s1 & s0.
if s3 is one, then the output of 2x1 multiplexer will be one of the 8 inputs i15 to i8 based on the values
of selection lines s2, s1 & s0.
therefore, the overall combination of two 8x1 multiplexers and one 2x1 multiplexer performs as one
16x1 multiplexer.
Digital Registers
flip-flop is a 1 bit memory cell which can be used for storing the digital data. to increase the storage
capacity in terms of number of bits, we have to use a group of flip-flop. such a group of flip-flop is
known as a register. the n-bit register willconsist of n number of flip-flop and it is capable of storing
an n-bit word.
the binary data in a register can be moved within the register from one flip-flopto another. the registers
that allow such data transfers are called as shift registers. there are four mode of operations of a
shift register.
serial input serial output
serial input parallel output
parallel input serial output
parallel input parallel output
Block Diagram
Operation
before application of clock signal, let q3 q2 q1 q0 = 0000 and apply lsb bit of thenumber to be entered
to din. so din = d3 = 1. apply the clock. on the first falling
edge of clock, the ff-3 is set, and stored word in the register is q3 q2 q1 q0 =1000.
200
apply the next bit to din. so din = 1. as soon as the next negative edge of the clockhits, ff-2 will set and
the stored word change to q3 q2 q1 q0 = 1100.
apply the next bit to be stored i.e. 1 to din. apply the clock pulse. as soon as thethird negative clock
edge hits, ff-1 will be set and output will be modified to
q3 q2 q1 q0 = 1110.
similarly with din = 1 and with the fourth negative clock edge arriving, the storedword in the register is
q3 q2 q1 q0 = 1111.
201
Truth Table
Waveforms
Block Diagram
202
Parallel Input Serial Output (Piso)
data bits are entered in parallel fashion.
the circuit shown below is a four bit parallel input serial output register.
output of previous flip flop is connected to the input of the next one via acombinational circuit.
the binary input word b0, b1, b2, b3 is applied though the samecombinational circuit.
there are two modes in which this circuit can work namely - shift mode orload mode.
Load Mode
when the shift/load bar line is low (0), the and gate 2, 4 and 6 become active they will pass b1, b2, b3
bits to the corresponding flip-flops. on the low going edgeof clock, the binary input b0, b1, b2, b3 will
get loaded into the corresponding flip- flops. thus parallel loading takes place.
Shift Mode
when the shift/load bar line is low (1), the and gate 2, 4 and 6 become inactive. hence the parallel
loading of the data becomes impossible. but the and gate 1,3and 5 become active. therefore the
shifting of data from left to right bit by bit onapplication of clock pulses. thus the parallel in serial out
operation takes place.
203
Block Diagram
Block Diagram
204
original number by2.
hence if we want to use the shift register to multiply and divide the givenbinary number, then we
should be able to move the data in either left or right direction.
such a register is called bi-directional register. a four bit bi-directional shiftregister is shown in fig.
there are two serial inputs namely the serial right shift data input dr, andthe serial left shift data input
dl along with a mode select input (m).
Block Diagram
Operation
205
a bi-directional register. for serial left operation, the input is applied to the serial input which goes to
and gate-1 shown in figure. whereas for the shift right operation, the serial input is appliedto d input.
Block Diagram
Counter
in digital logic and computing, a counter is a device which stores (and sometimes displays) the
number of times a particular event or process has occurred, often inrelationship to a clock signal.
counters are used in digital electronics for countingpurpose, they can count specific event happening
in the circuit. for example, in up counter a counter increases count for every rising edge of clock. not
only counting, a counter can follow the certain sequence based on our design like any random
sequence 0,1,3,2… .they can also be designed with the help of flip flops.
Counter Classification
counters are broadly divided into two categories
asynchronous counter
synchronous counter
Asynchronous Counter
in asynchronous counter we don’t use universal clock, only first flip flop is drivenby main clock and
the clock input of rest of the following flip flop is driven by output of previous flip flops. we can
understand it by following diagram-
206
it is evident from timing diagram that q0 is changing as soon as the rising edge ofclock pulse is
encountered, q1 is changing when rising edge of q0 is encountered(because q0 is like clock pulse for
second flip flop) and so on. in thisway ripples are generated through q0,q1,q2,q3 hence it is also
called ripple counter.
Synchronous Counter
unlike the asynchronous counter, synchronous counter has one global clock which drives each flip
flop so output changes in parallel. the one advantage ofsynchronous counter over asynchronous
counter is, it can operate on higher frequency than asynchronous counter as it does not have
207
cumulative delay because of same clock is given to each flip flop.
Synchronous Counter Circuit
Decade Counter
a decade counter counts ten different states and then reset to its initial states. asimple decade
counter will count from 0 to 9 but we can also make the decade counters which can go through any
ten states between 0 to 15(for 4 bit counter).
clock pulse q3 q2 q1 q0
0 0 0 0 0
1 0 0 0 1
208
2 0 0 1 0
3 0 0 1 1
4 0 1 0 0
5 0 1 0 1
6 0 1 1 0
7 0 1 1 1
8 1 0 0 0
9 1 0 0 1
10 0 0 0 0
Important point: number of flip flops used in counter are always greater thanequal to (log2 n) where
n=number of states in counter.
Representation Of Data
data and instructions cannot be entered and processed directly into computers using human
language. any type of data be it numbers, letters,special symbols, sound or pictures must first be
converted into machine- readable form i.e. binary form. due to this reason, it is important to
understand how a computer together with its peripheral devices handles data in its electronic circuits,
on magnetic media and in optical devices.
209
Data Representation In Digital Circuits
electronic components, such as microprocessor, are made up of millions ofelectronic circuits. the
availability of high voltage(on) in these circuits is interpreted as ‘1’ while a low voltage (off) is
interpreted as ‘0’.this conceptcan be compared to switching on and off an electric circuit. when the
switch is closed the high voltage in the circuit causes the bulb to light (‘1’ state).on the other hand
when the switch is open, the bulb goes off (‘0’ state). this forms a basis for describing data
representation in digital computers using the binary number system.
detector that transforms the patterns into digital form.the presence of a magnetic field in one
direction on magnetic media is interpreted as 1; whilethe field in the opposite direction is interpreted
as “0”.magnetic technologyis mostly used on storage devices that are coated with special magnetic
materials such as iron oxide. data is written on the media by arranging the magnetic dipoles of some
iron oxide particles to face in the same direction and some others in the opposite direction
Number System
we are introduced to concept of numbers from a very early age. to a computer,everything is a number,
i.e., alphabets, pictures, sounds, etc., are numbers.
number system is categorized into four types −
binary number system consists of only two values, either 0 or 1
octal number system represents values in 8 digits.
decimal number system represents values in 10 digits.
hexadecimal number system represents values in 16 digits.
number system
system base digits
binary 2 01
210
octal 8 01234567
decimal 10 0123456789
hexadecimal 16 0123456789abcdef
1 byte 8 bits
211
1024 exabytes 1 zettabyte
Text Code
text code is format used commonly to represent alphabets, punctuation marksand other symbols.
four most popular text code systems are − ebcdic ascii
extended ascii unicode
Ebcdic
extended binary coded decimal interchange code is an 8-bit code that defines256 symbols. given
below is the ebcdic tabular column
Ascii
american standard code for information interchange is an 8-bit code thatspecifies character values
from 0 to 127.
212
ascii code decimal value character
Extended Ascii
extended American standard code for information interchange is an 8-bit code that specifies
character values from 128 to 255.
Unicode
unicode worldwide character standard uses 4 to 32 bits to represent letters,numbers and symbol.
213
Data Types
a very simple but very important concept available in almost all the programming languages which is
called data types. as its name indicates, a data type representsa type of the data which you can
process using your computer program. it can be numeric, alphanumeric, decimal, etc.
keep computer programming aside for a while and take an easy example of adding two whole
numbers 10 & 20, which can be done simply as follows −10 + 20 another problem where we want to
add two decimal numbers 10.50 & 20.50,which will be written as follows −10.50 + 20.50
the two examples are straightforward. now another example where we want to record student
information in a notebook. here we would like to record the following information −
name:
class:
section:
age:
sex:
now, put one student record as per the given requirement –
name: zara ali
class: 6th
section: j
age: 13
sex: f
the first example dealt with whole numbers, the second example added two decimal numbers,
whereas the third example is dealing with a mix of differentdata.
put it as follows −
student name "zara ali" is a sequence of characters which is also calleda string.
student class "6th" has been represented by a mix of whole number anda string of two characters.
such a mix is called alphanumeric.
student section has been represented by a single character which is 'j'.
student age has been represented by a whole number which is 13.
student sex has been represented by a single character which is 'f'.
this way, we realized that in our day-to-day life, we deal with different types of data such as strings,
characters, whole numbers (integers), and decimal numbers(floating point numbers).
similarly, when we write a computer program to process different types of data,we need to specify its
type clearly; otherwise the computer does not understand how different operations can be performed
on that given data. different programming languages use different keywords to specify different data
types. for example, c and java programming languages use int to specify integer data,whereas char
214
specifies a character data type.
subsequent chapters will show you how to use different data types in differentsituations. for now,
check the important data types available in c, java, and python and the keywords we will use to specify
those data types.
these data types are called primitive data types and you can use these data types to build more
complex data types, which are called user-defined data type, for example a string will be a sequence
of characters.
Number System
the technique to represent and work with numbers is called number system. decimal number system
is the most common number system. other
popular number systems include binary number system, octal number system,hexadecimal number
system, etc.
215
Decimal Number System
decimal number system is a base 10 number system having 10 digits from 0 to 9.this means that any
numerical quantity can be represented using these 10 digits. decimal number system is also a
positional value system. this means that the value of digits will depend on its position. let us take an
example to understand this.
say we have three numbers – 734, 971 and 207. the value of 7 in all threenumbers is different−
in 734, value of 7 is 7 hundreds or 700 or 7 × 100 or 7 × 102
in 971, value of 7 is 7 tens or 70 or 7 × 10 or 7 × 101
in 207, value 0f 7 is 7 units or 7 or 7 × 1 or 7 × 100
the weightage of each position can be represented as follows −
in digital systems, instructions are given through electric signals; variation is done by varying the
voltage of the signal. having 10 different voltages to implement decimal number system in digital
equipment is difficult. so, many number systemsthat are easier to implement digitally have been
developed. let’s look at them in detail.
in any binary number, the rightmost digit is called least significant bit (lsb) andleftmost digit is called
most significant bit (msb).
and decimal equivalent of this number is sum of product of each digit with itspositional value.
110102 = 1×24 + 1×23 + 0×22 + 1×21 + 0×20
= 16 + 8 + 0 + 2 + 0
= 2610
computer memory is measured in terms of how many bits it can store. here is achart for memory
216
capacity conversion.
1 byte (b) = 8 bits
1 kilobytes (kb) = 1024 bytes
1 megabyte (mb) = 1024 kb
1 gigabyte (gb) = 1024 mb
1 terabyte (tb) = 1024 gb
1 exabyte (eb) = 1024 pb
1 zettabyte = 1024 eb
1 yottabyte (yb) = 1024 zb
decimal equivalent of any octal number is sum of product of each digit with itspositional value.
7268 = 7×82 + 2×81 + 6×80
= 448 + 16 + 6
= 47010
decimal equivalent of any hexadecimal number is sum of product of each digitwith its positional
value.
27fb16 = 2×163 + 7×162 + 15×161 + 10×160
= 8192 + 1792 + 240 +10
= 1023410
0 0 0 0000
217
1 1 1 0001
2 2 2 0010
3 3 3 0011
4 4 4 0100
5 5 5 0101
6 6 6 0110
7 7 7 0111
8 8 10 1000
9 9 11 1001
a 10 12 1010
b 11 13 1011
c 12 14 1100
d 13 15 1101
e 14 16 1110
f 15 17 1111
Ascii
besides numerical data, computer must be able to handle alphabets, punctuation marks,
mathematical operators, special symbols, etc. that form the complete character set of english
language. the complete set of characters or symbols are called alphanumeric codes. the complete
alphanumeric code typically includes −
26 upper case letters
26 lower case letters
10 digits
7 punctuation marks
20 to 40 special characters
now a computer understands only numeric values, whatever the number system used. so all
characters must have a numeric equivalent called the alphanumeric code. the most widely used
alphanumeric code is American standard code for information interchange (ascii). ascii is a 7-bit
code that has 128 (27) possible codes.
218
Iscii
iscii stands for indian script code for information interchange. iiscii was developed to support indian
languages on computer. language supported by iisci include devanagari, tamil, bangla, gujarati,
gurmukhi, tamil, telugu, etc. iisci is mostly used by government departments and before it could catch
on, a new universal encoding standard called unicode was introduced.
Unicode
219
unicode is an international coding system designed to be used with different language scripts. each
character or symbol is assigned a unique numeric value,largely within the framework of ascii. earlier,
each script had its own encodingsystem, which could conflict with each other.
in contrast, this is what unicode officially aims to do − unicode provides a uniquenumber for every
character, no matter what the platform, no matter what the program, no matter what the language.
as mentioned in steps 2 and 4, the remainders have to be arranged in the reverseorder so that the first
remainder becomes the least significant digit (lsd) and the last remainder becomes the most
significant digit (msd).
decimal number − 2910 = binary number − 111012.
step 1 21 / 2 10 1
step 2 10 / 2 5 0
step 3 5/2 2 1
step 4 2/2 1 0
step 5 1/2 0 1
decimal number − 2110 = binary number − 101012
221
octal number − 258 = binary number − 101012
step 2 101012 28 58
octal number − 258 = binary number − 101012shortcut method - binary to hexadecimal steps
step 1 − divide the binary digits into groups of four (starting from the right).
step 2 − convert each group of four binary digits to one hexadecimalsymbol.
example
binary number − 101012
calculating hexadecimal equivalent −
222
step binary number hexadecimal number
Complement Arithmetic
complements are used in the digital computers in order to simplify the subtraction operation and for
the logical manipulations. for each radix-r system (radix r represents base of number system) there
are two types of complements.
223
the diminished radix complement is
2 diminished radix complement
referred toas the (r-1)'s complement
1's complement
the 1's complement of a number is found by changing all 1's to 0's and all 0's to1's. this is called as
taking complement or 1's complement. example of 1's complement is as follows.
2's complement
the 2's complement of binary number is obtained by adding 1 to the leastsignificant bit (lsb) of 1's
complement of the number.
2's complement = 1's complement + 1
there are two major approaches to store real numbers (i.e., numbers with fractional component) in
modern computing. these are (i) fixed point notation and (ii) floating point notation. in fixed point
notation, there are a fixed numberof digits after the decimal point, whereas floating point number
allows for a varying number of digits after the decimal point.
Fixed-Point Representation −
this representation has fixed number of bits for integer part and for fractional part. for example, if
given fixed-point representation is iiii.ffff, then you can store minimum value is 0000.0001 and
maximum value is 9999.9999. there arethree parts of a fixed-point number representation: the sign
field, integer field,and fractional field.
where, 0 is used to represent + and 1 is used to represent. 000000000101011 is15 bit binary value for
decimal 43 and 1010000000000000 is 16 bit binary valuefor fractional 0.625.
the advantage of using a fixed-point representation is performance and disadvantage is relatively
limited range of values that they can represent. so, it isusually inadequate for numerical analysis as it
does not allow enough numbers and accuracy. a number whose representation exceeds 32 bits
would have to be stored inexactly.
these are above smallest positive number and largest positive number which canbe store in 32-bit
representation as given above format. therefore, the smallest positive number is 2-16 ≈ 0.000015
approximate and the largest positive number is (215-1)+(1-2-16)=215(1-2-16) =32768, and gap between
these numbers is 2-16.
we can move the radix point either left or right with the help of only integer fieldis 1.
226
so, actual number is (-1)s(1+m)x2(e-bias), where s is the sign bit, m is themantissa, e is the exponent
value, and bias is the bias number.
note that signed integers and exponent are represented by either sign
representation, or one’s complement representation, or two’s complementrepresentation.
the floating point representation is more flexible. any non-zero number can be represented in the
normalized form of ±(1.b1b2b3 ...)2x2n this is normalized formof a number x.
the following description explains terminology and primary details of ieee 754 binary floating point
representation. the discussion confines to single and doubleprecision formats.
usually, a real number in binary will be represented in the following format,
imim-1…i2i1i0.f1f2…fnfn-1
where im and fn will be either 0 or 1 of integer and fraction parts respectively.
a finite number can also represented by four integers components, a sign (s), a base (b), a significand
(m), and an exponent (e). then the numerical value of thenumber is evaluated as
(-1)s x m x be where m < |b|
depending on base and the number of bits used to encode various components,the ieee 754 standard
defines five basic formats. among the five formats, the binary32 and the binary64 formats are single
precision and double precision formats respectively in which the base is 2.
table – 1 precision representation
227
normalized number. the implied most significant bit can be used to represent even more accurate
significand (23 + 1 = 24 bits) which is called subnormal representation. the floating point numbers
are to berepresented in normalized form.
the subnormal numbers fall into the category of de-normalized numbers. the subnormal
representation slightly reduces the exponent range and can’t be normalized since that would result
in an exponent which doesn’t fit in the field. subnormal numbers are less accurate, i.e. they have less
room for nonzero bits inthe fraction field, than normalized numbers. indeed, the accuracy drops as
the size of the subnormal number decreases. however, the subnormal representationis useful in filing
gaps of floating point scale near zero.
in other words, the above result can be written as (-1)0 x 1.001(2) x 22 which yields the integer
components as s = 0, b = 2, significand (m) = 1.001, mantissa = 001 ande = 2. the corresponding
single precision floating number can be represented in binary as shown below,
where the exponent field is supposed to be 2, yet encoded as 129 (127+2) called biased exponent.
the exponent field is in plain binary format which alsorepresents negative exponents with an encoding
(like sign magnitude, 1’s complement, 2’s complement, etc.). the biased exponent is used for the
representation of negative exponents. the biased exponent has advantages over other negative
representations in performing bitwise comparing of two floating point numbers for equality.
a bias of (2n-1 – 1), where n is # of bits used in exponent, is added to the exponent
to get biased exponent (e). so, the biased exponent (e) of singleprecision number can be obtained as
e = e + 127 the range of exponent in single precision format is -128 to +127. other values areused
for special symbols.
note: when we unpack a floating point number the exponent obtained is the biased exponent.
subtracting 127 from the biased exponent we can extract unbiased exponent.
Precision:
the smallest change that can be represented in floating point representation is called as precision.
the fractional part of a single precision normalized number has exactly 23 bits of resolution, (24 bits
with the implied bit). this corresponds tolog(10) (223) = 6.924 = 7 (the characteristic of logarithm)
decimal digits of accuracy. similarly, in case of double precision numbers the precision is log(10) (252)
= 15.654
= 16 decimal digits.
Accuracy:
228
accuracy in floating point representation is governed by number of significand bits, whereas range is
limited by exponent. not all real numbers can exactly be represented in floating point format. for any
numberwhich is not floating point number, there are two options for floating point approximation, say,
the closestfloating point number less than x as x_ and the closest floating point
number greater than x as x+. a rounding operation is performed on number of significant bits in the
mantissa field based on the selected mode. the round down mode causes x set to x_, the round up
mode causes x set to x+, the round towards zero mode causes x is either x_ or x+ whichever is
between zero and.
the round to nearest mode sets x to x_ or x+ whichever is nearest to x. usually round to nearest is
most used mode. the closeness of floating point representation to the actual value is called as
accuracy.
Overflow is said to occur when the true result of an arithmetic operation is finite but larger in
magnitude than the largest floating point number which can be stored using the given precision.
Underflow is said to occur when the true resultof an arithmetic operation is smaller in magnitude
(infinitesimal) than the smallest normalized floating point number which can be stored. overflow can’t
beignored in calculations whereas underflow can effectively be replaced by zero.
Endianness:
229
the ieee 754 standard defines a binary floating point format. the architecture details are left to the
hardware manufacturers. the storage order of individualbytes in binary floating point numbers varies
from architecture to architecture.
in odd parity system, 1 is appended to binary string if there is even a number of 1’s to make an odd
number of 1’s. the receiver knows that whether sender is an odd parity generator or even parity
generator. suppose if sender is an odd parity generator then there must be an odd number of 1’s in
received binary string. if anerror occurs to a single bit that is either bit is changed to 1 to 0 or o to 1,
receivedbinary bit will have an even number of 1’s which will indicate an error.
230
the limitation of this method is that only error in a single bit would be identified.
000 1 0
001 0 1
010 0 1
011 1 0
100 0 1
101 1 0
110 1
0
111 0 1
231
figure – error detection with odd parity bit
Points To Remember:
in 1’s complement of signed number +0 and -0 has two differentrepresentation.
the range of signed magnitude representation of an 8-bit number in which1-bit is used as a signed bit
as follows -27 to +27.
floating point number is said to be normalized if most significant digit ofmantissa is one. for example,
6-bit binary number 001101 is normalizedbecause of two leading 0’s.
booth algorithm that uses two n bit numbers for multiplication gives resultsin 2n bits.
the booth algorithm uses 2’s complement representation of numbers andwork for both positive and
negative numbers.
if k-bits are used to represent exponent then bits number = (2k-1) and rangeof exponent = – (2k-1 -1) to
(2k-1).
Computer Arithmetic
computer arithmetic is a field of computer science that investigates how computers should represent
numbers and perform operations on them. it includes integer arithmetic, fixed-point arithmetic, and
232
the arithmetic this book focuses on floating-point (fp) arithmetic, which will be more thoroughly
described in chapter 1. for now, let us say that it is the common way computers
approximate real numbers and that it is described in the ieee-754 standard [iee08]. as in scientific
notations, numbers are represented using an exponent and a significand, except that this significand
has to fit on a certain number of bits. asthis number of bits (called precision) is limited, each operation
may be inexact due to the rounding. this makes computer arithmetic sometimes inaccurate: theresult
of a long computation may be far from the mathematical result that wouldhave been obtained if all
the computations were correct. this also makes computer arithmetic unintuitive: for instance, the fp
addition is not always associative.
Register Transfer
the term register transfer refers to the availability of hardware logic circuits thatcan perform a given
micro-operation and transfer the result of the operation to the same or another register.
most of the standard notations used for specifying operations on various registersare stated below.
the memory address register is designated by mar.
program counter pc holds the next instruction's address.
instruction register ir holds the instruction being executed.
r1 (processor register).
we can also indicate individual bits by placing them in parenthesis. forinstance, pc (8-15), r2 (5), etc.
data transfer from one register to another register is represented in symbolic form by means of
replacement operator. for instance, the following statement denotes a transfer of the data of register
r1 intoregister r2.
1. r2 ← r1 typically, most of the users want the transfer to occur only in a predetermined control
condition. this can be shown by following if-thenstatement:
if (p=1) then (r2 ← r1); here p is a control signal generated in the controlsection.
it is more convenient to specify a control function (p) by separating the control variables from the
233
register transfer operation. for instance, the following statement defines the data transfer operation
under a specificcontrol function (p).
1. p: r2 ← r1
the following image shows the block diagram that depicts the transfer of datafrom r1 to r2.
here, the letter 'n' indicates the number of bits for the register. the 'n' outputs ofthe register r1 are
connected to the 'n' inputs of register r2.
a load input is activated by the control variable 'p' which is transferred to theregister r2.
234
the two selection lines s1 and s2 are connected to the selection inputs of all fourmultiplexers. the
selection lines choose the four bits of one register and transferthem into the four-line common bus.
when both of the select lines are at low logic, i.e. s1s0 = 00, the 0 data inputs ofall four multiplexers
are selected and applied to the outputs that forms the bus.this, in turn, causes the bus lines to receive
the content of register a since the outputs of this register are connected to the 0 data inputs of the
multiplexers.
similarly, when s1s0 = 01, register b is selected, and the bus lines will receive thecontent provided by
register b.the following function table shows the register that is selected by the bus foreach of the
four possible binary values of the selection lines.
note: the number of multiplexers needed to construct the bus is equal to the number of bits in each
235
register. the size of each multiplexer must be 'k * 1' sinceit multiplexes 'k' data lines. for instance, a
common bus for eight registers of 16 bits each requires 16 multiplexers, one for each line in the bus.
each multiplexermust have eight data input lines and three selection lines to multiplex one significant
bit in the eight registers.
a bus system can also be constructed using three-state gates instead ofmultiplexers.
the three state gates can be considered as a digital circuit that has three gates,two of which are
signals equivalent to logic 1 and 0 as in a conventional gate.
however, the third gate exhibits a high-impedance state.
the most commonly used three state gates in case of the bus system is a buffergate.
the graphical symbol of a three-state buffer gate can be represented as:
the following diagram demonstrates the construction of a bus system with three-state buffers.
the outputs generated by the four buffers are connected to form a singlebus line.
only one buffer can be in active state at a given point of time.
the control inputs to the buffers determine which of the four normal inputswill communicate with the
bus line. a 2 * 4 decoder ensures that no more than one control input is active atany given point of
time.
Memory Transfer
236
most of the standard notations used for specifying operations on memorytransfer are stated below.
the transfer of information from a memory unit to the user end is calleda read operation.
the transfer of new information to be stored in the memory is calleda write operation.
a memory word is designated by the letter m.
we must specify the address of memory word while writing the memorytransfer operations.
the address register is designated by ar and the data register by dr.
thus, a read operation can be stated as:
1. Read: DR ← M [AR]
the read statement causes a transfer of information into the data register(dr) from the memory word
(m) selected by the address register (ar).
and the corresponding write operation can be stated as:
1. Write: M [AR] ← R1
the write statement causes a transfer of information from register r1 into the memory word (m)
selected by address register (ar).
Micro-Operations
the operations executed on data stored in registers are called micro-operations.a micro-operation is
an elementary operation performed on the information stored in one or more registers.
example: shift, count, clear and load.
Types Of Micro-Operations
the micro-operations in digital computers are of 4 types:
register transfer micro-operations transfer binary information from oneregister to another.
arithmetic micro-operations perform arithmetic operations on numericdata stored in registers.
logic micro-operations perform bit manipulation operation on non-numericdata stored in registers.
shift micro-operations perform shift micro-operations performed on data.
Arithmetic Micro-Operations
in general, the arithmetic micro-operations deals with the operations performed on numeric data
stored in the registers.
Note: The increment and decrement micro-operations are symbolized by '+ 1' and
'? 1' respectively. Arithmetic operations like multiply and divide are not included
in the basic set of micro-operations.
Logic Micro-Operations
these are binary micro-operations performed on the bits stored in the registers. these operations
consider each bit separately and treat them as binary variables.
let us consider the x-or micro-operation with the contents of two registers r1and r2.
p: r1 ← r1 x-or r2
in the above statement we have also included a control function.
assume that each register has 3 bits. let the content of r1 be 010 and r2 be 100.the x-or micro-
operation will be:
238
Shift Micro-Operations
these are used for serial transfer of data. that means we can shift the contents ofthe register to the
left or right. in the shift left operation the serial input transfersa bit to the right most position and in
shift right operation the serial input transfers a bit to the left most position.
Logical Shift
it transfers 0 through the serial input. the symbol "shl" is used for logical shift leftand "shr" is used for
logical shift right.
r1 ← she r1r1 ← she r1
the register symbol must be same on both sides of arrows.
Circular Shift
this circulates or rotates the bits of register around the two ends without any lossof data or contents.
in this, the serial output of the shift register is connected to its serial input. "cil" and "cir" is used for
circular shift left and right respectively.
Arithmetic Shift
this shifts a signed binary number to left or right. an arithmetic shift
left multiplies a signed binary number by 2 and shift left divides the number by 2. arithmetic shift
micro-operation leaves the sign bit unchanged because the signednumber remains same when it is
multiplied or divided by 2.
239
registers
register is a very fast computer memory, used to store data/instruction in-execution.
a register is a group of flip-flops with each flip-flop capable of storing one bit ofinformation. an n-bit
register has a group of n flip-flops and is capable of storingbinary information of n-bits.
a register consists of a group of flip-flops and gates. the flip-flops hold the binaryinformation and
gates control when and how new information is transferred intoa register. various types of registers
are available commercially. the simplest register is one that consists of only flip-flops with no external
gates.
these days registers are also implemented as a register file.
Instruction Codes
while a program, as we all know, is, a set of instructions that specify the operations, operands, and
the sequence by which processing has to occur.
an instruction code is a group of bits that tells the computer to perform a specificoperation part.
computers with a single processor register is known as accumulator (ac). theoperation is performed
with the memory operand and the content of ac.
Load(Ld)
the lines from the common bus are connected to the inputs of each register and data inputs of
memory. the particular register whose ld input is enabled receivesthe data from the bus during the
next clock pulse transition. Before studying about instruction formats lets first study about the
operandaddress parts. when the 2nd part of an instruction code specifies the operand, the instruction
issaid to have immediate operand. and when the 2nd part of the instruction code
specifies the address of an operand, the instruction is said to have a direct address. and in indirect
address, the 2nd part of instruction code, specifies the address of a memory word in which the
address of the operand is found.
Computer Instructions
the basic computer has three instruction code formats. the operation code (opcode) part of the
instruction contains 3 bits and remaining 13 bitsdepends upon the operation code encountered.
241
Register Reference Instruction
these instructions are recognized by the opcode 111 with a 0 in the left most bitof instruction. the
other 12 bits specify the operation to be executed.
Input-Output Instruction
these instructions are recognized by the operation code 111 with a 1 in the left most bit of instruction.
the remaining 12 bits are used to specify the input-outputoperation.
Format Of Instruction
the format of an instruction is depicted in a rectangular box symbolizing the bitsof an instruction.
basic fields of an instruction format are given below:
an operation code field that specifies the operation to be performed.
an address field that designates the memory address or register.
a mode field that specifies the way the operand of effective address isdetermined.
computers may have instructions of different lengths containing varying number of addresses. the
number of address field in the instruction format depends upon the internal organization of its
registers.
Immediate Mode
in this mode, the operand is specified in the instruction itself. an immediate modeinstruction has an
operand field rather than the address field.
for example: add 7, which says add 7 to contents of accumulator. 7 is theoperand here.
Register Mode
242
in this mode the operand is stored in the register and this register is present in cpu. the instruction
has the address of the register where the operand is stored.
Advantages
shorter instructions and faster instructions fetch.
faster memory access to the operand(s)
Disadvantages
very limited address space using multiple registers helps performance but it complicates the
instructions.
243
in this the register is incremented or decremented after or before its value isused.
For Example: ADD R1, 4000 - In this the 4000 is effective address of operand.
244
Displacement Addressing Mode
in this the contents of the indexed register is added to the address part of theinstruction, to obtain the
effective address of operand.
ea = a + (r), in this the address field holds two values, a(which is the base value)and r(that holds the
displacement), or vice versa.
Instruction Cycle
instruction cycle, also known as fetch-decode-execute cycle is the basic operational process of a
computer. this process is repeated continuously by cpufrom boot up to shut down of computer.
245
Read The Effective Address
if the instruction has an indirect address, the effective address is read from thememory. otherwise
operands are directly read in case of immediate operand instruction.
memory-reference instructions
the basic computer has 16-bit instruction register (ir) which can denote eithermemory reference or
register reference or input-output instruction.
Memory Reference – these instructions refer to memory address as an operand. the other operand
is always accumulator. specifies 12-bit address, 3-bit opcode (other than 111) and 1-bit addressing
mode for directand indirect addressing.
example –
ir register contains = 0001xxxxxxxxxxxx, i.e. add after fetching and decoding ofinstruction we find out
246
that it is a memory reference instruction for add operation.
hence, dr ← m[ar] ac ← ac + dr, sc ← 0
Input-Output Instructions
input/output – these instructions are for communication between computer andoutside environment.
the ir(14 – 12) is 111 (differentiates it from memory reference) and ir(15) is 1 (differentiates it from
register reference instructions).
the rest 12 bits specify i/o operation.
Example –
ir register contains = 1111100000000000, i.e. inp after fetch and decode cycle wefind out that it is an
input/output instruction for inputing character. hence, inputcharacter from peripheral device.
the set of instructions incorporated in16 bit ir register are:
arithmetic, logical and shift instructions (and, add, complement, circulateleft, right, etc)
to move information to and from memory (store the accumulator, load theaccumulator)
program control instructions with status conditions (branch, skip)
input output instructions (input character, output character)
Machine Language
machine language, or machine code, is a low-level language comprised
of binary digits (ones and zeros). high-level languages, such as swift and c++ mustbe compiled into
machine language before the code is run on a computer.
since computers are digital devices, they only recognize binary data. everyprogram, video, image, and
character of text is represented in binary. thisbinary data, or machine code, is processed as input by
the cpu. The resulting output is sent to the operating system or an application, which displays the
data visually. for example, the ascii value for the letter "a" is 01000001 in machine code, but this data
is displayed as "a" on the screen. an image may have thousands or even millions of binary values that
determine the color of each pixel.
while machine code is comprised of 1s and 0s, different processor architectures use different
machine code. for example, a powerpc processor, which has a risc architecture, requires different
code than an intel x86 processor,which has a cisc architecture. a compiler must compile high-level
source cod for the correct processor architecture in order for a program to run correctly.
the exact machine language for a program or action can differ by operating system. the specific
operating system dictates how a compiler writes a programor action into machine language.
computer programs are written in one or more programming languages,like c++, java, or visual basic.
a computer cannot directly understand the programming languages used to create computer
programs, so the program codemust be compiled. once a program's code is compiled, the computer
can understand it because the program's code is turned into machine language.
247
01001000 01100101 01101100 01101100 01101111 00100000 01010111
01101111 01110010 01101100 01100100
below is another example of machine language (non-binary), which prints theletter "a" 1000 times to
the computer screen.
169 1 160 0 153 0 128 153 0 129 153 130 153 0 131 200 208 241 96
Assembly Language
sometimes referred to as assembly or asm, an assembly language is a low- level programming
language.
programs written in assembly languages are compiled by an assembler. everyassembler has its own
assembly language, which is designed for one specific computer architecture.
Is Asm Portable?
no. because assembly languages are tied to one specific computer architecture, they are not portable.
a program written in one assembly language would need tobe completely rewritten for it to run on
another type of machine.
portability is one of the main advantages of higher-level languages. the c programming language is
248
often called "portable assembly" because c compilers exist for nearly every modern system
architecture. a program written in c may require some changes before it will compile on another
computer, but the corelanguage is portable.
generally speaking, the higher-level a language is, the fewer changes need to be made for it to run on
another architecture. the lowest-level languages — machinelanguage and assembly language — are
not portable.
Assembler
program used to convert or translate programs written in assembly code tomachine code. some users
may also refer to assembly language or assembler language as assembler.
an assembler is a program that converts assembly language into machine code. ittakes the basic
commands and operations from assembly code and converts them into binary code that can be
recognized by a specific type of processor.
assemblers are similar to compilers in that they produce executable code. however, assemblers are
more simplistic since they only convert low-level code (assembly language) to machine code. since
each assembly language is designedfor a specific processor, assembling a program is performed
using a simple one- to-one mapping from assembly code to machine code. compilers, on the other
hand, must convert generic high-level source code into machine code for a specific processor.
most programs are written in high-level programming languages and are compiled directly to
machine code using a compiler. however, in some cases, assembly code may be used to customize
functions and ensure they perform in a specific way. therefore, ides often include assemblers so they
can build programsfrom both high and low-level languages.
How It Works:
most computers come with a specified set of very basic instructions that correspond to the basic
machine operations that the computer can perform. for example, a "load" instruction causes the
processor to move astring of bits from a location in the processor's memory to a special holdingplace
called a register. assuming the processor has at least eight registers, each numbered, the following
instruction would move the value (string of bits of a certain length) at memory location 3000 into the
holding place called register 8:
L 8,3000
249
the assembler program takes each program statement in the source program and generates a
corresponding bit stream or pattern (a series of0's and 1's of a given length).
the output of the assembler program is called the object code or object program relative to the input
source program. the sequence of 0's and 1'sthat constitute the object program is sometimes called
machine code.
the object program can then be run (or executed) whenever desired.
in the earliest computers, programmers actually wrote programs in machine code, but assembler
languages or instruction sets were soon developed to speed up programming. today, assembler
programming is used only where very efficient control over processor operations is needed. it
requires knowledge of a particular computer's instruction set, however. historically, most programs
have been written in "higher-level" languages such as Cobol, Fortran, pl/i, and c. these languages are
easier to learn and faster to write programs with than assembler language. the program that
processes the source code written in these languages is called a compiler. like the assembler, a
compiler takes higher-level language statements and reduces them to machine code.
Program Loops
loops are among the most basic and powerful of programming concepts. a loop in a computer
program is an instruction that repeats until a specified condition isreached. in a loop structure, the
loop asks a question. if the answer requires action, it is executed. the same question is asked again
and again until no furtheraction is required. each time the question is asked is called an iteration.
a computer programmer who needs to use the same lines of code many times ina program can use
a loop to save time.
just about every programming language includes the concept of a loop. high-level programs
accommodate several types of loops. c, c++, and c# are all high-level computer programs and have
the capacity to use several types of loops.
Types Of Loops
as for loop is a loop that runs for a preset number of times.
a while loop is a loop that is repeated as long as an expression is true. anexpression is a statement
that has a value.
ado while loop or repeat until loop repeats until an expression becomesfalse.
an infinite or endless loop is a loop that repeats indefinitely because it hasno terminating condition,
the exit condition is never met, or the loop is instructed to start over from the beginning. although it
is possible for a programmer to intentionally use an infinite loop, they are often mistakesmade by new
programmers.
a nested loop appears inside any other for, while or do while loop.
a goto statement can create a loop by jumping backward to a label, although this is generally
discouraged as a bad programming practice. for some complex code,it allows a jump to a common
exit point that simplifies the code.
Subroutine
a set of instructions which are used repeatedly in a program can be referred to assubroutine. only
one copy of this instruction is stored in the memory. when a subroutine is required it can be called
many times during the execution of a particular program. a call subroutine instruction calls the
subroutine. care shouldbe taken while returning a subroutine as subroutine can be called from a
different place from the memory.
the content of the pc must be saved by the call subroutine instruction to make acorrect return to the
calling program.
figure – process of subroutine in a program subroutine linkage method is a way in which computer
call and return the subroutine. the simplest way of subroutine linkage is saving the return address in
a specific location, such as register which can be called as link register call subroutine.
Subroutine Nesting –
subroutine nesting is a common programming practice in which one subroutine call another
subroutine.
Figure – subroutine calling another subroutine from the above figure, assume that when subroutine
1 calls subroutine 2 the return address of subroutine 2 should be saved somewhere. so if link register
stores return address of subroutine 1 this will be (destroyed/overwritten) by return address of
subroutine 2. as the last subroutine called is the first one to be returned ( last in first out format). so
stack data structure is the most efficientway to store the return addresses of the subroutines.
251
Figure – return address of subroutine is stored in stack memory
Stack Memory –
stack is a basic data structure which can be implemented anywhere in the memory. it can be used to
store variables which may be required afterwards in the program execution. in a stack, the first data
put will be last to get out of a stack. so the last data added will be the first one to come out of the
stack (last infirst out).
252
Figure – stack memory having data a, b & c
so from the diagram above first a is added then b & c. while removing first c isremoved then b & a.
Design Of Control Unit the Control Unit Is Classified Into Two Major Categories:
hardwired control microprogrammed control
a hard-wired control consists of two decoders, a sequence counter, anda number of logic gates.
an instruction fetched from the memory unit is placed in the instructionregister (ir).
the component of an instruction register includes; i bit, the operationcode, and bits 0 through 11.
253
the operation code in bits 12 through 14 are coded with a 3 x 8 decoder.
the outputs of the decoder are designated by the symbols d0 throughd7.
the operation code at bit 15 is transferred to a flip-flop designated bythe symbol i.
the operation codes from bits 0 through 11 are applied to the controllogic gates.
the sequence counter (sc) can count in binary from 0 through 15.
Micro-Programmed Control
the microprogrammed control organization is implemented by using theprogramming approach.
in microprogrammed control, the micro-operations are performed by executing aprogram consisting
of micro-instructions.
the following image shows the block diagram of a microprogrammed controlorganization.
the control memory address register specifies the address of the micro-instruction.
the control memory is assumed to be a room, within which all controlinformation is permanently
stored. the control register holds the microinstruction fetched from thememory.
the micro-instruction contains a control word that specifies one or moremicro-operations for the data
processor. While the micro-operations are being executed, the next address is computed in the next
address generator circuit and then transferred into the control address register to read the next
microinstruction.
the next address generator is often referred to as a micro-programsequencer, as it determines the
address sequence that is read fromcontrol memory.
254
fixed instruction format. variable instruction format (16-64 bits perinstruction).
Dynamic Microprogramming:
a more advanced development known as dynamic microprogramming permits a microprogram to be
loaded initially from an auxiliary memory such as a magnetic disk. control units that use dynamic
microprogramming employ a writable controlmemory. this type of memory can be used for writing.
Control Memory:
control memory is the storage in the microprogrammed control unit to store themicroprogram.
Control Word:
the control variables at any given time can be represented by a control wordstring of 1 's and 0's called
a control word.
Micro Instruction:
a symbolic microprogram can be translated into its binary equivalent bymeans of an assembler.
each line of the assembly language microprogram defines a symbolicmicroinstruction.
each symbolic microinstruction is divided into five fields: label,microoperations, cd, br, and ad.
Micro Program:
a sequence of microinstructions constitutes a microprogram.
since alterations of the microprogram are not needed once the control unit is in operation, the control
255
memory can be aread-only memory (rom).
rom words are made permanent during the hardwareproduction of the unit.
the use of a micro program involves placing all control variables in words of rom for use by the control
unit throughsuccessive read operations.
the content of the word in rom at a given address specifies amicroinstruction.
Microcode:
microinstructions can be saved by employing subroutines thatuse common sections of microcode.
for example, the sequence of micro operations needed to generate the effective address of the
operand for an instruction is common to all memory reference instructions.
this sequence could be a subroutine that is called from withinmany other routines to execute the
effective address computation.
Address Sequencing
microinstructions are stored in control memory in groups, with each groupspecifying a
Routine.
to appreciate the address sequencing in a micro-program control unit, specify the steps that the
control must undergo during the execution of a single computer instruction.
step 1:
an initial address is loaded into the control address register when power isturned on in the computer.
this address is usually the address of the first microinstruction thatactivates the instruction fetch
routine.
the fetch routine may be sequenced by incrementing the control addressregister through the rest of
its microinstructions.
at the end of the fetch routine, the instruction is in the instruction registerof the computer.
step 2:
the control memory next must go through the routine that determines theeffective address of the
operand.
a machine instruction may have bits that specify various addressing modes,such as indirect address
and index registers.
the effective address computation routine in control memory can be reached through a branch
microinstruction, which is conditioned on thestatus of the mode bits of the instruction.
when the effective address computation routine is completed, the addressof the operand is available
in the memory address register.
step 3:
the next step is to generate the microoperations that execute theinstruction fetched from memory.
the microoperation steps to be generated in processor registers depend onthe operation code part of
the instruction.
each instruction has its own micro-program routine stored in a givenlocation of control memory.
the transformation from the instruction code bits to an address in controlmemory where the routine
is located is referred to as a mapping process.
a mapping procedure is a rule that transforms the instruction code into acontrol memory address.
step 4:
once the required routine is reached, the microinstructions that execute the instruction may be
sequenced by incrementing the control address register.
257
micro-programs that employ subroutines will require an external register for storing the return
address.
return addresses cannot be stored in rom because the unithas no writing capability.
when the execution of the instruction is completed, controlmust return to the fetch routine.
this is accomplished by executing an unconditional branchmicroinstruction to the first address of the
fetch routine.
Control Unit
the control unit extracts instructions from memory and decodes and executesthem.
the control unit acts as an intermediary that decodes the instructions sent to theprocessor, tells the
other units such as the arithmetic logic unit (below) what to do by providing control signals, and then
sends back the processed data back to memory.
259
to function properly, the cpu relies on the system clock, memory, secondarystorage, and data and
address buses.
smaller devices like mobile phones, calculators, held gaming systems, and tabletsuse smaller-sized
processors known as arm cpus to accommodate their reducedsize and space.
the cpu is the heart and brain of a computer. it receives data input, executes instructions, and
processes information. it communicates with input/output (i/o)devices, which send and receive data
to and from the cpu.
additionally, the microprocessor has an internal bus for communication with the internal cache
memory, called the backside bus. the main bus for data transfer toand from the cpu, memory, chipset,
and agp socket is called the front-side bus.
the cpu contains internal memory units, which are called registers. these registers contain data,
instructions, counters and addresses used in the alu'sinformation processing.
some computers utilize two or more processors. these consist of separate physical microprocessors
located side by side on the same board or on separateboards. each cpu has an independent interface,
separate cache, and individual paths to the system front-side bus.
multiple processors are ideal for intensive parallel tasks requiring multitasking. multicore cpus are
also common, in which a single chip contains multiple cpus.
since the first microprocessor was released by intel in november 1971, cpus haveincreased their
computing power severalfold.
the oldest intel 4004 processor only performed 60,000 operations per second,while a modern intel
pentium processor can perform about 188,000,000 instructions per second.
Types Of CPU:
CPUs are mostly manufactured by intel and amd, each of which manufactures itsown types of cpus.
in modern times, there are lots of cpu types in the market.
some of the basic types of cpus are described below:
single core cpu: single core is the oldest type of computer cpu, which was used in the 1970s. it has
only one core to process different operations. it can start only one operation at a time; the cpu
switches back and forth between different sets of data streams when more than one program runs.
so, it is not suitable for multitasking as the performance will be reduced if more than one application
runs. the performance of these cpus is mainly dependent on the clock speed. it isstill used in various
devices, such as smartphones.
dual core CPU: as the name suggests, dual core cpu contains two cores in a single integrated circuit
(ic). although each core has its own controller and cache,they are linked together to work as a single
unit and thus can perform faster thanthe single-core processors and can handle multitasking more
efficiently than single core processors.
260
quad core CPU: this type of cpu comes with two dual-core processors in one integrated circuit (ic) or
chip. so, a quad-core processor is a chip that contains four independent units called cores. these
cores read and execute instructions ofcpu. the cores can run multiple instructions simultaneously,
thereby increases the overall speed for programs that are compatible with parallel processing.
quad core cpu uses a technology that allows four independent processing units (cores) to run in
parallel on a single chip. thus by integrating multiple cores in a single cpu, higher performance can be
generated without boosting the clock speed. however, the performance increases only when the
computer's softwaresupports multiprocessing. the software which supports multiprocessing divides
the processing load between multiple processors instead of using one processorat a time.
History Of CPU:
Some Of The Important Events In The Development Of Cpu Since Its Invention Till Date Are As
Follows:
in 1823, baron jons jackob berzelius discovered silicon that is the maincomponent of cpu till date.
in 1903, nikola tesla got gates or switches patented, which are electricallogic circuits.
in december 1947, john bardeen, william shockley, and walter brattaininvented the first transistor at
the bell laboratories and got it patented in 1948.
in 1958, the first working integrated circuit was developed by robertnoyce and jack kilby.
in 1960, ibm established the first mass-production facility for transistorsin new york.
in 1968, robert noyce and gordon moore founded intel corporation.
amd (advanced micro devices) was founded in may 1969.
in 1971, intel introduced the first microprocessor, the intel 4004, withthe help of ted hoff.
in 1972, intel introduced the 8008 processor; in 1976, intel 8086 wasintroduced, and in june 1979, intel
8088 was released.
in 1979, a 16/32-bit processor, the motorola 68000, was released. later,it was used as a processor for
the apple macintosh and amiga computers.
in 1987, sun introduced the sparc processor.
in march 1991, amd introduced the am386 microprocessor family.
in march 1993, intel released the pentium processor. in 1995, cyrix introduced the cx5x86 processor
to give competition to intel pentiumprocessors.
in january 1999, intel introduced the celeron 366 mhz and 400 mhzprocessors.
in april 2005, amd introduced its first dual-core processor.
in 2006, intel introduced the core 2 duo processor.
in 2007, intel introduced different types of core 2 quad processors.
in april 2008, intel introduced the first series of intel atom processors,the z5xx series. they were
single-core processors with a 200 mhz gpu.
in september 2009, intel released the first core i5 desktop processorwith four cores.
in january 2010, intel released many processors such as core 2 quad processor q9500, first core i3
and i5 mobile processors, first core i3 andi5 desktop processors. in the same year in july, it released
the first core i7 desktop processor with six cores.
in june 2017, intel introduced the first core i9 desktop processor.
in april 2018, intel released the first core i9 mobile processor.
For Example:
mult r1, r2, r3
this is an instruction of an arithmatic multiplication written in assembly language. it uses three
address fields r1, r2 and r3. the meaning of this instruction is:
r1 <-- r2 * r3 this instruction also can be written using only two address fields as:
mult r1, r2 in this instruction, the destination register is the same as one of the sourceregisters. this
means the operation
r1 <-- r1 * r2 the use of large number of registers results in short program with limitedinstructions.
some examples of general register based cpu organization are ibm 360 andpdp- 11.
PUSH
this operation results in inserting one operand at the top of the stack and itdecrease the stack pointer
262
register. the format of the push instruction is:
it inserts the data word at specified address to the top of the stack. it can beimplemented as:
//decrement SP by 1
SP <-- SP - 1
pop –
this operation results in deleting one operand from the top of the stack and it increase the stack
pointer register. the format of the pop instructionis:
POP
it deletes the data word at the top of the stack to the specified address. it can beimplemented as:
//increment SP by 1
SP <-- SP + 1
operation type instruction does not need the address field in this cpu organization. this is because
the operation is performed on the two operandsthat are on the top of the stack. for example:
SUB
this instruction contains the opcode only with no address field. it pops the twotop data from the stack,
subtracting the data, and pushing the result into the stack at the top.
pdp-11, intel’s 8085 and hp 3000 are some of the examples of the stackorganized computers.
263
execution of instructions is fast because operand data are stored inconsecutive memory locations.
length of instruction is short as they do not have address field.
Computer Instructions
the basic computer has three instruction code formats. the operation code (opcode) part of the
instruction contains 3 bits and remaining 13 bitsdepends upon the operation code encountered.
Input-Output Instruction
these instructions are recognized by the operation code 111 with a 1 in the left most bit of instruction.
the remaining 12 bits are used to specify the input-outputoperation.
Format Of Instruction
the format of an instruction is depicted in a rectangular box symbolizing the bitsof an instruction.
basic fields of an instruction format are given below:
an operation code field that specifies the operation to be performed.
an address field that designates the memory address or register.
a mode field that specifies the way the operand of effective address isdetermined.
computers may have instructions of different lengths containing varying number of addresses. the
number of address field in the instruction format depends upon the internal organization of its
registers.
Addressing Modes
the operation field of an instruction specifies the operation to be performed. thisoperation will be
executed on some data which is stored in computer registers or the main memory. the way any
operand is selected during the program execution is dependent on the addressing mode of the
instruction. the purpose of using addressing modes is as follows:
to give the programming versatility to the user.
to reduce the number of bits in addressing field of instruction.
264
Immediate Mode
in this mode, the operand is specified in the instruction itself. an immediate modeinstruction has an
operand field rather than the address field.
for example: add 7, which says add 7 to contents of accumulator. 7 is theoperand here.
Register Mode
in this mode the operand is stored in the register and this register is present in cpu. the instruction
has the address of the register where the operand is stored.
Advantages
shorter instructions and faster instruction fetch.
faster memory access to the operand(s)
Disadvantages
very limited address space
using multiple registers helps performance but it complicates theinstructions.
265
Auto Increment/Decrement Mode
in this the register is incremented or decremented after or before its value isused.
For Example: ADD R1, 4000 - In this the 4000 is effective address of operand.
266
Indirect Addressing Mode
in this, the address field of instruction gives the address where the effectiveaddress is stored in
memory. this slows down the execution, as this includesmultiple memory lookups to find the operand.
Risc Processor
it is known as reduced instruction set computer. it is a type of microprocessor that has a limited
number of instructions. they can execute their instructions veryfast because instructions are very
small and simple.
risc chips require fewer transistors which make them cheaper to design and produce. in risc, the
instruction set contains simple and basic instructions from which more complex instruction can be
produced. most instructions complete in one cycle, which allows the processor to handle many
instructions at same time.
in this instructions are register based and data transfer takes place from registerto register.
Cisc Processor
it is known as complex instruction set computer.
it was first developed by intel.
it contains large number of complex instructions.
in this instructions are not register based.
instructions cannot be completed in one machine cycle.
data transfer is from memory to memory.
micro programmed control unit is found in cisc.
also they have variable instruction formats.
instruction sizeand large set of instructions with variable formats small set of instructions with
format (16-64 bits perinstruction). fixed format (32 bit).
268
most micro coded using control memory mostly hardwired without
cpu control (rom) but modern cisc usehardwired control. control memory.
Parallel Processing
parallel processing can be described as a class of techniques which enables thesystem to achieve
simultaneous data-processing tasks to increase the computational speed of a computer system.
a parallel processing system can carry out simultaneous data-processing to achieve faster execution
time. for instance, while an instruction is being processed in the alu component of the cpu, the next
instruction can be readfrom memory.
the primary purpose of parallel processing is to enhance the computer processing capability and
increase its throughput, i.e., the amount of processing that can be accomplished during a given
interval of time.a parallel processing system can be achieved by having a multiplicity of functional
units that perform identical or different operations simultaneously. the data can be distributed among
various multiple functional units.
the following diagram shows one possible way of separating the execution unitinto eight functional
units operating in parallel.
the operation performed in each functional unit is indicated in each block if thediagram:
269
the adder and integer multiplier performs the arithmetic operation withinteger numbers.
the floating-point operations are separated into three circuits operatingin parallel.
the logic, shift, and increment operations can be performed concurrently on different data. all units
are independent of each other,
so one number can be shifted while another number is beingincremented.
Pipelining
the term pipelining refers to a technique of decomposing a sequential processinto sub-operations,
with each sub-operation being executed in a dedicated segment that operates concurrently with all
other segments.
the most important characteristic of a pipeline technique is that several computations can be in
progress in distinct segments at the same time. the overlapping of computation is made possible by
associating a register with each segment in the pipeline. the registers provide isolation between each
segment sothat each can operate on distinct data simultaneously.
the structure of a pipeline organization can be represented simply by including aninput register for
each segment followed by a combinational circuit.
an example of combined multiplication and addition operation to get a betterunderstanding of the
pipeline organization.
the combined multiplication and addition operation is done with a stream ofnumbers such as:
the operation to be performed on the numbers is decomposed into sub- operations with each sub-
operation to be implemented in a segment within apipeline.
the sub-operations performed in each segment of the pipeline are defined as:
R5 ← R3 + R4 Add Ci to product
the following block diagram represents the combined as well as the sub-operations performed in each
segment of the pipeline.
271
registers r1, r2, r3, and r4 hold the data and the combinational circuits operatein a particular segment.
the output generated by the combinational circuit in a given segment is applied as an input register
of the next segment. for instance, from the block diagram, we
can see that the register r3 is used as one of the input registers for thecombinational adder circuit.
in general, the pipeline organization is applicable for two areas of computerdesign which includes:
arithmetic pipeline
instruction pipeline
Arithmetic Pipeline
arithmetic pipelines are mostly used in high-speed computers. they are used toimplement floating-
point operations, multiplication of fixed-point numbers, and similar computations encountered in
scientific problems.
to understand the concepts of arithmetic pipeline in a more convenient way, anexample of a pipeline
unit for floating-point addition and subtraction.
the inputs to the floating-point adder pipeline are two normalized floating-point binary numbers
defined as:
X = A * 2a = 0.9504 * 103
Y = B * 2b = 0.8200 * 102
272
where a and b are two fractions that represent the mantissa and a and b are theexponents.
the combined operation of floating-point addition and subtraction is divided intofour segments. each
segment contains the corresponding suboperation to be performed in the given pipeline. the
suboperations that are shown in the four segments are:
compare the exponents by subtraction. align the mantissas. Add or subtract the mantissas.
normalize the result.
the following block diagram represents the suboperations performed in eachsegment of the pipeline.
Note: Registers are placed after each suboperation to store the intermediate
results.
X = 0.9504 * 103
Y = 0.08200 * 103
273
Add Mantissas:
the two mantissas are added in segment three.
Z = X + Y = 1.0324 * 103
Z = 0.1324 * 104
Instruction Pipeline
pipeline processing can occur not only in the data stream but in the instructionstream as well.
most of the digital computers with complex instructions require instruction pipeline to carry out
operations like fetch, decode and execute instructions.
in general, the computer needs to process each instruction with the followingsequence of steps.
fetch instruction from memory.
decode the instruction.
calculate the effective address.
fetch the operands from memory.
execute the instruction.
store the result in the proper place.
each step is executed in a particular segment, and there are times when differentsegments may take
different times to operate on the incoming information.
moreover, there are times when two or more segments may require memoryaccess at the same time,
causing one segment to wait until another is finishedwith the memory.
the organization of an instruction pipeline will be more efficient if the instructioncycle is divided into
segments of equal duration. one of the most common examples of this type of organization is a four-
segment instruction pipeline.
A four-segment instruction pipeline combines two or more different segments and makes it as a
single one. for instance, the decoding of the instruction can becombined with the calculation of the
effective address into one segment.the following block diagram shows a typical example of a four-
segment instruction pipeline. the instruction cycle is completed in four segments.
274
segment 1:
the instruction fetch segment can be implemented using first in, first out (fifo)buffer.
segment 2:
the instruction fetched from memory is decoded in the second segment, andeventually, the effective
address is calculated in a separate arithmetic circuit.
segment 3:
an operand from memory is fetched in the third segment.
segment 4:
the instructions are finally executed in the last segment of the pipelineorganization.
Advantages Of Pipelining
the cycle time of the processor is reduced.
it increases the throughput of the system
it makes the system reliable.
275
Disadvantages Of Pipelining
the design of pipelined processor is complex and costly to manufacture.
the instruction latency is more.
Vector(Array) Processing
there is a class of computational problems that are beyond the capabilities of a conventional
computer. these problems require vast number of computations onmultiple data items, that will take
a conventional computer (with scalar processor)days or even weeks to complete.
such complex instructions, which operates on multiple data at the same time,requires a better way of
instruction execution, which was achieved by vector processors.
scalar CPUs can manipulate one or two data items at a time, which is not veryefficient. also, simple
instructions like add a to b, and store into c are not practically efficient.
addresses are used to point to the memory location where the data to be operated will be found,
which leads to added overhead of data lookup. so, until the data is found, the cpu would be sitting
ideal, which is a big performanceissue.
hence, the concept of instruction pipeline comes into picture, in which the instruction passes through
several sub-units in turn. these sub-units perform various independent functions, for example: the
first one decodes the instruction, the second sub-unit fetches the data and the third sub-unit performs
the math itself. therefore, while the data is fetched for one instruction, cpu doesnot sit idle, it rather
works on decoding the next instruction set, ending up working like an assembly line. Vector processor,
not only use instruction pipeline, but it also pipelines the data,working on multiple data at the same
time. A normal scalar processor instruction would be add a, b, which leads to additionof two operands,
but what if we can instruct the processor to add a group of numbers(from 0 to n memory location) to
another group of numbers(lets
say, n to k memory location). this can be achieved by vector processors.
in vector processor a single instruction, can ask for multiple data operations, which saves time, as
instruction is decoded once, and then it keeps on operatingon different data items.
applications of vector processors computer with vector processing capabilities are in demand in
specialized applications. the following are some areas where vector processing is used: petroleum
exploration.
Medical Diagnosis.
data analysis.
Weather Forecasting.
aerodynamics and space flight simulations. Image processing. Artificial intelligence.
277
Why Use The Array Processor Array Processors Increases The Overall Instruction Processing
Speed.
as most of the array processors operates asynchronously from the hostcpu, hence it improves the
overall capacity of the system.
array processors have its own local memory, hence providing extramemory for systems with low
memory.
Peripheral Devices
input or output devices that are connected to computer are called peripheraldevices. these devices
are designed to read information into or out of the memory unit upon command from the CPU and are
considered to be the part ofcomputer system. these devices are also called peripherals.
for example: keyboards, display units and printers are common peripheraldevices.
interface is a shared boundary between two separate components of the computersystem which can
be used to attach two or more components to the system for communication purposes.
there are two types of interfaces:
CPU interface I/o interface
Input-Output Interface
peripherals connected to a computer need special communication links for interfacing with CPU. in
computer system, there are special hardware components between the CPU and peripherals to
control or manage the input-output transfers. these components are called input-output interface
units because they provide communication links between processor bus and peripherals. they provide
a method for transferring information between internalsystem and input-output devices.
modes of i/o data transfer data transfer between the central unit and i/o devices can be handled in
generally three types of modes which are given below:
programmed i/o interrupt initiated i/o direct memory access
Programmed I/O
programmed i/o instructions are the result of i/o instructions written in computer program. each data
item transfer is initiated by the instruction in theprogram.
usually, the program controls data transfer to and from cpu and peripheral. transferring data under
278
programmed i/o requires constant monitoring of theperipherals by the cpu.
Interrupt Initiated I/O
in the programmed i/o method the cpu stays in the program loop until the i/ounit indicates that it is
ready for data transfer. this is time consuming process because it keeps the processor busy
needlessly.
this problem can be overcome by using interrupt initiated i/o. in this when the interface determines
that the peripheral is ready for data transfer, it generates aninterrupt. after receiving the interrupt
signal, the cpu stops the task which it is processing and service the i/o transfer and then returns back
to its previous processing task.
direct memory access removing the cpu from the path and letting the peripheral device manage the
memory buses directly would improve the speed of transfer. this technique is known as dma.
in this, the interface transfer data to and from the memory through memory bus.a dma controller
manages to transfer data between peripherals and memory unit.
many hardware systems use dma such as disk drive controllers, graphic cards,network cards and
sound cards etc. it is also used for intra chip data transfer inmulticore processors. in dma, cpu would
initiate the transfer, do other operations while the transfer is in progress and receive an interrupt from
the dma controller when the transfer has been completed.
Input/Output Processor
an input-output processor (iop) is a processor with direct memory access capability. in this, the
computer system is divided into a memory unit and numberof processors.
each iop controls and manage the input-output tasks. the iop is similar to cpu except that it handles
only the details of i/o processing. the iop can fetch and execute its own instructions. these iop
instructions are designed to manage i/otransfers only.
279
block diagram of i/o processor below is a block diagram of a computer along with various i/o
processors. the memory unit occupies the central position and can communicate with each
Processor.
the CPU processes the data required for solving the computational tasks. the iopprovides a path for
transfer of data between peripherals and memory. the cpu assigns the task of initiating the i/o
program.
the iop operates independent from CPU and transfer data between peripheralsand memory.
the communication between the iop and the devices is similar to the program control method of
transfer. and the communication with the memory is similar tothe direct memory access method.
in large scale computers, each processor is independent of other processors andany processor can
initiate the operation.
the cpu can act as master and the iop act as slave processor. the cpu assigns thetask of initiating
operations but it is the iop, who executes the instructions, and not the cpu. cpu instructions provide
operations to start an i/o transfer. the iop asks for cpu through interrupt.
instructions that are read from memory by an iop are also called commands to distinguish them from
instructions that are read by cpu. commands are preparedby programmers and are stored in memory.
command words make the program for iop. CPU informs the iop where to find the commands in
memory.
Interrupts
data transfer between the CPU and the peripherals is initiated by the CPU. but the CPU cannot start
the transfer unless the peripheral is ready to communicate with the CPU. when a device is ready to
communicate with the CPU, it generatesan interrupt signal. a number of input-output devices are
attached to the computer and each device is able to generate an interrupt request.
the main job of the interrupt system is to identify the source of the interrupt. there is also a possibility
that several devices will request simultaneously for CPUcommunication. then, the interrupt system has to
decide which device is to be serviced first.
Priority Interrupt
a priority interrupt is a system which decides the priority at which various devices, which generates
the interrupt signal at the same time, will be serviced bythe cpu. the system has authority to decide
which conditions are allowed to interrupt the cpu, while some other interrupt is being serviced.
generally, devices with high-speed transfer such as magnetic disks are given high priority and slow
280
devices such as keyboards are given low priority.
when two or more devices interrupt the computer simultaneously, the computerservices the device
with the higher priority first.
Types Of Interrupts:
following are some different types of interrupts:
Hardware Interrupts
when the signal for the processor is from an external device or hardware thenthis interrupts is known
as hardware interrupt.an example: when we press any key on our keyboard to do some action, then
thispressing of the key will generate an interrupt signal for the processor to perform certain action.
such an interrupt can be of two types:
Maskable Interrupt
the hardware interrupts which can be delayed when a much high priorityinterrupt has occurred at the
same time.
Normal Interrupt
the interrupts that are caused by software instructions are called normalsoftware interrupts.
Exception
unplanned interrupts which are produced during the execution of some programs are called
exceptions, such as division by zero.
281
Memory Organization In Computer Architecture
a memory unit is the collection of storage units or devices together. the memoryunit stores the binary
information in the form of bits. generally, memory/storageis classified into 2 categories:
volatile memory: this loses its data, when power is switched off.
non-volatile memory: this is a permanent storage and does not lose anydata when power is switched
off.
Memory Hierarchy
a memory unit is an essential component in any digital computer since it is needed for storing
programs and data.
typically, a memory unit can be classified into two categories:
the memory unit that establishes direct communication with the CPU iscalled main memory. the main
memory is often referred to as ram (random access memory).
the memory units that provide backup storage are called auxiliary memory. for instance, magnetic
disks and magnetic tapes are the mostused auxiliary memories.
auxiliary memory access time is generally 1000 times that of the main memory,hence it is at the
bottom of the hierarchy.
the main memory occupies the central position because it is equipped to communicate directly with
the CPU and with auxiliary memory devices throughinput/output processor (i/o).
when the program not residing in main memory is needed by the CPU, they are brought in from
auxiliary memory. programs not currently needed in main memory are transferred into auxiliary
memory to provide space in main memoryfor other programs that are currently in use.
the cache memory is used to store program data which is currently being executed in the cpu.
approximate access time ratio between cache memory andmain memory is about 1 to 7~10
apart from the basic classifications of a memory unit, the memory hierarchy consists all of the
storage devices available in a computer system ranging from the slow but high-capacity auxiliary
memory to relatively faster main memory. the total memory capacity of a computer can be visualized
by hierarchy of components. the memory hierarchy system consists of all storage devices contained
in a computer system from the slow auxiliary memory to fast mainmemory and to smaller cache
282
memory.
Main Memory The Main Memory Acts As The Central Storage Unit In A Computer System. It Is A
283
the memory unit that communicates directly within the cpu, auxillary memoryand cache memory, is
called main memory. it is the central storage unit of the computer system. it is a large and fast
memory used to store data during computer operations. main memory is made up of ram and rom,
with ram integrated circuit chips holing the major share.
ram: random access memory dram: dynamic ram, is made of capacitors and transistors, and must
be refreshed every 10~100 ms. it is slower and cheaper thansram.
sram: static ram, has a six-transistor circuit in each cell and retainsdata, until powered off. nvram: non-
volatile ram, retains its data, even when turned off.example: flash memory.
rom: read only memory read only memory, is non-volatile and is more like a permanentstorage for
information. it also stores the bootstrap
loader program, to load and start the operating system when computer is turned on. prom
(programmable rom), eprom(erasable prom) and eeprom(electrically erasable prom) are some
commonly used roms.
Auxiliary Memory
devices that provide backup storage are called auxiliary memory. For example: magnetic disks and
tapes are commonly used auxiliary devices. other devices used as auxiliary memory are magnetic
drums, magnetic bubble memoryand optical disks.
it is not directly accessible to the CPU and is accessed using the input/outputchannels.
auxiliary memory is known as the lowest-cost, highest-capacity and slowest- access storage in a
computer system. it is where programs and data are kept for long-term storage or when not in
immediate use. the most common examples ofauxiliary memories are magnetic tapes and magnetic
disks.
Magnetic Disks
a magnetic disk is a type of memory constructed using a circular plate of metal orplastic coated with
magnetized materials. usually, both sides of the disks are usedto carry out read/write operations.
however, several disks may be stacked on onespindle with read/write head available on each surface.
the following image shows the structural representation for a magnetic disk.
an auxiliary memory is known as the lowest-cost, highest-capacity and slowest- access storage in a
computer system. it is where programs and data are kept for long-term storage or when not in
immediate use. the most common examples ofauxiliary memories are magnetic tapes and magnetic
disks.
Magnetic Disks
a magnetic disk is a type of memory constructed using a circular plate of metal orplastic coated with
magnetized materials. usually, both sides of the disks are usedto carry out read/write operations.
however, several disks may be stacked on onespindle with read/write head available on each surface.
the following image shows the structural representation for a magnetic disk.
284
the memory bits are stored in the magnetized surface in spots along theconcentric circles called
tracks. the concentric circles (tracks) are commonly divided into sections calledsectors.
Magnetic Tape
magnetic tape is a storage medium that allows data archiving, collection, andbackup for different
kinds of data. the magnetic tape is constructed using a plastic strip coated with a magnetic recording
medium. The bits are recorded as magnetic spots on the tape along several tracks. usually,seven or
nine bits are recorded simultaneously to form a character together witha parity bit.
magnetic tape units can be halted, started to move forward or in reverse, or canbe rewound. however,
they cannot be started or stopped fast enough between individual characters. for this reason,
information is recorded in blocks referred to as records.
Cache Memory
the data or contents of the main memory that are used again and again by CPU,are stored in the cache
memory so that we can easily access that data in shortertime.
whenever the CPU needs to access memory, it first checks the cache memory. ifthe data is not found
in cache memory, then the CPU moves onto the main memory. it also transfers block of recent data
into the cache and keeps on deleting the old data in cache to accommodate the new one.
the data or contents of the main memory that are used frequently by CPU are stored in the cache
memory so that the processor can easily access that data in ashorter time. whenever the CPU needs
to access memory, it first checks the cache memory. if the data is not found in cache memory, then
the CPU moves into the main memory. Cache memory is placed between the CPU and the main
memory. the blockdiagram for a cache memory can be represented as:
285
the cache is the fastest component in the memory hierarchy and approaches the speed of CPU
components.
The basic operation of a cache memory is as follows: when the CPU needs to access memory, the
cache is examined. if theword is found in the cache, it is read from the fast memory.
if the word addressed by the CPU is not found in the cache, the mainmemory is accessed to read the
word. A block of words one just accessed is then transferred from main memory to cache memory.
the block size may vary from one word (theone just accessed) to about 16 words adjacent to the one
just accessed. The performance of the cache memory is frequently measured in termsof a quantity
called hit ratio. When the CPU refers to memory and finds the word in cache, it is said toproduce a hit.
If the word is not found in the cache, it is in main memory and it countsas a miss. the ratio of the
number of hits divided by the total CPU references tomemory (hits plus misses) is the hit ratio.
hit ratio the performance of cache memory is measured in terms of a quantity called hitratio. when
the CPU refers to memory and finds the word in cache it is said to produce a hit. if the word is not
found in cache, it is in main memory then it counts as a miss.
the ratio of the number of hits to the total CPU references to memory is called hitratio.
Associative Memory
it is also known as content addressable memory (cam). it is a memory chip in which each bit position
can be compared. in this the content is compared in each bit cell which allows very fast table lookup.
since the entire chip can be compared, contents are randomly stored without considering addressing
scheme.these chips have less storage capacity than regular memory chips.
an associative memory can be considered as a memory unit whose stored datacan be identified for
access by the content of the data itself rather than by an address or memory location. Associative
memory is often referred to as content addressable memory (cam).
when a write operation is performed on associative memory, no address or memory location is given
to the word. the memory itself is capable of finding anempty unused location to store the word.
on the other hand, when the word is to be read from an associative memory, thecontent of the word,
or part of the word, is specified. the words which match thespecified content are located by the
memory and are marked for reading.
286
the following diagram shows the block representation of an associative memory.
from the block diagram, we can say that an associative memory consists of amemory array and logic
for 'm' words with 'n' bits per word.
the functional registers like the argument register a and key register k each have n bits, one for each
bit of a word. the match register m consists of m bits,one for each memory word.
the words which are kept in the memory are compared in parallel with thecontent of the argument
register. The key register (k) provides a mask for choosing a particular field or key in the argument
word. if the key register contains a binary value of all 1's, then the entire argument is compared with
each memory word. otherwise, only those bitsin the argument that have 1's in their corresponding
position of the key register are compared. thus, the key provides a mask for identifying a piece of
information which specifies how the reference to memory is made.
the following diagram can represent the relation between the memory array andthe external registers
in an associative memory.
the cells present inside the memory array are marked by the letter c with two subscripts. the first
subscript gives the word number and the second specifies thebit position in the word. for instance,
287
the cell cij is the cell for bit j in word I.
a bit aj in the argument register is compared with all the bits in column j of thearray provided that kj =
1. this process is done for all columns j = 1, 2, 3, n. if a match occurs between all the unmasked bits
of the argument and the bits inword i, the corresponding bit mi in the match register is set to 1. if one
or more unmasked bits of the argument and the word do not match, mi is cleared to 0.
Memory mapping and concept of virtual memory he transformation of data from main memory to
cache memory is calledmapping. there are 3 main types of mapping:
associative mapping direct mapping set associative mapping
the associative memory stores both address and data. the address value of 15 bits is 5-digit octal
numbers and data is of 12 bits word in 4-digit octal number. a CPU address of 15 bits is placed in
argument register and the associative memoryis searched for matching address.
direct mapping
the CPU address of 15 bits is divided into 2 fields. in this the 9 least significant bitsconstitute the
index field, and the remaining 6 bits constitute the tag field. the number of bits in index field is equal
to the number of address bits required to access cache memory.
288
the disadvantage of direct mapping is that two words with same index address can't reside in cache
memory at the same time. this problem can be overcome byset associative mapping.
in this we can store two or more words of memory under the same index address.each data word is
stored together with its tag and this forms a set.
Replacement Algorithms
data is continuously replaced with new data in the cache memory using replacement algorithms.
following are the 2 replacement algorithms used: fifo - first in first out. oldest item is replaced with
the lru - least recently used. item which is least recently used by cpu isremoved.
Virtual Memory
virtual memory is the separation of logical memory from physical memory. thisseparation provides
large virtual memory for programmers when only small physical memory is available.
virtual memory is used to give programmers the illusion that they have a very large memory even
though the computer has a small main memory. it makes thetask of programming easier because the
289
programmer no longer needs to worry about the amount of physical memory available.
The Memory Management Unit Performs Three Major Functions: Hardware Memory Management
operating system (os) memory management
application memory management
hardware memory management deals with a system's ram and cache memory, os memory
management regulates resources among objects and data structures, and application memory
management allocates and optimizes memory among programs.
the mmu also includes a section of memory that holds a table that matches virtual addresses to
physical addresses, called the translation lookaside buffer(tab).
Multiprocessor
a multiprocessor is a computer system with two or more central processing units(cpus) share full
access to a common ram. the main objective of using a multiprocessor is to boost the system’s
execution speed, with other objectives being fault tolerance and application matching.
there are two types of multiprocessors, one is called shared memory multiprocessor, and another is
distributed memory multiprocessor. in shared memory multiprocessors, all the CPUs shares the
common memory but in a distributed memory multiprocessor, every CPU has its own private memory.
Applications Of Multiprocessor –
as a uniprocessor, such as single instruction, single data stream (sisd).
as a multiprocessor, such as single instruction, multiple data stream (simd),which is usually used for
vector processing. Multiple series of instructions in a single perspective, such as multiple instruction,
single data stream (misd), which is used for describing hyper-threading or pipelined processors.
inside a single system for executing multiple, individual series of instructions in multiple perspectives,
such as multiple instruction, multipledata stream (mimd).
290
Benefits Of Using A Multiprocessor –
enhanced performance.
multiple applications.
multi-tasking inside an application.
high throughput and responsiveness.
hardware sharing among CPUs.
Characteristics Of Multiprocessor
a multiprocessor system has two or more cpus. it is an interconnection of two or more cpus with
memory and input-output equipment. the term “processor” in multiprocessor can mean either a
central processing unit (cpu) or an input-outputprocessor (iop). however, a system with a single cpu
and one or more lops is usually not included in the definition of a multiprocessor system unless the
iop has computational facilities comparable to a cpu. as it is most commonly defined, a
multiprocessor system implies the existence of multiple cpus, although usually there will be one or
more lops as well. as mentioned earlier multiprocessors are classified as multiple instruction stream,
multiple data stream (mimd) systems.
there are some similarities between multiprocessor and multicomputer systems since both support
concurrent operations. however, there exists an important distinction between a system with multiple
computers and a system with multipleprocessors. computers are interconnected with each other by
means of communication lines to form a computer network. the network consists of several
autonomous computers that may or may not communicate with each other. a multiprocessor system
is controlled by one operating system that provides interaction between processors and all the
components of the system cooperate in the solution of a problem.
multiprocessing improves the reliability of the system so that a failure or error inone part has a limited
effect on the rest of the system. if a fault cause one processor to fail, a second processor can be
assigned to perform the functions ofthe disabled processor. the system as a whole can continue to
function correctlywith perhaps some loss in efficiency.
the benefit derived from a multiprocessor organization is an improved system performance. the
system derives its high performance from the fact that computations can proceed in parallel in one
of two ways. Multiple independent jobs can be made to operate in parallel.a single job can be
partitioned into multiple parallel tasks. an overall function can be partitioned into a number of tasks
that each processor can handle individually. system tasks may be allocated to special purpose
processors whose design is optimized to perform certain types of processing efficiently. an example
is a computer system where one processor performs the computations for an industrial process
control while other monitor and control the various parameters, such as temperature and flow rate.
multiprocessors are classified by the way their memory is organized. a multiprocessor system with
common shared memory is classified as a shared- memory or tightly coupled multiprocessor. this
does not preclude each processorfrom having its own local memory. in fact, most commercial tightly
coupled multiprocessors provide a cache memory with each cpu. in addition, there is a global
common memory that all cpus can access. information can therefore be shared among the cpus by
placing it in the common global memory.
Multiport Memory
a multiport memory system employs separate buses between each memory module and each CPU.
this is shown in figure below for four cpus and four memory modules (mms). each processor bus is
connected to each memory module. a processor bus consists of the address, data, and control lines
required to communicate with memory. the memory module is said to have four ports andeach port
accommodates one of the buses. the module must have internal control logic to determine which
port will have access to memory at any given time. memory access conflicts are resolved by
assigning fixed priorities to each memory port. the priority for memory access associated with each
processor maybe established by the physical port position that its bus occupies in each module. thus
cpu1 will have priority over cpu2, cpu2 will have priority over cpu3, and cpu4 will have the lowest
priority. the advantage of the multiport memory organization is the high transfer rate that can be
achieved because of the multiplepaths between processors and memory. the disadvantage is that it
requires expensive memory control logic and a large number of cables and connectors. as a
consequence, this interconnection structure is usually appropriate for systems with a small number
of processors.
292
Crossbar Switch
the crossbar switch organization consists of a number of cross points that are placed at intersections
between processor buses and memory module paths. figure below shows a crossbar switch
interconnection between four cpus and four memory modules. the small square in each crosspoint
is a switch that determines the path from a processor to a memory module. each switch point has
control logic to set up the transfer path between a processor and memory. it examines the address
that is placed in the bus to determine whether its particularmodule is being addressed. it also resolves
multiple requests for access to the same memory module on a predetermined priority basis.
control signals (not shown) associated with the switch that establish the interconnection between
the input and output terminals. the switch has the capability of connecting input a to either of the
outputs. terminal b of the switch behaves in a similar fashion. the switch also hasthe capability to
293
arbitrate between conflicting requests. if inputs a and b both request the same output terminal, only
one of them will be connected; the other will be blocked.
Hypercube Interconnection
the hypercube or binary n-cube multiprocessor structure is a loosely coupled system composed of n
= 2n processors interconnected in an n-dimensional binarycube. each processor forms a node of the
cube. although it is customary to refer to each node as having a processor, in effect it contains not
only a cpu but also local memory and i/o interface. each processor has direct communication paths
to n other neighbor processors. these paths correspond to the edges of the cube.there are 2n distinct
n-bit binary addresses that can be assigned to the processors. each processor address differs from
that of each of its n neighbors by exactly one bit position.
Interconnection Structures
System Bus
the processor uses a multidrop, shared system bus to provide four-way glue less multiprocessor
system support. no additional bridges are needed for building upo a four-way system. systems with
eight or more processors are designed through clusters of these nodes using high-speed
interconnects. note that multidrop buses are a cost-effective way to build high-performance four-way
systems for commercial transaction processing and e-business workloads. these workloads often
have highly shared writeable data and demand high throughput and low latency on transfers of
modified data between caches of multiple processors. in a four-processor system, the transaction
based bus protocol allowsup to 56 pending bus transactions (including 32 read transactions) on the
bus at any given time. an advanced mesi coherence protocol helps in reducing bus invalidation
transactions and in providing faster access to writeable data. the cache-to-cache transfer latency is
further improved by an enhanced defer mechanism, which permits efficient out-of-order data
transfers and out-of-order transaction completion on the bus. a deferred transaction on the bus can
be completed without reusing the address bus. this reduces data return latency for deferred
transactions and efficiently uses the address bus. this feature is critical for scalability beyond four-
processor systems. the 64-bit system bus uses a source-synchronous data transfer to achieve 266-
mtransfers/ s, which enables a bandwidth of 2.1 gbytes/s. the combination of these features makes
the itaniumprocessor system a scalable building block for large multiprocessor systems.
Intercrosses Arbitration
computer systems contain a number of buses at various levels to facilitate the transfer of information
between components. the cpu contains a number of internal buses for transferring information
between processor registers and alu.a memory bus consists of lines for transferring data, address,
and read/write information. an i/o bus is used to transfer information to and from input andoutput
devices. a bus that connects major components in a multi-processor system, such as cpus, lops, and
memory, is called a system bus. the physical circuits of a system bus are contained in a number of
identical printed circuit boards. each board in the system belongs to a particular module. the board
consists of circuits connected in parallel through connectors. each pin of each circuit connector is
connected by a wire to the corresponding pin of all other connectors in other boards. thus any board
294
can be plugged into a slot in the back-pane that forms the system bus.
the processors in a shared memory multiprocessor system request access to common memory or
other common resources through the system bus. if n otherprocessor is currently utilizing the bus,
the requesting processor may be granted access immediately. however, the requesting processor
must wait if another processor is currently utilizing the system bus. furthermore, other processors
may request the system bus at the same time. arbitration must then be performed to resolve this
multiple contention for the shared resources. the arbitration logic would be part of the system bus
controller placed between the local bus and the system bus.
System Bus
a typical system bus consists of approximately 100 signal lines. these lines are divided into three
functional groups: data, address, and control. in addition, thereare power distribution lines that supply
power to the components. for example, the ieee standard 796 multibus system has 16 data lines, 24
address lines, 26 control lines, and 20 power lines, for a total of 86 lines.
the data lines provide a path for the transfer of data between processors and common memory. the
number of data lines is usually a multiple of 8, with 16 and32 being most common. the address lines
are used to identify a memory addressor any other source or destination, such as input or output
ports. the number of address lines determines the maximum possible memory capacity in the
system. for example, an address of 24 lines can access up to 2″ (16 mega) words of memory. the
data and address lines are terminated with three-state buffers. the address buffers are unidirectional
from processor to memory. the data lines arebi- directional, allowing the transfer of data in either
direction.
data transfers over the system bus may be synchronous or asynchronous. in asynchronous bus, each
data item is transferred during a time slice known in advance to both source and destination units;
synchronization is achieved by driving both units from a common clock source. an alternative
procedure is tohave separate clocks of approximately the same frequency in each unit.
synchronization signals are transmitted periodically in order to keep all clocks asynchronous bus in
the system in step with each other. in an asynchronous bus, each data item being transferred is
accompanied by handshaking control signals to indicate when the data are transferred from the
source and received by the destination.
the control lines provide signals for controlling the information transfer betweenunits. timing signals
indicate the validity of data and address information.
command signals specify operations to be performed. typical control lines include transfer signals
such as memory read and write, acknowledge of a transfer, interrupt requests, bus control signals
such as bus request and bus grant,and signals for arbitration procedures.
In the first come, first-serve scheme, requests are served in the order received. toimplement this
algorithm, the bus controller establishes a queue arranged according to the time that the bus requests
arrive. each processor must wait for its turn to use the bus on a first-in, first-out (fifo) basis. the
rotating daisy-chain procedure is a dynamic extension of the daisy-chain algorithm. in this scheme
there is no central bus controller, and the priority line is connected from the priority-out of the last
device back to the priority-in of the first device in a closed loop. this is similar to the connections
shown in figure for serial arbitration exceptthat the po output of arbiter 4 is connected to the pi input
of arbiter 1.
whichever device has access to the bus serves as a bus controller for the following arbitration. each
arbiter priority for a given bus cycle is determined byits position along the bus priority line from the
arbiter whose processor is currently controlling the bus. once an arbiter releases the bus, it has the
lowestpriority.
Intercrosses Synchronization
the instruction set of a multiprocessor contains basic instructions that are used to implement
communication and synchronization between cooperating processes. communication refers to the
exchange of data between different processes. for example, parameters passed to a procedure in a
different processor constitute interprocessor communication. synchronization refers to the special
case where the data used to communicate between processors is control information.
synchronization is needed to enforce the correct sequence of processes and to ensure mutually
exclusive access to shared writable data.
multiprocessor systems usually include various mechanisms to deal with the synchronization of
resources. low-level primitives are implemented directly by the hardware. these primitives are the
basic mechanisms that enforce mutual exclusion for more complex mechanisms implemented in
software. a number ofhardware mechanisms for mutual exclusion have been developed. one of the
most popular methods is through the use of a binary semaphore.
the semaphore is tested by transferring its value to a processor register r and then it is set to 1. the
value in r determines what to do next. if the processor finds that r = 1, it knows that the semaphore
was originally set (the fact that it is set again does not change the semaphore value). that means
298
another processor isexecuting a critical section, so the processor that checked the semaphore does
not access the shared memory. r = 0 means that the common memory (or the shared resource that
the semaphore represents) is available. the semaphore is set to 1 to prevent other processors from
accessing memory. the processor can now execute the critical section. the last instruction in the
program must clear location sem to zero to release the share resource to other processors.
Cache Coherence
we know that the primary advantage of cache is its ability to reduce the average access time in
uniprocessor systems. when the processor finds a word in cache during a read operation, the main
memory is not involved in the transfer. if the operation is to write, there are two commonly used
procedures to update memory. in the write-through policy, both cache and main memory are updated
with every write operation. in the write-back policy, only the cache is updated and the location is
marked so that it can be copied later into main memory. in a shared memory multiprocessor system,
all the processors share a common memory. in addition, each processor may have a local memory,
part or all of which may be a cache. the compelling reason for having separate caches for each
processor is to reduce the average access time in each processor .the same information may reside
in a number of copies in some caches and main memory. to ensure the ability of the system to
execute memory operations correctly, the multiple copies must be kept identical. this requirement
imposes a cache coherence problem. a memory scheme is coherent if the value returned on a load
instruction is always the value given by the latest store instruction with the same address. without a
proper solution to the cache coherence problem, caching cannot be used in bus-oriented
multiprocessor with two or more processors.
copies of the same object. as multiple processors operate in parallel, and independently multiple
caches may possess different copies of the same memory block, this creates cache coherence
problem. cache coherence schemes help to avoid this problem bymaintaining a uniform state for
each cached block of data let x be an element of shared data which has been referenced by two
processors,p1 and p2. in the beginning, three copies of x are consistent. if the processor p1 writes a
new data x1 into the cache, by using write-through policy, the same copywill be written immediately
into the shared memory. in this case, inconsistency occurs between cache memory and the main
memory. when a write-back
policy is used, the main memory will be updated when the modified data in thecache is replaced or
invalidated.
in this case, we have three processors p1, p2, and p3 having a consistent copy ofdata element ‘x’ in
their local cache memory and in the shared memory (figure-a). processor p1 writes x1 in its cache
memory using write-invalidate protocol. so, all other copies are invalidated via the bus. it is denoted
by ‘I’ (figure-b). invalidated blocks are also known as dirty, i.e., they should not be used. the write-
update protocol updates all the cache copies via the bus. by using write back cache, the memory
copy is also updated (figure-c).
301
Cache Events And Actions
following events and actions occur on the execution of memory-access andinvalidation commands
−read-miss − when a processor wants to read a block and it is not in the cache, a read-miss occurs.
this initiates a bus-read operation. if no dirty copy exists, then the main memory that has a consistent
copy, supplies a copy to the requesting cache memory. if a dirty copy exists in a remote cache
memory, that cache will restrain the main memory and send a copyto the requesting cache memory.
in both the cases, the cache copy will enter the valid state after a read miss.write-hit − if the copy is
in dirty or reserved state, write is done locally andthe new state is dirty. if the new state is valid, write-
invalidate command is broadcasted to all the caches, invalidating their copies. when the shared
memory is written through, the resulting state is reserved after this first write.
write-miss − if a processor fails to write in the local cache memory, thecopy must come either from
the main memory or from a remote cachememory with a dirty block. this is done by sending a read-
invalidate command, which will invalidate all cache copies. then the localcopy is updated with dirty
state.read-hit − read-hit is always performed in local cache memory withoutcausing a transition of
state or using the snoopy bus for invalidation.
Block Replacement − when a copy is dirty, it is to be written back to the main memory by block
replacement method. however, when the copy is either in valid or reserved or invalid state, no
replacement will take place.
Directory-Based Protocols
302
by using a multistage network for building a large multiprocessor with hundredsof processors, the
snoopy cache protocols need to be modified to suit the network capabilities. broadcasting being very
expensive to perform in a multistage network, the consistency commands is sent only to those
caches thatkeep a copy of the block. this is the reason for development of directory-based protocols
for network-connected multiprocessors.
in a directory-based protocols system, data to be shared are placed in a common directory that
maintains the coherence among the caches. here, the directory acts as a filter where the processors
ask permission to load an entry from the primary memory to its cache memory. if an entry is changed
the directory eitherupdates it or invalidates the other caches with that entry.
The Major Concern Areas Are −Sharing Of Writable Data Process Migration I/O Activity
Sharing Of Writable Data
when two processors (p1 and p2) have same data element (x) in their local caches and one process
(p1) writes to the data element (x), as the caches are write-through local cache of p1, the main
memory is also updated. now when p2tries to read data element (x), it does not find x because the
data element in thecache of p2 has become outdated
Process Migration
303
in the first stage, cache of p1 has data element x, whereas p2 does not have anything. a process on
p2 first writes on x and then migrates to p1. now, the process starts reading data element x, but as
the processor p1 has outdated datathe process cannot read it. so, a process on p1 writes to the data
element x and then migrates to p2. after migration, a process on p2 starts reading the data element
x but it finds an outdated version of x in the main memory.
I/O Activity
as illustrated in the figure, an i/o device is added to the bus in a two-processor multiprocessor
architecture. in the beginning, both the caches contain the data element x. when the i/o device
receives a new element x, it stores the new element directly in the main memory. now, when either p1
or p2 (assume p1) tries to read element x it gets an outdated copy. so, p1 writes to element x. now,if
i/o device tries to transmit x it gets an outdated copy.
304
in the system. popular classes of uma machines, which are commonly used for (file-) servers, are the
so-called symmetric multiprocessors (smps). in an smp, all system resources like memory, disks,
otheri/o devices, etc. are accessible by the processors in a uniform manner.
Non-Uniform Memory Access (Numa)
in numa architecture, there are multiple smp clusters having an internal indirect/shared network,
which are connected in scalable message-passing network. so, numa architecture is logically shared
physically distributed memoryarchitecture.in a numa machine, the cache-controller of a processor
determines whether a memory reference is local to the smp’s memory, or it is remote. to reduce the
number of remote memory accesses, numa architectures usually apply caching processors that can
cache the remote data. but when caches are involved, cache coherency needs to be maintained. so
these systems are also known as cc-numa(cache coherent numa).
Multicore Processor
A multicore processor is a single computing component comprised of two or more cpus that read
and execute the actual program instructions. the individualcores can execute multiple instructions in
parallel, increasing the performance ofsoftware which is written to take advantage of the unique
architecture. the first multicore processors were produced by intel and amd in the early 2000s. today,
processors are created with two cores ("dual core"), four cores ("quad core"), six cores ("hexa core"),
and eight cores ("octo core"). processorsare made with as many as 100 physical cores, as well as
1000 effective independent cores by using fpgas (field programmable gate arrays).
a multicore processor is a single integrated circuit (a.k.a., chip multiprocessor orcmp) that contains
multiple core processing units, more commonly known as cores. there are many different multicore
processor architectures, which varyin terms of number of cores. different multicore processors often
have different numbers of cores. for example, a quad-core processor has four cores. thenumber of
cores is usually a power of two.
305
Number Of Core Types.
homogeneous (symmetric) cores. all of the cores in a homogeneousmulticore processor are of the
same type; typically, the core processing units are general-purpose central processing units that run
a single multicore operating system. heterogeneous (asymmetric) cores. heterogeneous multicore
processors have a mix of core types that often-run different operating systems and include graphics
processing units. number and level of caches. multicore processors vary in terms of their instruction
and data caches, which are relatively small and fast pools of local memory. How cores are
interconnected. multicore processors also vary in terms oftheir bus architectures.
isolation. the amount, typically minimal, of in-chip support for the spatialand temporal isolation of
cores:
Physical Isolation ensures that different cores cannot access the same physical hardware (e.g.,
memory locations such as caches andram).temporal isolation ensures that the execution of software
on one core does not impact the temporal behavior of software runningon another core.
reliability and robustness. allocating software to multiple cores increasesreliability and robustness
(i.e., fault and failure tolerance) by limiting fault and/or failure propagation from software on one core
to software on another. the allocation of software to multiple cores also supports failure tolerance
by supporting failover from one core to another (and subsequentrecovery).
obsolescence avoidance. the use of multicore processors enables architects to avoid technological
obsolescence and improve maintainability. chip manufacturers are applying the latest technical
advances to their multicore chips. as the number of cores continues to increase, it becomes
increasingly hard to obtain single-core chips.
hardware costs. by using multicore processors, architects can produce systems with fewer
computers and processors.
interference. interference occurs when software executing on one core impacts the behavior of
software executing on other cores in the same processor. this interference includes failures of both
spatial isolation (due to shared memory access) and failure of temporal isolation (due to interference
delays and/or penalties). temporal isolation is a bigger problem than spatial isolation since multicore
processors may have special hardware that can be used to enforce spatial isolation (to prevent
software running on different cores from accessing the same processor-internal memory). the
number of interference paths increases rapidly with the number of cores and the exhaustive analysis
308
of all interference paths is often impossible. the impracticality of exhaustive analysis necessitates
the selection of representative interference paths when analyzing isolation. the following diagram
uses the color red to illustrate three possible interference paths between pairs of applications
involving six shared resources.
Concurrency defects. cores execute concurrently, creating the potential for concurrency defects
including deadlock, live lock, starvation, suspension, (data) race conditions, priority inversion, order
violations, and atomicity violations. note that these are essentially the same types of concurrency
defects that can occur when software is allocated to multiple threads on a single core.
Non-determinism. multicore processing increases non-determinism. for example, i/o interrupts have
top-level hardware priority (also a problem with single core processors). multicore processing is also
subject to lock trashing, which stems from excessive lock conflicts due to simultaneous access of
kernel services by different cores (resulting in decreased concurrency and performance). the
resulting non-deterministic behaviorcan be unpredictable, can cause related faults and failures, and
can maketesting more difficult (e.g., running the same test multiple times may notyield the same test
result).
Analysis difficulty. the real concurrency due to multicore processing requires different memory
consistency models than virtual interleaved concurrency. it also breaks traditional analysis
approaches for work on single core processors. the analysis of maximum time limits is harder and
may be overly conservative. although interference analysis becomes morecomplex as the number of
cores-per-processor increases, overly restricting the core number may not provide adequate
performance.
Accreditation and certification. interference between cores can cause missed deadlines and
excessive jitter, which in turn can cause faults (hazards) and failures (accidents). verifying a multicore
309
system requires proper real-time scheduling and timing analysis and/or specialized performance
testing. moving from a single core to a multicore architecture may require recertification.
unfortunately, current safety policy guidelines are based on single-core architectures and must be
updated based on the recommendations that will be listed in the final blog entry in this series.
example
𝑎 = 𝑞𝑏 + 𝑟
158 = 9 × 17 + 5
Convention uses a dot 𝒕 to show multiplication instead of a 𝑥
This is fine, since we are purely dealing with integers. No decimals are involved
158 = 9 ⋅ 17 + 5
so 𝑞 = 9 and 𝑟 = 5
use the division algorithm to find the quotient and remainder whena = 158 and b = 17
𝑎 = 𝑞𝑏 + 𝑟
so 𝑎 = 𝑞𝑏 + 0
reduce a fraction to its simplest form(just divide top and bottom by the gcd) find relatively prime
(coprime) integersthese occur when the gcd (a, b) = 1
solve equations of the formgcd (a, b) =ax +by
If 𝑎𝑥 + 𝑏𝑦 = 𝑑 where gcd (𝑎, 𝑏) = 𝑑 then
𝑏
𝑥𝑛 = 𝑥0 + ( ) 𝑚
𝑑
𝑎
𝑦𝑛 = 𝑦0 + ( ) 𝑚
𝑑
describes the general solution 𝑥𝑛 , 𝑦𝑛 when the particular solutions 𝑥0 , 𝑦0 are known
310
example
find the gcd of 135 and 1780
𝑎 = 𝑞𝑏 + 𝑟
1780 = 13 ⋅ 135 + 25
Now continue, replacing
𝑎 with 𝑏 and 𝑏 with 𝑟
135 = 5 ⋅ 25 + 10
25 = 2 ⋅ 10 + 5
10 = 2 ⋅ 5 + 0
gcd (135,1780) = 5
example
find the lcm of 135 and 1780
𝑎𝑏
1 cm =
(𝑎, 𝑏)
135 ⋅ 1780
=
5
240300
=
5
= 48060
Icm(135,1780) = 48060
example
reduce the fraction 1480/128600 to its simplest form
𝑎 = 𝑞𝑏 + 𝑟
128600 = 86 ⋅ 1480 + 1320
1480 = 1 ⋅ 1320 + 160
1320 = 8 ⋅ 160 + 40
160 = 4 ⋅ 40 + 0
gcd (128600,1480) = 40
311
𝑎 = 𝑞𝑏 + 𝑟
111 = 3 ∙ 34 +
34 = 3 ⋅ 9 + 7
9=1⋅7+2
7=3⋅2+1
2=2⋅1+0
change subject to 𝑟 and substitute
Start with the second last equation
and work backwards
=7−3∙2
= 7 − 3 ∙ (9 − 1 ∙ 7)
=7−3∙9+3∙7 =4⋅7−3∙9
= 4 ∙ (34 − 3 ⋅ 9) − 3 ⋅ 9 = 4 ∙ 34 − 15 ∙
= 4 ⋅ 34 − 15 ⋅ (111 − 3 ⋅ 34)
= 49 ⋅ 34 − 15 ⋅ 111
compare to original
34𝑥 + 111𝑦 = 1
⇒ 𝑥 = 49 𝑦 = −15
2. Diophantine equations
these are of the form
𝑎𝑥 𝑛 + 𝑏𝑦 𝑛 = 𝑐 𝑛
Where all numbers are integers 𝑎𝑥 + 𝑏𝑦 = 𝑐
has a solution only if the gcd is a factor of 𝑐
To solve
1 Find gcd (𝑎, 𝑏) = 𝑑then 𝑑 ∣ 𝑐 so 𝑐 = 𝑑𝑛 for some integer 𝑛.
Express 𝑐 in terms of 𝑑
2 Express 𝑑 in the form 𝑑 = 𝑎𝑠 + 𝑏𝑡 for some integers 𝑠 and 𝑡
3 Multiply by 𝑛 to get 𝑥 = 𝑠𝑛 𝑦 = 𝑡𝑛
example
solve the linear Diophantine equation69x +27𝑦 = 1332, if it exists
69𝑥 + 27𝑦 = 1332
Find the gcd of 69 and 27
69 = 2 ⋅ 27 + 15
27 = 1 ∙ 15 + 12
15 = 1 ⋅ 12 + 3
12 = 4 ⋅ 3 + 0
gcd (69,27) = 3
since 3 ∣ 1332, a solution exists
𝑐 = 𝑑𝑛 ⇒ 1332 = 3𝑛 ⇒ 𝑛 = 444
312
3 = 15 − 1 ⋅ 12
= 15 − 1 ⋅ (27 − 1 ⋅ 15) = 15 − 1 ⋅ 27 + 1 ⋅ 15 = 2 ⋅ 15 − 1 ⋅ 27
= 2 ⋅ (69 − 2 ⋅ 27) − 1 ⋅ 27
= 2 ⋅ 69 − 4 ⋅ 27 − 1 ⋅ 27
= 2 ⋅ 69 − 5 ⋅ 27
𝑑 = 69𝑠 + 27𝑡
⇒ 𝑠 = 2 𝑡 = −5
Multiply through by 𝒏
𝑥 = 2𝑛 𝑦 = −5𝑛
= 2 ⋅ 444 = −5 ⋅ 444
One solution is 𝑥 = 888, 𝑦 = −2220
𝑏 27
𝑥𝑛 = 𝑥0 + ( ) 𝑚 = 888 + ( ) 𝑚 = 888 + 9𝑚
𝑑 3
𝑎
𝑦𝑛 = 𝑦0 + ( ) 𝑚 = −2220 + 13𝑚
𝑑
for some multiple m
example
find the positive integer values of 𝑥 and 𝑦 that satisfy 69𝑥 + 27𝑦 = 1332 From above, a solution
exists
69𝑥 + 27𝑦 = 1332
gcd (69,27) = 3
23𝑥 + 9𝑦 = 444
solving for 𝒙
444 − 9𝑦
𝑥 =
23
7 9𝑦
= 19 −
23 23
9𝑦 7
= 19 − +
23 23
9𝑦 − 7
= 19 −
23
9𝑦 − 7
∴ ≤ 18
23
23 ∙ 18 + 7
⇒𝑦≤
9
421 7
⇒𝑦≤ ≤ 46
9 9
0 < 𝑦 ≤ 46, Δ𝑦 = 9
Lowest possible is
𝑦 = 11
444 − 99
thus 𝑥 = = 15
23
Alternatively, solving for 𝑦
313
444 − 23𝑥
𝑦 =
9
23𝑥 − 3
= 49 −
9
23𝑥 − 3
∴ ≤ 48
9
48 ∙ 9 + 3
⇒ 𝑥≤
23
435 21
⇒ 𝑥≤ ≤ 18
23 23
0 < 𝑥 ≤ 18, Δ𝑥 = 23
Lowest possible answers
𝑥 = 15
23 ⋅ 15 − 3
𝑦 = 49 −
9
= 49 − 38
𝑦 = 11
check
69𝑥 + 17𝑦 = 69 ⋅ 15 + 17 ⋅ 11 = 1035 + 187 = 1332
3. pythagorean triples 𝑎𝑥 2 + 𝑏𝑦 2 = 𝑐 2 to find these, pick an odd positive number
divide its square into two integers which areas close to being equal as is possible
e.g., 72 = 49 = 24 + 25
gives triples 7,24,25
72 + 242 = 252
alternatively, pick any even integer ntriples are 2𝑛, 𝑛2 − 1 and 𝑛2 + 1
e.g., picking 8 gives 16,63 and 65 indeed 162 + 632 = 652
Fermat’s lastliveorem
4. 𝑎𝑥 𝑛 + 𝑏𝑦 𝑛 = 𝑐 𝑛 , 𝑛 > 2 cannot be solved with all as integers 36 = 100100 in base 2
number bases
to convert a number into a different bases,usethe division algorithm n taking b as therequired base.
example
convert 36 into binary 36 = 1001002
𝑎 = 𝑞𝑏 + 𝑟
36 = 18 ⋅ 2 + 0
Now continue, replacing a with 𝑞
18 = 9 ⋅ 2 + 0
9= 4⋅2+1
4= 2⋅2+0
2= 1⋅2+0
1= 0⋅2+1
36 = 24 in base 16
example
314
𝑎 = 𝑞𝑏 + 𝑟
503793 = 31487 ⋅ 16 + 1
31487 = 1967 ⋅ 16 + 15
1967 = 122 ⋅ 16 + 15
122 = 7 ⋅ 16 + 10
convert 36 into hexadecimal
7 = 0 ⋅ 16 + 7
503793 = 7AFF1 in base 16
503793dec = 7𝐴𝐹𝐹1hex
example
convert 503703 into hexadecimal
(Remember that hexadecimal uses letters) computer system architecture)
Answer: a
Explanation: Any negative number isrecognized by its MSB (Most Significant Bit).
If it’s 1, then it’s negative, else if it’s0, then positive.
Answer: b
Explanation: On multiplying the decimal number continuously by 2, the binary equivalent is
obtained bythe collection of the integer part. However, if it’s an integer, then itsbinary equivalent is
determined bydividing the number by 2 and collecting the remainders.
Answer: c
Explanation: It can be represented upto 16 different values with the help ofa Word. Nibble is a
combination of four bits and Byte is a combination of8 bits. It is “word” which is said to be a
collection of 16-bits on most of the systems.
315