0% found this document useful (0 votes)
12 views

Module2

The document provides an overview of fixed-point representation in computer arithmetic, detailing its components, types, advantages, and disadvantages. It explains arithmetic operations such as addition and subtraction using fixed-point numbers, including algorithms for these operations and examples. Additionally, it covers multiplication techniques, particularly Booth's Algorithm for both unsigned and signed fixed numbers.

Uploaded by

pratyush.pgat2nd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Module2

The document provides an overview of fixed-point representation in computer arithmetic, detailing its components, types, advantages, and disadvantages. It explains arithmetic operations such as addition and subtraction using fixed-point numbers, including algorithms for these operations and examples. Additionally, it covers multiplication techniques, particularly Booth's Algorithm for both unsigned and signed fixed numbers.

Uploaded by

pratyush.pgat2nd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Data Representation and Computer Arithmetic

Fixed point representation of numbers


Fixed-point representation is a method of storing real numbers in computers where the
position of the binary (or decimal) point is fixed. It is commonly used when floating-point
representation is not required or is too expensive in terms of computational resources.
Components of Fixed-Point Representation
1. Sign Bit: Determines if the number is positive (0) or negative (1).
2. Integer Part: Stores the whole number portion of the value.
3. Fractional Part: Stores the fractional portion if necessary.
4. Fixed Position of the Point: Unlike floating-point representation, the decimal/binary
point remains in a fixed position.
Types of Fixed-Point Representation
1. Signed Magnitude Representation
o The most significant bit (MSB) represents the sign (0 for positive, 1 for
negative).
o The remaining bits store the magnitude of the number.
2. One’s Complement Representation
o Negative numbers are represented by inverting all bits of their positive
counterpart.
o There are two representations for zero: +0 and -0.
3. Two’s Complement Representation (Most Common)
o Negative numbers are represented by inverting all bits of the positive number
and adding 1.
o Provides a unique representation for zero and simplifies arithmetic operations.
4. Fractional Fixed-Point Representation
o Used to represent fractions by allocating a fixed number of bits for the
fractional part.
o Example: In a 16-bit system, if 8 bits are for the integer part and 8 bits for the
fraction, the point is assumed to be after the first 8 bits.
Advantages of Fixed-Point Representation
 Faster arithmetic operations compared to floating-point.
 Simpler hardware implementation.
 Efficient memory usage in applications where precision is not a major concern.
Disadvantages of Fixed-Point Representation
 Limited range of values.
 Precision issues in representing very small or very large numbers.
 Overflow and underflow problems when exceeding the allocated bit size.
Example of Fixed-Point Representation
Consider an 8-bit system using two’s complement representation:
 +5 (Decimal) → 00000101 (Binary)
 -5 (Decimal) → 11111011 (Binary) (Two’s complement)

Example −Assume number is using 32-bit format which reserve 1 bit for the sign, 15 bits
for the integer part and 16 bits for the fractional part.

 Then, -43.625 is represented as following:

Translate the number 4310 в binary like this:


 the Integer part of the number is divided by the base of the new number system:
43 2
-42 21 2
1 -20 10 2
1 -10 5 2
0 -4 2 2
1 -2 1
0

 the result of the conversion was:


 4310 = 1010112
 the Final answer: 4310 = 1010112

Translate the number 0.62510 в binary like this:

the Fractional part of the number is multiplied by the base of the new number system:

0. 625*2
1 .25*2
0 .5*2
1 .0*2
the result of the conversion was: 0.62510 = 0.1012

the Final answer: 0.62510 = 0.1012

Algorithms for arithmetic operations: addition- subtraction


Addition
Adding two numbers is an addition. We may add signed or unsigned numbers. When we add
two numbers, say 8 and 5, the result is 13 i.e. while adding two single-digit numbers, we may
get a two-digit number in the result. A similar possibility exists in the binary system too.
Thumb rule of binary addition is:
0+0=0
0+1=1
1+0=1
1 + 1 = 10
Examples (a –e) of unsigned binary addition are given in figure

Adder
The hardware circuit which executes this addition is called Adder. There are two types of
adders namely Half adder and Full adder. Basic adder circuit does 1-bit addition and is
extended for n-bit addition. The adder circuit characteristics are detailed by a circuit, a truth
table, Formula and a block symbol. The adder circuits are constructed from logic gates which
satisfy the formula as per truth table. These are also called combinational logic. A
Combinational logic output reflects the input without clocking.
Figure Half adder
The Half Adder (HA) has two inputs (A, B) and two outputs (Sum and Carry). The Sum is
XOR of input while the Carry is AND of the input. The Half Adder is detailed in figure.
A Full Adder (FA) also performs 1-bit addition but taking 3 inputs (A, B and C i) and produces
two outputs (Sum and Carry). Like HA, FA generates result consisting of Sum (S) and Carry
out (Cout). Cout is used as Ci+1 while cascading for multiple bits of a word. Full Adder is
detailed in figure 8.3. A full adder can also be constructed using half adder blocks as in figure

Figure Full adder

Figure Full Adder using Half Adder blocks

Subtraction
Subtraction is finding the difference of B from A i.e A-B. Basis of binary subtraction is:
0-0=0
0 - 1 = -1
1-0=1
1-1=0
Of course, the usual borrow logic from the adjacent digit is applied as in the case of decimal
numbers. Examples of signed binary Subtraction is as below:

2's Complement for Subtraction


"1's complement + 1 = 2's complement"
Generating this 2's complement is very simple using an XOR circuit. The XOR circuit will
generate 1's complement. A control signal called SUBTRACT is used as add value of 1. This
way, an adder executes subtraction. See the example below, where case (b), case (c) and case
(e) are worked out as 2's complement representation; and A-B becomes A + (2's
complement(B)). The result is obtained in 2's complement form discarding the carry. Observe
that this method works for all kind of data.

Interpreting 2's complement numbers


 Observe the sign bit (MSB)
 If '0', the number is positive; the (n-1) bits mean the absolute value of the number in
binary
 If '1', the number is negative; (n-1) bits mean the 2’s complement value of the number
in binary; Invert the (n-1) bits and add 1 to get the absolute value of this negative
number.

Example of Fixed-Point Addition


1. Steps for Fixed-Point Addition
Given: Two fixed-point numbers with the same format (e.g., 8-bit, 4 integer + 4 fractional
bits).
Algorithm:
1. Align the numbers: Ensure both numbers have the same binary point position.
2. Convert to binary: Represent numbers in fixed-point binary.
3. Perform binary addition: Add the two binary numbers just like integer addition.
4. Handle carry: If there is a carry beyond the allocated bit length, it may cause
overflow.
5. Normalize the result: Ensure the binary point remains in the fixed position.

Example: Adding 3.625 and 1.875


 Assume 4 integer bits + 4 fractional bits.
 Convert to fixed-point binary:
o 3.62510=0011.10102
o 1.87510=0001.11102
 Perform binary addition:
0011.1010 (3.625)
+ 0001.1110 (1.875)
--------------
0101.1000 (5.5 in decimal)
The result is 5.5 (0101.1000 in binary), which fits within the fixed-point format.
Example of Fixed-Point Subtraction
Steps for Fixed-Point Subtraction
Given: Two fixed-point numbers with the same format (e.g., 8-bit, 4 integer + 4 fractional
bits).
Algorithm:
1. Align the numbers: Ensure both numbers have the same binary point position.
2. Convert to binary: Represent numbers in fixed-point binary.
3. Take Two’s Complement of the Subtrahend (B):
o Invert all bits.
o Add 1 to get the negative representation.
4. Perform Binary Addition: Add the minuend (A) and the two’s complement of the
subtrahend (B).
5. Handle Carry: If there is a carry beyond the allocated bit length, discard it.
6. Normalize the result: Ensure the binary point remains in the fixed position.
Example: Subtract 1.875 from 3.625
 Assume 4 integer bits + 4 fractional bits.
 Convert to fixed-point binary:
o 3.62510=0011.10102
o 1.87510=0001.11102
 Take Two’s Complement of 1.875:
1. Invert 0001.1110 → 1110.0001
2. Add 1: 1110.0010 (Two’s complement of 1.875)
 Perform Binary Addition:
0011.1010 (3.625)
+ 1110.0010 (-1.875)
--------------
0001.1100 (1.750 in decimal)
The result is 1.75 (0001.1100 in binary).

1001 = -7 1100 = -4 0011 = 3


0101 = +5 0100 = +4 0100= 4
1110 =-2 10000 = 0 0111= 7
(a) (-7)+(+5) (b) (-4)+(4) (c) (+3)+(+4)

1100 = -4 0101 =5 1001 = -7


1111 = -1 0100 =4 1010 = -6
11011 = -5 1001=9 10011 = overflow
(d) (-4)+(-1) (e) (+5)+(+4) (f) (-7)+(-6)
Multiplication
Booth's Algorithm (unsigned fixed numbers)

Flowchart of Multiplication:

1. Initially multiplicand is stored in B register and multiplier is stored in Q register.


2. Sign of registers B (Bs) and Q (Qs) are compared using XOR functionality (i.e., if
both the signs are alike, output of XOR operation is 0 unless 1) and output stored in
As (sign of A register).
Note: Initially 0 is assigned to register A and E flip flop. Sequence counter is initialized with
value n, n is the number of bits in the Multiplier.
3. Now least significant bit of multiplier is checked. If it is 1 add the content of register
A with Multiplicand (register B) and result is assigned in A register with carry bit in
flip flop E. Content of E A Q is shifted to right by one position, i.e., content of E is
shifted to most significant bit (MSB) of A and least significant bit of A is shifted to
most significant bit of Q.
4. If Qn = 0, only shift right operation on content of E A Q is performed in a similar
fashion.
5. Content of Sequence counter is decremented by 1.
6. Check the content of Sequence counter (SC), if it is 0, end the process and the final
product is present in register A and Q, else repeat the process.
Example:
Multiplicand = 10111=23
Multiplier = 10011 =19
23*19=437

Convert 11010012 into an equivalent decimal number.


Solution: Using binary to decimal conversion method, we get;
(1101001)₂ = (1 × 2⁶) + (1 × 2⁵) + (0 × 2⁴) + (1 × 2³) + (0 × 2²) + (0 × 2¹) + (1 × 2⁰)
= 64 + 32 + 0 + 8 + 0 + 0 + 1 = (105)₁₀
convert the fractional part 0.1101 to decimal form
Decimal equivalent of "1" = 1 × 2^-1 = 0.5
Decimal equivalent of "1" = 1 × 2^-2 = 0.25
Decimal equivalent of "0" = 0 × 2^-3 = 0
Decimal equivalent of "1" = 1 × 2^-4 = 0.0625
Decimal equivalent of "0.1101" = 0.50.2500.0625
0.1101 = 0.8125
Here is the final answer, The binary number 0.1101 converted to decimal is therefore equal
to:
= 0.11012
= 010 + 0.812510
= 0.812510
Booth's Algorithm (signed fixed numbers)
Booth observed that multiplication can also be done with mixed additions and subtractions,
instead of only additions. And it deals with signed multiplication as well.
The motivation for Booth's Algorithm is that ALU with add or subtract can get the same
result in more than one way .i.e. the multiplier 6 can be dealt as:
6=–2+8
Booth's Algorithm categorises the multiplier as the run of 1's and further as begin, middle and
end of runs. The run is identified as below for a number 01110.

Run of 1's
Based on the run status, the operation to be performed in the multiplication process is defined
as in table 9.2. The values of the current bit (Q0) and the outgoing bit (Qe) of the multiplier
decide the operation to be performed. By this, the multiplication is achieved in less number of
cycles based on the multiplier. A multiplier may have many combinations of runs based on its
value. This algorithm is sensitive to bit patterns of Multiplier. A pattern like 01010101 may
be the worse as it has many begin and end runs necessitating as many additions and
subtractions and may not save cycle time. But by and large Booth’s algorithm saves cycles.
Table 9.2 Booth Encoding for Multiplication – Operation regarding the run

Current Bit Bit to the


(Q0) right (Qe) Explaination Example Operation

1 0 Begins run of 1s 000111100 Subtract multiplicand from partial


0 product, shift right

1 1 Middle of run of 000111100 No arithmetic operation, shift right


1s 0

0 1 End of run of 1s 000111100 Add multiplicand to partial product,


0 shift right

0 0 Middle of run of 000111100 No arithmetic operation, shift right


0s 0
Booth's algorithm uses Arithmetic Shift Right for collecting partial product. Arithmetic Shift
right is a sign-extended shift; i.e if the sign bit is 0, then 0 is extended while shifting; if the
sign bit is 1, then 1 is extended while shifting. For this reason, n+1 is the register size. You
may observe this in our work out in table. The work out is for (-12x -11). This example is
taken to demonstrate the outcome of signed multiplication with Booth's algorithm. Both
multiplicand (M) and Multiplier (Q) use 5-bits as against 4-digit binary number.
Note:
i. When two bits Qn and Qn + 1 are 00 or 11, we simply perform the arithmetic shift right
operation (ashr) to the partial product AC. And the bits of Qn and Q n + 1 is incremented
by 1 bit.
ii. If the bits of Qn and Qn + 1 is shows to 01, the multiplicand bits (M) will be added to
the AC (Accumulator register). After that, we perform the right shift operation to the
AC and QR bits by 1.
iii. If the bits of Qn and Qn + 1 is shows to 10, the multiplicand bits (M) will be subtracted
from the AC (Accumulator register). After that, we perform the right shift operation to
the AC and QR bits by 1.
12= 1100 binary
0011 1’s complement
+1
-12= 0100 2’s complement
Finally with sign bit -12=10100
M=010111=23
SC P Q=-9 Qe
-M=101001

6 010111 000000 110111 0

Q=1,Qe=0,P-M 101001 110111 0

Shift P,Q,Qe 110100 111011 1

5 Shift P,Q,Qe 111010 011101 1

4 Shift P,Q,Qe 111101 001110 1

3 Q=0,Qe=1,P+M 010100 001110 1

Shift P,Q,Qe 001010 000111 0

2 Q=1,Qe=0,P-M 110011 000111 0

Shift P,Q,Qe 111001 100011 1

1 Shift P,Q,Qe 111100 110001 1

Take 2’s complement for 111100110001(-207) will give(000011001111)=207

Floating-Point Representation
This representation does not reserve a specific number of bits for the integer part or the
fractional part. Instead, it reserves a certain number of bits for the number (called the
mantissa or significand) and a certain number of bits to say where within that number the
decimal place sits (called the exponent).

Example: Suppose number is using 32-bit format: the 1 bit sign bit, 8 bits for signed
exponent, and 23 bits for the fractional part. The leading bit 1 is not stored (as it is always 1
for a normalized number) and is referred to as a “hidden bit”.
Then −53.5 is normalized as -53.5=(-110101.1)2=(-1.101011)x25 , which is represented as
following below,

Where 00000101 is the 8-bit binary value of exponent value +5.

Real numbers
Real numbers are numbers that include fractions/values after the decimal point.
For example, 123.75 is a real number.
Floating point representation
Real numbers are stored in a computer as floating point numbers using a mantissa (m),
a base (b) and an exponent (e) in this format:
m X be
Example (in decimal)
This example gives a general idea of the role of the mantissa, base and exponent. It does not
fully reflect the computer's method for storing real numbers.
The number 123.75 can be represented as a floating point number. To do this, move all the
digits so that the most significant digit is to the right of the decimal point:
123.75 → 0.12375
The number after the decimal point is the mantissa (m).
As this number is written in decimal (denary), the base (b) is 10 .
To work out the exponent (e) count how many decimal places you have moved the decimal
point by (in this case three). So we can represent 123.75 in floating point representation as
this:
0.12375 x 103
Example (in binary)
In the Higher course, all floating point representation is in binary.
First convert 123.75 to binary:

64 32 16 8 4 2 1 0.5 0.25
1 1 1 1 0 1 1 1 1
64 + 32 + 16 + 8 + 2 + 1 + 0.5 + 0.25 = 1111011.11
123.75 in binary is 1111011.11
Key fact
The computer will not store the actual decimal point as part of the floating point
number but it is used here for illustrative purposes.
To find the mantissa, move the decimal point to the right of the most significant bit of the
mantissa:
1111011.11 → 0.111101111
To calculate the exponent, count how many places the decimal point moved to give the
mantissa. In this case the decimal point moved seven places to the left:
So the exponent for our number is 7.
4 2 1
1 1 1
In binary, the number 7 is 111 as 4 + 2 + 1 = 7
In order to represent 123.75 the mantissa would be 111101111 and the exponent would be
111. This can be thought of as:
0.111101111 x 2111
Sign bit
As well as the mantissa, base and exponent, we have a digit before the decimal point. This is
used as a sign bit and is represented in binary as a 0 for positive and a 1 for negative.
How many bits?
There will always be a trade-off between accuracy and range when using floating point
notation, as there will always be a set number of bits allocated to storing real numbers:
 increasing the number of bits devoted to the mantissa will improve the accuracy of a
floating point number
 increasing the number of bits devoted to the exponent will increase the range of
numbers that can be held
In the Higher course, floating point numbers are represented as follows:
 1 bit for the sign
 15 bits for the mantissa
 8 bits for the exponent

Restoring Division Algorithm


Let us see the flowchart for the restoring division algorithm in computer architecture.
Let us understand the flowchart:
 We initialize the variables. Register A is initialized with 0, register Q will have the
dividend, and register M will contain the divisor. N is the counter and has a value
equal to the number of bits present in the dividend.

 The value of AQ(here in this step, A and Q will be treated as a single unit) will
shift to the left.
 In this step, subtraction occurs. M will be subtracted from A, and A will store the
result.

 In this step, we check for the most significant bit of A. Suppose the most significant
bit in A is 1 after the above three stages in the restoring division algorithm. In that
case, it will set the least significant bit of Q as 0, and the value of A will again become
what it was before the subtraction operation in step 3. If the most significant bit in A is
0, then it will set the least significant bit of Q as 1.
 N is decreased by 1 in this step.
 In this step, we check the value of N. If the value of N becomes 0, we break the loop
here or move back to step 2.
 In this step, we have our answer with the quotient in Q and the remainder in A.

Let us see an example of the restoring division algorithm in computer architecture:


Let the dividend be 0111(7) and the divisor 0011(3).

N M A Q Operation

4 0011 0000 0111 Initialization

0000 111_ SHL AQ

1101 A= A-M

0000 1110 Q[0]=0 and restore A

3 0011 0001 110_ SHL AQ

1110 A=A-M

0001 1100 Q[0]=0 and restore A

2 0011 0011 100_ SHL AQ

0000 A=A-M

0000 1001 Q[0]=1

1 0011 0001 001_ SHL AQ


N M A Q Operation

1110 A=A-M

0001 0010 Q[0]= 0 and restore A

We got the quotient as 0010, which is 2, and the remainder as 0001, which is 1.
Let the dividend be 1011(11)=Q and the divisor 0011(3)=M. -M=1101

N M A Q Operation

4 0011 0000 1011 Initialization

0001 011_ SHL AQ

1110 A= A-M

0001 0110 Q[0]=0 and restore A

3 0011 0010 110_ SHL AQ

1111 A=A-M

0010 1100 Q[0]=0 and restore A

2 0011 0101 100 SHL AQ

0010 A=A-M

0010 1001 Q[0]=1

1 0011 0101 001 SHL AQ

0010 A=A-M

0010 0011 Q[0]= 0 and restore A

We got the quotient as 0011, which is 3, and the remainder as 0010, which is 2.

You might also like