Lecture 3 - Bits, Bytes, and Integers
Lecture 3 - Bits, Bytes, and Integers
Lecture 3
198:331 Introduction to Computer Organization
Instructor:
Michael A. Palis
[email protected]
1
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
2
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Everything is Bits
Each bit is 0 or 1
By encoding/interpreting sets of bits in various ways
● Computers determine what to do (instructions)
● … and represent and manipulate numbers, sets, strings, etc. …
Why bits? Electronic Implementation
● Easy to store with bistable elements
● Reliably transmitted on noisy and inaccurate wires
0 1 0
1.1V
0.9V
0.2V
0.0V
3
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Data Representation
Data represented as sequences of bits but require different sizes
Sizes are multiples of 1 byte = 8 bits
Size of representation (in bytes) depends on machine and compiler
4
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Number Systems
Positional base b number system
● Base (or radix) b number is written as (dn-1 dn-2 … d1 d0)b
● Each digit di ∈ {0, 1, ..., b-1}
● Decimal value of (dn-1 dn-2 … d1 d0)b is
𝑛−1
𝑑𝑖 × 𝑏𝑖
𝑖=0
5
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Base 8 or octal
● Each octal digit di ∈ {0, 1, 2, 3, 4, 5, 6, 7}
● Example: 72168 = (7×83) + (2×82) + (1×81) + (6×80) = 372610
● C declaration: unsigned int y = 07216; // prepend a zero
Base 16 or hexadecimal
● Each hex digit di ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F}
where: A = 10, B = 11, C = 12, D = 13, E = 14, F = 15
● Example: 2F3A16 = (2×163) + (15×162) + (3×161) + (10×160) = 12,09010
● C declaration: unsigned int z = 0x2F3A; // prepend “zero-x”
6
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Binary Numbers
Base 2 or binary
● Example: 101102 = (1×24) + (0×23) + (1×22) + (1×21) + (0×20) = 2210
● Regardless of the base in which a number is expressed at the
program level, it is represented at the machine level in base 2 or
binary – as a sequence of bits (0’s and 1’s)
● Sorry, C does not allow writing integer constants in binary!
7
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
2 A F 8
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
10
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Boolean Algebra
Developed by George Boole in 19th Century
● Algebraic representation of logic
► Encode “TRUE” as 1 and “FALSE” as 0
Boolean Operations
● NOT, AND, OR, Exclusive-OR (in C, ~, &, | and ^, respectively)
11
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
12
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
► 01101001 { 0, 3, 5, 6 }
► 76543210
► 01010101 { 0, 2, 4, 6 }
► 76543210
Operations
● & Intersection 01000001 { 0, 6 }
● | Union 01111101 { 0, 2, 3, 4, 5, 6 }
● ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
● ~ Complement 10101010 { 1, 3, 5, 7 }
13
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Bit-Level Operations in C
Operations &,|, ~, ^ Available in C
● Apply to any “integral” data type
► long, int, short, char, unsigned
14
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
15
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Shift Operations
Left Shift: x << y Argument x 01100010
● Shift bit-vector x left y positions << 3 00010000
► Throw away extra bits on left
Log. >> 2 00011000
► Fill with 0’s on right
Arith. >> 2 00011000
Right Shift: x >> y
● Shift bit-vector x right y positions
► Throw away extra bits on right Argument x 10100010
● Logical shift << 3 00010000
► Fill with 0’s on left
Log. >> 2 00101000
● Arithmetic shift
► Replicate most significant bit on left Arith. >> 2 11101000
Undefined Behavior
● Shift amount < 0 or ≥ word size
16
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
17
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
using w bits
● Example: unsigned int x = 43;
► 4310 = 1010112
► In C, unsigned int is 32 bits (4 bytes)
► 32-bit representation
00000000000000000000000000101011 = 0x0000002B
Binary representation
● Sign-magnitude
● Ones’ complement
● Two’s complement – dominantly used today
19
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
x 00000000000000000000000000101011 = 0x0000002B
y 10000000000000000000000000101011 = 0x8000002B
magnitude
sign bit
(left-fill with zeros if needed)
20
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
positive integer
Example
● int x = 43, y = -43;
● 32-bit representation
x 00000000000000000000000000101011 = 0x0000002B
y 11111111111111111111111111010100 = 0xFFFFFFD4
21
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
positive integer
Example
● int x = 43, y = -43;
● 32-bit machine representation
x 00000000000000000000000000101011 = 0x0000002B
y 11111111111111111111111111010101 = 0xFFFFFFD5
22
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
23
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
0000 0 0 0 0 Sign-magnitude
0001 1 1 1 1 and ones’
0010 2 2 2 2 complement
0011 3 3 3 3 each have two
0100 4 4 4 4 distinct
0101 5 5 5 5 representations for
0110 6 6 6 6 zero
0111 7 7 7 7
1000 8 −0 −7 −8
1001 9 −1 −6 −7
1010 10 −2 −5 −6
1011 11 −3 −4 −5
1100 12 −4 −3 −4
1101 13 −5 −2 −3
1110 14 −6 −1 −2
1111 15 −7 −0 −1
24
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
0000 0 0 0 0 In two’s
0001 1 1 1 1 complement,
0010 2 2 2 2 every binary
0011 3 3 3 3 pattern represents
0100 4 4 4 4 a unique integer
0101 5 5 5 5 value
0110 6 6 6 6
… a property it
0111 7 7 7 7
shares with
1000 8 −0 −7 −8
unsigned
1001 9 −1 −6 −7 representation
1010 10 −2 −5 −6
1011 11 −3 −4 −5 Unsigned and
1100 12 −4 −3 −4 two’s complement
1101 13 −5 −2 −3 arithmetic can use
1110 14 −6 −1 −2 the same
hardware
1111 15 −7 −0 −1
25
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
26
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Weights in
−23 22 21 20 B2T(1011) = 1∙(−23) + 0∙22 + 1∙21 + 1∙20 = −510
Signed
27
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
1101 13 −3 integer
1110 14 −2 ● Discussed earlier (decimal to binary
1111 15 −1 conversion)
28
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Numeric Ranges
Unsigned Signed 2’s Complement
● UMin = 0 ● TMin = –2w–1
► 000…0 ► 100…0
● Other Values
► Minus 1 = 111…1
Values for w = 16
Decimal Hex Binary
UMax 65535 FF FF 11111111 11111111
TMax 32767 7F FF 01111111 11111111
TMin -32768 80 00 10000000 00000000
-1 FF FF 11111111 11111111
0 00 00 00000000 00000000
29
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Observations C Programming
● |TMin | = TMax + 1 ● #include <limits.h>
► Asymmetric range ● Declares constants, e.g.,
● UMax = 2∙TMax + 1 ► ULONG_MAX
► LONG_MAX
► LONG_MIN
30
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
31
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 T2U 5
0110 6 6
0111 7 U2T 7
1000 −8 8
1001 −7 9
1010 −6 10
1011 −5 11
1100 −4 12
1101 −3 13
1110 −2 14
1111 −1 15
33
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
0000 0 0
0001 1 1
0010 2 2
0011 3 = 3
0100 4 4
0101 5 5
0110 6 6
0111 7 7
1000 −8 8
1001 −7 9
1010 −6
+/- 16 10
1011 −5 11
1100 −4 12
1101 −3 13
1110 −2 14
1111 −1 15
34
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
𝑥 + 2𝑤 , 𝑥<0
𝑇2𝑈 𝑥 = ቊ
𝑥, 𝑥≥0
35
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
𝑢, 𝑢 < 2𝑤−1
𝑈2𝑇 𝑢 = ቊ
𝑢 − 2𝑤 , 𝑢 ≥ 2𝑤−1
36
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
37
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Casting Example
C Program
void main()
{
int tx, ty = -16;
unsigned int ux = 2, uy;
Output
tx = fffffff2 (hex) = -14 (dec)
ty = fffffff0 (hex) = -16 (dec)
ux = 2 (hex) = 2 (dec)
uy = fffffff2 (hex) = 4294967282 (dec)
38
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
39
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Expanding Integers
Given a w-bit integer x, convert it to a w’-bit integer with
same value, where w’ > w
40
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Truncating Integers
Given w-bit integer x, convert it to a k-bit integer, k < w
● Retain k rightmost bits and delete the rest
● Does not guarantee that value is preserved
Unsigned Integer
10010 01011 = 58710
● Truncation could result in overflow
► 587 not representable with 5 bits
Signed Integer
● Truncation could result in overflow 10010 01011 = −43710
change in sign
● Result is actually U2T(x mod 2k) 01011 = 1110
41
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
42
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
43
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
44
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Addition
Recall addition in decimal
1 1 1 carry digits
4 9 7
+ 9 2 8
1 4 2 5 sum digits
Similarly in binary
carry bits 1 1 1 1 1 (decimal)
0 1 1 0 1 1 = 2 7
+ 0 0 1 1 0 1 = 1 3
sum bits 1 0 1 0 0 0 4 0
45
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Addition
Unsigned Overflow
● Adding two w-bit unsigned integers may produce a result that
requires more than w bits. This is called overflow
● For unsigned addition, overflow occurs when there is a carry out of
the most significant bit position
1 1 1
0 1 1 0 1 1 = 2 7
+ 1 0 1 0 0 0 = 4 0
True sum: w+1 bits 1 0 0 0 0 1 1 6 7
Sum stored: w bits 0 0 0 0 1 1 3
46
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Addition
Operands: w bits u •••
+v •••
True Sum: w+1 bits
u+v •••
Discard Carry: w bits UAddw(u , v) •••
47
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
number in between
49
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Subtraction
Signed Subtraction
● Reduces to signed addition since A – B ≡ A + (–B)
● Therefore, to compute A – B:
► Compute –B = two’s complement(B)
► Add –B to A
Unsigned Subtraction
● Uses same algorithm except first check that A ≥ B
► If A < B result will be negative but will be interpreted as a (large)
unsigned number
50
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
51
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Multiplication
Recall multiplication in decimal
1 3 5 multiplicand
× 4 2 3 multiplier
4 0 5
2 7 0 partial products
5 4 0
5 7 1 0 5 product
52
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Multiplication
Similarly in binary
1 1 0 1 = 1310
× 0 1 0 1 = 510
1 1 0 1
0 0 0 0
1 1 0 1
0 0 0 0
1 0 0 0 0 0 1 = 6510
53
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Multiplication
Observe that each partial product is simply either:
● multiplicand shifted left i times (if i-th multiplier bit = 1), or
● zero (if multiplier bit = 0)
1 1 0 1
× 0 1 0 1
1 1 0 1 (multiplicand × 1) shift left 0
0 0 0 0 (multiplicand × 0) shift left 1
1 1 0 1 (multiplicand × 1) shift left 2
0 0 0 0 (multiplicand × 0) shift left 3
1 0 0 0 0 0 1
54
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Multiplication
Unsigned Multiplication Algorithm
55
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Unsigned Multiplication in C
u •••
Operands: w bits
* v •••
True Product: 2*w bits u ·v ••• •••
UMultw(u , v) •••
Discard w bits: w bits
56
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication
If binary numbers represent 2’s complement signed integers,
algorithm doesn’t work
1 1 0 1 = -310
× 0 1 0 1 = 510
1 1 0 1
0 0 0 0
1 1 0 1
0 0 0 0
1 0 0 0 0 0 1 ≠ -1510
57
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication
Why doesn’t the algorithm work?
1 1 0 1 = -310
× 0 1 0 1 = 510
This is a positive
number (+1310 )! 0 0 0 1 1 0 1
0 0 0 0 0 0
0 1 1 0 1
0 0 0 0
1 0 0 0 0 0 1 ≠ -1510
58
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication
To correct the problem, the multiplicand should be sign-
extended to 2w bits before forming partial products.
1 1 1 1 1 1 0 1 = -310
Sign-extend × 0 1 0 1 = 510
1 1 1 1 1 1 0 1
0 0 0 0 0 0 0
1 1 1 1 0 1
0 0 0 0 0
1 1 1 1 0 0 0 1 = -1510
● Carry out of the (2w)-th bit is discarded in each addition because we’re
performing two’s complement addition. (That is, we keep only 2w least
significant bits of product.)
59
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication
Unfortunately, the tweak does not work if the multiplier is
negative:
1 1 1 1 1 1 0 1 = -310
× 1 0 1 1 = -510
1 1 1 1 1 1 0 1
1 1 1 1 1 0 1
0 0 0 0 0 0
1 1 1 0 1
1 1 0 1 1 1 1 1 ≠ 1510
60
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication
Dealing with a negative multiplier
● Let A = multiplicand and B = multiplier
● Rewrite the multiplier as B = ±B* where B* is the magnitude of B
● If B= +B* (i.e., B is positive) then the multiplication algorithm gives the
correct product because A × B = A × B*
● If B = –B* (i.e., B is negative) the multiplication algorithm gives the
incorrect product because it actually computes:
A × two’s complement(B*) = A × (2w – B*) = (A × 2w) – (A × B*)
● If B = –B* the product should be A × B = A × (– B*) = – (A × B*)
Key Observations
● Subtracting (A × 2w) from (A × 2w) – (A × B*) yields – (A × B*) which is
the correct product
● (A × 2w) is equal to A shifted to the left w times
● Subtracting (A × 2w) is same as adding two’s complement of (A × 2n)
61
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication
Signed Two’s Complement Multiplication Algorithm
1. Set-up:
a. Initialize 2w-bit product to 0
b. Sign-extend multiplicand to 2w bits
2. For i = 0 to w-1 do the following:
a. If i-th bit of multiplier = 1, add multiplicand to product
b. Left shift multiplicand left by 1 bit
Only step added to
3. If MSB of multiplier = 1, add two’s complement unsigned multiplication
of multiplicand to product algorithm. ➞ Can
merge two algorithms
4. Product holds multiplicand × multiplier into a single one that
works for both signed
and unsigned.
62
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication
Algorithm Illustration:
1 1 1 1 1 1 0 1 = -310
After shifting left,
multiplicand = 11010000. × 1 0 1 1 = -510
Since MSB(multiplier) = 1: 1 1 1 1 1 1 0 1
-Take two’s complement 1 1 1 1 1 0 1 0
of multiplicand
11010000 → 00110000 0 0 0 0 0 0 0 0
- Add it to product 1 1 1 0 1 0 0 0
0 0 1 1 0 0 0 0
0 0 0 0 1 1 1 1 = 1510
63
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Signed Multiplication in C
u •••
Operands: w bits
* v •••
True Product: 2*w bits u ·v ••• •••
TMultw(u , v) •••
Discard w bits: w bits
64
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
65
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
66
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
2 5 quotient
divisor 1 4 3 5 4
- 2 8 dividend
7 4
- 7 0
4 remainder
67
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
0 1 1 1 = 710
610 = 1 1 0 1 0 1 1 1 0 = 4610
- 1 1 0
1 0 1 1
Check:
- 1 1 0 46 = 7×6 + 4
1 0 1 0
- 1 1 0
1 0 0 = 410
68
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
69
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
R 1 0 1 1 1 0
D 0 1 1 0 0 0 R ≥ D Q 0 0 0 1
R 0 1 0 1 1 0
D 0 0 1 1 0 0 R ≥ D Q 0 0 1 1
R 0 0 1 0 1 0
D 0 0 0 1 1 0 R ≥ D Q 0 1 1 1
R <
R 0 0 0 1 0 0 Q 0 1 1 1
divisor
remainder quotient
70
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
sign(remainder) = sign(dividend)
71
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
72
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Lecture Outline
Representing information as bits
Bit-level manipulations
Integers
● Representation: unsigned and signed
● Conversion, casting
● Expanding, truncating
● Addition, subtraction, multiplication, division
Character Data
74
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Character Data
In C, the char data type is
considered an integer data
type
75
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Character Data
To meet the demand for
more characters and
symbols used in other
languages, ASCII character
set was subsequently
extended to encode 256
characters, each assigned
an 8-bit binary value.
76
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
Character Data
Extended ASCII does not address the diversity of writing systems in the
world (e.g., Hebrew, Russian, Chinese, ...). To address this problem,
Unicode was developed.
Current Unicode standard supports almost 150,000 characters
covering over 160 modern and historic writing systems, as well as
symbols and emoji.
Unicode’s “Universal Character Set” encodes characters as 32-bit
binary numbers. However, alternate encodings are supported. E.g.,
UTF-8 uses a variable-length encoding of 1 to 4 bytes for each
character.
The first 128 characters of UTF-8 corresponds one-to-one with ASCII,
making valid ASCII text valid UTF-8-encoded Unicode as well.
(However, not extended ASCII.)
See https://home.unicode.org/ for further information.
Java uses Unicode for its representation of strings. C uses ASCII but
program libraries are available to support Unicode.
77
Carnegie Mellon
198:331 Intro to Computer Organization Lecture 3
The End
78