0% found this document useful (0 votes)
242 views

Cache Tutorial

This document provides information about direct mapped caches. It explains that in a direct mapped cache with 8 sets and 1 word block size, the bottom 2 address bits indicate the word, and the next 3 bits indicate the set. For example, addresses 0x00000014 and 0x00000034 both map to set 5. It also includes examples of calculating cache access times and speedups from pipelining.

Uploaded by

rktiwary256034
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
242 views

Cache Tutorial

This document provides information about direct mapped caches. It explains that in a direct mapped cache with 8 sets and 1 word block size, the bottom 2 address bits indicate the word, and the next 3 bits indicate the set. For example, addresses 0x00000014 and 0x00000034 both map to set 5. It also includes examples of calculating cache access times and speedups from pipelining.

Uploaded by

rktiwary256034
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

CACHE TUTORIAL

rktiwary
Direct Mapped Cache

This mapping is illustrated in Figure for a


direct mapped cache with a capacity of eight
words and a block size of one word. The
cache has eight sets, each of which contains
a one-word block. The bottom two bits of the
address are always 00, because they are
word aligned. The next log28 = 3 bits
indicate the set onto which the memory
address maps. Thus, the data at addresses
0x00000004, 0x00000024, , 0xFFFFFFE4
all map to set 1, as shown in blue. Likewise,
data at addresses 0x00000010, ,
0xFFFFFFF0 all map to set 4, and so forth.
Example : CACHE FIELDS
To what cache set in Figure does the word
at address 0x00000014 map? Name another
address that maps to the same set.
Solution The two least significant bits of
the address are 00, because the address is
word aligned. The next three bits are 101,
so the word maps to set 5. Words at
addresses 0x34, 0x54, 0x74, , 0xFFFFFFF4
all map to this same set.
Eg. Consider an un-pipelined processor.
Assume that it has 1-ns clock cycle and
that it uses 4 cycles for ALU operations and
5 cycles for branches and 4 cycles for
memory operations. Assume that the
relative frequencies of these operations are
50%, 35% and 15% respectively. Suppose
that due to clock skew and set up,
pipelining the processor adds 0.15 ns of
overhead to the clock. Ignoring any latency
impact, how much speed up in the
instruction execution rate will we gain from
a pipeline?
The average instruction execution
time on an un-pipelined processor is
= clock cycle Avg. CPI
= 1 ns ((0.5 4) + (0.35 5) +
(0.15 4))
= 4.35 ns
The avg. instruction execution time on
pipelined processor is = 1 ns + 0.2 ns
= 1.2 ns
So speed up = 4.35/1.2 = 3.3625
Eg. A given processor requires 1000 cycles to perform a
context switch and start an interrupt handler (and the
same number of cycles to switch back to the program that
was running when the interrupt occurred), or 500 cycles
to poll an I/O device.An I/O device attached to that
processor makes 150 requests per second, each of which
takes 10,000 cycles to resolve once the handler has been
started. By default , the processor polls every 0.5 ms if it
is not using interrupts.
a. How many cycles per second does the processor spend
handling I/O from the device if interrupts are used?
b. How many cycles per second are spent on I/O if polling
is used (include all polling attempts)? Assume the
processor only polls during time slices when user
programs are not running, so do not include any context-
switch time in your calculation.
c. How often would the processor have to poll for polling
to take as many cycles per second as interrupts?
a. The device makes 150 requests, each of which require one
interrupt. Each interrupt takes 12,000 cycles (1000 to start the
handler, 10,000 for the handler,1000 to switch back to the original
program), for a total of 1,800,000 cycles spent handling this
device each second.
b. The processor polls every 0.5 ms, or 2000 times/s. Each polling
attempt takes 500 cycles, so it spends 1,000,000 cycles/s polling.
In 150 of the polling attempts, a request is waiting from the I/O
device, each of which takes 10,000 cycles to complete for another
1,500,000 cycles. Therefore, the total time spent on I/O each
second is 2,500,000 cycles with polling.
c. In the polling case, the 150 polling attempts that find a request
waiting consume 1,500,000 cycles. Therefore, for polling to match
the interrupt case, an additional 300,000 cycles must be
consumed, or 600 polls, for a rate of 600 polls/s.
Eg.An I/O device transfers 10 MB/s of
data into the memory of a processor over
the I/O bus, which has a total data
transfer capacity of 100 MB/s. The 10
MB/s of data is transferred as 2500
independent pages of 4 KB each. If the
processor operates at 200 MHz, it takes
1000 cycles to initiate a DMA
transaction, and 1500 cycles to respond
to the device's interrupt when the DMA
transfer completes, what fraction of the
processor's time is spent handling the
data transfer with and without DMA?
Without DMA, the processor must copy the data into the
memory as the I/O device sends it over the bus. Because
the device sends 10 MB/s over the I/O bus, which has a
capacity of 100 MB/s, 10% of each second is spent
transferring data
over the bus. Assuming the processor is busy handling
data during the time that each page is being transferred
over the bus (which is a reasonable assumption because
the time between transfers is too short to be worth doing
a context
switch), then 10% of the processors time is spent copying
data into memory. With DMA, the processor is free to work
on other tasks, except when initiating each DMA and
responding to the DMA interrupt at the end of each
transfer. This takes 2500 cycles/transfer, or a total of
6,250,000 cycles spent handling DMA each second.
Because the processor operates at 200 MHz, this means
that 3.125% of each second, or 3.125% of the processor's
Q 5.8 Design a 16-bit memory of total capacity 8192
bits using SRAM chips of size 64 1 bit. Give the array
configuration of the chips on the memory board showing
all required input and output signals for assigning this
memory to the lowest address space. The design should
allow for both byte and 16-bit word accesses
Disk Data Layout
Winchester Disk Format
Timing
Track
of a selection involves moving the head in a movable
head
Disk
I/O
system or electronically selecting one head on a fixed-
Transfe
head
r system. On a movable head system, the time it
takes to position the head at the track is known as seek
time. In either case, once the track is selected, the disk
controller waits until the appropriate sector rotates to
line up with the head. The time it takes for the beginning
of the sector to reach the head is known as rotational
delay, or rotational latency.

The sum of the seek time, if any, and the rotational


delay equals the access time, which is the time it takes
to get into position to read or write. Once the head is in
ROTATIONAL DELAY Disks, other than floppy
disks, rotate at speeds ranging from 3600 rpm
(for handheld devices such as digital cameras)
up to, as of this writing, 20,000 rpm;
at this latter speed, there is one revolution per 3
ms.
Thus, on the average, the rotational delay will
be 1.5 ms.

TRANSFER TIME : T = b/rN


Where:
T : transfer time
b : number of bytes to be transferred
N : number of bytes on a track
r : rotation speed, in revolutions per second
Comparison between sequential and Random
Access

Consider a disk with an advertised average seek


time of 4 ms, rotation speed of 15,000 rpm, and
512-byte sectors with 500 sectors per track.
Suppose that we wish to read a file
consisting of 2500 sectors for a total of 1.28
Mbytes
estimate the total time for the transfer for
sequential
Average seek 4 ms
Average rotational delay 2 ms
Read 500 sectors 4 ms
Total 10 ms
with essentially no seek time then after.

using random access rather than
sequential access

Average seek 4 ms
Average rotational delay 2 ms
Read 1 sector .008 ms
Total 6.008 ms

Total time = 2500 * 6.008 = 15020 ms = 15.02


seconds

Transfer time :T = b/rN = = 0.008ms


where r= 15000
& N= no of bytes on track= 500 Bytes
b = number of of bytes on a sector= 512 bytes
Q. A disk unit has 24 recording surface . It has a
total of 14000 cylinders. There is an average of
400 sectors per track. Each sector contains 512
bytes of data.
a. What is the maximum number of bytes that
can be stored in this unit?
Q. A disk unit has 24 recording surface . It has a
total of 14000 cylinders. There is an average of
400 sectors per track. Each sector contains 512
bytes of data.
a. What is the maximum number of bytes that
can be stored in this unit?
Sol: (a) The maximum number of bytes that can
be stored on this disk is 24 X14000X400X512 =
68.8 X109 bytes.
Q. A disk unit has 24 recording surface . It has a
total of 14000 cylinders. There is an average of
400 sectors per track. Each sector contains 512
bytes of data.
a. What is the maximum number of bytes that
can be stored in this unit?
Sol: (a) The maximum number of bytes that can
be stored on this disk is 24 X14000X400X512 =
68.8 X109 bytes.

b. What is the data transfer rate in bytes per


second at a rotational speed of 7200rpm?
Q. A disk unit has 24 recording surface . It has a
total of 14000 cylinders. There is an average of
400 sectors per track. Each sector contains 512
bytes of data.
a. What is the maximum number of bytes that
can be stored in this unit?
Sol: (a) The maximum number of bytes that can
be stored on this disk is 24 X14000X400X512 =
68.8 X109 bytes.

b. What is the data transfer rate in bytes per


second at a rotational speed of 7200rpm?
Sol: The data transfer rate is (400 X 512 X
7200)/60
= 24.58 106 bytes/s.
Q. A disk unit has 24 recording surface . It has a
total of 14000 cylinders(tracks). There is an
average of 400 sectors per track. Each sector
contains 512 bytes of data.
a. What is the maximum number of bytes that
can be stored in this unit?
Sol: (a) The maximum number of bytes that can
be stored on this disk is 24 X14000X400X512 =
68.8 X109 bytes.

c. Using a 32 bit word , suggest a suitable


scheme for specifying the disk address,
assuming that there are 512 bytes per sector.
Q. A disk unit has 24 recording surface . It has a
total of 14,000 cylinders. There is an average of
400 sectors per track. Each sector contains 512
bytes of data.

c. Using a 32 bit word , suggest a suitable


scheme for specifying the disk address,
assuming that there are 512 bytes per sector.

Sol: Need 9 bits to identify a sector, 14 bits for


a track, and 5 bits for a surface.
Thus, a possible scheme is to use address bits
A8-A0 for sector, A22-A9 for track, and A27-A23
for surface identification. Bits A31-A28 are not
used.
Consider a single-platter disk with the following
parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter: 30,000;
number of sectors per track: 600;
seek time: one ms for every hundred tracks
traversed.
Let the disk receive a request to access a random
sector on a random track and assume the disk
head starts at track 0.
a. What is the average seek time?
If the track is no 0 then no seek time , If it is any
other one then we have to seek the average time
On an averga 29999/2 are the no of tracks
transversed
if 1ms for 100 tracks
Consider a single-platter disk with the following
parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter: 30,000;
number of sectors per track: 600;
seek time: one ms for every hundred tracks
traversed.
Let the disk receive a request to access a random
sector on a random track and assume the disk
head starts at track 0.
b. What is the average rotational latency?

Consider a single-platter disk with the following
parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter: 30,000;
number of sectors per track: 600;
seek time: one ms for every hundred tracks
traversed.
Let the disk receive a request to access a random
sector on a random track and assume the disk
head starts at track 0.
b. What is the average rotational latency?

Consider a single-platter disk with the following
parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter: 30,000;
number of sectors per track: 600;
seek time: one ms for every hundred tracks
traversed.
Let the disk receive a request to access a random
sector on a random track and assume the disk
head starts at track 0.
b. What is the average rotational latency?
Consider a single-platter disk with the following
parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter:
30,000;
number of sectors per track: 600;
seek time: one ms for every hundred tracks
traversed.
Let the disk receive a request to access a
random sector on a random track and assume
the disk head starts at track 0.
c. What is the transfer time for a sector?
Consider a single-platter disk with the following
parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter:
30,000;
number of sectors per track: 600;
seek time: one ms for every hundred tracks
traversed.
Let the disk receive a request to access a
random sector on a random track and assume
the disk head starts at track 0.
c. What is the transfer time for a sector?
Time for one rotation= 4.167ms X 2= 8.33ms
there are 600 sectors per track
So time for one sector is 8.33/600= 138.9 ms
Consider a single-platter disk with the following
parameters: rotation speed: 7200 rpm;
number of tracks on one side of platter:
30,000;
number of sectors per track: 600;
seek time: one ms for every hundred tracks
traversed.
Let the disk receive a request to access a
random sector on a random track and assume
the disk head starts at track 0.
d. What is the total average time to satisfy a
request?

Add them all to give 154 ms approx

You might also like