This document provides information about direct mapped caches. It explains that in a direct mapped cache with 8 sets and 1 word block size, the bottom 2 address bits indicate the word, and the next 3 bits indicate the set. For example, addresses 0x00000014 and 0x00000034 both map to set 5. It also includes examples of calculating cache access times and speedups from pipelining.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
242 views
Cache Tutorial
This document provides information about direct mapped caches. It explains that in a direct mapped cache with 8 sets and 1 word block size, the bottom 2 address bits indicate the word, and the next 3 bits indicate the set. For example, addresses 0x00000014 and 0x00000034 both map to set 5. It also includes examples of calculating cache access times and speedups from pipelining.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30
CACHE TUTORIAL
rktiwary Direct Mapped Cache
This mapping is illustrated in Figure for a
direct mapped cache with a capacity of eight words and a block size of one word. The cache has eight sets, each of which contains a one-word block. The bottom two bits of the address are always 00, because they are word aligned. The next log28 = 3 bits indicate the set onto which the memory address maps. Thus, the data at addresses 0x00000004, 0x00000024, , 0xFFFFFFE4 all map to set 1, as shown in blue. Likewise, data at addresses 0x00000010, , 0xFFFFFFF0 all map to set 4, and so forth. Example : CACHE FIELDS To what cache set in Figure does the word at address 0x00000014 map? Name another address that maps to the same set. Solution The two least significant bits of the address are 00, because the address is word aligned. The next three bits are 101, so the word maps to set 5. Words at addresses 0x34, 0x54, 0x74, , 0xFFFFFFF4 all map to this same set. Eg. Consider an un-pipelined processor. Assume that it has 1-ns clock cycle and that it uses 4 cycles for ALU operations and 5 cycles for branches and 4 cycles for memory operations. Assume that the relative frequencies of these operations are 50%, 35% and 15% respectively. Suppose that due to clock skew and set up, pipelining the processor adds 0.15 ns of overhead to the clock. Ignoring any latency impact, how much speed up in the instruction execution rate will we gain from a pipeline? The average instruction execution time on an un-pipelined processor is = clock cycle Avg. CPI = 1 ns ((0.5 4) + (0.35 5) + (0.15 4)) = 4.35 ns The avg. instruction execution time on pipelined processor is = 1 ns + 0.2 ns = 1.2 ns So speed up = 4.35/1.2 = 3.3625 Eg. A given processor requires 1000 cycles to perform a context switch and start an interrupt handler (and the same number of cycles to switch back to the program that was running when the interrupt occurred), or 500 cycles to poll an I/O device.An I/O device attached to that processor makes 150 requests per second, each of which takes 10,000 cycles to resolve once the handler has been started. By default , the processor polls every 0.5 ms if it is not using interrupts. a. How many cycles per second does the processor spend handling I/O from the device if interrupts are used? b. How many cycles per second are spent on I/O if polling is used (include all polling attempts)? Assume the processor only polls during time slices when user programs are not running, so do not include any context- switch time in your calculation. c. How often would the processor have to poll for polling to take as many cycles per second as interrupts? a. The device makes 150 requests, each of which require one interrupt. Each interrupt takes 12,000 cycles (1000 to start the handler, 10,000 for the handler,1000 to switch back to the original program), for a total of 1,800,000 cycles spent handling this device each second. b. The processor polls every 0.5 ms, or 2000 times/s. Each polling attempt takes 500 cycles, so it spends 1,000,000 cycles/s polling. In 150 of the polling attempts, a request is waiting from the I/O device, each of which takes 10,000 cycles to complete for another 1,500,000 cycles. Therefore, the total time spent on I/O each second is 2,500,000 cycles with polling. c. In the polling case, the 150 polling attempts that find a request waiting consume 1,500,000 cycles. Therefore, for polling to match the interrupt case, an additional 300,000 cycles must be consumed, or 600 polls, for a rate of 600 polls/s. Eg.An I/O device transfers 10 MB/s of data into the memory of a processor over the I/O bus, which has a total data transfer capacity of 100 MB/s. The 10 MB/s of data is transferred as 2500 independent pages of 4 KB each. If the processor operates at 200 MHz, it takes 1000 cycles to initiate a DMA transaction, and 1500 cycles to respond to the device's interrupt when the DMA transfer completes, what fraction of the processor's time is spent handling the data transfer with and without DMA? Without DMA, the processor must copy the data into the memory as the I/O device sends it over the bus. Because the device sends 10 MB/s over the I/O bus, which has a capacity of 100 MB/s, 10% of each second is spent transferring data over the bus. Assuming the processor is busy handling data during the time that each page is being transferred over the bus (which is a reasonable assumption because the time between transfers is too short to be worth doing a context switch), then 10% of the processors time is spent copying data into memory. With DMA, the processor is free to work on other tasks, except when initiating each DMA and responding to the DMA interrupt at the end of each transfer. This takes 2500 cycles/transfer, or a total of 6,250,000 cycles spent handling DMA each second. Because the processor operates at 200 MHz, this means that 3.125% of each second, or 3.125% of the processor's Q 5.8 Design a 16-bit memory of total capacity 8192 bits using SRAM chips of size 64 1 bit. Give the array configuration of the chips on the memory board showing all required input and output signals for assigning this memory to the lowest address space. The design should allow for both byte and 16-bit word accesses Disk Data Layout Winchester Disk Format Timing Track of a selection involves moving the head in a movable head Disk I/O system or electronically selecting one head on a fixed- Transfe head r system. On a movable head system, the time it takes to position the head at the track is known as seek time. In either case, once the track is selected, the disk controller waits until the appropriate sector rotates to line up with the head. The time it takes for the beginning of the sector to reach the head is known as rotational delay, or rotational latency.
The sum of the seek time, if any, and the rotational
delay equals the access time, which is the time it takes to get into position to read or write. Once the head is in ROTATIONAL DELAY Disks, other than floppy disks, rotate at speeds ranging from 3600 rpm (for handheld devices such as digital cameras) up to, as of this writing, 20,000 rpm; at this latter speed, there is one revolution per 3 ms. Thus, on the average, the rotational delay will be 1.5 ms.
TRANSFER TIME : T = b/rN
Where: T : transfer time b : number of bytes to be transferred N : number of bytes on a track r : rotation speed, in revolutions per second Comparison between sequential and Random Access
Consider a disk with an advertised average seek
time of 4 ms, rotation speed of 15,000 rpm, and 512-byte sectors with 500 sectors per track. Suppose that we wish to read a file consisting of 2500 sectors for a total of 1.28 Mbytes estimate the total time for the transfer for sequential Average seek 4 ms Average rotational delay 2 ms Read 500 sectors 4 ms Total 10 ms with essentially no seek time then after.
using random access rather than sequential access
Average seek 4 ms Average rotational delay 2 ms Read 1 sector .008 ms Total 6.008 ms
Total time = 2500 * 6.008 = 15020 ms = 15.02
seconds
Transfer time :T = b/rN = = 0.008ms
where r= 15000 & N= no of bytes on track= 500 Bytes b = number of of bytes on a sector= 512 bytes Q. A disk unit has 24 recording surface . It has a total of 14000 cylinders. There is an average of 400 sectors per track. Each sector contains 512 bytes of data. a. What is the maximum number of bytes that can be stored in this unit? Q. A disk unit has 24 recording surface . It has a total of 14000 cylinders. There is an average of 400 sectors per track. Each sector contains 512 bytes of data. a. What is the maximum number of bytes that can be stored in this unit? Sol: (a) The maximum number of bytes that can be stored on this disk is 24 X14000X400X512 = 68.8 X109 bytes. Q. A disk unit has 24 recording surface . It has a total of 14000 cylinders. There is an average of 400 sectors per track. Each sector contains 512 bytes of data. a. What is the maximum number of bytes that can be stored in this unit? Sol: (a) The maximum number of bytes that can be stored on this disk is 24 X14000X400X512 = 68.8 X109 bytes.
b. What is the data transfer rate in bytes per
second at a rotational speed of 7200rpm? Q. A disk unit has 24 recording surface . It has a total of 14000 cylinders. There is an average of 400 sectors per track. Each sector contains 512 bytes of data. a. What is the maximum number of bytes that can be stored in this unit? Sol: (a) The maximum number of bytes that can be stored on this disk is 24 X14000X400X512 = 68.8 X109 bytes.
b. What is the data transfer rate in bytes per
second at a rotational speed of 7200rpm? Sol: The data transfer rate is (400 X 512 X 7200)/60 = 24.58 106 bytes/s. Q. A disk unit has 24 recording surface . It has a total of 14000 cylinders(tracks). There is an average of 400 sectors per track. Each sector contains 512 bytes of data. a. What is the maximum number of bytes that can be stored in this unit? Sol: (a) The maximum number of bytes that can be stored on this disk is 24 X14000X400X512 = 68.8 X109 bytes.
c. Using a 32 bit word , suggest a suitable
scheme for specifying the disk address, assuming that there are 512 bytes per sector. Q. A disk unit has 24 recording surface . It has a total of 14,000 cylinders. There is an average of 400 sectors per track. Each sector contains 512 bytes of data.
c. Using a 32 bit word , suggest a suitable
scheme for specifying the disk address, assuming that there are 512 bytes per sector.
Sol: Need 9 bits to identify a sector, 14 bits for
a track, and 5 bits for a surface. Thus, a possible scheme is to use address bits A8-A0 for sector, A22-A9 for track, and A27-A23 for surface identification. Bits A31-A28 are not used. Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm; number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek time: one ms for every hundred tracks traversed. Let the disk receive a request to access a random sector on a random track and assume the disk head starts at track 0. a. What is the average seek time? If the track is no 0 then no seek time , If it is any other one then we have to seek the average time On an averga 29999/2 are the no of tracks transversed if 1ms for 100 tracks Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm; number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek time: one ms for every hundred tracks traversed. Let the disk receive a request to access a random sector on a random track and assume the disk head starts at track 0. b. What is the average rotational latency?
Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm; number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek time: one ms for every hundred tracks traversed. Let the disk receive a request to access a random sector on a random track and assume the disk head starts at track 0. b. What is the average rotational latency?
Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm; number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek time: one ms for every hundred tracks traversed. Let the disk receive a request to access a random sector on a random track and assume the disk head starts at track 0. b. What is the average rotational latency? Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm; number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek time: one ms for every hundred tracks traversed. Let the disk receive a request to access a random sector on a random track and assume the disk head starts at track 0. c. What is the transfer time for a sector? Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm; number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek time: one ms for every hundred tracks traversed. Let the disk receive a request to access a random sector on a random track and assume the disk head starts at track 0. c. What is the transfer time for a sector? Time for one rotation= 4.167ms X 2= 8.33ms there are 600 sectors per track So time for one sector is 8.33/600= 138.9 ms Consider a single-platter disk with the following parameters: rotation speed: 7200 rpm; number of tracks on one side of platter: 30,000; number of sectors per track: 600; seek time: one ms for every hundred tracks traversed. Let the disk receive a request to access a random sector on a random track and assume the disk head starts at track 0. d. What is the total average time to satisfy a request?