Understanding Caches in Computer Systems

lecture 14 caches n.w
1 / 27
Embed
Share

Explore the importance of caches in computer systems, the impact of caching on performance, and the concept of memory hierarchy. Learn about CPU-memory gaps, caching strategies, and the memory hierarchy levels from registers to remote secondary storage.

  • Computer systems
  • Caches
  • Memory hierarchy
  • Performance optimization
  • CPU

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Lecture 14: Caches CS 105 Spring 2021

  2. Life without caches You decide that you want to learn more about computer systems than is covered in this course The library contains all the books you could possibly want, but you don't like to study in libraries, you prefer to study at home. You have the following constraints: Desk Library (can hold one book) (can hold many books)

  3. Life without caches Average latency to access a book: 40mins Average throughput (incl. reading time): 1.2 books/hr

  4. A Computer System Register file PC ALU System bus Memory bus Main memory I/O bridge Bus interface CPU I/O bus Expansion slots for other devices such as network adapters USB controller Graphics adapter Disk controller Mouse Keyboard Display hello executable stored on disk Disk

  5. 5 The CPU-Memory Gap 100,000,000.0 100,000,000.0 10,000,000.0 10,000,000.0 Disk 1,000,000.0 1,000,000.0 100,000.0 100,000.0 SSD 10,000.0 10,000.0 Time (ns) Time (ns) 1,000.0 1,000.0 100.0 100.0 DRAM 10.0 10.0 SRAM 1.0 1.0 CPU 0.1 0.1 0.0 0.0 1985 1985 1990 1990 1995 1995 2000 2000 2003 2003 2005 2005 2010 2010 2015 2015 Year Year

  6. 6 Caching The Very Idea Keep some memory values nearby in fast memory Modern systems have 3 or even 4 levels of caches Cache idea is widely used: Disk controllers Web (Virtual memory: main memory is a cache for the disk)

  7. 7 Memory Hierarchy CPU registers hold words retrieved from the L1 cache. L0: Regs L1 cache (SRAM) L1 cache holds cache lines retrieved from the L2 cache. L1: Smaller, faster, and costlier (per byte) storage devices L2 cache (SRAM) L2: L2 cache holds cache lines retrieved from L3 cache L3 cache (SRAM) L3 cache holds cache lines retrieved from main memory. L3: Main memory holds disk blocks retrieved from local disks. Main memory (DRAM) L4: Larger, slower, and cheaper (per byte) storage devices Local disks hold files retrieved from disks on remote servers Local secondary storage (local disks) L5: L6: Remote secondary storage (e.g., cloud, web servers)

  8. Latency numbers every programmer should know (2020) L1 cache reference Branch mispredict L2 cache reference Main memory reference memory 1MB sequential read SSD random read SSD 1MB sequential read Magnetic Disk seek Magnetic Disk 1MB sequential read Round trip in Datacenter Round trip CA<->Europe 1 ns 3 ns 4 ns 100 ns 3,000 ns 16,000 ns 49,000 ns 2,000,000 ns 825,000 ns 500,000 ns 150,000,000 ns 3 ?s 16 ?s 49 ?s 2 ms 825 ?s 500 ?s 150 ms

  9. Life with caching Average latency to access a book: <20mins Average throughput (incl. reading time): ~2 books/hr

  10. 10 Caching The Vocabulary Size: the total number of bytes that can be stored in the cache Cache Hit: the desired value is in the cache and returned quickly Cache Miss: the desired value is not in the cache and must be fetched from a more distant cache (or ultimately from main memory)

  11. Exercise 1: Caching Strategies How should we decide which books to keep in the bookshelf?

  12. 13 Example Access Patterns int sum = 0; for (int i = 0; i < n; i++){ sum += a[i]; } return sum; Data references Reference array elements in succession. Reference variable sum each iteration. Instruction references Reference instructions in sequence. Cycle through loop repeatedly.

  13. Example Access Patterns

  14. 15 Principle of Locality Programs tend to use data and instructions with addresses near or equal to those they have used recently Temporal locality: Recently referenced items are likely to be referenced again in the near future Spatial locality: Items with nearby addresses tend to be referenced close together in time

  15. CACHE ORGANIZATION

  16. Cache Lines valid bit tag data block v tag 0 1 2 3 4 5 6 7 data block: cached data (i.e., copy of bytes from memory) tag: uniquely identifies which data is stored in the cache line valid bit: indicates whether or not the line contains meaningful information

  17. 18 Direct-mapped Cache Address of data: tag index offset find line v tag 0 1 2 3 4 5 6 7 identifies byte in line v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7 tag v 0 1 2 3 4 5 6 7

  18. Example: Direct-mapped Cache Assume: cache block size 8 bytes Assume: assume 8-bit machine 0xB4 Address of data: Line 0 1 110 0F 12 AB 34 FF FF EA 68 1011 0100 Line 1 1 001 00 00 00 00 00 40 06 1D 10 101 100 Line 2 1 101 0D 00 00 00 2F 00 00 00 Line 3 001 0 00 11 22 33 44 55 66 77

  19. Exercise 2: Interpreting Addresses Consider the hex address 0xA59. What would be the tag, index, and offset for this address with each of the following cache configurations? 1. A direct-mapped cache with 8 cache lines and 8-byte data blocks 2. A direct-mapped cache with 16 cache lines and 4-byte data blocks 3. A direct-mapped cache with 16 cache lines and 8-byte data blocks

  20. Exercise 2: Interpreting Addresses Consider the hex address 0xA59. What would be the tag, index, and offset for this address with each of the following cache configurations? 1010 0101 1001 1. A direct-mapped cache with 8 cache lines and 8-byte data blocks 101001 011 001 2. A direct-mapped cache with 16 cache lines and 4-byte data blocks 101001 0110 01 3. A direct-mapped cache with 16 cache lines and 8-byte data blocks 10100 1011 001

  21. Exercise 3: Cache Indices Assume you have an array of 6 integers a that begins at address 0x601940. Assume you are running on a machine that has a direct-mapped cache with 8 cache lines and 8-byte data blocks. Which cache line would each of the 6 integers be stored in when it is in cache? 0x601940

  22. Exercise 3: Cache Indices Assume you have an array of 6 integers a that begins at address 0x601940. Assume you are running on a machine that has a direct-mapped cache with 8 cache lines and 8-byte data blocks. Which cache line would each of the 6 integers be stored in when it is in cache? 0x601940 Element a[0] a[1] a[2] a[2] a[4] a[5] Address Binary Address Index Offset 000 0x601940 0100 0000 000 100 000 100 0x601944 0x601948 0x60194c 0100 0100 0100 1000 0100 1100 000 001 001 000 100 0x601950 0x601954 0101 0000 0101 0100 010 010

  23. Exercise 3: Direct-mapped Cache Memory 0x14 0x10 0x0c 0x08 0x04 0x00 13 Cache 18 17 16 15 14 Assume 4 byte data blocks Line 0 0000 Line 1 Line 2 Line 3 Access rd 0x00 tag 0000 idx 00 off 00 h/m m 47 47 47 47 0 0 0 0 0000 0000 0000 13 1 0000 rd 0x04 rd 0x14 rd 0x00 rd 0x04 rd 0x14

  24. Exercise 3: Direct-mapped Cache Memory 0x14 0x10 0x0c 0x08 0x04 0x00 13 Cache 18 17 16 15 14 Assume 4 byte data blocks Line 0 0000 Line 1 Line 2 Line 3 Access rd 0x00 tag 0000 idx 00 off 00 h/m m m 47 47 47 47 0 0 0 0 0000 0000 0000 13 1 0000 rd 0x04 01 00 0000 14 1 0000 m h m rd 0x14 01 00 01 00 00 00 0001 0000 0000 18 1 0001 rd 0x00 rd 0x04 14 1 0000 m rd 0x14 01 00 0001 18 1 0001 How well does this take advantage of spacial locality? How well does this take advantage of temporal locality?

  25. Exercise 4: Direct-mapped Cache Memory 0x14 0x10 0x0c 0x08 0x04 0x00 13 Cache 18 17 16 15 14 Assume 8 byte data blocks Line 0 47 48 0000 Line 1 47 48 0000 Access rd 0x00 tag idx off h/m 0 0 rd 0x04 rd 0x14 rd 0x00 rd 0x04 rd 0x14

  26. Exercise 4: Direct-mapped Cache Memory 0x14 0x10 0x0c 0x08 0x04 0x00 13 Cache 18 17 16 15 14 Assume 8 byte data blocks Line 0 47 48 0000 13 14 Line 1 47 48 0000 Access rd 0x00 tag 0000 idx 0 off 000 h/m m h 0 0 1 0000 rd 0x04 0 100 0000 m m h rd 0x14 0 0 0 100 000 100 0001 0000 0000 1 0001 17 18 13 14 rd 0x00 1 0000 rd 0x04 m rd 0x14 0 100 0001 1 17 18 0001 How well does this take advantage of spacial locality? How well does this take advantage of temporal locality?

  27. 28 Exercise 5: Feedback 1. Rate how well you think this recorded lecture worked 1. Better than an in-person class 2. About as well as an in-person class 3. Less well than an in-person class, but you still learned something 4. Total waste of time, you didn't learn anything 2. How much time did you spend on this video lecture (including time spent on exercises)? 3. Do you have any questions that you would like me to address in this week's problem session? 4. Do you have any other comments or feedback?

Related


More Related Content