Set Associative Cache

Introduction

A set-associative scheme is a hybrid between a fully associative cache, and direct mapped cache. It's considered a reasonable compromise between the complex hardware needed for fully associative caches (which requires parallel searches of all slots), and the simplistic direct-mapped scheme, which may cause collisions of addresses to the same slot (similar to collisions in a hash table).

Let's assume, as we did for fully associate caches that we have:

• 128 slots
• 32 bytes per slot

Furthermore, let's assume that we can group slots together into sets. In particular, we will assume that we have 8 slots per set.

Parking Lot Analogy

Suppose we have 1000 parking spots. This time, instead of using a 3 digit number for each parking spot, we use 2 digits. Thus, the parking spots are numbered 00 up to 99.

However, instead of one parking spot per number, we have 10 for each number. Thus, there are ten parking spots numbered 00, ten numbered 01, ..., and ten numbered 99.

Your parking spot is based on the first 2 digits of your student ID number.

In this case, you use the first 2 digits of your student ID, and have up to 10 different parking spots you can park at. This gives you some flexibility about where to park.

In effect, the various parking permits on a large commuter campus work just like that. There are many lots, each with their own letter or number. You are given a permit for a particular lot, but you can park anywhere within this lot. The advantage is that you only have to search for a spot in one large lot, as opposed to searching for a parking spot in all of campus.

Set Associative Scheme

Like the direct mapped scheme, we still treat the slots like an array. The slots are still numbered 0000000 up to 1111111 (there are 128 slots).

However, we group the slots into sets, and the key is to keep track of the sets, instead of the slots.

How many sets do we have? 128 slots divided by 8 slots per sets, gives us 16 sets.

We need to specify the set number, instead of the slot number, and that takes lg 16 = 4 bits.

Here's how the bits of the address break down. It's very similar to direct mapped, except we use 4 bits for the set, instead of the slot.

Bits A4-0 is still the offset. The set number are the next 4 bits, Bits A8-5. The remaining bits, A31-9 is the tag.

Finding the Slot

Finding a slot is more complex than in direct-mapped caches. Suppose you have address B31-0.
• Use bits B8-5 to find the set.
• This should specify 8 slots (since we said there were 8 slots per set. The slots should have following slot indexes:
• B8-5000
• B8-5001
• B8-5010
• B8-5011
• B8-5100
• B8-5101
• B8-5110
• B8-5111
In effect, the set number specifies the upper 4 bits of the index, and the bottom 3 bits are all possible 3 bit bitstring values.
• Search in all 8 slots to see if the tag A31-9 matches the tag in the slot.
• If it matches one of the slots, get the byte at offset B4-0.
• If not, decide which slot should be used (possibly evicting a slot), fetch the 32 bytes from memory, slot, updating valid bit, dirty bit, and tag as neededx
This is called 8-way set associative cache, since each set contains 8 slots. You can have N-way set-associative caches, where each set contains N slots (where N is a power of 2).

Compromises

This scheme is a compromise. You only have to use the complex comparison hardware (to find the correct slot) on a small set of slots, instead of over all the slots. Presumably, such comparison hardware is more than linear in the number of slots, so the fewer the slots you need to search through, the less overall hardware is needed.

Yet, you gain the flexibility of allowing up to N cache lines per slot for an N-way set associative scheme.

Summary

A set-associative cache scheme is a combination of fully associative and direct mapped schemes. You group slots into sets. You find the appropriate set for a given address (which is like the direct mapped scheme), and within the set you find the appropriate slot (which is like the fully associative scheme).

This scheme has fewer collisions because you have more slots to pick from, even when cache lines map to the same set.