Direct Mapped Cache

Introduction

Let's assume, as we did for fully associate caches that we have:

Parking Lot Analogy

Suppose we have 1000 parking spots. However, instead of being unnumbered, each parking spot is given a 3 digit number from 000 to 999.

Your parking spot is based on the first 3 digits of your student ID number.

What problem can occur? If you received your social security number (student ID) in Maryland, the first three digits of your student ID is likely to be in the low 200's.

Thus, there's a much higher chance someone will be in your parking spot. There's also a chance that parking spots numbered in the 900's might be mostly empty since few students may have student IDs with the first 3 digits in that range.

This is basically the direct-mapped scheme. The advantages of this scheme are that it's simple to find your parking spot. It can only be in one location.

Direct Mapped Scheme

In a direct mapped cache, we treat the 128 slots as if they were an array of slots. We can index into the array using binary numbers. For 128 slots, you need 7 bits. Thus, the index of the array is from 0000000two up to 1111111two.

How do we decide which slot the cache line associated with address A31-0 should go into?

Here's how we do it:

Since we have 128 slots, we need to specify which one slot we need the cache line to go in. This requires lg 128 = 7 bits. Where do we get the bits from? Directly from the address itself.

Bits A4-0 is still the offset. The slot number are the next 7 bits, Bits A11-5. The remaining bits, A31-12 is the tag.

Finding the Slot

Finding a slot is now easy. Suppose you have address B31-0.

Drawbacks

The drawback of this scheme is obvious. Effectively, we have a hash table with a very simple hash function (use bits B11-5). This can cause collisions at that particular slot.

Unlike a hash table, we do not resolve such collisions by finding the next free slot. Instead, we overwrite the value in the slot.

Just think of the parking lot analogy. A direct mapped scheme works poorly if many students have permits for the same spot, while other spots have very few permits.

The data you have might simply map to the same slot, and you could have cache lines going in and out all the time.

Advantages

If there's an advantage to the scheme, it's that it's very simple. You don't have to simulataneously match tags with all slots. You just have one slot to check.

If the data isn't in the slot, it's obvious that this is the slot to get evicted.

Why the Middle Bits?

We can pick any bits we want in the address, instead of the middle bits nearest the offset. Why did we pick those bits.

The problem with picking the high bits is that data and program tend to occupy a small section of memory. Because of that, they tend to share the same high bits.

This isn't so surprising. Consider a 10,000 page book, where each page has 5 digits (thus, there are pages 03401 and pages 00023). Grab any 100 consecutive pages, and the top two digits are highly like to be the same.

This can happen with data too. All the data will have very similar high bits. You want the bits as low as possible, so you get enough variation so that data can go into each slot with equal likelihood (so you hope).

You can't pick the low bits, because you need them to serve as the offset.

Summary

A direct-mapped cache scheme makes picking the slot very simple. It treats the slot as a large array, and the index of the array is picked from bits of the address (which is why we need the number of slots to be a power of 2---otherwise we can't select bits from the address)

The scheme can suffer from many addresses "colliding" to the same slot, thus causing the cache line to be repeatedly evicted, even though there may be empty slots that aren't being used, or being used with less frequency.