COMS W4118 Operating Systems I

Translation Lookaside Buffer (TLB)

Avoiding extra memory accesses

With 5-level paging, we effectively incur five additional memory dereferences per pointer dereference. This is incredibly expensive!

Observation: memory access locality.

MMU employs a fast-lookup hardware cache called “associative memory” or translation lookaside buffer (TLB).

Paging with TLB

tlb

When a virtual address is dereferenced, CPU will lookup VPN in the TLB. If there is a mapping (TLB hit), you don’t go through the page tables, you already have the PFN! Access physical memory using PFN and offset.

If the VPN isn’t in the TLB, then hardware performs a page table walk. Once PFN is derived, CPU installs VPN->PFN mapping into the TLB and then restart the memory dereference so that it is a TLB hit.

Effective access time

Assume that:

Compute effective access time (EAT) as follows:

EAT = (1 + e) a + (2 + e)(1 - a)
- If TLB hit, then just incur TLB lookup and memory cycle

EAT = a + ea + 2 + e - ea - 2a
EAT = 2 + e - a
- Assuming a high TLB-hit ratio and a low TLB lookup time, EAT approaches the
  cost of 1 memory cycle (worth it!)

TLB and context switches

What should we do with TLB contents on context switches? We know that PTBR will get swapped out during context switch, so the VPNs in the TLB shouldn’t make sense anymore..

Option 1: flush the entire TLB

Option 2: attach ID to TLB entries

x86 also has INVLPG addr instruction, invalidates 1 TLB entry