Classical memory access optimization with the TLB
by @Jonathan Salwan - 2012-12-24When the kernel is in protected mode and the paging is enabled, all virtual address (VA) are translated to physical address (PA), this is the goal of the MMU (Memory Management Unit).
This transition VA -> PA is part of mechanisms of paging. There are several designs of paging, like PAE (Physical Address Extension) or paging with 4MB page size. For more information about paging modes see this post paging modes. The diagram below shows how it works for the classical 32-bits virtual address (No-PAE).
This design is used in most Operating System, but this design has an impact on performance, because to do the transition between the virtual address and the physical address, several steps is necessary.
- Get entry of page directory.
- Get entry of page table.
- And then, get offset for the physical address.
To compensate for this loss of time, the Operating Systems set up what is called the TLB (Translation Lookaside Buffer).
The TLB can be considered like a paging cache, and can therefore save time solving the virtual address to physical address. But in specific case, using the TLB can lost time. This is what we will see.
The TLB is a correspondence table between virtual page number (VPN) and physical page number (PPN). When a memory access is requested, the MMU begins by consulting the TLB to see if this correspondence is already present. If the correspondence is found, we have what is called a TLB hit, otherwise it is a TLB miss.
When a TLB hit occurs, the MMU gets directly the PPN correspondence to the VPN requested in the TLB table, and then the offset of VA is added to the PPN for found the physical address (Figure 2). In this case, we save time because the MMU did not seek the PTE (Page Table Entry) via the page directory. Otherwise, if we have a TLB miss, the MMU needs to search the PTE with the page directory (Figure 1) and we have a loss of time because the MMU has sought the PTE with the TLB and the page directory. If this address is not valid or if is not accessible in Read/Write, the CPU raises a PF exception (Page Fault) and it is the PF handler which takes over, otherwise the VPN-PPN couple are added in the TLB table to save time if the same request was made.