3.1. EVITCTION

page-life-cycle

The lifecycle of a page in WiredTiger involves several key steps:

  1. Reading from Disk: Pages are initially read from disk into memory.
  2. Modification in Memory: Once in memory, pages can be modified, creating “dirty” pages.
  3. Reconciliation: The modified (dirty) pages are reconciled in memory and discarded once reconciliation is complete.
  4. Eviction Queue: Pages are then selected and added to the eviction queue, where they await removal from memory by the eviction thread.
  5. Eviction: The eviction thread discards “clean” pages directly from memory if no changes were made compared to the disk pages. For dirty pages, the thread writes the reconciled disk image to disk before discarding them from memory.

Now, let’s delve into how WiredTiger handles cache eviction to ensure efficient memory management.

WiredTiger’s in-memory cache is organized using a B-tree structure, with hazard pointers to ensure safe access to memory pages. B-tree nodes are managed at the page level, and the cache eviction policy follows a Least Recently Used (LRU) approach. The block manager handles raw block I/O operations like reading, writing, syncing, compressing, and checkpointing data. These blocks are efficiently managed in a skip list for fast indexing and allocation.

Eviction is crucial for managing WiredTiger’s cache, as it ensures the cache size stays within user-defined limits. When the cache exceeds these limits, the eviction process is triggered to bring usage back under control. The upcoming sections will focus on how this eviction process works in detail.

Eviction Process Overview

  • Eviction Structures:
    Eviction is managed using WT_EVICT_QUEUE structures, which contain lists of WT_EVICT_ENTRY structures.

  • Eviction Components:
    The eviction process involves one eviction server, zero or more eviction worker threads, and three shared eviction queues (two ordinary queues and one urgent queue).

  • Hazard Pointers:
    Hazard pointers are used to track memory that is still in use. When a page (or piece of data) in the buffer pool is being accessed by a transaction, its address is registered with a hazard pointer. This prevents the page from being evicted while it is still in use, ensuring safe access to the data without corruption or loss. If a page has an associated hazard pointer, it will not be evicted.

Eviction Server

  • Eviction Server:
    The eviction server is responsible for finding pages that can be evicted. It walks through the trees, identifying evictable candidates, and sorts them based on their last access time. The oldest one-third of these candidates (simulating an approximate LRU algorithm) are then added to the eviction queues.

  • Worker Threads:
    Users can configure a minimum and maximum number of eviction worker threads. These threads pop pages from the queues and evict them. The number of worker threads can scale dynamically to optimize eviction performance while minimizing overhead.

  • Urgent Queue:
    Pages marked for forced eviction are placed on the urgent queue, which takes precedence over ordinary eviction queues.

  • Exclusive Access:
    If other threads are reading the content, the page cannot be evicted. The eviction process must lock and gain exclusive access to the page to prevent parallel access during eviction.

Clean vs. Dirty Data

  • Clean Data:
    Data in the cache that is identical to the version stored on disk.

  • Dirty Data:
    Data in the cache that has been modified and needs to be reconciled with the disk version.

Clean vs. Dirty Eviction

  • Clean Eviction:
    Involves removing a page from memory that has no dirty content, leaving it unchanged on disk.

  • Dirty Eviction:
    The dirty page undergoes reconciliation, where obsolete content is discarded, the latest values are written to the data store, and older values are moved to the history store. If a dirty page is selected for eviction, it is first written to a special file, such as the WiredTigerLAS.wt file, before being removed from the cache. This ensures that the changes are not lost and can be persisted to disk properly.