What is an “active snapshot”?

In InnoDB, a snapshot is not a cached result set. It’s a logical view of which row versions are visible to a statement or a transaction, defined by a read view (a set of transaction IDs and boundaries).
A snapshot is active while any statement/transaction holds a read view that could still need older row versions.

  • Every row version has hidden metadata:
    • DB_TRX_ID — the transaction id that last modified the row version.
    • DB_ROLL_PTR — a pointer into the undo log to reconstruct the previous version.
  • A read view decides visibility: “Is the version’s DB_TRX_ID committed and not too new relative to my view?”

TL;DR: An active snapshot means “there exists at least one read view for which certain older versions might still be needed.” As long as such snapshots exist, those versions cannot be purged.


How and when is a snapshot (read view) created?

Isolation level drives when InnoDB creates (and refreshes) read views:

Isolation levelWhen the read view is createdHow long it’s used
REPEATABLE READ (InnoDB default)At the first consistent read in a transaction (or START TRANSACTION WITH CONSISTENT SNAPSHOT)Reused for the entire transaction → all plain SELECTs see the same snapshot
READ COMMITTEDAt the start of each statement that does a consistent readNew snapshot per statement
READ UNCOMMITTEDNo consistent read (can see uncommitted)N/A
SERIALIZABLELike REPEATABLE READ but with extra lockingEntire transaction

Notes:

  • A consistent read is a SELECT without locking modifiers (no FOR UPDATE / FOR SHARE).
  • Locking reads (SELECT ... FOR UPDATE/SHARE, UPDATE, DELETE) don’t use the snapshot; they read the latest committed version (and may wait/lock).

So: No, a snapshot is not “putting the result in cache.” It’s a lightweight visibility rule encoded in a read view; on each row access, InnoDB checks visibility and, if needed, reconstructs an older version from the undo chain.


What’s a “read view”?

A read view is the tiny, in-memory snapshot of which transactions are considered visible to your SELECT.
It’s not a cached result set. It’s just a set of IDs and boundaries that tell InnoDB which row versions you’re allowed to see.

When is it created?

  • REPEATABLE READ (default): created at the first consistent read in the transaction (or via START TRANSACTION WITH CONSISTENT SNAPSHOT) and reused for the whole transaction.

  • READ COMMITTED: created per statement (a fresh view for each SELECT).

  • Locking reads (… FOR UPDATE/SHARE) don’t use the read view; they read the latest committed version with locks.

Where is it stored?

  • In memory, attached to the transaction/statement control block.

  • It is not persisted to disk and does not store rows.

  • InnoDB uses it on the fly: for each row, check visibility; if needed, reconstruct an older version via the undo chain.

What does it look like?

Think of a small struct with three pieces:

  • active_txn_ids: the set (sorted list) of transaction IDs that were active (not committed yet) when the view was created.

  • min_active_id: the smallest ID in that active set (the “low watermark”).

  • max_id_so_far: the next transaction ID that would be assigned at that moment (a “high watermark”; anything ≥ this didn’t even exist yet).

Example (mental model / pseudostructure):

ReadView {
  active_txn_ids = [101, 104, 108]
  min_active_id  = 101
  max_id_so_far  = 120   // next-to-assign ID at snapshot creation
}

How is it used to decide visibility?

Every row version has:

  • DB_TRX_ID = the transaction that wrote that version

  • DB_ROLL_PTR = pointer to the undo record for the previous version

Rule of thumb (visibility test):

  1. If DB_TRX_ID < min_active_id → that writer must have committed before the snapshot ⇒ visible.

  2. Else if DB_TRX_ID is in active_txn_ids → that writer was still running at snapshot time ⇒ not visible.

  3. Else (DB_TRX_ID not in active set, and < max_id_so_far) → it had already committed before the snapshot ⇒ visible.

  4. (Conceptually, DB_TRX_ID >= max_id_so_far can’t happen for a stored row; that would be “from the future”.)

If a version isn’t visible, InnoDB follows DB_ROLL_PTR to the previous version and repeats the check, walking back in time until it finds a visible one (or concludes the row didn’t exist for this snapshot).

Tiny worked example

Using the read view above:

  • Row version written by trx 9999 < 101visible.

  • Row version written by trx 105105 ∈ [101,104,108]? no, but it’s not in the active set and < 120 ⇒ must have committed already ⇒ visible.

  • Row version written by trx 104104 ∈ active_txn_idsnot visible (reader must use the previous version via undo).

That’s it: a read view is a lightweight visibility filter (IDs + boundaries), not a copy of data.