Goal

Deliver ultra-low latency (sub-millisecond pauses) that do not scale with heap size, by performing almost all GC work concurrently with the application.

  • Type: Concurrent, mostly non-moving STW (short sync points only)
  • Focus: Latency ✅, Throughput △ (good, but not the primary goal)
  • Scales to: very large heaps (10s–100s of GB, up to TB) with stable pauses
  • Generational:
    • JDK 21: generational mode available (needed -XX:+ZGenerational)
    • JDK 23+: generational ZGC is the default when ZGC is enabled

🧠 Mental model

[App threads keep running]
   ├─ ZGC concurrently: mark → relocate → remap
   └─ Very short STW sync points (phase transitions / bookkeeping)

  • Key idea: do the heavy work while the app runs; stop only briefly to coordinate phases.

🧩 How ZGC achieves low pauses

  • Region/colored pointers + load barriers
    • On every object load, a fast load barrier ensures the reference is valid:
      • if the object moved → transparently fix up (remap/forward) the reference
      • can even help evacuate (relocate) the object if this thread is first to see it
  • Concurrent marking: liveness discovered while the app runs.
  • Concurrent relocation (compaction): objects moved to reduce fragmentation, without long pauses.
  • Short STW points: lightweight synchronization between phases; sub-ms on healthy systems.

🧵 Threads & scalability

  • ZGC uses concurrent GC worker threads alongside app threads.
  • Pauses remain short regardless of heap size (design target: pauses don’t scale with heap).

📈 Throughput & trade-offs

  • Throughput is solid but may trail Parallel (and sometimes G1) because:
    • concurrent GC work competes for CPU,
    • barrier cost on loads adds a tiny per-access overhead,
    • generational metadata adds some native memory overhead.
  • If you need the highest raw throughput and can tolerate longer pauses, prefer Parallel.

🧪 Heap layout & generations

  • Region-based heap; regions can be collected independently.
  • Generational ZGC (21 opt-in; 23+ default) improves throughput by:
    • collecting young frequently, old less often (classic generational hypothesis),
    • reducing promotion/relocation pressure in old regions.

🚨 What to watch (symptoms & fixes)

  • Allocation stalls: app threads briefly blocked because ZGC can’t free fast enough.
    • Fixes: increase heap (-Xmx), allow more concurrent GC threads (-XX:ConcGCThreads=<n>), reduce allocation rate / GC pressure.
  • Native memory overhead: generational mode uses more metadata; size heap & system RAM accordingly.

🔧 Quick tuning cheatsheet

Start simple; ZGC is “auto-tuning friendly”. Use JFR/-Xlog:gc* to measure.

Enable ZGC:

-XX:+UseZGC

(21 only) turn on generations:

-XX:+UseZGC -XX:+ZGenerational

Concurrency threads (rarely needed; ergonomics usually fine):

-XX:ConcGCThreads=<n>

Class unloading (usually on by default):

-XX:+ZUnloadClasses

Logging / observability:

-Xlog:gc*,safepoint

Typical run:

java -XX:+UseZGC -Xms8g -Xmx8g -Xlog:gc*,safepoint -jar app.jar

✅ When to use ZGC

  • Latency-sensitive services (tight p99/p999 SLOs).
  • Large heaps (multi-GB to TB) where classic full GCs would be catastrophic.
  • Long-lived services where stable, tiny pauses matter more than peak throughput.

🗺️ ZGC vs Parallel vs G1 (1-liners)

  • ZGC: Ultra-low pauses, concurrent marking/relocation, great on big heaps; small throughput tax.
  • Parallel: Max throughput, but STW pauses (can be long on Old Gen).
  • G1: Balanced (good throughput + controllable pauses) via mixed collections.

🧰 Minimal flag sets

JDK 23+ (generational enabled by default with ZGC):

java -XX:+UseZGC -Xms8g -Xmx8g -jar app.jar

JDK 21 (opt-in generational):

java -XX:+UseZGC -XX:+ZGenerational -Xms8g -Xmx8g -jar app.jar

If you see allocation stalls under load:

java -XX:+UseZGC -Xms16g -Xmx16g -XX:ConcGCThreads=8 -Xlog:gc*,safepoint -jar app.jar