Goal
Deliver ultra-low latency (sub-millisecond pauses) that do not scale with heap size, by performing almost all GC work concurrently with the application.
- Type: Concurrent, mostly non-moving STW (short sync points only)
- Focus: Latency ✅, Throughput △ (good, but not the primary goal)
- Scales to: very large heaps (10s–100s of GB, up to TB) with stable pauses
- Generational:
- JDK 21: generational mode available (needed
-XX:+ZGenerational) - JDK 23+: generational ZGC is the default when ZGC is enabled
- JDK 21: generational mode available (needed
🧠 Mental model
[App threads keep running]
├─ ZGC concurrently: mark → relocate → remap
└─ Very short STW sync points (phase transitions / bookkeeping)

- Key idea: do the heavy work while the app runs; stop only briefly to coordinate phases.
🧩 How ZGC achieves low pauses
- Region/colored pointers + load barriers
- On every object load, a fast load barrier ensures the reference is valid:
- if the object moved → transparently fix up (remap/forward) the reference
- can even help evacuate (relocate) the object if this thread is first to see it
- On every object load, a fast load barrier ensures the reference is valid:
- Concurrent marking: liveness discovered while the app runs.
- Concurrent relocation (compaction): objects moved to reduce fragmentation, without long pauses.
- Short STW points: lightweight synchronization between phases; sub-ms on healthy systems.
🧵 Threads & scalability
- ZGC uses concurrent GC worker threads alongside app threads.
- Pauses remain short regardless of heap size (design target: pauses don’t scale with heap).
📈 Throughput & trade-offs
- Throughput is solid but may trail Parallel (and sometimes G1) because:
- concurrent GC work competes for CPU,
- barrier cost on loads adds a tiny per-access overhead,
- generational metadata adds some native memory overhead.
- If you need the highest raw throughput and can tolerate longer pauses, prefer Parallel.
🧪 Heap layout & generations
- Region-based heap; regions can be collected independently.
- Generational ZGC (21 opt-in; 23+ default) improves throughput by:
- collecting young frequently, old less often (classic generational hypothesis),
- reducing promotion/relocation pressure in old regions.
🚨 What to watch (symptoms & fixes)
- Allocation stalls: app threads briefly blocked because ZGC can’t free fast enough.
- Fixes: increase heap (
-Xmx), allow more concurrent GC threads (-XX:ConcGCThreads=<n>), reduce allocation rate / GC pressure.
- Fixes: increase heap (
- Native memory overhead: generational mode uses more metadata; size heap & system RAM accordingly.
🔧 Quick tuning cheatsheet
Start simple; ZGC is “auto-tuning friendly”. Use JFR/
-Xlog:gc*to measure.
Enable ZGC:
-XX:+UseZGC(21 only) turn on generations:
-XX:+UseZGC -XX:+ZGenerationalConcurrency threads (rarely needed; ergonomics usually fine):
-XX:ConcGCThreads=<n>Class unloading (usually on by default):
-XX:+ZUnloadClassesLogging / observability:
-Xlog:gc*,safepointTypical run:
java -XX:+UseZGC -Xms8g -Xmx8g -Xlog:gc*,safepoint -jar app.jar✅ When to use ZGC
- Latency-sensitive services (tight p99/p999 SLOs).
- Large heaps (multi-GB to TB) where classic full GCs would be catastrophic.
- Long-lived services where stable, tiny pauses matter more than peak throughput.
🗺️ ZGC vs Parallel vs G1 (1-liners)
- ZGC: Ultra-low pauses, concurrent marking/relocation, great on big heaps; small throughput tax.
- Parallel: Max throughput, but STW pauses (can be long on Old Gen).
- G1: Balanced (good throughput + controllable pauses) via mixed collections.
🧰 Minimal flag sets
JDK 23+ (generational enabled by default with ZGC):
java -XX:+UseZGC -Xms8g -Xmx8g -jar app.jarJDK 21 (opt-in generational):
java -XX:+UseZGC -XX:+ZGenerational -Xms8g -Xmx8g -jar app.jarIf you see allocation stalls under load:
java -XX:+UseZGC -Xms16g -Xmx16g -XX:ConcGCThreads=8 -Xlog:gc*,safepoint -jar app.jar