Verification Matrix
Purpose
This page is the research-facing index of what SPECTRAX-GK treats as verified, validated, exploratory, or deferred. It is meant to answer four questions for each lane:
what physical model is being exercised,
what observable is compared,
what the reference is,
what acceptance gate applies.
Literature Baselines Reviewed
The current matrix is anchored on these published baselines:
Tronko et al., Verification of Gyrokinetic codes: theoretical background and applications: verification methodology, observed-order checks, and benchmark-observable framing.
Mandell et al., GX: a GPU-native gyrokinetic turbulence code for tokamak and stellarator design: CBC, W7-X, KBM, nonlinear transport, velocity-space convergence, and performance figure conventions.
González-Jerez et al., Electrostatic gyrokinetic simulations in W7-X geometry: W7-X ITG/TEM scans, zonal-flow response, and nonlinear ITG heat flux.
Nevins et al., Characterizing electron temperature gradient turbulence: ETG operating-point conventions.
Monreal et al., Residual zonal flows in tokamaks and stellarators at arbitrary wavelengths: residual-zonal-flow metrics and damping interpretation.
Merlo et al., Linear multispecies gyrokinetic flux tube benchmarks in shaped tokamak plasmas: shaping scans, ballooning-angle handling, Rosenbluth-Hinton residuals, and GAM damping.
González-Jerez et al., Electrostatic microturbulence in W7-X: comparison of local gyrokinetic simulations with Doppler reflectometry measurements: fluctuation amplitudes, frequency spectra, and zonal-flow spectral content.
Maurer et al., Global electromagnetic turbulence simulations of W7-X-like plasmas with GENE-3D: heavy-electron electromagnetic verification before realistic-mass stellarator production runs.
Status Legend
Closed: benchmark lane is accepted for research claims.Open: lane is active and expected to close.Exploratory: useful for development, not yet a paper claim.Deferred: intentionally out of scope for the current paper/release.
Tokamak Linear
Lane |
Observable |
Reference |
Status |
Baseline gate |
|---|---|---|---|---|
Cyclone ITG |
|
GX + CBC literature |
Closed |
|
ETG |
|
GX + ETG benchmark literature |
Closed |
|
KBM |
|
GX |
Closed |
|
KAW |
branch-followed linear response |
GX |
Deferred |
close branch identity before publication use |
TEM |
|
GX / literature |
Open |
close branch-following and reference selection first |
Shaped multispecies tokamak |
|
Sauter benchmark set |
Open |
literature-backed operating point and overlap gate required |
Shaped tokamak zonal-flow / GAM |
residual level, damping rate, GAM envelope |
Merlo et al. + analytical Rosenbluth-Hinton estimates where applicable |
Open |
residual and damping must match literature/code-backed references before publication use; signed |
Frozen artifact paths for the currently closed tokamak linear lanes:
docs/_static/cyclone_comparison.pngdocs/_static/etg_comparison.pngdocs/_static/kbm_comparison.pngdocs/_static/kbm_eigenfunction_overlap_summary.pngdocs/_static/reference_modes/kbm_linear_gx_ky0p3000.npzdocs/_static/benchmark_core_linear_atlas.png
Closed raw-overlay diagnostic artifacts for the KBM lane:
docs/_static/reference_modes/kbm_linear_spectrax_ky0p3000.csvdocs/_static/kbm_eigenfunction_reference_overlay_ky0p3000.pngdocs/_static/reference_modes/kbm_eigenfunction_reference_overlay_ky0p3000.jsontools/generate_kbm_reference_overlay.py
The refreshed bounded-cost extraction produces normalized overlap
0.999985 and relative L^2 mismatch 0.00721 against the frozen GX
raw mode at k_y \approx 0.3 when run with the exact KBM grid contract, the
selected growth-fit window, and a late-time eigenfunction tail window. The
generator writes a machine-readable gate report with overlap >= 0.95 and
relative L^2 <= 0.25 as the acceptance policy, and this raw-overlay
artifact now passes.
Branch-followed scan tables should use the same gate-report convention:
observed-order gates for resolution or velocity-space convergence, and
branch-continuity gates for adjacent gamma/omega jumps and successive
eigenfunction overlap when overlap data are available. The tracked KBM
candidate table now has a no-rerun summary path through
tools/generate_kbm_branch_gate_summary.py and
docs/_static/kbm_branch_gate_summary.json. That summary now uses the
continuity-first selected branch and passes the strict checks:
max_rel_gamma_jump ~= 0.388, max_rel_omega_jump ~= 0.320, and no
successive-overlap deficit.
Observed-order convergence tables should also gate both the asymptotic finest
refinement and the full set of pairwise refinement orders. The generic
tools/generate_observed_order_gate.py path now records this policy in JSON.
The tracked Cyclone velocity-space convergence artifact
docs/_static/cyclone_resolution_observed_order.json is closed on an
office/GPU ky=0.30 time-path sweep with all pairwise orders positive,
final-pair order above 4.8, and finest-grid relative growth-rate error
about 1.1e-3.
The current materialized gate reports are indexed by
tools/make_validation_gate_index.py in
docs/_static/validation_gate_index.json and
docs/_static/validation_gate_index.png. Exploratory diagnostics can set
gate_index_include=false so they remain documented but do not count as
release blockers. The current release-gate index has 16/16 tracked reports
passing.
Stellarator Linear
Lane |
Observable |
Reference |
Status |
Baseline gate |
|---|---|---|---|---|
W7-X ITG flux tube |
|
stella/GENE benchmark paper + GX |
Closed |
|
W7-X TEM / kinetic-electron extension |
|
stella/GENE benchmark paper + W7-X TEM literature |
Open |
|
W7-X zonal flow |
residual level, damping envelope |
stella/GENE benchmark paper + zonal-flow literature |
Open; time coverage closed, residual and late-envelope gates open |
a case-specific runtime/tool path exists through |
W7-X fluctuation spectra |
resolved |
W7-X nonlinear gate plus Doppler-reflectometry comparison conventions |
Initial simulation diagnostic closed; experimental transfer-function validation deferred |
|
HSX |
|
GX / internal frozen references |
Closed |
near-marginal deviations documented explicitly |
Electromagnetic stellarator verification |
heavy-electron linear/nonlinear EM response |
GENE-3D verification conventions |
Open |
close heavy-electron EM lane before realistic-mass claims |
Frozen artifact paths for the currently closed stellarator linear lanes:
docs/_static/w7x_linear_t2_scan.csvdocs/_static/hsx_linear_t2_scan.csvdocs/_static/w7x_linear_t2_lastvalue.csvdocs/_static/hsx_linear_t2_lastvalue.csvdocs/_static/reference_modes/w7x_linear_gx_ky0p3000.npzdocs/_static/reference_modes/w7x_linear_spectrax_ky0p3000.csvdocs/_static/w7x_eigenfunction_reference_overlay_ky0p3000.pngdocs/_static/reference_modes/w7x_eigenfunction_reference_overlay_ky0p3000.jsondocs/_static/benchmark_core_linear_atlas.png
For W7-X, the whole-window scan and the late-time last-value reduction tell the
same story. For HSX, the whole-window mean_rel_gamma metric is kept as an
honest near-marginal stress signal, but the late-time closure should be read
from docs/_static/hsx_linear_t2_lastvalue.csv because the final
(gamma, omega) values are much tighter than the whole-window average.
The W7-X raw eigenfunction overlay is now closed at k_y rho_i = 0.3 using
tools/generate_w7x_reference_overlay.py. The frozen GX bundle was refreshed
from the finite t≈2 raw field history because the older bundle source
contained non-finite late-time fields. The matched imported-geometry
SPECTRAX-GK extraction uses the validated z_index diagnostic contract and
gives normalized overlap 0.9999999994 and relative L^2 mismatch
3.33e-5 against the frozen GX raw mode.
Nonlinear Validation
Lane |
Observable |
Reference |
Status |
Baseline gate |
|---|---|---|---|---|
Cyclone ITG |
heat-flux window mean/std/RMS, |
GX |
Closed |
current release gate |
Cyclone Miller |
same as above |
GX |
Closed |
allow documented low-amplitude / overlap-only adjustments |
KBM |
|
GX |
Closed |
mature lane |
W7-X |
heat-flux windows, saturation trend |
GX + W7-X benchmark conventions |
Closed |
release gate |
HSX |
heat-flux windows, saturation trend |
GX / internal frozen references |
Closed |
near-threshold behavior documented |
ETG full-GK pilot |
short-window nonlinear transport |
GX + ETG operating-point convention |
Exploratory |
manuscript use only if the pilot is explicitly framed as such |
kinetic-electron Cyclone |
electromagnetic nonlinear transport |
GX |
Deferred |
keep out of the paper until branch identity and runtime cost are closed |
Frozen artifact paths for the currently closed nonlinear lanes:
docs/_static/nonlinear_cyclone_diag_compare_t400.pngdocs/_static/nonlinear_cyclone_miller_diag_compare_t122.pngdocs/_static/nonlinear_kbm_diag_compare_t400_stats.pngdocs/_static/nonlinear_w7x_diag_compare_t200.pngdocs/_static/hsx_nonlinear_compare_t50_true.pngdocs/_static/benchmark_core_nonlinear_atlas.png
Machine-readable nonlinear window gates are now tracked for the first refreshed subset:
The summary panel above is generated by
tools/plot_nonlinear_window_statistics.py from the frozen gate-summary JSON
files. It plots the gate statistic (windowed mean relative mismatch) and the
maximum relative mismatch for each diagnostic, excluding exploratory summaries
with gate_index_include=false.
docs/_static/nonlinear_cyclone_miller_gate_summary.json: passed at the tightened case gate0.095.docs/_static/nonlinear_kbm_gate_summary.json: passed at the tightened case gate0.02.docs/_static/nonlinear_hsx_gate_summary.json: passed at the tightened case gate0.05.docs/_static/nonlinear_w7x_gate_summary.json: passed at the current0.10mean-relative release gate after the corrected adaptive state continuation and GX-stylePhi2artifact refresh.docs/_static/nonlinear_cyclone_gate_summary.json: passed on the mature Cyclonet=100..400transport window at the current0.10mean-relative release gate.docs/_static/nonlinear_cyclone_short_gate_summary.json: retained only as an exploratoryt=5startup/resolved-spectrum audit and excluded from the release-gate index.
Quasilinear Diagnostics and Model Selection
The quasilinear verification surface is deliberately split between validated linear-state diagnostics, rejected absolute-flux calibration attempts, and one scoped model-selection result. A closed model-selection status must not be read as a promoted runtime predictor.
Lane |
Observable |
Reference or artifact |
Status |
Baseline gate |
|---|---|---|---|---|
Electrostatic quasilinear weights and spectra |
heat/particle weights, growth/frequency spectra, and channel metadata |
|
Closed as diagnostics |
electrostatic channel validation and reproducible spectrum generation; this is not calibrated absolute-flux prediction |
One-constant and simple saturation-rule absolute-flux models |
train/holdout heat-flux prediction error |
|
Rejected / unpromoted |
current one-constant and simple-rule reports fail the held-out absolute-flux gate and must not be exposed as a user-facing saturation law |
|
leave-one-geometry-out error and interval coverage |
|
Closed as scoped model-selection result |
the accepted candidate is a manuscript model-selection result only; the status gate does not promote a runtime/TOML absolute-flux predictor, universal nonlinear transport model, or shipped saturation option |
Future absolute-flux promotion |
calibrated heat-flux prediction on nonlinear holdouts |
future late-window convergence metadata and promotion JSON |
Open |
every holdout needs finite passed post-transient convergence metadata: transient cutoff, running-mean drift, block/bootstrap uncertainty, finite sample count, and source provenance |
These gates do not change the deferred W7-X lanes: W7-X zonal long-window recurrence/damping and W7-X TEM / kinetic-electron validation remain outside the current manuscript/release scope. They also do not promote a universal absolute-flux model. Production nonlinear optimization is promoted only for the selected optimized-equilibrium audit now attached to the guard; nonlinear turbulence gradients and broad multi-surface claims remain separate gates.
Autodiff Validation
Workflow |
Observable |
Validation type |
Status |
|---|---|---|---|
Sensitivity analysis |
|
finite-difference / complex-step / tangent consistency |
Open |
Two-mode inverse problem |
planted parameter recovery |
gradient check + covariance estimate |
Closed |
UQ / Laplace example |
posterior covariance and propagated uncertainty |
Hessian/Jacobian validation |
Open |
Stellarator optimization prototype |
low-dimensional objective reduction |
gradient consistency + constrained solve behavior |
Closed for reduced objective plumbing; open for production nonlinear heat-flux optimization |
The single-mode inverse figure is intentionally a sensitivity and non-identifiability demonstration. The two-mode figure is the closed parameter-recovery validation. Both examples now write finite-difference Jacobian checks, Jacobian rank/condition number, covariance, standard deviations, correlations, and one-sigma UQ ellipse area into their summary JSON files. Those metadata are part of the validation gate: differentiated observables are not promoted to inverse-design or UQ claims unless the derivative check is conditioned and the inverse problem is identifiable.
Differentiable Geometry and Stellarator Objectives
The VMEC/Boozer objective lane is split into three claim levels. The first two are currently release/manuscript scoped; the third remains a future promotion gate.
Workflow |
Observable |
Reference or artifact |
Status |
Baseline gate |
|---|---|---|---|---|
VMEC/Boozer equal-arc geometry parity |
|
|
Closed for artifact-passing rows |
|
Solver-ready objective gradients |
linear eigenfrequency, growth, |
|
Closed for reduced QH/Li383 gates |
implicit AD/finite-difference mismatch remains within the tracked gate;
current combined maximum relative mismatch is about |
VMEC/Boozer aggregate optimization promotion |
aggregate objective decrease plus surface/field-line generalization |
|
Open for production transport claims |
aggregate finite-difference and line-search artifacts must pass on the
same training sample set, then an independent passed production-scope
validation artifact must cover a held-out |
Reduced stellarator ITG optimization and UQ |
objective reduction history, AD/finite-difference derivative parity, local covariance, and projected uncertainty |
|
Closed as reduced optimization plumbing |
objective/UQ metadata pass for the tracked QA control vector; the nonlinear entry is a smooth reduced window estimator |
VMEC/Boozer nonlinear startup FD audit |
compact startup-window heat-flux response to geometry perturbation |
|
Exploratory plumbing gate |
finite-output and finite-difference-response checks pass, but
|
Selected optimized-equilibrium nonlinear transport audit |
optimized-equilibrium post-transient heat-flux average with uncertainty and nonlinear audit bars |
|
Closed for selected optimized-equilibrium replicated transport audit |
the selected QA optimized equilibrium passes the |
Use this section as the verification boundary for README figures: the VMEC/Boozer parity, gradient-holdout, and reduced optimization/UQ panels can be cited as reduced objective evidence. Startup-window finite-difference panels and reduced nonlinear-window estimators must not be cited as saturated transport-gradient validation. The optimized-equilibrium replicate panel may be cited as a post-transient transport-window audit for the selected candidate, not as a universal quasilinear absolute-flux model.
Parallelization Validation
Independent k_y and ensemble parallelization is accepted only when a
serial numerical-identity gate accompanies the timing data. The current closed
artifact is docs/_static/parallel_ky_scan_gate.png with metadata in
docs/_static/parallel_ky_scan_gate.json. It runs the real Cyclone linear
solver with ky_batch=1 and a fixed-shape batched scan, then requires
max_gamma_rel_error <= 1e-8 and max_omega_abs_error <= 1e-8. The
observed speedup is reported as an engineering metric, not as the gate itself.
The larger CPU/GPU strong-scaling artifacts
docs/_static/independent_ky_scan_scaling_large.json and
docs/_static/quasilinear_uq_ensemble_scaling_large.json are the current
release references for production independent-work scaling. Their split CPU
and GPU companions must keep per-row identity, timing, and worker/profile
metadata synchronized with the performance and validation manifests.
docs/_static/parallelization_completion_status.json is the release ledger
that turns those artifacts into a scoped production-closure claim while keeping
nonlinear domain-decomposition speedup out of scope.
Fixed-step nonlinear full-state sharding now has an engineering identity
artifact at docs/_static/nonlinear_sharding_profile.json generated by
tools/profile_nonlinear_sharding.py. That closes the control-flow and
final-state identity layer for the pjit state-sharded scan. The corresponding
two-GPU office artifact,
docs/_static/nonlinear_sharding_profile_office_gpu.json, confirms active
auto/kx sharding with zero final-state error on the bounded profiling
grid. The large combined sweep
docs/_static/nonlinear_sharding_strong_scaling_large.json is a
profiler/identity artifact, not a production nonlinear speedup claim. The same
diagnostic keeps z-axis FFT sharding out of the release claim until it has
a dedicated communication/layout design and a passing identity gate. True
nonlinear domain decomposition with halo/FFT communication, conservation
checks, and benchmark-size speedup remains outside the release claim.
Notes
A lane should not move from
OpentoClosedwithout an owning script, frozen artifact path, and literature/reference statement.README figures should use only
Closedlanes unless a panel is explicitly marked exploratory.Raw eigenfunction overlays for manuscript use should be rendered only from frozen reference bundles checked into
docs/_static/reference_modes/. Do not build publication figures from transient external files or ad hoc office-machine outputs.Experimental-facing figures such as W7-X fluctuation spectra should remain scoped as simulation diagnostics unless the diagnostic transfer function and access model are encoded directly in the repo.
Electromagnetic stellarator claims should be split explicitly into heavy-electron verification and realistic-electron research runs.