Quasilinear Transport

SPECTRAX-GK can compute quasilinear transport diagnostics from a linear eigenstate or late-time linear state. The implementation deliberately separates the exact linear diagnostic from any saturation model:

  • linear weights are amplitude-normalized heat and particle fluxes computed with the same diagnostic kernels used by runtime simulations;

  • saturation rules are named, serialized model assumptions that convert a linear mode into a trend-level saturated estimate;

  • calibrated absolute flux claims require nonlinear training and holdout validation and should not be inferred from the uncalibrated rules alone.

Current validated scope

The current implementation supports electrostatic channels only:

[quasilinear]
enabled = true
mode = "weights"
saturation_rule = "none"
amplitude_normalization = "phi_rms"
kperp_average = "phi_weighted"
channels = ["es"]

The diagnostic writes:

  • *.quasilinear.summary.json with growth rate, frequency, normalization, kperp_eff2, species weights, and saturation metadata;

  • *.quasilinear_species.csv with species-resolved heat and particle flux weights and, when requested, saturated estimates.

  • *.quasilinear_spectrum.csv for serial scan-runtime-linear runs with quasilinear diagnostics enabled.

For linked-boundary or imported-geometry scans, *.quasilinear_spectrum.csv stores two perpendicular-mode coordinates: ky is the requested scan coordinate used for ordering and plotting, while mode_ky is the selected signed grid-mode coordinate used internally by the linear solve. This prevents negative-branch aliases from corrupting publication spectra while preserving the exact selected mode metadata.

Literature anchors and claim policy

The SPECTRAX-GK quasilinear layer follows the same separation used in modern reduced gyrokinetic transport workflows:

  • the linear gyrokinetic eigenproblem determines growth rates, frequencies, eigenfunctions, cross-phases, and species-resolved flux weights;

  • a separate saturation rule converts those linear quantities into fluctuation amplitudes;

  • nonlinear simulations or experimental transport databases are required before claiming calibrated absolute fluxes.

This separation is central to early nonlinear tests of quasilinear transport models [Waltz09], to the QuaLiKiz derivation [Stephens21], to profile- evolution use cases [Citrin17], and to broader quasilinear-model validation reviews [Staebler24]. Parker et al. [Parker23] show why saturation rules must be treated as model assumptions rather than consequences of the linear solve alone. SAT3 [Dudding22] and SAT3-NN [Sar26] are useful longer term targets because they use spectrum-aware, database-calibrated saturation information instead of a single uncalibrated mixing-length constant.

For stellarator optimization, SPECTRAX-GK currently treats quasilinear fluxes as research diagnostics and optimization proxies, following the microstability optimization motivation in [Jorge24]. The present release does not claim a validated absolute nonlinear flux predictor. The current eight-case electrostatic-compatible portfolio validates the input plumbing and rejects the legacy one-constant and simple saturation-rule families. The richer spectral_envelope_ridge candidate is accepted only as a scoped model-development result with uncertainty metadata; it is not exposed as a runtime saturation law or universal transport model.

Executable usage

spectraxgk run-runtime-linear \
  --config examples/linear/axisymmetric/runtime_cyclone_quasilinear.toml \
  --out tools_out/cyclone_quasilinear

or enable the diagnostic for another linear runtime TOML:

spectraxgk run-runtime-linear \
  --config examples/linear/axisymmetric/runtime_cyclone.toml \
  --quasilinear \
  --ql-mode saturated \
  --ql-saturation-rule mixing_length \
  --ql-normalization phi_rms \
  --ql-csat 1.0 \
  --out tools_out/cyclone_quasilinear

For a ky spectrum, use the independent-scan path. --workers can parallelize the per-ky linear solves and quasilinear state extraction while preserving the serial ordering of the output spectrum:

spectraxgk scan-runtime-linear \
  --config examples/linear/axisymmetric/runtime_cyclone_quasilinear.toml \
  --ky-values 0.1,0.2,0.3,0.4 \
  --quasilinear \
  --workers 2 \
  --out tools_out/cyclone_quasilinear_scan

Then render the spectrum:

python tools/plot_quasilinear_spectrum.py \
  --spectrum tools_out/cyclone_quasilinear_scan.quasilinear_spectrum.csv \
  --out docs/_static/quasilinear_cyclone_spectrum.png

The shipped worker-identity gate for this path is generated with:

JAX_ENABLE_X64=1 python tools/generate_quasilinear_runtime_parallel_gate.py \
  --workers 2 \
  --ky 0.1 0.2 \
  --out-prefix docs/_static/quasilinear_runtime_parallel_gate
Quasilinear runtime scan worker identity gate

Serial and worker-parallel scan-runtime-linear quasilinear spectra from the same runtime configuration. The gate checks ordered state-extraction identity for the linear heat-flux weight and the saturated heat-flux estimate; any timing metadata is reported for engineering tracking only, not as a production speedup claim.

The shaped-tokamak Miller companion uses the same pattern, with the positive ky range resolved by the nonlinear run’s Ny=64 grid:

spectraxgk scan-runtime-linear \
  --config examples/linear/axisymmetric/runtime_cyclone_miller_quasilinear.toml \
  --ky-values 0.1,0.2,0.3,0.4,0.5 \
  --quasilinear \
  --out docs/_static/quasilinear_cyclone_miller_spectrum_scan
Cyclone Miller quasilinear spectrum

Model details

Linear eigenproblem

For a fixed flux-tube geometry and perpendicular mode, the linear runtime solves the matrix-free system

\[\frac{\partial G}{\partial t} = \mathcal{L}(\mathbf{p}) G, \qquad \mathcal{L} v_j = \lambda_j v_j,\]

where G is the Hermite-Laguerre gyrocenter moment state, v_j is a right eigenvector, and

\[\lambda_j = \gamma_j - i\omega_j.\]

The sign convention above matches the runtime output: gamma is the growth rate and omega is the physical mode frequency reported by the executable. The operator is assembled by spectraxgk.linear, spectraxgk.terms.assembly, and the individual term modules under spectraxgk.terms.

Field solve and linear weights

Given a linear state G, SPECTRAX-GK first reconstructs fields with compute_fields_cached. In the currently validated electrostatic path the quasilinear diagnostic uses phi and sets A_parallel = B_parallel = 0. Electromagnetic quasilinear channels remain disabled until the field-channel normalization and nonlinear-calibration gates are complete.

The species heat and particle weights are computed with the same diagnostic kernels used for nonlinear runtime outputs. For the electrostatic heat-flux channel, the code contracts the radial E x B velocity factor with the Hermite-Laguerre pressure moment:

\[v_{E,x,k} = i k_y \phi_k,\]
\[\overline{p}_{s,k} = \sum_\ell \left(J_{\ell s}^{(\mathrm{fac})} G_{\ell,0,s,k} + \frac{1}{\sqrt{2}} J_{\ell s} G_{\ell,2,s,k}\right),\]
\[Q_{s,k}^{(\mathrm{ES})} = \Re\left[v_{E,x,k}^* \overline{p}_{s,k}\right] W_k.\]

Here W_k includes the positive-ky Hermitian factor, the dealias mask, the flux-surface Jacobian/grad rho weight, species density/temperature factors, and the selected diagnostic flux scale. The particle-flux channel uses the density moment

\[\overline{n}_{s,k} = \sum_\ell J_{\ell s} G_{\ell,0,s,k}, \qquad \Gamma_{s,k}^{(\mathrm{ES})} = \Re\left[v_{E,x,k}^* \overline{n}_{s,k}\right] W_{\Gamma,k},\]

but it is zero for the one-ion adiabatic-electron cases because there is no kinetic electron species carrying particle transport. The implemented formulas live in spectraxgk.diagnostics.gx_heat_flux_species(), spectraxgk.diagnostics.gx_particle_flux_species(), and _gx_heat_flux_channel_contrib_species in spectraxgk.diagnostics.

Amplitude normalization and effective scale

The implemented effective perpendicular scale is

\[k_{\perp,\mathrm{eff}}^2 = \frac{\langle k_\perp^2 |\phi|^2 \rangle} {\langle |\phi|^2 \rangle},\]

where the average uses the runtime spectral and flux-tube volume weights. Heat and particle flux weights are divided by the selected amplitude normalization, making them invariant under eigenfunction phase rotations and amplitude rescalings.

The normalized linear weights are therefore

\[\widehat{Q}_{s} = \frac{\sum_k Q_{s,k}^{(\mathrm{ES})}}{\mathcal{N}_\phi}, \qquad \widehat{\Gamma}_{s} = \frac{\sum_k \Gamma_{s,k}^{(\mathrm{ES})}}{\mathcal{N}_\phi}.\]

The default normalization is

\[\mathcal{N}_\phi = \sum_{k_x,k_y,z} w_{k_x,k_y,z} |\phi_{k_x,k_y}(z)|^2,\]

with the same Hermitian and flux-tube weights used by spectraxgk.quasilinear.spectral_phi_weights().

Supported amplitude normalizations are:

  • phi_rms: weighted |\phi|^2 average;

  • phi_midplane: maximum midplane |\phi|^2;

  • field_energy: electrostatic field-energy normalization.

Supported saturation rules are:

  • none: write linear weights only;

  • mixing_length: A^2 = C_sat max(gamma - gamma_floor, 0) / kperp_eff2;

  • lapillonne_2011: currently the same audited scaling contract as mixing_length until the broader model-specific validation suite is added.

  • linear_weight: A^2 = C_sat. This is a diagnostic intensity rule used to test whether the linear flux-weight spectrum alone transfers across geometries.

  • absolute_growth_mixing_length: A^2 = C_sat |gamma| / kperp_eff2. This gives stable short-window branches nonzero diagnostic intensity and is included only for saturation-model stress tests, not as a validated physical rule.

The current mixing-length output is

\[A_k^2 = C_{\mathrm{sat}}\, \frac{\max(\gamma_k-\gamma_{\mathrm{floor}},0)} {k_{\perp,\mathrm{eff},k}^2},\]
\[Q_{s,k}^{(\mathrm{sat})} = A_k^2 \widehat{Q}_{s,k}, \qquad \Gamma_{s,k}^{(\mathrm{sat})} = A_k^2 \widehat{\Gamma}_{s,k}.\]

This is intentionally the simplest possible baseline. It is useful for software validation and sensitivity studies, but Parker-style saturation-rule comparisons [Parker23], SAT3/SAT3-NN-style spectrum-aware rules [Dudding22] [Sar26], and nonlinear holdout tests are required before it can be used as a predictive absolute-flux model.

The reduced objective helper spectraxgk.quasilinear.quasilinear_feature_objective supports the same diagnostic rules for differentiability tests from feature vectors [gamma, kperp_eff2, flux_weight]. The fast suite checks the resulting Jacobians against central finite differences before these objectives are used in optimization examples.

Implementation map

Layer

Source

Responsibility

Quasilinear weights

spectraxgk.quasilinear

phase/amplitude-invariant k_perp scale, heat and particle weights, and saturated outputs

Diagnostic kernels

spectraxgk.diagnostics

heat, particle, field-energy, volume-factor, and resolved flux contractions shared by linear and nonlinear paths

Runtime plumbing

spectraxgk.runtime, spectraxgk.runtime_artifacts

single-run and scan execution, TOML/executable overrides, JSON/CSV artifact writing

Input schema

spectraxgk.runtime_config, spectraxgk.io

[quasilinear] configuration and round-trip serialization

Calibration reports

spectraxgk.quasilinear_calibration

train/holdout/audit schemas, nonlinear-window ingestion, scale fitting, and report scoring

Plotting tools

tools/plot_quasilinear_spectrum.py and tools/plot_quasilinear_calibration.py

publication-facing spectrum and calibration figures

Differentiability gates

spectraxgk.autodiff_validation

finite-difference checks, covariance diagnostics, dense operator fixtures, and implicit isolated-eigenpair sensitivities

Algorithmic workflow

For one linear mode:

build grid, geometry, species, and linear cache
solve the linear eigenproblem or fit the late-time linear state
reconstruct phi from the eigenvector/state
compute kperp_eff2 from |phi|^2 weights
compute heat and particle flux contractions using runtime diagnostic kernels
divide by the requested amplitude normalization
optionally apply a named saturation rule
write summary JSON and species CSV artifacts

For a serial ky scan:

for requested ky in ky_values:
    select the closest grid mode
    run the single-mode linear solve
    compute quasilinear payload
    store requested ky and selected signed mode_ky
write *.scan.csv and *.quasilinear_spectrum.csv

For nonlinear calibration:

integrate or load nonlinear diagnostic CSV over a declared time window
compute late-window convergence metadata after transient removal
require finite mean, running-mean drift, block-bootstrap SEM, and provenance
integrate/sum the linear quasilinear spectrum
create train, holdout, or audit calibration points
optionally fit one multiplicative scale on train points only
score holdout points against explicit mean-relative-error and window gates

Numerics and differentiability

SPECTRAX-GK production linear solves remain matrix-free. Dense matrices are only materialized in tiny validation fixtures through spectraxgk.autodiff_validation.explicit_complex_operator_matrix().

Eigenvalue sensitivities use JAX derivatives of the matrix entries and the standard isolated-branch relation

\[\frac{\partial \lambda}{\partial p_i} = w^\dagger \frac{\partial \mathcal{L}}{\partial p_i} v, \qquad w^\dagger v = 1,\]

where v and w are right and left eigenvectors. Eigenfunction-dependent observables use the implicit perturbation system

\[\begin{split}\begin{bmatrix} \mathcal{L} - \lambda I & -v \\ w^\dagger & 0 \end{bmatrix} \begin{bmatrix} \partial_i v \\ \partial_i \lambda \end{bmatrix} = \begin{bmatrix} -(\partial_i \mathcal{L})v \\ 0 \end{bmatrix}.\end{split}\]

The gauge condition w^\dagger \partial_i v = 0 makes the derivative unique for phase-invariant observables. This path is now tested on a tiny SPECTRAX-GK linear-RHS fixture and compared against nearest-branch central finite differences. Direct JAX differentiation through non-Hermitian eigenvectors is still explicitly guarded because JAX does not provide that JVP; the implicit path is the supported validation route.

Validation gates

The fast test suite currently checks:

  • TOML and executable plumbing for [quasilinear];

  • phase and amplitude invariance of the linear weights;

  • explicit rejection of unvalidated electromagnetic channels;

  • artifact serialization for summary and species tables;

  • a small Krylov runtime smoke test.

  • autodiff-vs-finite-difference and tangent checks for the reduced mixing-length objective [gamma, kperp_eff2, flux_weight].

  • branch-isolated eigenvalue AD-vs-finite-difference checks, which are the lightweight gate used before differentiating full linear growth/frequency outputs.

  • a tiny dense SPECTRAX-GK linear-RHS fixture that materializes the otherwise matrix-free operator, disables the production custom-VJP field solve for forward-mode validation, and checks an isolated eigenvalue derivative against central finite differences.

  • an explicit guard showing that direct JAX differentiation through non-Hermitian eigenvectors is unsupported;

  • an implicit left/right eigenpair sensitivity gate for phase-invariant eigenfunction observables, including a tiny SPECTRAX-GK linear-RHS quasilinear-style objective checked against finite differences.

  • a fast promotion guardrail that scans the calibration/model-selection JSON reports, conservative documentation wording, and the manuscript-readiness quasilinear lane. A closed manuscript lane must remain explicitly scoped as a diagnostic/model-selection result and must list the guardrail artifact; it is not a calibrated absolute-flux claim.

Implicit sensitivity example

The user-facing example examples/theory_and_demos/quasilinear_implicit_sensitivity.py applies the implicit gate to a tiny Cyclone linear-RHS fixture. The differentiated observable is

\[\mathbf{y} = \left[ \gamma,\, \omega,\, k_{\perp,\mathrm{eff}}^2,\, \widehat{Q}_i,\, Q_i^{(\mathrm{ML})} \right],\]

where Q_i^(ML) is the uncalibrated mixing-length heat-flux proxy computed from the linear heat-flux weight. The parameter vector is [R/L_n, R/L_Ti]. The goal is not to claim nonlinear-flux prediction from this tiny fixture; the goal is to verify that a phase-invariant quasilinear observable can be differentiated through an isolated non-Hermitian eigenbranch without relying on unsupported JAX eigenvector derivatives.

python examples/theory_and_demos/quasilinear_implicit_sensitivity.py \
  --outdir docs/_static
Implicit quasilinear eigenpair sensitivity validation

The lower panels compare the implicit left/right derivative against central finite differences that follow the nearest isolated eigenvalue branch. The tracked artifact passes with maximum relative derivative error around 1.2e-2 and branch gap around 2.4e-1. Those values are stored in docs/_static/quasilinear_implicit_sensitivity.json so documentation figures and tests use the same audit payload.

The manuscript-level validation plan adds nonlinear calibration and holdout studies across axisymmetric and stellarator cases before making absolute transport-prediction claims. The model and calibration policy follows the quasilinear derivation and saturation-rule validation philosophy in [Stephens21] and [Parker23].

Calibration reports

Calibration artifacts should use spectraxgk.quasilinear_calibration so training, holdout, and audit points carry the same schema. A report is promoted to calibrated_absolute_flux only when it contains at least one training point, at least one holdout point, finite passed nonlinear late-window convergence metadata for every holdout, and the holdout mean-relative-error gate passes. The window metadata comes from spectraxgk.quasilinear_window or tools/check_nonlinear_window_convergence.py and records the transient cutoff, late-window mean/std, running-mean drift, block/bootstrap SEM, sample counts, and source-artifact provenance. Otherwise the claim is demoted to calibration_dataset or training_or_audit_only. This keeps README, docs, and manuscript figures from claiming absolute nonlinear transport prediction from an uncalibrated saturation rule.

Replicated nonlinear windows should additionally be checked with spectraxgk.quasilinear_window.nonlinear_window_ensemble_report before they are used as seed, initial-condition, or timestep-robust transport evidence. The ensemble gate consumes already-built nonlinear-window convergence reports, requires each input window to be promotion-ready by default, and checks the relative spread of late-window means plus the combined SEM across replicates. This is the lightweight metadata gate that lets the validation ladder state that several long nonlinear runs agree; it is not a substitute for actually running those long nonlinear simulations. The command-line artifact wrapper is tools/check_nonlinear_window_ensemble.py; it reads multiple window JSON reports and writes a JSON report plus an optional PNG summary for documentation or manuscript audit trails. For external-VMEC replicate campaigns the end-to-end extraction wrapper is tools/build_external_vmec_replicate_ensemble.py: it reads the finished *.out.nc files, extracts Diagnostics/HeatFlux_st into trace CSVs, writes transport-window summaries, convergence reports, readiness and ensemble JSON gates, and generates the publication-facing two-panel trace/uncertainty plot. This is the preferred path for new seed/timestep replicate evidence because it removes manual NetCDF-to-CSV and provenance-editing steps. The wrapper returns nonzero when readiness or ensemble gates fail. Diagnostic landscape scans may pass --allow-failed-gates to finish collecting all points, but the produced JSON still records the failed gates and those points remain excluded from promotion claims.

The first tracked external-VMEC replicate campaign applies that protocol to the admitted D-shaped nonlinear holdout. Six office-GPU runs were launched from the same n64 configuration family: t=150 startup runs and t=250 continuations for seeds 31 and 32 at dt=0.05, plus a timestep variant with seed 22 at dt=0.04. The late transport window t=[170,250] passes the readiness gate and the ensemble gate without relaxing the spread or uncertainty thresholds: the accepted means are about 18.8, 20.8, and 18.1 in gyro-Bohm units, the mean-relative spread is 0.141 against the 0.15 gate, and the combined SEM/mean is 0.054 against the 0.25 gate. This strengthens the D-shaped holdout as nonlinear model-development evidence; it still does not promote the current quasilinear model to an absolute saturated-flux predictor.

D-shaped external-VMEC nonlinear heat-flux replicate gate

The next independent external-VMEC replicate campaign uses the admitted circular tokamak holdout. The first t=450 pass was intentionally not promoted even though the seed/timestep ensemble spread was small (mean_rel_spread≈0.076): seed 31 failed terminal-window agreement (0.199 against the 0.15 gate), showing that the late-time heat flux was still drifting. Extending the same three replicas to t=700 and using the later t=[350,700] window closes the readiness and ensemble gates without relaxing thresholds. The accepted means are consistent across seed and timestep variants, with ensemble mean 18.97, mean-relative spread 0.035, and combined SEM/mean 0.043. This is the circular holdout’s replicated nonlinear-window artifact; the failed t=450 readiness result is retained as convergence-history evidence rather than a promoted figure.

Circular external-VMEC nonlinear heat-flux replicate gate

The report builder is intentionally strict. Every point in one report must use the report’s named saturation_rule; predicted fluxes, observed nonlinear window means, and optional window standard deviations must be finite; and window standard deviations must be non-negative. Spectrum integration similarly rejects missing columns, all-non-finite samples, invalid integration methods, and non-positive delta_ky widths. These checks are part of the validation surface: a quasilinear figure can be exploratory, but a calibration or candidate-model artifact cannot silently mix rules or admit non-converged, non-finite, or dimensionally ambiguous data.

Train and holdout points must also be tied to a passed nonlinear validation gate before they can be used in calibration. The audit tool tools/check_quasilinear_calibration_inputs.py enforces that rule by matching each point’s nonlinear_artifact to tracked nonlinear gate metadata. It passes for the current Cyclone, Cyclone Miller, HSX, W7-X, and D-shaped external-VMEC calibration inputs, and now includes the ITERModel, up-down asymmetric, and circular external-VMEC windows in the eight-case portfolio. It would fail if an exploratory or non-converged pilot such as the CTH-like external-VMEC feasibility trace were inserted as a train/holdout point. The companion audit tools/check_quasilinear_promotion_guardrails.py is the metadata promotion guard: it requires finite nonlinear window means and standard deviations, train/holdout artifact provenance, passed held-out gates before calibrated_absolute_flux promotion, and explicit documentation scope markers. It also audits the manuscript quasilinear model-development figure index: each tracked figure must have a JSON sidecar, a scoped non-absolute claim level, explicit failed-baseline or blocker metadata where relevant, and README/docs wording that does not promote a runtime/TOML absolute-flux predictor. The current tracked reports are therefore not a calibrated absolute-flux claim.

Quasilinear calibration inputs matched to passed nonlinear gates

Existing nonlinear window summaries can be converted into calibration points with calibration_point_from_nonlinear_window_summary when the summary points to either a diagnostics CSV or a SPECTRAX-GK runtime NetCDF file. CSV inputs use the t column and the selected heat-flux column, usually heat_flux. NetCDF inputs use Grids/time and map heat_flux to Diagnostics/HeatFlux_st; heat_flux_es, heat_flux_apar, and heat_flux_bpar map to the corresponding Diagnostics/HeatFlux*_st variables. Species are summed by default for NetCDF variables with shape (time, species); pass --species-index to the report builder when an ion-only or electron-only nonlinear target is needed. The helper uses the summary’s tmin/tmax window when present, otherwise the full finite time range, and records the mean and standard deviation of the selected heat-flux observable.

python tools/build_quasilinear_calibration_report.py \
  --points docs/_static/quasilinear_calibration_points.json \
  --out docs/_static/quasilinear_calibration_report.json \
  --saturation-rule mixing_length

The first tracked audit point maps the Cyclone quasilinear spectrum above to the long-window nonlinear Cyclone heat-flux diagnostic. It is intentionally an audit point, not a calibrated transport claim:

Cyclone quasilinear calibration audit against nonlinear heat flux

With C_sat = 1 the simple mixing-length rule underpredicts the absolute nonlinear heat flux by orders of magnitude. This is the expected outcome for an uncalibrated saturation rule and is precisely why the report remains at training_or_audit_only. A paper-level absolute-flux claim requires a documented training set, held-out nonlinear cases, and passed holdout gates.

The same report can also be generated directly from a quasilinear spectrum and a nonlinear gate summary:

python tools/build_quasilinear_calibration_report.py \
  --spectrum docs/_static/quasilinear_cyclone_spectrum_scan.quasilinear_spectrum.csv \
  --nonlinear-summary docs/_static/nonlinear_cyclone_gate_summary.json \
  --split audit \
  --case cyclone_long_window \
  --geometry cyclone \
  --electron-model adiabatic \
  --saturation-rule mixing_length \
  --out docs/_static/quasilinear_cyclone_calibration_audit_report.json

python tools/plot_quasilinear_calibration.py \
  --report docs/_static/quasilinear_cyclone_calibration_audit_report.json \
  --out docs/_static/quasilinear_cyclone_calibration_audit.png

Train/holdout transfer

The first geometry-transfer gate fits a single multiplicative heat-flux scale on the Cyclone long-window nonlinear diagnostic and holds out the Cyclone Miller nonlinear window. This is the minimal one-constant calibration expected of a simple mixing-length saturation rule: if it fails, the missing ingredient is not just a constant C_sat.

Quasilinear train/holdout calibration from Cyclone to Cyclone Miller

The tracked report is calibration_dataset and passed = false. The Cyclone-fitted scale is C_sat = 3839.966 for the current normalization, but the held-out Cyclone Miller error is much larger than the 0.35 mean relative gate. That failure is retained as a manuscript-facing result: it demonstrates that the implemented linear weights and nonlinear-window ingestion are working, while a transferable saturation model remains an open research task.

The manuscript-facing combined report broadens this test to eight electrostatic-compatible nonlinear windows. It fits Cyclone and external-VMEC ITERModel, then holds out Cyclone Miller, HSX, W7-X, D-shaped external VMEC, up-down asymmetric external VMEC, and circular external VMEC. The nonlinear input validation passes, but the one-constant model still fails with held-out mean relative error about 2.11. The simple saturation-rule sweep also fails: positive-growth mixing length is the least-bad simple rule at about 2.11 mean held-out relative error, while the training-mean null baseline is about 1.20. The accepted spectral_envelope_ridge candidate reaches about 0.295 mean relative error with interval coverage 7/8; keep it labeled as candidate_model_development_not_runtime_option until additional electrostatic holdouts, electromagnetic channels, and optimized-equilibrium audits close.

The report is generated with:

python tools/build_quasilinear_calibration_report.py \
  --points docs/_static/quasilinear_cyclone_miller_train_holdout_points.json \
  --fit-train-scale \
  --out docs/_static/quasilinear_cyclone_miller_train_holdout_report.json

python tools/plot_quasilinear_calibration.py \
  --report docs/_static/quasilinear_cyclone_miller_train_holdout_report.json \
  --out docs/_static/quasilinear_cyclone_miller_train_holdout.png

Non-axisymmetric HSX holdout

The first non-axisymmetric quasilinear calibration audit uses the same adiabatic-electron ITG setup as the tracked nonlinear window gate. The checked TOML points to a self-contained QHS VMEC deck generated locally by vmec_jax; exact HSX validation should override --vmec-file with the machine-specific benchmark WOUT:

cd examples/vmec
vmec_jax input.NuhrenbergZille_1988_QHS
cd ../..
spectraxgk scan-runtime-linear \
  --config examples/linear/non-axisymmetric/runtime_hsx_linear_quasilinear.toml \
  --ky-values 0.047619047619047616,0.09523809523809523,0.14285714285714285,0.19047619047619047,0.23809523809523808,0.2857142857142857 \
  --Nl 4 --Nm 8 --solver time --dt 0.005 --steps 400 \
  --quasilinear \
  --out docs/_static/quasilinear_hsx_spectrum_scan \
  --no-progress
HSX quasilinear spectrum

All scanned HSX branches in this short linear spectrum are stable under the current gamma_floor = 0 mixing-length rule, so the uncalibrated saturated heat-flux estimate is exactly zero even though the nonlinear HSX heat-flux window is finite. This is a useful negative result: it shows that the current one-constant mixing-length rule is not a transferable stellarator transport model and that branch coverage/saturation physics must be improved before absolute stellarator quasilinear-flux claims.

The combined Cyclone-train, Cyclone-Miller-holdout, and HSX-holdout report is generated with:

python tools/build_quasilinear_calibration_report.py \
  --points docs/_static/quasilinear_cyclone_miller_train_holdout_points.json \
  --spectrum docs/_static/quasilinear_hsx_spectrum_scan.quasilinear_spectrum.csv \
  --nonlinear-summary docs/_static/nonlinear_hsx_gate_summary.json \
  --split holdout \
  --case hsx_nonlinear_window \
  --geometry hsx \
  --electron-model adiabatic \
  --fit-train-scale \
  --out docs/_static/quasilinear_hsx_train_holdout_report.json

python tools/plot_quasilinear_calibration.py \
  --report docs/_static/quasilinear_hsx_train_holdout_report.json \
  --out docs/_static/quasilinear_hsx_train_holdout.png
Quasilinear train/holdout calibration including HSX

The report remains calibration_dataset and passed = false. In the absolute-flux panel, open markers denote non-positive quasilinear estimates that are plotted at the documented log-axis floor. The HSX point has a finite nonlinear heat-flux window mean but zero current mixing-length prediction, so the relative error is one by construction.

W7-X NetCDF nonlinear-window path

W7-X uses the same calibration machinery, but the tracked nonlinear window is a runtime NetCDF file rather than a diagnostics CSV. The report builder therefore uses the NetCDF ingestion path described above. A reproducible linear quasilinear spectrum should be generated from a VMEC source, not from an ignored local tools_out/*.eik.nc file:

cd examples/vmec
vmec_jax input.nfp3_QI_fixed_resolution_final
cd ../..
spectraxgk scan-runtime-linear \
  --config examples/linear/non-axisymmetric/runtime_w7x_linear_quasilinear_vmec.toml \
  --ky-values 0.047619047619047616,0.09523809523809523,0.14285714285714285,0.19047619047619047,0.23809523809523808,0.2857142857142857 \
  --Nl 4 --Nm 8 --solver time --dt 0.005 --steps 400 \
  --quasilinear \
  --out docs/_static/quasilinear_w7x_spectrum_scan \
  --no-progress

The bundled command above regenerates the demo spectrum from the shipped QI VMEC input deck after vmec_jax creates examples/vmec/wout_nfp3_QI_fixed_resolution_final.nc. Exact W7-X validation should use the same TOML with --vmec-file pointing to the machine-specific benchmark WOUT; the benchmark WOUT itself is not shipped in Git.

W7-X quasilinear spectrum

All six short-window W7-X linear branches in the tracked electrostatic adiabatic-electron scan are stable under the current gamma_floor = 0 rule. As for HSX, this makes the uncalibrated saturated mixing-length flux zero even though the nonlinear heat-flux window is finite. The W7-X nonlinear NetCDF window is added to the same train/holdout report with:

python tools/build_quasilinear_calibration_report.py \
  --points docs/_static/quasilinear_cyclone_miller_train_holdout_points.json \
  --spectrum docs/_static/quasilinear_w7x_spectrum_scan.quasilinear_spectrum.csv \
  --nonlinear-summary docs/_static/nonlinear_w7x_gate_summary.json \
  --split holdout \
  --case w7x_nonlinear_window \
  --geometry w7x \
  --electron-model adiabatic \
  --fit-train-scale \
  --out docs/_static/quasilinear_w7x_train_holdout_report.json
Quasilinear train/holdout calibration including W7-X

The report remains calibration_dataset and passed = false. The Cyclone-fitted one-constant rule overpredicts Cyclone Miller, while the W7-X stable branches underpredict the finite nonlinear window by construction. This is retained as a negative absolute-flux result and should not be presented as a validated W7-X transport model.

The manuscript-facing combined holdout panel now puts two training geometries (Cyclone and the admitted external-VMEC ITERModel case) together with six held-out nonlinear windows: Cyclone Miller, HSX, W7-X, the admitted D-shaped external-VMEC case, the admitted up-down asymmetric external-VMEC case, and the admitted circular external-VMEC case.

Combined quasilinear train/holdout calibration including stellarator and external VMEC holdouts

This combined report is also calibration_dataset and passed = false. It is the clearest current figure for the absolute-flux story: one-constant mixing length does not transfer across the present tokamak, stellarator, and external-VMEC nonlinear windows. The fitted heat-flux scale uses only the two training points and still leaves the six holdouts at mean absolute relative error about 2.11, with the worst held-out error about 8.08. The external-VMEC points are included only after their high-grid convergence gates passed: D-shaped tokamak at t = 250, ITERModel at t = 350, up-down asymmetric tokamak at t = 450, and circular tokamak at t = 450. The result is useful precisely because it blocks premature absolute quasilinear transport claims and motivates the next saturation-model sweep.

Saturation-rule sweep

The first saturation-rule sweep compares three one-scalar intensity rules using the same train/holdout split: fit one multiplicative scale on the two training geometries and score all six holdouts. The tested rules are the current positive-growth mixing-length rule, the raw linear heat-flux weight, and an absolute-growth mixing-length diagnostic that gives stable branches nonzero intensity. The last rule is included only as a diagnostic stress test; it is not a validated physical saturation rule.

python tools/plot_quasilinear_saturation_rule_sweep.py \
  --workers 4 \
  --out docs/_static/quasilinear_saturation_rule_sweep.png

--workers parallelizes the independent case-row extraction while preserving the same ordered report as the serial run. It does not change the underlying quasilinear spectra, nonlinear window summaries, or validation gates.

Quasilinear saturation-rule sweep across Cyclone, Cyclone Miller, HSX, W7-X, and D-shaped external VMEC

All tested one-scalar rules fail the held-out absolute-flux gate. The current positive-growth mixing-length rule is still the best of the three, but its holdout mean absolute relative error is about 2.11. The raw linear-weight rule is worse at about 2.68, and the absolute-growth diagnostic is worse again at about 3.32. The figure also reports a training-mean null baseline; that null gives holdout mean relative error about 1.20. It is not a quasilinear model, but it is a necessary reviewer check: no calibrated saturation rule should be promoted unless it beats this null baseline as well as the linear-weight baseline. The JSON companion carries the same promotion_gate and currently has no accepted rules. This narrows the next research task: the admitted external-VMEC cases strengthen the negative transfer evidence, but absolute-flux prediction still needs a richer saturation/intensity model than any one-scalar fit tested here.

Shape-aware saturation diagnostic

The next diagnostic tests whether the missing information can be captured by a single low-dimensional shape envelope. It first forms the normalized nonlinear to quasilinear spectrum-shape ratio,

\[R_j(k_y) = \frac{P^{\mathrm{NL}}_j(k_y)} {P^{\mathrm{QL}}_j(k_y) + \epsilon}, \qquad P(k_y) = \frac{Q(k_y)}{\sum_{k_y} Q(k_y)},\]

and fits a shared exponent with per-case intercepts,

\[\log R_j(k_y) = a_j + p \log\left(\frac{k_y}{k_{y,\mathrm{ref}}}\right) + \epsilon_j .\]

The held-out prediction uses only training nonlinear shapes, training scalar fluxes, and the held-out linear spectrum. It then fits the absolute heat-flux scale on the training cases and scores the held-out nonlinear window. The tracked figure uses --passed-shape-only for the exponent fit, so the failed Cyclone shape gate does not contaminate the shape correction used for the other geometries.

python tools/plot_quasilinear_shape_aware_saturation.py \
  --passed-shape-only \
  --out docs/_static/quasilinear_shape_aware_saturation.png
Shape-aware quasilinear saturation diagnostic

This is a useful negative result. The shape-aware power law gives mean leave-one-geometry-out absolute relative error about 0.664, while the linear-weight baseline gives about 0.624. Both fail the 0.35 absolute transport gate, and the shape-aware model does not improve the mean held-out score. The figure also includes a deliberately simple training-mean null baseline; it gives mean relative error about 0.170 on this small dataset because the archived nonlinear windows have similar absolute heat-flux levels. That null baseline is not a quasilinear model, but it is the right reviewer check: a calibrated saturation model should beat this baseline before being used for absolute transport or optimization claims. This closes the one-exponent saturation-envelope test and motivates a richer calibrated model with branch/state features, uncertainty diagnostics, and electromagnetic extensions. Its JSON companion also carries promotion_gate.passed = false so the rejected model cannot be accidentally promoted by downstream scripts.

Candidate uncertainty gate

The candidate uncertainty gate adds prediction intervals to the same leave-one-geometry-out protocol. For each held-out geometry, the candidate is calibrated on the remaining cases, the training log-residuals define a 95% prediction interval, and the held-out point is scored against the nonlinear heat-flux window. A candidate is promoted only if it:

  • passes the 0.35 mean-relative transport gate;

  • beats the training-mean null baseline;

  • beats the linear-weight baseline when it is a new non-baseline model;

  • reaches the interval-coverage gate;

  • passes candidate-specific eligibility checks such as minimum training-set size relative to the number of fitted parameters and matrix conditioning.

python tools/plot_quasilinear_candidate_uncertainty.py \
  --workers 4 \
  --out docs/_static/quasilinear_candidate_uncertainty.png

--workers parallelizes the leave-one-geometry-out holdout rows. The JSON report records the worker count and the identity contract; the numerical acceptance remains the same as the serial report.

Quasilinear candidate uncertainty gate

The richer spectral_envelope_ridge candidate is now accepted on the current eight-case electrostatic portfolio. It uses only two linear-spectrum envelope features, the positive-growth k_y centroid and the heat-flux-weighted k_y width, in a three-parameter log-linear ridge fit (intercept plus two features). On leave-one-geometry-out scoring it reaches mean relative error about 0.295 with interval coverage 7/8. That beats both the training-mean null baseline (about 0.900 on this eight-case leave-one-out metric) and the calibrated linear-weight baseline (about 0.947). The legacy one-scalar rules remain rejected, and the broader four-feature linear_state_ridge candidate remains ineligible because its five fitted parameters still exceed the current training-volume gate. This is the intended research posture: accept the smallest candidate that passes held-out skill and coverage, while retaining the simpler failed models as reviewer-facing negative controls.

All three model-development reports above now carry an input_validation block. It is generated from the nonlinear summary gates before model fitting, so these figures can only be regenerated from nonlinear windows that already passed their validation gates. This is intentionally stricter than a finite-run check: exploratory external-VMEC pilots remain useful for planning, but they cannot enter these quasilinear model diagnostics until their convergence and validation gates pass.

Dataset-sufficiency gate

The current electrostatic-compatible dataset is large enough to reject simple saturation-rule hypotheses and to promote a bounded richer candidate, but it is still not large enough to justify every higher-parameter model class. The dataset-sufficiency gate makes that reviewer-facing scope explicit before any model fit is attempted. It requires:

  • nonlinear input summaries that have already passed their validation gates;

  • at least six electrostatic-compatible calibration cases;

  • at least two explicit training geometries and three held-out geometries;

  • enough leave-one-out training cases per fitted parameter for richer candidates;

  • passed downstream saturation-rule and uncertainty/skill gates.

python tools/plot_quasilinear_dataset_sufficiency.py \
  --out docs/_static/quasilinear_dataset_sufficiency.png
Quasilinear dataset-sufficiency promotion gate

The tracked gate now passes for the current eight-case electrostatic dataset. There are eight admitted cases, two explicit training geometries, and six held-out geometries. That is enough data volume for the one-parameter linear-weight candidate, the two-parameter shape-power-law candidate, and the three-parameter spectral_envelope_ridge candidate, though not yet for the five-parameter linear_state_ridge model. Promotion is therefore scoped: the accepted candidate is the reduced spectral-envelope model documented above, while higher-parameter candidates remain blocked by the train-to-parameter ratio gate. KBM is still listed as a validated but excluded nonlinear case because the present quasilinear diagnostics are electrostatic; electromagnetic quasilinear field-channel normalization and calibration remain separate future work.

Model-selection status

The model-selection status combines the dataset-sufficiency gate, candidate uncertainty gate, and all tracked train/holdout calibration reports into one claim-boundary artifact. It is intentionally not another fit. It answers the reviewer-facing question: is there a positive scoped model-selection result, and are we still avoiding an absolute-flux overclaim?

python tools/plot_quasilinear_model_selection_status.py \
  --optimized-equilibrium-nonlinear-audit docs/_static/production_nonlinear_optimization_guard.json \
  --require-optimized-equilibrium-nonlinear-audit \
  --out docs/_static/quasilinear_model_selection_status.png
Quasilinear model-selection status and claim-boundary guardrails

The current status passes. The accepted spectral_envelope_ridge candidate has leave-one-geometry-out mean relative error about 0.295 and interval coverage 7/8. It beats both the training-mean null baseline (0.900) and the calibrated linear-weight baseline (0.947), while every tracked one-constant train/holdout calibration report remains demoted to calibration_dataset. The status also consumes the selected optimized-QA nonlinear audit, but only as scoped transport evidence for that audited equilibrium. This closes a scoped model-selection lane for the manuscript. It does not promote a runtime/TOML absolute-flux predictor, a universal saturation rule, or a nonlinear turbulence-gradient optimization claim.

The companion claim-boundary artifact docs/_static/nonlinear_turbulence_gradient_evidence_status.json is deliberately stricter: it records that replicated long-window transport evidence is present, while the current nonlinear-gradient artifact is a long-window production candidate that still fails closed on plus-state transport-window spread and propagated gradient uncertainty. The current tracked artifact is the QA/ESS ZBS(1,0) 7.5% bracket: all twelve t=900 nonlinear outputs pass the t=[450,900] output gates, the response is resolved (response_fraction = 0.0319), and the finite-difference bracket is local (fd_asymmetry_rel = 0.044), but the plus ensemble spread is 0.196 > 0.15 and gradient_uncertainty_rel = 1.81 > 0.5. It is therefore not promoted as production turbulence-gradient evidence.

Holdout-gap report

The holdout-gap report is the reviewer-facing companion to the model-selection status. It answers a different question: which currently tracked nonlinear windows are admitted, which candidate windows are excluded, and what exact data product is needed before absolute-flux promotion can be reconsidered?

python tools/build_quasilinear_holdout_gap_report.py \
  --out docs/_static/quasilinear_holdout_gap_report.png
Quasilinear holdout gap report and absolute-flux promotion blocker

The current report admits six holdouts and two training references, but it keeps absolute_flux_promoted = false because the aggregate held-out absolute-flux error remains about 2.11 against the 0.35 gate. The accepted spectral_envelope_ridge candidate remains a scoped model-selection result. Its mean leave-one-geometry-out relative error is about 0.295, but that number is not a saturated absolute-flux promotion because it comes from the candidate-selection uncertainty report rather than a passed absolute train/holdout calibration artifact.

The JSON sidecar now carries an explicit absolute_flux_promotion_requirements block. For the current frozen artifacts it records an error factor of about 6.04 over the absolute-flux gate, with Cyclone Miller as the worst admitted holdout (Q_i = 4.26 observed versus 38.7 predicted). Before absolute-flux promotion can even be reconsidered, the report requires at least three additional independent passed holdouts, at least one additional external-VMEC holdout family, and at least one non-axisymmetric external-VMEC holdout family. These are evidence requirements, not automatic promotion criteria; any future absolute model must still pass the held-out transport-error gate, prediction interval/skill gates, and provenance checks on the same frozen case ledger.

The report tracks 13 excluded candidates. The shaped-tokamak pressure candidate is one of those exclusions: it is finite and late-window stable at t = 450, but its 48x48x32 and 64x64x40 heat-flux windows differ by about 0.306, above the 0.15 grid-agreement gate. The report currently identifies the external-VMEC family as the highest-leverage next source because an ITERModel window already passes as training data and the nearest tracked holdout gap is a near-miss in the same family. The new ITERModel_reference same-family audit passes at t = 450, but it is classified as reproducibility evidence rather than an independent holdout because that family is already consumed by a training reference.

External-VMEC next-holdout runbook

The gap report intentionally stops at metadata ranking. The launch runbook turns that ranking plus the linear external-VMEC candidate screen into a reproducible next-run contract. It does not run simulations and does not promote an absolute-flux model; it records which nonlinear runs should be launched next, which grids and horizons should be used, and which acceptance gate must pass before a new point can enter the calibration set.

python tools/build_external_vmec_holdout_runbook.py \
  --out docs/_static/external_vmec_next_holdout_runbook.png
External-VMEC nonlinear holdout launch runbook

The current runbook is intentionally blocked for unchanged replays. The ITERModel preferred-family audit has now passed, so replaying the same wout_ITERModel_reference.nc ladder would not create independent holdout leverage. The unchanged shaped-tokamak pressure case is also demoted because its tracked t = 450 high-grid convergence gate failed. Families with a recent failed external-VMEC convergence gate are demoted until the rerun protocol is materially changed, for example by increasing resolution or changing the candidate. The generated JSON sidecar includes replayable write_external_vmec_holdout_configs.py commands only when an admissible launch target exists, the recommended high-grid horizons, and the fail-closed acceptance requirements: split = holdout, passed grid/window convergence, a post-transient transport window, and independence from the training reference.

VMEC equilibrium portfolio for future holdouts

The next quasilinear promotion attempt needs more matched nonlinear holdout windows, not more fit parameters. The local vmec_jax checkout includes a useful portfolio of small VMEC equilibria that can seed those future linear scans and nonlinear validation runs without adding the VMEC files to the SPECTRAX-GK repository. The inventory tool records file sizes, checksums, nfp, resolution, aspect ratio, edge rotational transform, beta, and a selection score for follow-up cases:

VMEC_JAX_ROOT=/path/to/vmec_jax
python tools/plot_vmec_jax_equilibrium_inventory.py \
  --data-dir "$VMEC_JAX_ROOT/examples/data" \
  --out docs/_static/vmec_jax_equilibrium_inventory.png
vmec_jax equilibrium inventory for future validation holdouts

The current inventory finds 11 local VMEC equilibria. The best immediate linear/nonlinear holdout candidates are wout_li383_low_res.nc, wout_QI_stel_seed_3127.nc, wout_nfp4_QH_warm_start.nc, wout_cth_like_fixed_bdy.nc, wout_shaped_tokamak_pressure.nc, wout_circular_tokamak.nc, wout_DSHAPE.nc, and wout_purely_toroidal_field.nc. The wout_LandremanPaul2021_QA_lowres.nc fixture is deliberately deferred by the inventory gate because its VMEC reference scale metadata are degenerate (Aminor_p = Rmajor_p = aspect = volume = 0), which is not a valid input to the current VMEC-to-EIK runtime path.

The first bounded smoke scans use ky = 0.10, 0.20, Nl = 2, Nm = 4, dt = 0.005, and 80 explicit time steps to check only that the geometry path, linear solve, and quasilinear-feature writer are finite. They are intentionally short and should not be interpreted as optimized physics scans:

VMEC fixture

smoke result

sampled growth rates

wout_li383_low_res.nc

finite stable branches

-0.0258, -0.0297

wout_nfp4_QH_warm_start.nc

finite stable branches

-0.0243, -0.0186

wout_cth_like_fixed_bdy.nc

finite stable branches

-0.0230, -0.0282

wout_shaped_tokamak_pressure.nc

finite stable branches

-0.0669, -0.0562

These are not accepted quasilinear transport calibration points yet. Each candidate must first get a reproducible production-resolution SPECTRAX-GK linear quasilinear scan, a matched nonlinear heat-flux window, and a passed nonlinear comparison/physics gate before entering the leave-one-out calibration reports above.

The first full-ky follow-up kept the same W7-X-style VMEC runtime TOML but used Nl = 4, Nm = 8, and 400 explicit time steps over the six-point stellarator ky grid used elsewhere in this documentation. Li383 remains stable over that short linear scan, with gamma decreasing from about -0.022 to -0.078. The shaped-tokamak fixture is also stable over the same grid, with gamma increasing from about -0.080 to -0.0186 but not crossing zero. The nfp4 QH and CTH-like fixtures are more useful for the next validation lane: QH reaches gamma = 0.0328 at ky = 0.2857, while CTH-like reaches gamma = 0.0488 at the same sampled ky. The figures below are still linear-feasibility artifacts only; they motivate matched nonlinear windows but do not validate an absolute quasilinear saturation rule.

External nfp4 QH VMEC linear quasilinear feasibility spectrum External CTH-like VMEC linear quasilinear feasibility spectrum

The QH convergence failure triggered a broader candidate screen before any additional nonlinear promotion. The five-point screen uses ky = 0.095, 0.190, 0.286, 0.381, 0.476, Nx = Ny = 48, Nz = 32, Nl = 4, Nm = 8, and 400 explicit RK4 steps. Among the finite external-VMEC candidates, wout_DSHAPE.nc is the strongest unstable branch with gamma = 0.096 at ky = 0.476. Circular tokamak and ITER-model fixtures are close behind, while QI/QA/QH reference fixtures are stable or fail the current geometry screen. The screen output is tracked as docs/_static/external_vmec_candidate_linear_screen.csv.

The refreshed local portfolio adds three explicit outcomes to that screen. The finite-beta wout_li383_low_res.nc branch stays stable over ky = 0.095 to 0.476 under the same adiabatic-electron ITG screen. The new wout_QI_stel_seed_3127.nc branch is finite and only weakly unstable: the refined low-ky scan peaks at gamma≈3.8e-3 near ky≈0.143, and a Krylov check confirms the lowest-ky branch remains near marginality. The current runbook therefore treats this as QI/seed-robust feasibility evidence, not as a nonlinear transport launch target. A separate wout_basic_non_stellsym_simsopt.nc attempt fails the present VMEC flux-tube cut contract before time integration, so it is tracked as a geometry-contract failure rather than a physics result.

QI seed VMEC near-marginal linear quasilinear spectrum

The companion branch-refinement gate compares the same time-propagated branch with a Krylov check at the lowest unstable sampled mode. Finite rows, contiguous positive low-ky support, and Krylov consistency pass, but the launch-growth subgate fails because max(gamma)≈3.8e-3 is below the 0.02 nonlinear-launch threshold. This is the intended outcome: the result is useful QI branch-continuation evidence, not a nonlinear transport validation or absolute-flux calibration point.

QI seed branch-refinement gate

Nonlinear follow-up configs for these external VMEC candidates should be generated with tools/write_external_vmec_holdout_configs.py rather than by hand. The standard command writes matched 48x48x32 and 64x64x40 TOMLs for a t = 150 initial run and a t = 250 restart continuation, plus a JSON manifest containing the launch commands and restart-copy commands. This keeps every candidate on the same ITG/adiabatic-electron physics, dissipation, sampling, and output convention before the convergence gate decides whether the case is admissible. The next promotion step is replicated nonlinear transport evidence: use --seed-variant for distinct randomized initial conditions and --dt-variant for fixed-step sensitivity checks. The generated TOMLs carry explicit seed/timestep metadata so tools/check_nonlinear_window_ensemble_readiness.py can fail closed until each admitted case has at least two passed seed and timestep windows.

External D-shaped tokamak VMEC linear quasilinear feasibility spectrum

D-shaped tokamak nonlinear pilots were then run on the office GPUs with the same ITG/adiabatic-electron physics as the QH and CTH-like pilots. The t = 150 low-to-mid-grid gate passes immediately: Nx = Ny = 32, Nz = 24 and Nx = Ny = 48, Nz = 32 differ by only 0.039 on the common late-window mean and 0.050 on the least-trending-window mean. The t = 150 mid-to-high-grid gate is close but not acceptable, with 0.201 common-window and 0.262 least-window relative differences. This is exactly why the time-window check is required before promoting a run.

Extending the 48x48x32 and 64x64x40 runs from t = 150 to t = 250 closes the gate. The common-window means are about 16.1 and 18.5 with symmetric relative difference 0.139; the independently selected least-trending-window means are about 15.9 and 17.7 with relative difference 0.108. Both are below the 0.15 threshold, and the trend/CV/sample-count gates also pass. D-shaped tokamak is therefore the first external-VMEC nonlinear transport holdout from this campaign admitted into the quasilinear calibration report. Admission does not promote the current absolute-flux model: the Cyclone-trained one-constant mixing-length estimate overpredicts this D-shaped holdout by about two orders of magnitude.

External D-shaped tokamak VMEC nonlinear low-to-mid-grid convergence gate External D-shaped tokamak VMEC nonlinear t250 high-grid convergence gate

The circular-tokamak follow-up is now a useful example of why bounded extension ladders must be explicit. At t = 150 the 48x48x32 to 64x64x40 pair had excellent grid agreement but still failed the common and least-window trend gates. Extending both runs to t = 250 removed the trend issue, but the common-window coefficient of variation rose to about 0.229 and the common/least-window grid differences rose to about 0.180 and 0.307. The case was therefore still excluded at that stage. A final, pre-declared extension to t = 450 closes the high-grid gate: the common-window heat-flux means are about 19.18 and 19.43, the common-window symmetric relative difference is 0.0128, the least-trending-window difference is 0.0468, and all trend/CV/sample-count gates pass. Circular tokamak is therefore admitted as an independent held-out external-VMEC transport window, while the failed shorter gates remain tracked as useful convergence-history artifacts.

External circular tokamak VMEC linear quasilinear feasibility spectrum External circular tokamak VMEC nonlinear t450 high-grid convergence gate External circular tokamak VMEC seed/timestep replicate gate

The shaped-tokamak pressure candidate was then run through the same 48x48x32 / 64x64x40 ladder to t = 450. This is a scientifically useful negative result rather than a calibration point. Both traces are finite and their late windows are statistically interpretable: the 48x48x32 common-window heat flux is about 11.45 with relative slope 6.3e-5 per time unit, and the 64x64x40 common-window heat flux is about 8.41 with relative slope -9.3e-4 per time unit. The pair fails only the grid-agreement gate: the common-window and least-window symmetric relative heat-flux differences are both about 0.306. The candidate stays excluded from quasilinear calibration until a higher-resolution or otherwise modified protocol demonstrates grid convergence.

External shaped-tokamak pressure VMEC nonlinear t450 high-grid convergence gate

The next screened unstable axisymmetric external-VMEC candidate, wout_ITERModel_reference.nc, does close the gate after one bounded extension ladder. The t = 150 pair was already close: the common-window grid difference was only about 0.073, but the 64x64x40 trace was still drifting on the common window and the least-window difference was about 0.164, slightly above the 0.15 threshold. Extending the same 48x48x32 and 64x64x40 runs first to t = 250 and then to t = 350 closes the gate cleanly. At t = 350 the common-window heat-flux means are about 22.41 and 22.05, the common-window symmetric relative difference is only 0.0165, the least-window difference is 0.1415, and all trend/CV/sample-count gates pass. ITERModel is therefore admitted as the second external-VMEC nonlinear transport holdout from this campaign. As with D-shaped tokamak, this strengthens the negative-transfer evidence for the current one-constant mixing-length model rather than rescuing it: the observed heat flux is about 22.0 while the uncalibrated quasilinear prediction is only about 0.389.

External ITERModel VMEC linear quasilinear feasibility spectrum External ITERModel VMEC nonlinear t350 high-grid convergence gate

The same ITERModel family was then rerun as an independent t = 450 audit using the corrected restart ladder. The audit passes the high-grid gate: the 48x48x32 and 64x64x40 common-window heat-flux means are about 21.31 and 20.14, with common-window and least-window symmetric grid differences 0.056 and 0.055. This is useful reproducibility evidence for the training reference, but it is not admitted as a new quasilinear holdout because it is not independent of the training-family geometry.

External ITERModel VMEC same-family audit t450 high-grid convergence gate

The external-VMEC runbook is now deliberately fail-closed. It requires a screened linear growth rate of at least gamma = 0.02 before it writes any nonlinear launch command, and it still requires a matched post-transient transport window plus passed grid/window convergence before a point can enter calibration. This blocks three common failure modes: rerunning a same-family training audit as if it were independent, replaying a family with a recent failed convergence gate unchanged, and launching expensive nonlinear simulations from near-marginal linear branches such as the present QI seed.

The next screened unstable external-VMEC tokamak candidate, wout_up_down_asymmetric_tokamak_reference.nc, also closes after a bounded extension ladder. At t = 150 the grid difference already passed (0.138), but the 64x64x40 common-window trend was still too large (7.32e-3 per time unit). Extending both runs to t = 250 reduced the common-window relative difference to 0.0411 and the least-window difference to 0.0499, but the common-window trend on the lower grid was still slightly above threshold (2.78e-3 versus 2.0e-3). A final extension to t = 450 closes the gate cleanly: the common-window heat-flux means are about 7.43 and 7.76, the common-window symmetric relative difference is 0.0435, the least-window difference is 0.0242, and both common and least-window trend/CV/sample-count gates pass. This case is now the third admitted external-VMEC nonlinear transport holdout in the tracked stellarator/tokamak calibration portfolio.

External up-down asymmetric tokamak VMEC linear quasilinear feasibility spectrum External up-down asymmetric tokamak VMEC nonlinear t450 high-grid convergence gate

A reduced-grid nonlinear QH pilot has also been run locally at Nx = Ny = 32, Nz = 24, Nl = 4, Nm = 8, and dt = 0.05 using the same VMEC fixture. The original t = 20 trace was intentionally not promoted: its late-half mean heat flux was only about 1.78e-4 and still growing. The lane has now been extended from the saved nonlinear state to t = 150. The longer trace reaches a meaningful post-transient heat-flux level: the least-trending window is t = 77.55 to 150.00, with mean heat flux about 19.6, standard deviation about 1.14, and relative linear trend about -3.25e-4 per time unit. This closes the specific concern that the QH pilot was only measuring a startup/noise-floor heat flux. It is still a reduced-grid feasibility result, not a calibrated transport holdout, until a grid/window convergence gate passes.

External nfp4 QH VMEC nonlinear feasibility pilot

A higher-grid QH companion run at Nx = Ny = 48 and Nz = 32 was then run on the office GPU to the same t = 150 horizon. It is finite and has a flat late trace, but the late heat-flux level changes materially: the common late-window mean is about 11.6 instead of 19.8, and the independently selected least-trending means are about 12.0 instead of 19.6. The resulting symmetric relative grid differences are about 0.523 on the common window and 0.480 on the least-trending windows, both above the 0.15 gate.

The follow-on Nx = Ny = 64 and Nz = 40 run is also finite to t = 150, but it moves the transport level again instead of confirming the 48x48x32 point: the common-window mean is about 6.0 and the least-trending mean is about 5.8. The mid-to-high-grid symmetric relative differences are about 0.630 and 0.704. QH is therefore a useful negative convergence result, not a new quasilinear calibration holdout.

Higher-grid external nfp4 QH VMEC nonlinear feasibility pilot External nfp4 QH VMEC nonlinear grid convergence gate High-grid external nfp4 QH VMEC nonlinear feasibility pilot External nfp4 QH VMEC nonlinear high-grid convergence gate

The same reduced-grid protocol was then applied to the CTH-like fixture and extended on the office GPU to t = 150. The run remains finite and develops a clear late nonlinear state: the least-trending tracked window is t = 75.05 to 150.00 with mean heat flux about 23.1, standard deviation about 1.79, and relative linear trend about 1.2e-3 per time unit. This is the strongest current external-VMEC nonlinear candidate. It is still kept as a feasibility pilot, not a transport-calibration holdout, because no external-VMEC nonlinear acceptance gate has been defined for this fixture and no independent reference or production-resolution convergence check has passed yet.

External CTH-like VMEC nonlinear feasibility pilot

The first bounded grid check repeats the same run at Nx = Ny = 48 and Nz = 32. It is also finite to t = 150 and has a flatter late trace, but the transport level changes materially: the common t = 75.05 to 150.00 window has mean heat flux about 12.8 rather than 23.1, and the least-trending t = 120.05 to 150.00 window has mean heat flux about 14.5. This is a useful negative convergence result. CTH-like should not be used as a quasilinear calibration holdout until the grid, hypercollision, and window-selection protocol are frozen and the production-resolution comparison passes a case-specific tolerance.

Higher-grid external CTH-like VMEC nonlinear feasibility pilot

The explicit convergence gate follows the same evidence chain used in nonlinear gyrokinetic benchmark papers: time traces and saturated heat-flux windows are compared, and the candidate is not promoted unless the heat flux is robust to resolution and window choice. This mirrors the Cyclone and W7-X time-trace/convergence practice in [Dimits00] and [GX], while the stellarator-domain sensitivity documented by [Sanchez21] motivates keeping external-VMEC cases behind a conservative gate until flux-tube and resolution choices are fixed. The current CTH-like pair fails the common-window stationarity and grid-refinement requirements: the common-window symmetric relative heat-flux difference is about 0.571 and the least-trending-window difference is about 0.453, both above the 0.15 production threshold. That threshold is intentionally strict enough to match the order of nonlinear heat-flux convergence tolerances reported for Laguerre-Hermite gyrokinetic calculations in [GX]. Long turbulent time series can still be physically informative, as emphasized by W7-X heat-flux time-series analyses [Papadopoulos23], but they are not calibration data until this gate passes.

External CTH-like VMEC nonlinear grid convergence gate

The nonlinear time-horizon audit below is a guardrail for the manuscript and documentation. It classifies archived heat-flux artifacts by their actual time coverage and claim level. The long matched nonlinear gates for Cyclone, Cyclone Miller, KBM, W7-X, and HSX pass the current release comparison envelopes. D-shaped tokamak passes the external-VMEC t = 250 high-grid convergence and replicated seed/timestep gates, while circular tokamak passes the external-VMEC t = 450 high-grid gate and the longer t = 700 replicated seed/timestep gate. The QH and CTH-like external-VMEC traces are long feasibility pilots that still need convergence gates. The compact finite-difference audits remain startup plumbing checks, and the differentiable nonlinear-window optimization examples remain reduced-envelope estimators rather than production nonlinear transport averages.

Nonlinear transport time-horizon audit

The normalized W7-X spectrum-shape gate does pass when the linear heat-flux-weight distribution is compared with the resolved nonlinear HeatFlux_kyst spectrum from the NetCDF output:

python tools/plot_quasilinear_spectrum_shape_gate.py \
  --spectrum docs/_static/quasilinear_w7x_spectrum_scan.quasilinear_spectrum.csv \
  --nonlinear tools_out/final_nonlinear_audit/w7x_spectrax_current_adaptive_t200.out.nc \
  --out docs/_static/quasilinear_w7x_spectrum_shape_gate.png \
  --ql-column heat_flux_weight_total \
  --nonlinear-variable Diagnostics/HeatFlux_kyst \
  --tv-gate 0.2 \
  --cosine-gate 0.95 \
  --title "W7-X quasilinear/nonlinear ky-spectrum shape gate"
W7-X quasilinear and nonlinear ky-spectrum shape gate

The tracked W7-X shape gate passes with total-variation distance about 0.056 and cosine similarity about 0.992. This supports the linear-spectrum shape diagnostic for W7-X under the current setup, while the absolute saturated-flux model remains rejected by the train/holdout report.

The same HSX artifacts also close the first real spectrum-shape gate. This gate does not use the saturated flux, because the current stable-branch mixing-length rule would erase the spectrum. Instead it compares the normalized linear heat-flux-weight spectrum against the normalized nonlinear HeatFlux_kyst spectrum averaged over the resolved nonlinear diagnostics:

python tools/plot_quasilinear_spectrum_shape_gate.py \
  --spectrum docs/_static/quasilinear_hsx_spectrum_scan.quasilinear_spectrum.csv \
  --nonlinear tools_out/final_nonlinear_audit/hsx_nonlinear_t50.out.nc \
  --out docs/_static/quasilinear_hsx_spectrum_shape_gate.png \
  --ql-column heat_flux_weight_total \
  --nonlinear-variable Diagnostics/HeatFlux_kyst \
  --time-max 49.2 \
  --tv-gate 0.2 \
  --cosine-gate 0.95 \
  --title "HSX quasilinear/nonlinear ky-spectrum shape gate"
HSX quasilinear and nonlinear ky-spectrum shape gate

The tracked HSX shape gate passes with total-variation distance about 0.11 and cosine similarity about 0.97. This supports the linear spectrum-shape diagnostic while still rejecting any absolute saturated-flux claim from the current uncalibrated rule.

Axisymmetric spectrum-shape gates

The same spectrum-shape extraction is also tracked for the electrostatic axisymmetric adiabatic-electron nonlinear windows. These gates compare only the normalized ky distribution of the linear heat-flux weight against the resolved nonlinear HeatFlux_kyst spectrum. They do not test the absolute mixing-length heat-flux level.

Cyclone Miller quasilinear and nonlinear ky-spectrum shape gate

Cyclone Miller passes the initial shape gate with total-variation distance about 0.094 and cosine similarity about 0.983. This is a useful positive gate: the linear heat-flux-weight spectrum and the resolved nonlinear heat-flux spectrum place comparable weight across the scanned ky range under the current window.

Cyclone quasilinear and nonlinear ky-spectrum shape gate

The long-window Cyclone shape gate is intentionally retained as a failed gate: it gives total-variation distance about 0.215 and cosine similarity about 0.896 against the current TV <= 0.2 and cosine >= 0.95 criteria. The mismatch is concentrated in the low- and high-ky tails, which points to a saturation/intensity-model limitation or a window/branch-selection issue rather than a failed file-ingestion path. This is a paper-facing negative result and should guide the next quasilinear saturation-model sweep.

KBM is not included in the current spectrum-shape quasilinear gate because the tracked KBM validation lane is electromagnetic, while the implemented quasilinear diagnostic currently validates only electrostatic field channels. KBM should enter this section only after electromagnetic quasilinear weights for A_parallel and B_parallel have independent normalization and finite-difference/linear-diagnostic gates.