-
Notifications
You must be signed in to change notification settings - Fork 2
Feat/cmip7 awiesm3 veg hr #266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JanStreffing
wants to merge
381
commits into
prep-release
Choose a base branch
from
feat/cmip7-awiesm3-veg-hr
base: prep-release
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
381 commits
Select commit
Hold shift + click to select a range
4db542e
pycmor CMIP7 global attributes: emit CV-valid values
JanStreffing a430fde
pycmor output: emit external_variables for cell_measures (CF 7.2)
JanStreffing d5e3ed5
pycmor std_lib: reusable cell_measures steps + fx FrozenPipelines
JanStreffing 9b191ee
std_lib cell_measures: AreacellaFxPipeline reads model output, not mesh
JanStreffing 266e472
configs: migrate areacello/areacella pipelines to std_lib FrozenPipel…
JanStreffing b505886
HR/LR yaml sync + repoint for test runs
JanStreffing 0a1f79b
HR core_atm sbatch wrapper + CAP7 data volume estimate
JanStreffing 9bff49e
std_lib + configs: CMIP7 fx/flag/unstructured-grid fixes
JanStreffing 0700690
timeaverage: default flox engine to numpy, override knob
JanStreffing 975ea11
std_lib: pluggable netCDF codec + write scheduler knobs
JanStreffing 1fba182
pycmor output: three HR-atmos fixes (dask _FillValue, XIOS bounds, co…
JanStreffing 00478b0
bounds recovery: re-read XIOS bounds from input files at save time
JanStreffing 29dc348
bounds variables: drop the spurious ``units`` attribute
JanStreffing 8443aeb
bit-level quantization: default BitGroom-5, skip bounds / coord vars
JanStreffing 6959c2e
add wip verification
JanStreffing a61be16
add full hr test runner script
JanStreffing 0c31f9b
update HR yamls with performance optims and removal of double pipelin…
JanStreffing 09c48bc
env checker to ensure we run on env with hdf5 and nc from hpc
JanStreffing ae3a92f
add step rechunk_time
JanStreffing a6b229e
update all example scripts for lr and hr benchmarking
JanStreffing 743aee7
script:// step paths: support env-var expansion, migrate yamls to $PY…
JanStreffing 1927d89
feat(land): add 75 LPJ-GUESS CAP7/VEG/EXTRA rules and depth/pool loaders
39afd78
ocean: rewire uos / vos onto unod_sfc / vnod_sfc, drop 3hr tauuo / ta…
JanStreffing fb695bd
oifs: sync xios xml from esm_tools (file_def, context_ifs, grid_def)
JanStreffing 57e4add
oifs: drop redundant + unrequested OIFS streams; retarget pycmor rule…
JanStreffing 6475250
extra_land: drop old composite c3PftFrac rule + lpjg_monthly_sum_pipe…
JanStreffing 47b1154
extra_land: drop 5 more old PFT-fraction rules colliding with the LPJ…
JanStreffing 65945ef
inherit: switch netcdf_write_scheduler from threads to synchronous on…
JanStreffing a0866d4
ocean: implement compute_msftm_density / compute_msftmmpa_{depth,dens…
JanStreffing afedbab
doc: add sanity-check ranges for 53 newly-active HR rules
JanStreffing d1ce806
cap7_aerosol: cmor 1 model year of GHG, not 273 forcing years
JanStreffing 59cfa1e
xios: drop duplicate _<suffix>_ infix from OIFS output filenames
JanStreffing 0b3501d
xios dedup followups: 11 second_input_pattern + 3 extra_land per-var …
JanStreffing 08d09c2
extra_land: fix singular pipeline: -> pipelines: on 6 LPJ PFT rules
JanStreffing f1d4c9d
tooling + cap7_atm fixes
JanStreffing cd5f5d2
fix: pipeline-key typos, prefix typos, generic-pattern updates across…
JanStreffing a103070
fix: 2 quoted-pattern dups + lrcs_land static per-var
JanStreffing 0551d1b
fix: FESOM *_file: regex-form -> glob-form + repoint glob support
JanStreffing 48fa47f
lrcs_seaice: replace regrid_regular_to_fesom with regrid_oifs_to_fesom
JanStreffing 2508561
naming: drop cap7/cmip7 from XIOS file_ids; merge collisions
JanStreffing ff183a1
examples LR test yamls: repoint to LR_run_test3
JanStreffing f3b45d5
tools: add sanity-check walker + report generators
JanStreffing 8e99422
issues_y1587_sanity.md: refresh with all 695/695 files
JanStreffing df15285
fix unit handling and compute bugs in fesom-derived rules
JanStreffing f98e394
sanity_check_ranges: widen bounds with literature support
JanStreffing 74a1616
disable fesom diagnostics that are dead under linfs ALE
JanStreffing a37195d
zg: do divide-by-g in xios, drop rule-side scale_factor
JanStreffing f61f01b
slab-loop save path + memory-pressure bench artifacts
JanStreffing 8aa8eb3
HANDOFF: update with pyconcat retry numbers + revised throughput math
JanStreffing a65cd13
HANDOFF: append mode is the recommended default
JanStreffing ebecb51
slab loop: fix multi-time-axis append, add chunk-count guard, input f…
JanStreffing f8bc0ae
slab loop: detect aux time dims by name/CF attrs, not size match
JanStreffing cd8341f
slab loop: skip when aux time dim has size != primary time
JanStreffing 3174c79
slab loop: tighten multi-time-axis guard to also skip parallel-axis case
JanStreffing 5d90ccd
slab loop: rebalance slab boundaries + override time-axis chunksize
JanStreffing 1c7f2d7
Revert slab-loop save path: append-mode silent truncation
JanStreffing 7338ba7
HANDOFF: investigation closed with negative result + cgroup-v2 watchdog
JanStreffing 0e76600
recipe: fix rtmt_mon formula and snd_mon time-coord alignment
JanStreffing 2cc43f6
README: align compute_rtmt docstring with the corrected formula
JanStreffing 1cb45e4
Round 1 closed (negative): h5netcdf and inline_array both regress
JanStreffing 3169ce5
Round 2 closed (negative): Prefect task collapse no measurable win
JanStreffing 39a0c79
Round 3 audit: all four candidates dead or unverifiable
JanStreffing a6ff822
Round 4: contention sweep on mini-cap7 finds 3x4x48 as new default
JanStreffing d280114
fix _HLGExprSequence pickle bug in save under prefect-dask
JanStreffing 8d5340d
Add save_dataset heartbeat + design proposal for subflow deadlock fix
JanStreffing a41103a
HR submit: new default 4x4x16 + collapse (27% wall reduction)
JanStreffing e8aa226
Production tuning: thread/mem-limit pass-through + parallel cleanup +…
JanStreffing 1d2e1b1
fix cap7_atm rule inputs: add 2d to file_def, year-lock secondary pat…
JanStreffing fcd43cc
Eliminate parent×subflow deadlock; fix gate-A infra flakes
JanStreffing 6773ea5
Throttle parent submission to W*TPW via as_completed rolling window
JanStreffing 90f382f
Throttle parent submission also in _parallel_process_prefect
JanStreffing a30a963
DESIGN_PROPOSAL §10.6: gate-A v2 final + new failure mode + throttle
JanStreffing 8046000
Add CLI overrides for run-root, year range, mesh, output, slurm memory
JanStreffing 55bb37e
Fix CLI-override regressions vs repoint_hr_year.py (R1, R2)
JanStreffing 961c492
Migrate *_file: literal globs to *_path:/*_pattern: form
JanStreffing 42e8a6a
Recipe fixes post-CLI migration: F1+F3+F4+F5+F6 (cli5 545/11)
JanStreffing 6ef5eca
sifllattop / siflsenstop: switch to atmos_1h_sfc_hfls/hfss
JanStreffing 53eb778
mask_where_no_seaice: fix duplicate-time + bounded-memory broadcast
JanStreffing 7091733
sanity_check: add HTML report generator + Test_06 results
JanStreffing 5ba2f33
sanity_check: add per-variable map plots to HTML report
JanStreffing b885674
sanity_check/build_maps: fix dim detection + file picker
JanStreffing 46058d3
sanity_check/build_html_report: long names + description, fix PICONTR…
JanStreffing 9bdd870
sanity_check/build_maps: 3-panel maps + level-name fix
JanStreffing 54c7b28
sanity_check: split PICONTROL_NONZERO from PHYS_NEG_VALUES
JanStreffing 932d9a0
sanity_check: time-series plots for hemispheric/global scalars
JanStreffing cb0de4c
sanity_check html: widen layout so 3-panel maps render large
JanStreffing 799d9ab
fix snm sign / vsfcorr fill / siarea unit + handoff doc
JanStreffing 7aea762
build_maps: walk through level dim to find a non-NaN slice
JanStreffing 5170e3c
sanity_check: order-of-magnitude FAIL + Baltic-aware salinity bounds
JanStreffing 0644699
build_maps scatter: wrap lon to -180..180 before plotting
JanStreffing 803a150
sanity_check: PICONTROL classifier narrowed to anthropogenic-flux only
JanStreffing d72c0b2
Apply rho_0-family scale to depth-integrated ocean rules
JanStreffing 8bbeb68
sanity_check maps: regen with scatter + lon-wrap (333/335)
JanStreffing 7e5e3c4
sanity_check + dsn: EXTREME_OUTLIER tier; fix dsn double-conversion bug
JanStreffing 9fec294
fix sea-ice conductive/turbulent flux sign convention
JanStreffing c36726a
sanity_check: scale-too-small check (observed range << expected envel…
JanStreffing 1bcce1d
sanity_check: tighten scale-too-small threshold to 10x
JanStreffing 663401c
sanity_check html: rename Ice / move landIce -> Land page
JanStreffing a766437
reports: re-walk Test_06_cli_y1587_v7 after cli9 rule fixes
JanStreffing 1db0265
handoff: test06_cli9 results + 5 follow-up items
JanStreffing e0e93ee
sanity_check: per-file diagnosis + hemisphere-integral unit-conversion
JanStreffing 8005217
sanity_check index: report both per-variable and per-file counts
JanStreffing a53979f
sanity_check: per-file cards (one card per .nc file)
JanStreffing 2a789b7
sanity_check: per-file maps + compute-node SLURM script
JanStreffing 5e255a3
sanity_check html: re-render with per-file maps from compute node
JanStreffing ca6068a
sanity_check: widen areacella for HR, fix ua mean, skip checks for wi…
JanStreffing 08ad0ad
reports: refresh 17 stale Test_06 entries + reclassify areacella/ua
JanStreffing c07a81a
sanity_check ranges: widen fFire/fFireAll/fFireNat bounds
JanStreffing d90fea5
sanity_check: cadence-aware bounds (var_<cadence> fallback to plain var)
JanStreffing 563248e
sanity_check ranges: widen HR DGVM carbon-pool bounds with literature…
JanStreffing ad63293
sanity_check ranges: widen difvho/difvso bounds to include convective Kv
JanStreffing 244a6e1
sanity_check ranges: widen siflcondtop and siflfwbot for HR thin-ice …
JanStreffing 8507e69
hfbasin: replace per-element-area approximation with tripyview edge-c…
JanStreffing 9a6aec5
save_dataset: tmpfs-staged atomic write (Option A + A.5)
JanStreffing 87bc1fe
save_dataset: watchdog timeout + retry (Option E)
JanStreffing 7d49257
sisnmass NH/SH: apply rho_snow scaling before unit conversion
JanStreffing 9bf31f1
integrate_over_hemisphere: replace fancy isel with mask-and-multiply
JanStreffing 9ab2cd2
save_dataset watchdog: clarify the log message is informational
JanStreffing 17a4cf6
save_dataset: watchdog no longer raises -- diagnostic only
JanStreffing dbf3a08
sltbasin: port to tripyview edge-crossing integration
JanStreffing 6a5c67c
sanity_check ranges: hfx/hfy bounds reflect per-cell not basin-integr…
JanStreffing d6233d0
lrcs_seaice failure fixes: pipeline throttle_group + secondary-mf lru…
JanStreffing 3604c53
save_dataset: move compute off driver onto cluster workers (Fix #3)
JanStreffing 341ddee
shard isolation: SLURM-level per-shard yamls, 1 array per tier
JanStreffing 5973a3f
shard runner: 512G cgroup, fix3-off default, graph-size instrumentation
JanStreffing 82aea52
runner+submitter: per-tier --mem override, default back to --mem=0
JanStreffing e452940
submitter: per-tier Fix #3 override via SHARD_FIX3 env
JanStreffing fb639fa
_safe_to_netcdf: retry transient compute errors
JanStreffing cfb9320
_safe_to_netcdf: gc.collect + malloc_trim between saves to fight glib…
JanStreffing c855236
submitter: lrcs_seaice gets Fix #3 ON; LRCS_SEAICE_MEM env knob
JanStreffing cd99b57
Revert "submitter: lrcs_seaice gets Fix #3 ON; LRCS_SEAICE_MEM env knob"
JanStreffing a426963
Revert "_safe_to_netcdf: gc.collect + malloc_trim between saves to fi…
JanStreffing 7e2f49f
Revert "_safe_to_netcdf: retry transient compute errors"
JanStreffing 76b1d47
_process_rule: @task(retries=3, retry_delay_seconds=30)
JanStreffing ea564dd
_safe_to_netcdf: gc.collect + malloc_trim between saves
JanStreffing 0c4aff1
submitter: lrcs_seaice gets 6h walltime; others stay 3h
JanStreffing 821f0eb
Revert "_process_rule: @task(retries=3, retry_delay_seconds=30)"
JanStreffing 8815440
_process_rule: manual whole-rule retry on transient dask errors
JanStreffing 2d705fa
submitter: extra_atm runs with N_WORKERS=3 (was 4)
JanStreffing 5c955bf
lrcs_seaice runs under jemalloc to bound malloc fragmentation
JanStreffing 9dfcf24
submitter: jemalloc on for all tiers (was lrcs_seaice only)
JanStreffing cc19fc6
submitter: cap7_atm switches to SHARD_FIX3=off (regression A/B)
JanStreffing 338ec44
submitter: extra_atm flips to fix3=off; drop tier_workers override
JanStreffing 7900daa
load_lpjguess_*: vectorize df.iterrows() to fix dask heartbeat deadlock
JanStreffing fba441c
submitter: extra_atm joins lrcs_seaice on --mem=512G
JanStreffing 9f0e520
filecache: vectorize timestamp parsing in select_range/validate_range…
JanStreffing 9ba14cb
submitter: drop lrcs_seaice 6h walltime override, use global 3h
JanStreffing d7d375e
submitter/runner: SHARD_DRS=on emits full CMIP DRS output tree
JanStreffing fc0d215
sanity_check: fix per-file bounds display + recurse into nested cmori…
JanStreffing 900c575
sanity_check: report fixes + cli37 report; transport: kg/s edge width
JanStreffing 5667ab2
seaice: mask hxy-si rules, fix sisaltmass and sidmasstran[xy], attach…
JanStreffing d4ce88d
sanity_check: derive realm from file's :realm attr, split cards per r…
JanStreffing 775a1b0
cap7_land: disable 8 LPJ-GUESS rules that duplicated core compounds
JanStreffing e7306e2
ocean: fix hfds sign, msftbarot equator band, wmo vertical area, hfba…
JanStreffing e332044
sanity_check_ranges: tune difmxylo, sfdsi bounds (cli37 Christian rev…
JanStreffing b706941
cap7_land: add vegHeightGrass_mon rule (hxy-grs)
f87e49a
sanity_check_ranges: hfds bounds for post-sign-flip recipe
JanStreffing 80fe7db
sanity_check_ranges: widen hfbasin to ±10 PW for HR monthly extremes
JanStreffing 8b19d2c
sanity_check_ranges: relax 5 land bounds per Laszlo/Christian round 2
JanStreffing 51789a6
veg/cap7 land: clip-noise + clip-floor steps + treeFrac total yearly …
JanStreffing 2fd7241
veg_land sanity-check round 1: plan + D4 handoff + reviews
JanStreffing c88c0e2
wo: interface→midpoint averaging fixes "clean top, noisy below" pattern
JanStreffing 470641b
cap7_land: add yearly rules for per-PFT treeFrac{BdlDcd,BdlEvg,NdlDcd…
465afa4
sanity_check_ranges: loosen cli37 piControl bounds for fAnthDisturb, …
1dbb5e7
xios cmip7 RH: wire send_cmip7_rh.F90 outputs into field_def/file_def
JanStreffing 7557ac4
lrcs_seaice: remove redundant mask_where_no_seaice from 13 FESOM-nati…
JanStreffing 35735fc
CMIP7 DReq: bump v1.2.2.2 -> v1.2.2.3
JanStreffing d044b36
veg_land: drop 4 PFT-yearly treeFrac rules — not in CMIP7 data request
JanStreffing 07390ec
submit_hr_year_shards: extra_land shard size 20 → 5
JanStreffing 5a61ad3
lrcs_seaice, veg_land: force serial rule submission via throttle_group
JanStreffing 3299a99
throttle_caps: yaml path dead, use PYCMOR_THROTTLE_CAPS env var instead
JanStreffing 9d4579c
cmorizer: rule.throttle_group as fallback for unpipelined rules
JanStreffing 4fdade1
fesom-ingest tiers: tighten input patterns to native-only (\d{4})
JanStreffing 3d47d27
launcher: optional gr-grid variant via WITH_GR=yes
JanStreffing 1a3b3e6
core_seaice, cap7_seaice, veg_seaice: serial throttle
JanStreffing daef954
submit_hr_year_shards: fix WITH_GR fesom detection + gr throttle
JanStreffing 517aeae
generate_gr_yaml: drop rules using FESOM-mesh-dependent pipelines
JanStreffing 5779fcb
generate_gr_yaml: expand mesh-pipeline filter + drop rules with no gr…
JanStreffing 5ea55d8
generate_gr_yaml: fix step substrings, add pipeline-name filter
JanStreffing 711c014
submit_hr_year_shards: trial 256G for lrcs_seaice & extra_atm (cli49)
JanStreffing f3bd753
submit_hr_year_shards: split cli49 outcome — lrcs_seaice 256G, extra_…
JanStreffing 0454145
fix year-filter regression: handle \d{4} placeholder in FESOM patterns
JanStreffing 2a0045c
core_atm: serial throttle (parallel-save HDF5 lock wedge)
JanStreffing ed22f11
core_atm: disable compression on 7 daily 3D p19 rules
JanStreffing c9a1057
core_atm: shard_size=1 (one rule per SLURM task)
JanStreffing 8d40b4c
core_atm: shard_size 1 → 3 (cost recovery while staying below stall f…
JanStreffing 3351f36
core_atm: re-enable compression on 7 daily 3D pl rules
JanStreffing 2c65522
CMIP7 QC compliance: bounds attrs, units_metadata, calendar, cmor_ver…
JanStreffing 69195e7
Add per-shard compliance-checker QC step
JanStreffing 675fdfe
examples: CMIP7 CV casing, parent_experiment_id, HR source_id and gri…
JanStreffing 6a1eb8b
LPJ-GUESS: slice loaded data to the requested year range
JanStreffing 3c2de88
slice_to_rule_year_range: strip serialisation attrs from time coord
JanStreffing eb9eeb4
slice_to_rule_year_range: nuke both attrs and encoding on time
JanStreffing 424708f
LPJ-GUESS year filter: move from pipeline step into the loaders
JanStreffing e82f6f8
files.py: stop writing time:calendar to attrs (xarray refuses double-…
JanStreffing 42c51ee
load_mfdataset: trim data-level year boundary spill
JanStreffing e98c25a
gather_inputs: import numpy (was missing for the year-filter np.fromi…
JanStreffing b94c3ea
_align_time_to: tolerate 1-2 stamp boundary-spill trim
JanStreffing 6da4bf6
qc: add cmip7repack pre-step, qc_ignore_codes allowlist, [qc] extras
JanStreffing 87f4733
examples: institution description matches WCRP-universe CV
JanStreffing e89c75f
qc: roll run_compliance_checker into every custom pipeline + DefaultP…
JanStreffing cda9ea4
examples: enable_output_subdirs: true on remaining 33 CMIP7 configs
JanStreffing 21b3fd1
run_hr_shard: add inactivity watchdog to catch cluster wedges
JanStreffing f30a9e9
qc: walk DRS subdirs and use compound_name parts to find rule files
JanStreffing 39080c9
examples: bump CMIP7 metadata v1.2.2.2 -> v1.2.2.3 + lowercase region
JanStreffing 1f6ae70
qc: enable qc_repack + VAR004 ignore on seaice, defensive --MODEL drop
JanStreffing ad11a87
run_hr_shard: bump dask close/process-close timeouts to 60s
JanStreffing 222e666
lrcs_ocean: throttle save_dataset to serial:1
JanStreffing 99b08e1
qc: pin esgvoc>=4.1 + document post-install database init step
JanStreffing abae4dd
qc + bounds: promote FESOM coord to mesh precision, strip _Quantize a…
JanStreffing 38161f3
examples: bump CMIP7 metadata v1.2.2.3 -> v1.2.2.4
JanStreffing b7c038e
qc: enable run_compliance_checker on all 17 tier YAMLs
JanStreffing e723668
examples: post-EMD-merge cleanup + description audit
JanStreffing 94ee9bd
qc: wire up wcrp_cmip7 suite + fix 4 HIGH-priority CF findings
JanStreffing 4594715
qc: CMIP7-CV-compliant global attributes + DRS layout
JanStreffing 7195cc9
qc: restore institution attr in inherit blocks
JanStreffing 70bd8af
qc: fix 3 wcrp findings surfaced at scale in cli67
JanStreffing e1198a0
time_bounds: ship as DefaultPipeline step + loosen month detection
JanStreffing 553c1c3
qc: align time coord to midpoint(time_bnds) so wcrp TIME001 passes
JanStreffing fdf444f
qc: prefer mesh.nc polygon bnds + promote lat/lon to float64
JanStreffing a490c5d
status: HR seaice end-to-end at zero findings; cc-plugin PR#46 + WCRP…
JanStreffing a705361
qc: drop time on fx, clean stale bounds attrs, daily bnds, schema check
JanStreffing 79377f6
dim_mapping: detect OIFS model_levels as model_level, not longitude
JanStreffing a4900a3
qc: yearly bnds + midpoint + 8-digit filename token
JanStreffing 0bdbcd7
slurm: refresh esgvoc CV cache on compute nodes before pycmor imports
JanStreffing 8c4ccc0
qc: force canonical time encoding at every save site (cli69 fix)
JanStreffing 21798be
qc: strip CF-inheritable attrs from existing time_bnds (E5 cli69 fix)
JanStreffing cf8f21b
examples: minimal smoke configs for cli69 fix targets
JanStreffing 51e8509
qc: time coord requires standard_name+axis; bnds must drop _FillValue
JanStreffing 3c1aa73
qc: chunking encoder also drops _FillValue from coords (cf §7.1 bnds)
JanStreffing 3be99c0
qc: collapse CMOR axis IDs time1/2/3 to file dim 'time' (TIME003 LUT)
JanStreffing 4b3e24b
recipes: lai_mon compound is .mon.glb, not .day.glb (TIME001 lai_day)
JanStreffing 59a098f
qc: tpt time, time_bnds re-attach in save path, retain cell_measures=''
JanStreffing 560e5c7
qc: clear cli71 remaining HIGH (sob, basin, cfc/ch4/n2o, vsfcorr units)
JanStreffing 7ebab27
qc: enforce 4 MiB chunk floor in dask-aligned encoder (FILE004d)
JanStreffing 7903756
qc: mirror wcrp TIME001 use_midpoint logic + preserve region case
JanStreffing 7dc5e1c
qc: auto-detect XIOS time_counter in load_mfdataset + secondary loader
JanStreffing 2ee1fe5
qc: include region in _rule_files prefix to stop cross-rule file mixing
JanStreffing f095641
qc: sweep stale cmip7repack intermediates before/after each run
JanStreffing 6c0a3c5
qc: CMIP7 tpt cell_methods + recipe brandings match the OIFS XIOS ops
JanStreffing 59ca5b5
recipes: document ts_3hr + ps_1hr_south30 as production gaps
JanStreffing 0f6077f
qc: tob units_metadata + --MODEL cell_measures realm substitution
JanStreffing 4c22fb6
qc: dateline-normalise lon_bnds vertices around the centroid (cf §7.1)
JanStreffing 3e09122
qc: 6-digit sub-daily token + OIFS-XIOS half-step bnds + .first() eng…
JanStreffing 3bd1684
qc: cli76 fixes — fx save, DReq v1.2.2.4, script:// NO_CACHE, sisnhc …
JanStreffing a80f059
qc: fx save branch also strips unlimited_dims kwarg
JanStreffing ba90491
qc: canonicalise units '1e-3'/'1e-03' -> '1E-03' on the output array
JanStreffing 126f4d7
qc: cli79 fixes — msftm coord attrs, §7.2 cell_measures, soil-layer §7.1
JanStreffing 23d93ee
recipes: switch netcdf codec from blosc_zstd back to zlib (default)
JanStreffing 4334397
qc: vertical bnds shape + time_bnds long_name (cli81 follow-up)
JanStreffing File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,242 @@ | ||
| # Design proposal: drop literal-glob `*_file:` in favor of `*_pattern:` form | ||
|
|
||
| ## Context | ||
|
|
||
| R1 of [PLAN_cli_override_regressions.md](PLAN_cli_override_regressions.md) | ||
| added an `*` → `<year>` expansion inside `apply_overrides` so existing yaml | ||
| entries like | ||
|
|
||
| ```yaml | ||
| aice_file: /work/.../outdata/fesom/a_ice.fesom.*.nc | ||
| ``` | ||
|
|
||
| still resolve to a real file after we removed `repoint_hr_year.py`. The | ||
| expansion is a workaround, not a fix: it recreates repoint's regex rewrite | ||
| inside the CLI layer instead of removing the underlying yaml smell. | ||
|
|
||
| This proposal: migrate all 10 affected entries to the regex-pattern form | ||
| that secondary inputs already use, and remove R1. | ||
|
|
||
| ### Forcing function: 1700-year cmorization in multi-year chunks | ||
|
|
||
| Upcoming workload is to cmorize **1700 simulation years**, processed in | ||
| chunks of multiple years per pycmor run (not one-year-at-a-time). With | ||
| FESOM's typical one-file-per-year naming (`<var>.fesom.<year>.nc`, | ||
| verified at e.g. | ||
| `/work/ab0246/a270092/runtime/fesom-2.7/ice_strength/run_19600101-19601231/` | ||
| and the existing y1587 archive), each chunk needs `open_mfdataset` over | ||
| N files. | ||
|
|
||
| **R1's literal-glob form structurally cannot represent this.** It | ||
| requires `--year-start == --year-end` and raises `OverrideError` | ||
| otherwise. The 1700-year run hits that error on the first chunk that | ||
| spans more than one year — which is most of them. The migration is a | ||
| hard prerequisite for the upcoming workload, not a stylistic cleanup. | ||
|
|
||
| --- | ||
|
|
||
| ## Does globbing affect years? Yes — it locks single-year only | ||
|
|
||
| The R1 expansion takes `--year-start` and substitutes it for the literal | ||
| `*` in `*_file:` values. That's the **only** year handling these rules | ||
| get. Three properties fall out of that: | ||
|
|
||
| 1. **Year is injected, not detected.** The expansion blindly writes the | ||
| CLI year into the filename. There's no check that the resulting file | ||
| actually exists; runtime is the first place a typo or missing file | ||
| surfaces. | ||
|
|
||
| 2. **Multi-year is structurally impossible.** `xr.open_dataset(literal_path)` | ||
| takes one file. R1 raises `OverrideError` for `--year-start != | ||
| --year-end` because there's no way to expand a single literal `*` to | ||
| multiple files inside a single string. Users hitting a multi-year | ||
| range get a migration message — but that migration is exactly what | ||
| this proposal does. | ||
|
|
||
| 3. **`skip_input_year_filter` does nothing for `*_file:` consumers.** R2 | ||
| gates `_filter_files_by_year_range` at both call sites, but `*_file:` | ||
| resolution doesn't go through either path — it's a literal `open_dataset` | ||
| call. Centennial-forcing rules can't use a `*_file:` form. | ||
|
|
||
| Compare with the pattern form: | ||
|
|
||
| ```yaml | ||
| aice_path: /work/.../outdata/fesom | ||
| aice_pattern: a_ice\.fesom\..*\.nc | ||
| aice_variable: a_ice | ||
| ``` | ||
|
|
||
| - File list comes from the directory + regex. | ||
| - `filter_files_by_year_range` narrows by `year_start`/`year_end` — | ||
| **range, not equality**. | ||
| - `open_mfdataset` handles 1+ files transparently. | ||
| - `skip_input_year_filter: true` opts out cleanly. | ||
|
|
||
| So globbing in `*_file:` form is a year-mangler in disguise. Removing it | ||
| removes a hidden coupling between yaml syntax and CLI semantics. | ||
|
|
||
| --- | ||
|
|
||
| ## Affected entries (audit) | ||
|
|
||
| ``` | ||
| $ grep -rn 'fesom\.\*\.nc' awi-esm3-veg-hr-variables/ | grep '_file:' | ||
| ``` | ||
|
|
||
| | Tier | Rule | Key(s) | | ||
| |---|---|---| | ||
| | `lrcs_seaice` | sispeed | `aice_file`, `vice_file`, V-component file | | ||
| | `lrcs_seaice` | sidmasstranx, sidmasstrany | `aice_file`, `vice_file` | | ||
| | `lrcs_seaice` | sistressave, sistressmax | `aice_file` (or similar) | | ||
| | `lrcs_seaice` | siflcondtop, sifb, sihc | (single fesom file each) | | ||
| | `lrcs_seaice` | simpeffconc | (single fesom file) | | ||
| | `lrcs_seaice` | sispeed_day | per-day equivalent | | ||
| | `core_ocean` | zostoga | (fesom 3D file) | | ||
|
|
||
| Exact key names per rule need a finer audit before migration. Static-mesh | ||
| keys (`grid_file`, `basin_mask_file`) and any `*_file:` value without a | ||
| literal `*` are unaffected. | ||
|
|
||
| --- | ||
|
|
||
| ## Migration mechanics | ||
|
|
||
| ### Yaml side | ||
|
|
||
| For each entry: | ||
|
|
||
| ```yaml | ||
| # before | ||
| aice_file: /work/.../outdata/fesom/a_ice.fesom.*.nc | ||
| aice_variable: a_ice | ||
| ``` | ||
|
|
||
| ```yaml | ||
| # after | ||
| aice_path: /work/.../outdata/fesom | ||
| aice_pattern: a_ice\.fesom\..*\.nc | ||
| aice_variable: a_ice | ||
| ``` | ||
|
|
||
| The key triplet matches `_load_secondary_mf`'s convention | ||
| ([custom_steps.py:2153](examples/custom_steps.py#L2153)). | ||
|
|
||
| ### Step function side | ||
|
|
||
| For each custom step that reads a `*_file:` attribute: | ||
|
|
||
| ```python | ||
| # before | ||
| ds = xr.open_dataset(rule.aice_file, use_cftime=True) | ||
| aice = ds[rule.get("aice_variable", "a_ice")] | ||
| ``` | ||
|
|
||
| ```python | ||
| # after | ||
| aice = _load_secondary_mf(rule, "aice_path", "aice_pattern", "aice_variable") | ||
| ``` | ||
|
|
||
| `_load_secondary_mf` already: | ||
| - regex-matches files in the directory; | ||
| - year-filters via `filter_files_by_year_range` (with the | ||
| `skip_input_year_filter` opt-out from R2); | ||
| - opens via `open_mfdataset` (handles 1+ files); | ||
| - renames `time_counter` → `time` if requested; | ||
| - drops residual XIOS time bounds vars; | ||
| - selects the variable by name or auto-picks. | ||
|
|
||
| Most call sites that read `*_file:` do those steps manually anyway — | ||
| this consolidates them. | ||
|
|
||
| ### CLI override side | ||
|
|
||
| Remove R1 entirely: | ||
|
|
||
| - delete `_FESOM_FILE_RE` and `_expand_year_in_file_keys` from | ||
| [overrides.py](src/pycmor/core/overrides.py); | ||
| - delete the `if ov.year_start == ov.year_end` / `else` block in | ||
| `apply_overrides`; | ||
| - delete the R1-specific tests in | ||
| [test_overrides.py](tests/unit/test_overrides.py). | ||
|
|
||
| R2's `skip_input_year_filter` plumbing stays — it serves the centennial- | ||
| forcing rules independent of this migration. | ||
|
|
||
| --- | ||
|
|
||
| ## Scope of changes | ||
|
|
||
| | Component | Change | | ||
| |---|---| | ||
| | Yamls in `awi-esm3-veg-hr-variables/` | ~10 rule entries across 2 tiers (lrcs_seaice + core_ocean) | | ||
| | `examples/custom_steps.py` | ~10 custom step functions edited to call `_load_secondary_mf` | | ||
| | `src/pycmor/core/overrides.py` | net deletion — `_expand_year_in_file_keys`, `_FESOM_FILE_RE`, multi-year `OverrideError`, the entire R1 block in `apply_overrides` | | ||
| | `tests/unit/test_overrides.py` | drop R1 tests; R2 tests stay | | ||
| | `PLAN_cli_override_regressions.md` | mark R1 superseded | | ||
|
|
||
| --- | ||
|
|
||
| ## Trade-offs vs the workaround | ||
|
|
||
| | | R1 workaround | Proposed migration | | ||
| |---|---|---| | ||
| | Multi-year support | **impossible (raises OverrideError)** — blocks the 1700-year chunked run | works via `open_mfdataset` | | ||
| | Year filter is range-aware | no (single year only) | yes | | ||
| | Centennial-forcing opt-out | not applicable | works via `skip_input_year_filter` | | ||
| | Year/path coupling lives in | apply_overrides regex | rule yaml + helper | | ||
| | Net code in CLI override layer | grew by ~40 lines | shrinks by ~40 lines | | ||
|
|
||
| --- | ||
|
|
||
| ## Recommendation | ||
|
|
||
| Do the migration. R1 was the right call as a hot-fix to unblock the y1587 | ||
| single-year run, but the upcoming 1700-year chunked workload makes it a | ||
| blocker. The migration: | ||
|
|
||
| - unifies all secondary-input handling on one helper (`_load_secondary_mf`), | ||
| - removes a code path from the CLI override layer that had to know about | ||
| FESOM filename conventions, | ||
| - shrinks `apply_overrides` by ~40 lines, | ||
| - enables multi-year ranges (the 1700-year chunked case), | ||
| - preserves R2's `skip_input_year_filter` semantics for centennial inputs. | ||
|
|
||
| There's no value in deferring. R1 stays only as long as nothing needs | ||
| multi-year secondary inputs. | ||
|
|
||
| --- | ||
|
|
||
| ## Open questions — resolved | ||
|
|
||
| ### Q1: Per-rule key audit | ||
|
|
||
| To do during migration with | ||
| `grep -E '_file:.*fesom\.\*\.nc' awi-esm3-veg-hr-variables/`. No upfront | ||
| input needed. | ||
|
|
||
| ### Q2: `open_mfdataset` smoke test — passed | ||
|
|
||
| Tested against | ||
| `/work/bb1469/a270092/runtime/awiesm3-develop/after_lpjg_spinup_work_01/outdata/fesom/a_ice.fesom.{1900..1903}.nc`: | ||
|
|
||
| | Call | Result | Time | | ||
| |---|---|---| | ||
| | `xr.open_dataset(one_file)` | `time=12, nod2=126858` | 176 ms | | ||
| | `xr.open_mfdataset([one_file])` | `time=12, nod2=126858` | 22 ms | | ||
| | `xr.open_mfdataset(four_files)` | `time=48, nod2=126858` (concatenated correctly) | 260 ms | | ||
|
|
||
| Single-file `open_mfdataset` is in fact **faster** than `open_dataset` | ||
| (lazy-load); multi-file concatenates correctly along `time`. No | ||
| behavior regression for the migration. The y1700 chunked workload — | ||
| the original concern — gets the right shape automatically. | ||
|
|
||
| ### Q3: Promote `_load_secondary_mf` to `pycmor.std_lib`? | ||
|
|
||
| **No.** Audit of all callers | ||
| (`grep -rn '_load_secondary_mf' --include='*.py'`) shows every caller | ||
| is inside `examples/custom_steps.py` itself. No external consumers, no | ||
| yaml indirection that imports it from a stable path. User confirmed | ||
| `custom_steps.py` is user-owned and free to modify. | ||
|
|
||
| Keep it private. If a future project outside this codebase wants the | ||
| helper, that's the trigger to promote — premature now. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete before merge