[ hub_ops digital twin · v3.0 · rigor-applied ]

Hub Operations Digital Shadow — a proposed multi-layer model, paired with validated findings.

A view of the Chelmsford Twilight sort as the projection of a much richer dynamical system onto the observable iGate/SOR subspace. The richer model — the ~300-dimensional EKF — is a proposed architecture, never instantiated. No performance number is claimed for it. What is claimed is a small set of operational findings drawn from real corpus data.

// abstract

Branch A (v1.4) models the sort as a closed iGate/SOR subsystem. Branch B asks the complementary question: if all relevant upstream data existed in real time, what would the complete model look like? The answer is a multi-layer dynamical system — external arrival, environmental, physical-infrastructure, package-composition, and human-factors layers, plus outward CURE/SEAS extensions — of approximate dimension 300, which the architecture proposes to track with an Extended Kalman Filter. This EKF has not been instantiated, fit, or run. No transition functions, observation models, Jacobians, or noise covariances are specified numerically; accordingly the paper makes no performance claim for it — no RMSE figures, no per-sensor value estimates, no accuracy numbers anywhere. The model is named a digital shadow, not a twin: an unbuilt model cannot honestly be called a running, synchronized twin. What v3.0 does contribute, on real data, is the validated findings of §11–§16 and the constants pinned to metric_contract.json.

/ 01

The idealized model

Every quantity v1.4 predicts — PPH trajectories, staffing needs, sort-quality scores — is determined not only by what happens inside the building but by what is approaching it. A feeder delayed on I-495 by a winter storm delivers its pieces late, the inbound ramp compresses, induction spikes, jam probability rises. The closed-system model sees none of this until the packages are on the belt; it infers external dynamics from their internal consequences, after the fact. That is not a failure of v1.4 — under present sensor coverage, the closed-system model is the correct model for the data that exist.

v3.0 develops the complementary view as a proposed, unbuilt digital shadow: a continuously-updated high-fidelity simulation that would ingest sensor data across layers that feed causally into sort dynamics before they ever surface in iGate. A digital twin is a running, continuously-synchronized model of a physical system. The model in this paper has never been instantiated. Calling it a twin would overstate it — so it is named a shadow, a model that would shadow the physical sort once built. The filename stays in the digital_twin lineage for archival continuity; the body language is "shadow" / "proposed model" throughout.

Informally, v1.4 is the closed iGate/SOR subsystem and the proposed model is a superset. It is tempting to say v1.4 is "the marginal" of the full filter, and earlier versions stated that as a theorem. That theorem is retracted (§05); the relation survives only as a conjecture under stated, unverified assumptions. v3.0 does not revise v1.4; it positions v1.4 as the validated, deployed model and the multi-layer shadow as a proposal.

/ 02

The proposed ~300-dim EKF

The ~300-dimensional EKF "digital twin" was never built, fit, or run. What follows is a proposed architecture — a design target, not a validated implementation. The concrete specification of state-transition functions, observation models, Jacobians, and noise covariances is not provided and has not been calibrated. No performance number is claimed for it anywhere: there are no RMSE figures, no per-sensor value estimates, and no accuracy numbers in this paper.

The complete state at time $t$ stacks the latent upstream layers on top of the v1.4 operational state $\mathbf{x}_\text{ops}(t)$, which is its observable projection onto the iGate/SOR subspace:

// design — proposed full state vector (not instantiated) $$\mathbf{X}(t) = \bigl[\mathbf{x}_\text{ext},\, \mathbf{x}_\text{env},\, \mathbf{x}_\text{infra},\, \mathbf{x}_\text{pkg},\, \mathbf{x}_\text{human},\, \mathbf{x}_\text{CURE},\, \mathbf{x}_\text{SEAS},\, \mathbf{x}_\text{ops}\bigr]$$

With Chelmsford parameters (≈20 feeders, ≈30 belt segments, ≈120 chutes, ≈8 zones, ≈40 sorters, ≈100 destination lanes, three CURE×ZIP aggregation zones), the dimension count sums to approximately 300. This is a back-of-envelope sum of proposed layer sizes, not an instantiated state vector — no 300 named, ordered components are written down, and no filter has been run on them. The architecture proposes to track this state with an Extended Kalman Filter; the recursion below is the textbook EKF, reproduced for completeness as design, not result:

// design — proposed EKF recursion (terms unspecified numerically) $$ \hat{\mathbf{X}}(t{+}\delta t \mid t) = \mathbf{f}\bigl(\hat{\mathbf{X}}(t \mid t), \mathbf{u}(t)\bigr), \qquad \mathbf{P}(t{+}\delta t \mid t) = \mathbf{F}_t\, \mathbf{P}(t \mid t)\, \mathbf{F}_t^\top + \mathbf{Q}_t $$ $$ \mathbf{K}(t) = \mathbf{P}\,\mathbf{H}_t^\top \bigl(\mathbf{H}_t \mathbf{P}\, \mathbf{H}_t^\top + \mathbf{R}(t)\bigr)^{-1}, \quad \hat{\mathbf{X}}(t \mid t) = \hat{\mathbf{X}}(t \mid t{-}1) + \mathbf{K}(t)\bigl(\mathbf{z}(t) - \mathbf{h}(\hat{\mathbf{X}})\bigr) $$

None of $\mathbf{f}, \mathbf{F}_t, \mathbf{Q}_t, \mathbf{h}, \mathbf{H}_t, \mathbf{R}(t)$ is specified. The migration roadmap from v1.4 to the full filter lists, for each candidate increment, only the data source it would require and a qualitative complexity tier (SOR push → minimal; belt telemetry → low; ORION manifests / feeder GPS → medium; weather + traffic + full EKF → high). It contains no RMSE column: the earlier 5–8% … 20–30% figures were invented — no EKF was built, so no RMSE could have been measured — and have been removed.

/ 03

Observable layers

The proposed shadow adds five upstream layers that feed causally into sort dynamics before they appear in iGate, plus three outward layers. Each row names the data source it would require — none of these sensors is currently integrated. Where a layer is unbuilt, the paper says so; it does not claim the layer's value.

layer	proposed data source	causal role · status
External arrival	Feeder GPS, P-car GPS, ORION manifests	Volume, timing, phasing of inbound flow — not integrated
Environmental	NOAA / NWS, HERE / Waze, DOT incident feeds	Feeder ETA modulation, road delay — NOAA pull is cheapest candidate
Physical infra	Belt-speed sensors, induction counters, chute load cells	Throughput ceiling, jam probability, back-pressure — not integrated
Pkg composition	SLIC manifests, vision dims, smalls-bag scans	Belt-capacity consumption, zone load — not integrated
Human factors	Time-on-task, break records, experience history	Effective PPH vs fatigue / experience — not integrated
CURE outbound	CURE Main Data export ($t{+}24$h)	Dock utilization → loading bottleneck — validated findings §11
SEAS comp.	SEAS Hub / Employee Summary	Smalls composition, misload / LIB counts — validated findings §12
LIB surface	SEAS LIB roll-up	Sort-end correction signal — proposed correction only

Because CURE/SEAS data arrive at $t{+}24$h — incompatible with a 15-minute synchronous update — they would feed correction of the previous sort's trajectory, not real-time state, until asynchronous schemes (out-of-sequence measurement, fixed-lag smoothers) are implemented. The induction ceiling is the cleanest single-sensor target: an induction-rate counter plus a main-belt-speed feed would let the tracker compute the buffer clear-time as a sort-end push estimate — a proposed capability, not a deployed one.

/ 04

Operational constants

The validated findings — the part of this paper that rests on real data — are governed by a single source of truth, metric_contract.json. Four constants anchor the model; all carry stated uncertainty.

Weekly geometric volume decay. Full 254-sort corpus recalibration; Friday ≈ 93% of Monday.

gamma_weekly

γ = 0.982 ± 0.010

Zone 9-12 share — a DOW-indexed vector on the SEAS basis, not a flat invariant.

kappa(DOW)

Mon 0.413 · Fri 0.322

PD-belt share of total iGate Hub volume — a volume share, not a correlation.

rho_PD/Hub

ρ = 0.509 ± 0.025

Building PPH; cost-per-piece $c = 1/\Pi$. Paid hours = volume ÷ plan PPH.

Pi_SOR · c

Π ≈ 120.3 · c = 1/Π

γ = 0.982. v2.1 stated γ ≈ 0.938 from the single steepest week; the 6-week mean was 0.958; the full 254-sort corpus recalibration gives the canonical γ = 0.982 ± 0.010 (Fri/Mon ratio 0.9297; $\gamma = 0.9297^{1/4}$). Each step widened the window — the full year is gentler than any short window. At γ = 0.982, $\hat{V}_\text{Fri}/V_\text{Mon} = 0.982^4 \approx 0.93$: Friday is not meaningfully light, so the old "Friday looks light — cut a head" advice (built on the stale steep γ) is operationally wrong and corrected here.

κ as a DOW vector. The Zone 9-12 share (coordinator's Zone 3 = PD-9…PD-12) is not a flat invariant. On the SEAS basis it runs Mon 0.413 · Tue 0.403 · Wed 0.374 · Thu 0.357 · Fri 0.322 (full-year mean 0.368 ± 0.014). The iGate basis sits ≈0.04–0.06 lower at every phase (≈0.31–0.33 — this is what the historical "0.311" measured on a 5-sort window). The two bases differ by the CCHIL → PD-09 attribution rule — SEAS bundles all CCHIL volume onto the PD-09 row while iGate distributes by physical scanner location — not by phase scope; that mechanism is stated as a hypothesis still pending the closing arithmetic. The ≈9 pp Mon→Fri swing means zone-share forecasts must use the DOW value, not a flat constant.

/ 05

Rigor & open problems

v3.0 reconciles the audited canonical head, the κ DOW-vector revision, and the DeepSeek revision (which first renamed twin → shadow) into one authoritative head, under a conservative theorem-handling decision: unproven flagship claims are retracted or demoted with a one-line reason, not re-proven; the unbuilt model is re-framed, not built.

What changed: the ~300-dim EKF was relabeled a proposed architecture and every invented performance number was removed (the migration-table RMSE-reduction percentages, the "value of each new sensor stream" quantification, all implied per-sensor RMSE deltas). The "missing sensors" unfalsifiability deflection was deleted — the old closing line ("the gap is not a gap in understanding, it is a gap in sensor coverage; the understanding is already complete") is replaced by a falsifiable target: the model earns the name "validated" only if an instantiated filter, on held-out sorts it was not fit to, beats the current DOP single-point projection by a pre-registered margin with reported error bars. Theorem 8.1 (EKF marginalization) was retracted — marginalizing latent states out of a coupled nonlinear EKF does not in general yield a Kalman filter on the sub-block with merely inflated $Q$, and the old "proof" defined its own noise term as itself; it survives only as Conjecture 8.1 under explicit, unverified hypotheses with no derivation attempted. The N=2–3 autoresearch appendix was excised in full; mislabeled "theorems" (volume forecast, delay cascade, effective PPH, LIB decay, the ∂SQS/∂λ threshold) were demoted to definitions / identities / observations; constants were pinned to metric_contract.json.

Honestly disclosed limits, carried forward: the proposed EKF has no instantiation — either a minimal filter is built and validated on held-out sorts, or the model remains a design proposal, and this paper takes the latter, honest stance. The CCHIL level-shift mechanism is a hypothesis (the redistribute-on-same-sorts arithmetic has not been performed; the two bases were measured on different corpora/periods). The §5.3 jam-hazard parameters and the Weibull duration form are unestimated placeholders. The heat-degradation slope is an office-cognitive-work figure flagged for domain transfer. The session-log dates are placeholders pending a true collection log.

Rigor audit applied 2026-05-31. Supersedes v2.13, v2.13_kappa-conformance, and digital_shadow_v3.0-ds, reconciled into one head. The ~300-dim EKF is a PROPOSED, UNBUILT architecture — no RMSE, no per-sensor value, no accuracy figure is claimed for it. Constants conformed to metric_contract.json (γ = 0.982; κ_Z3 DOW-vector on SEAS basis; ρ = 0.509; Π_SOR ≈ 120.3, c = 1/Π). This is an internal operations-research and career brief, not a peer-reviewed paper.