From Observation to Intelligence — CHEMA Sort Intelligence Whitepaper v2.2

CHEMA Sort Intelligence Whitepaper · v2.2

From Observation
to Intelligence

A data-driven framework for sort-floor supervision

Author

Rafael Almeida

Facility

CHEMA Twilight Hub

Corpus

613 sorts · Dec 2023–May 2026

Published

May 2026

Abstract

This paper presents Sort Intelligence Master — an empirical analytics framework built on 613 sorts of operational data from the CHEMA Twilight Hub, spanning December 2023 through May 2026. The system derives DOW-indexed zone load coefficients (k_Z), a weekly PPH decay constant (γ = 0.958), and a four-phase operational model from seven heterogeneous data sources. It is implemented as a browser-native tool requiring no installation or IT support. The framework converts pre-sort labor allocation from floor intuition to a quantified, fidelity-corrected decision brief that takes under two minutes to generate.

Section 1

Operational Context

CHEMA Twilight Hub operates Monday through Friday, 18:00–22:00. At peak induction, the facility processes 12,000–28,000 packages per hour across 38 active outbound bays organized into three zones: Zone 1 (PD-01 to PD-04), Zone 2 (PD-05 to PD-08), and Zone 3 (PD-09 to PD-12). Approximately 18–28 loaders staff each zone during diagnostic phases; total outbound headcount ranges from 54 to 84 depending on day of week and volume forecast.

The supervisor's core instruments — iGate ESR scan reports, Schedule of Record (SOR-T) headcount exports, SEAS service exception reports, and CURE destination load files — are generated nightly but have historically been consumed in isolation, each providing a partial view of the operation. The analytical work described in this paper connects them.

Three-zone structure. Z1 = PD-01 to PD-04 (northeast corridor). Z2 = PD-05 to PD-08 (southeast + L.L. Bean). Z3 = PD-09 to PD-12 (southern New England + outliers). Each zone has a distinct freight composition that varies structurally by day of week — not by operational preference.

Section 2

Development Arc

Four tools were built sequentially over approximately two weeks, each building on the data infrastructure established by the previous. The development occurred alongside daily operations — no dedicated time was allocated, no project was opened, no budget was requested.

KPI Dashboard → Apr 28, 2026
Live-Sort Tracker v1.0 → May 6, 2026
Label Training Certification (LTC) → May 11, 2026
Sort Intelligence Master → May 12, 2026

The corpus accumulation began with the facility's first available SOR export in December 2023 and extends through May 2026. 29 of 30 validation sorts reconcile within rounding tolerance between SOR volume and SEAS reported totals. Three sorts exhibit scan efficiency above 100%, which refutes the prior assumption that iGate scan totals must be bounded by SOR reported volume — a correction that propagates into all downstream fidelity calculations.

Section 3

The Fidelity Problem

Raw iGate ESR data carries three structural distortions that, left uncorrected, produce unreliable belt-level PPH signals. The fidelity cascade module addresses each.

Mechanism 1 — Bay Attribution Error

iGate attributes scans to the ULD bay the employee scanned on entry. When a loader moves between bays — either as a borrow from a coordinator or through unauthorized migration — their scans continue crediting the original bay until a new ULD scan is recorded. A loader physically working PD-11 but last-scanned into PD-09 shows as PD-09 PPH. The affected bay gains phantom scan credit; the actual bay shows an artificial deficit.

Mechanism 2 — Shared Equipment

In high-volume phases, loaders occasionally use a co-worker's scan gun or scan from a shared device. Scans recorded under an incorrect employee ID create attribution that the PPH system assigns to the wrong person and, by extension, the wrong belt segment. This is not detectable from iGate output alone — it requires cross-referencing SOR headcount assignments against scan ID distribution.

Mechanism 3 — Synthetic IDs

DWS machines (IDs 1–40), return belt systems (IDs 101–104), and bulk belt systems (IDs 401–404) generate automated scan events that appear in the ESR employee summary. Unfiltered, these inflate per-employee averages and distort zone-level totals. The fidelity cascade applies an ID taxonomy filter before any PPH or load coefficient computation.

Fidelity score. Each sort receives a per-sort fidelity score (0.0–1.0) based on the proportion of scans attributable to correctly-positioned human loaders with verified ID discipline. Sorts below threshold are down-weighted in k_Z and γ computations.

Section 4

Phase Ladder & k_Z(DOW)

A sort-night is not uniform. PPH varies structurally across four phases driven by freight volume, package geometry, and labor fatigue. Treating all-night data as a single diagnostic baseline produces a systematically biased signal.

Phase 0

Pre-Sort Setup

0% → ~15% completion

Trailer positioning, ULD scans, equipment staging. PPH is below diagnostic baseline — expected and irrelevant to staffing assessment.

Phase 1

Ramp

~15% → ~35% completion

Volume flow increasing. PPH climbing toward diagnostic ceiling. Staffing decisions made here shape Phase 2 outcomes.

Phase 2

Diagnostic Window

~35% → ~75% completion

Full-flow steady state. The only phase from which reliable PPH baselines should be drawn. k_Z coefficients are computed from this phase exclusively.

Phase 3

Wind-Down

>~75% completion

Volume decreasing geometrically. PPH decline is structural — not underperformance. Phase 3 discount prevents phantom pressure calls in the sort's final 40 minutes.

Zone Load Coefficient k_Z(DOW)

The zone-level load coefficient k_Z(DOW) measures each zone's share of total hub scan volume during the diagnostic phase (Phase 2), indexed by day of week. The three zone coefficients are zero-sum: k_Z1 + k_Z2 + k_Z3 ≈ 1.0. Values below are derived from 29 sorts of ESR data (weeks 13–18, 2026).

Day	k_Z3 (Phase 2)	Std Error	Z3 Share	Z1+Z2 Combined
Monday	0.413	±0.012	~41%	~59%
Tuesday	0.395	±0.015	~40%	~60%
Wednesday	0.385	±0.014	~39%	~61%
Thursday	0.380	±0.013	~38%	~62%
Friday	0.322	±0.009	~32%	~68%

Monday–Friday k_Z3 visualization — Zone 3 (terra) vs Z1+Z2 (ink) share of hub scan volume:

Monday

0.413

Tuesday

0.395

Wednesday

0.385

Thursday

0.380

Friday

0.322

The Monday–Friday swing. The difference between k_Z3(Monday) = 0.413 and k_Z3(Friday) = 0.322 is a 28% structural difference in Zone 3's proportion of hub volume — driven entirely by what the freight network routes on those days, not by anything the facility controls. A supervisor who staffs Monday's Z3 the same way they staffed Friday's will be short by the equivalent of 5–6 loaders' worth of loading capacity before the first package reaches the belt.

Reading k alongside γ

k answers where tonight's labor needs to be. γ = 0.958 answers what result to expect from it. If Zone 3 averaged 320 PPH last week, the baseline expectation this week — absent any other change — is approximately 306 (320 × 0.958), about 4.2% lower from structural decay. A reading of 280 entering Phase 2 is 1.5 color bands below the expected trajectory — warranting a scan compliance and borrow/loan attribution review before the sort progresses further.

Section 5

The Python Runtime — Deployment Plan

Every tool described in this paper runs in a standard web browser. No Python is required to use them. They read files via the browser's FileReader API and store data in IndexedDB — no installation, no network dependency, no IT ticket to deploy an update. Python serves one distinct offline purpose: ingesting raw weekly export files into the corpus and running batch analytical modules against the full sort history.

Deployment Sequence

Action

Time

Dependency

Install Python via MS Store

5 min

IT approval

Run corpus_builder.py on existing Raw_Import folder

2–3 min

Phase 1

Verify 613-sort corpus integrity (row counts, date range)

5 min

Phase 2

Run weekly update after next sort week export

2–3 min recurring

Phase 3

Module Stack

corpus_builder.py

Idempotent ingestion pipeline. Drop new weekly SOR, CURE, and SEAS exports into Raw_Import and run one command. Atomic write pattern prevents corpus corruption on interruption.
fidelity_cascade.py

Cross-references SOR headcount against iGate scan attribution. Identifies scanner ID mismatches and positional mistogles. Computes a per-sort fidelity score (0.0–1.0) and flags affected belts.
emergence_gap.py

Computes predicted Phase 2 PPH per belt from k_Z(DOW) × hub scan volume ÷ belt headcount, then compares against actual iGate Phase 2 PPH. A persistent negative gap signals a structural problem.
staffing_model.py

Inputs: day of week, phase, planned volume, historical k values with standard errors. Output: zone-by-zone staffing recommendation with confidence interval and Phase 3 discount flag.
cure_pressure.py

Maps CURE destination loads onto the PD belt serving each destination. Flags trailers running above 85% utilization and adjusts zone staffing recommendations accordingly.
trend_engine.py

Applies the γ = 0.958 weekly decay model to rolling PPH baselines. Detects drift below expected trajectory and generates a plain-language pre-sort alert with numeric summary.

Data security. Every script reads from and writes to the local project folder on the work laptop. No data leaves the machine. The Python interpreter requires no network access after installation. The files processed are identical to those already opened in Excel at the facility daily. Python introduces no new data exposure surface.

Section 6

Replication and Scaling Potential

These tools were built for CHEMA Twilight. The architecture is intentionally generic. Every UPS hub running the same SOR/TMS, iGate/ESR, SEAS, and CURE exports produces identically structured files. The corpus builder requires exactly one configuration change per facility: the SLIC number. Every other component — phase engine, fidelity cascade, staffing model, UI tabs, PPH palette — is facility-agnostic and requires no modification.

A district deploying this stack would maintain one local corpus instance per hub, managed by the on-site FTS coordinator running corpus_builder.py after each export week. A district-level aggregation script pulls all facility corpora into a single district JSON. A district manager reviewing the trend engine's output before the week begins can identify which facilities are approaching PPH threshold risk and which have staffing capacity available to share.

The path from CHEMA to a district pilot: Python approval at one facility, one quarter of documented results, and a shared file folder for district-level aggregation. The tools are already built. The data is already being collected. The only unlock is the runtime.

Section 7

The Supervisor as Agent: Signal, Decision, Outcome

Every operational system produces data. The question is whether anyone is positioned to use it. At CHEMA Twilight, the supervisor is the decision node between the system's signals and its outcomes. The quality of those interventions depends on the quality of the signal the supervisor reads — grounded in 613 sorts of measured history, indexed by day of week, segmented by sort phase, and corrected for the three fidelity distortions that make raw iGate data an unreliable guide.

7.1 Upstream — Inbound as the Leading Signal

The inbound operation is the earliest available indicator of what the sort will demand from outbound. Door turn rate during Phase 0 and Phase 1 signals freight composition arriving in real time: fast turns indicate flowable cube; slow turns indicate bulk irregular. The Phase Engine makes this explicit — enter the current fractional completion and day of week, receive phase identification, k_Z-implied load distribution across all three zones, and a staffing recommendation calibrated to the freight structure the historical record says today's sort should carry.

7.2 Midfield — The Belt as the Decision Surface

The outbound belts are where labor allocation becomes service outcome. Every correct staffing decision reduces misloads, reduces LIB events, and reduces overgood and damage. Phase 3 discount distinguishes genuine understaffing from geometrically exhausted volume — a distinction no existing UPS reporting system makes.

7.3 Downstream — Service as the Measured Outcome

Every package that makes its correct trailer, in the correct load position, without damage, is a package that delivers on time. LTC certifies that the employee scanning the package is the employee the system credits. The fidelity cascade detects where that chain has broken. The Phase Engine prevents valid Phase 2 PPH from being contaminated by Phase 3 wind-down data. The staffing model translates the clean signal into an actionable recommendation.

7.4 Safety — The Unseen KPI

PPH and service metrics have dashboards. Safety does not — not at the granularity that matters. Phase 3 discount is, among other things, a safety mechanism. When the system correctly identifies that a belt's PPH has declined because the sort is winding down — not because the team needs to go harder — the supervisor does not issue a pressure call on a ghost signal. Injuries that happen in the last 20 minutes of a sort, when fatigue is highest and the temptation to read low PPH as underperformance is greatest, become avoidable.

Data-oriented supervision does not remove the supervisor's authority. It removes the conditions under which that authority gets exercised against a population of employees based on corrupted inputs.

7.5 LTC as the Human Link

Label Training Certification is the only tool in this program that directly addresses employee behavior rather than analytical infrastructure. UPS's On-Job Supervision system (GEMS #1760) specifies 111 methods for sorters, 87 for pick-offs, 168 for loaders — each must be demonstrated at production rate, passing threshold 95%, recertified every six months. OJS certifies how to execute the role; LTC certifies where to send the package.

Three layers of attribution:

Layer 1 — Sort aisle sorters: route by SLIC and exception to color-coded belt. No scan. Compliance shows in missort rate. LTC certifies sort chart knowledge.

Layer 2 — Pick-off employees: secondary sort by ZIP exception within SLIC group to chute 1, chute 2, or let-ride. No scan. LTC certifies zip code exception routing.

Layer 3 — Loaders inside trailers: scan ULD bay-door label before entry, then scan each package. This is the only source of all iGate PPH attribution. LTC certifies scan discipline: own employee scanning ID, correct ULD scan before entry, individual package scanning. When this layer fails, the PPH signal the coordinator receives does not match the work being done.

Section 8

Career Development and Organizational Value

These tools were built in approximately two weeks, by a single sort floor supervisor, using data that was already there. No budget was requested, no project was opened, no timeline was assigned. The development happened alongside daily operations.

Data Engineering

Idempotent ETL across 7 heterogeneous sources

Tracker JSON exports, SOR XLSX files, iGate/ESR reports, SEAS service exception files, CURE destination load files — atomic write safety, schema versioning, 81 MB unified corpus covering 613 sorts.

Statistical Modeling

Empirical derivation with honest uncertainty

DOW-indexed load coefficients, weekly PPH decay constant γ = 0.958, correlation coefficient — all from operational data, all with confidence intervals transparent about sample size limits.

Systems Theory

Four-branch research program

Hub as a purposeful system with adaptive agents, distortion mechanisms, and feedback loops — documented as internal whitepapers at research-paper quality across four concurrent branches.

Software Development

Four production tools in active daily use

Browser-native architecture requiring zero IT support for any deployment or update. Tracker v4.0: 6,200+ lines, 40+ versions, full lifecycle from problem identification to iteration.

Technical Writing

Research-grade documentation

Spanning floor-level instruction, mathematical derivation, and executive-level briefing — across the same operational domain, written for audiences ranging from loaders to district management.

Operational Judgment

Derived from direct supervisory experience

The phase ladder, k staffing model, fidelity cascade, and borrow/loan detection were derived from direct supervisory experience and validated against measured data — not from external research.

The ask for this presentation is narrow: Python runtime on one laptop, for one supervisor, to run scripts that process files the facility already generates every day. The output is better staffing decisions, earlier identification of performance problems, and a replicable model any FTS coordinator in the district could be trained to operate in a single session.

Section 9

Conclusion

A PPH target always set near the performance ceiling. A workforce making real-time decisions with imperfect signals. A service standard that depends on both. That tension is manageable — not by removing the target or reducing the supervisor's responsibility, but by improving the quality of the information available when decisions are made.

"The KPI connections are direct. Clean scanner attribution → accurate belt-level PPH → correct staffing → fewer misloads, fewer LIB events, less overgood and damage. Phase-segmented analysis → no wind-down data contaminating diagnostic baselines → no phantom pressure calls in the sort's final 40 minutes. Zone-general k_Z(DOW) staffing → every zone allocated from its own empirically-derived freight share."

From Section 9 — CHEMA Sort Intelligence Whitepaper v2.2

The tools are built. The corpus is queryable. Every browser-based component runs today within network policy. The one remaining capability gap is a Python runtime on one work laptop.

That approval is the difference between a supervisor who observes the sort and a supervisor who understands it.

Section 10

References

Internal documents reference the CHEMA Analytics Program research corpus. External references are standard academic and open-source citations.

[A]Almeida, R. (2026). Hub Operations Mathematical Framework, v1.8. whitepapers/hub_ops_mathematical_framework_v1.8.md
[B]Almeida, R. (2026). Hub Operations Digital Twin, v2.3. whitepapers/hub_ops_digital_twin_v2.3.md
[C]Almeida, R. (2026). Purposeful Systems in Hub Operations, v3.9. whitepapers/hub_ops_purposeful_systems_v3.9.md
[D]Almeida, R. (2026). Unified Framework for Hub Operations Analytics, v4.3. whitepapers/hub_ops_unified_framework_v4.3.md
[OJS]UPS Corporate Industrial Engineering. Hub Management Job Methods Certification, GEMS Code #1760. Passing threshold 95%; recertification every 6 months.
[9]openpyxl Project Contributors. (2024). openpyxl — A Python library to read/write Excel 2010 xlsx/xlsm files. openpyxl.readthedocs.io
[10]Hunter, J. D. (2007). Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering, 9(3), 90–95.
[14]Ackoff, R. L. (1971). Towards a System of Systems Concepts. Management Science, 17(11), 661–671.