Multi-window CT encoding, and what actually made Atlas generalize
Calcified stones, inflamed bowel, and vessels each want a different CT window. Encoding three windows into three channels did not raise the internal score, but it helped the model travel.

One window is always a compromise
A radiologist does not read an abdominal CT through a single window. Calcified stones are conspicuous on a bone window, inflammatory change reads best on a soft-tissue window, and vascular or parenchymal findings want a narrower contrast-sensitive window. A model fed one fixed grayscale window inherits that compromise.
Three windows, three channels
Atlas converts raw Hounsfield Units into a three-channel image: a soft-tissue window in the red channel for organ boundaries and inflammation, a bone and stone window in the green channel for calcifications, and an angio and liver window in the blue channel for vascular and contrast-sensitive structures. This gives a standard detector a radiologically motivated input without changing its architecture, extending an idea used in brain CT [1] and related abdominal RGB-superposition work [3].
The ablation: internal parity, external gain
We trained an otherwise identical single-window grayscale model on the same patient-level splits. Internally, the two were a near tie: macro AUROC 0.941 versus 0.937, with per-class differences in both directions and none statistically significant. The interesting result was on transfer. On the external cohort, multi-window encoding gave a small but significant macro-AUROC gain, concentrated where physics predicts it should be: kidney stones improved by 0.114 AUROC (p = 0.0002) and pancreatitis by 0.035 (p = 0.021).
Robustness is the point, not raw accuracy
Because AUROC is threshold-independent, that external gain is not an artifact of operating-point selection. It suggests the multi-window representation carries information that survives a change of scanner fleet, reconstruction kernel, and population. Models that look identical in-distribution can diverge sharply out-of-distribution [2], so the encoding choice that buys cross-site stability is worth more than one that nudges an internal leaderboard.
References
- Wang X, et al. Deep learning for acute intracranial hemorrhage on head CT. NeuroImage: Clinical. 2021;32:102785.
- Zech JR, et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs. PLoS Med. 2018;15(11):e1002683.
- Lee GP, et al. Disease classification in abdominal CT through RGB superposition methods. Comput Math Methods Med. 2023;2023:7714483.