All articles
MethodMay 20, 20265 min read

Multi-window CT encoding, and what actually made Atlas generalize

Calcified stones, inflamed bowel, and vessels each want a different CT window. Encoding three windows into three channels did not raise the internal score, but it helped the model travel.

Cross-sectional CT imaging detail

One window is always a compromise

A radiologist does not read an abdominal CT through a single window. Calcified stones are conspicuous on a bone window, inflammatory change reads best on a soft-tissue window, and vascular or parenchymal findings want a narrower contrast-sensitive window. A model fed one fixed grayscale window inherits that compromise.

Three windows, three channels

Atlas converts raw Hounsfield Units into a three-channel image: a soft-tissue window in the red channel for organ boundaries and inflammation, a bone and stone window in the green channel for calcifications, and an angio and liver window in the blue channel for vascular and contrast-sensitive structures. This gives a standard detector a radiologically motivated input without changing its architecture, extending an idea used in brain CT [1] and related abdominal RGB-superposition work [3].

The ablation: internal parity, external gain

We trained an otherwise identical single-window grayscale model on the same patient-level splits. Internally, the two were a near tie: macro AUROC 0.941 versus 0.937, with per-class differences in both directions and none statistically significant. The interesting result was on transfer. On the external cohort, multi-window encoding gave a small but significant macro-AUROC gain, concentrated where physics predicts it should be: kidney stones improved by 0.114 AUROC (p = 0.0002) and pancreatitis by 0.035 (p = 0.021).

Robustness is the point, not raw accuracy

Because AUROC is threshold-independent, that external gain is not an artifact of operating-point selection. It suggests the multi-window representation carries information that survives a change of scanner fleet, reconstruction kernel, and population. Models that look identical in-distribution can diverge sharply out-of-distribution [2], so the encoding choice that buys cross-site stability is worth more than one that nudges an internal leaderboard.

References

  1. Wang X, et al. Deep learning for acute intracranial hemorrhage on head CT. NeuroImage: Clinical. 2021;32:102785.
  2. Zech JR, et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs. PLoS Med. 2018;15(11):e1002683.
  3. Lee GP, et al. Disease classification in abdominal CT through RGB superposition methods. Comput Math Methods Med. 2023;2023:7714483.