Spatially Aware Transformer Networks for Contextual Prediction of Diabetic
Nephropathy Progression from Whole Slide Images.
Authors Shickel B, Lucarelli N, Rao AS, Yun D, Moon KC, Han SS, Sarder P
Submitted By Submitted Externally on 10/16/2023
Status Published
Journal medRxiv : the preprint server for health sciences
Year 2023
Date Published 2/1/2023
Volume : Pages Not Specified : Not Specified
PubMed Reference 36865174
Abstract Diabetic nephropathy (DN) in the context of type 2 diabetes is the leading cause
of end-stage renal disease (ESRD) in the United States. DN is graded based on
glomerular morphology and has a spatially heterogeneous presentation in kidney
biopsies that complicates pathologists’ predictions of disease progression.
Artificial intelligence and deep learning methods for pathology have shown
promise for quantitative pathological evaluation and clinical trajectory
estimation; but, they often fail to capture large-scale spatial anatomy and
relationships found in whole slide images (WSIs). In this study, we present a
transformer-based, multi-stage ESRD prediction framework built upon nonlinear
dimensionality reduction, relative Euclidean pixel distance embeddings between
every pair of observable glomeruli, and a corresponding spatial self-attention
mechanism for a robust contextual representation. We developed a deep
transformer network for encoding WSI and predicting future ESRD using a dataset
of 56 kidney biopsy WSIs from DN patients at Seoul National University Hospital.
Using a leave-one-out cross-validation scheme, our modified transformer
framework outperformed RNNs, XGBoost, and logistic regression baseline models,
and resulted in an area under the receiver operating characteristic curve (AUC)
of 0.97 (95% CI: 0.90-1.00) for predicting two-year ESRD, compared with an AUC
of 0.86 (95% CI: 0.66-0.99) without our relative distance embedding, and an AUC
of 0.76 (95% CI: 0.59-0.92) without a denoising autoencoder module. While the
variability and generalizability induced by smaller sample sizes are
challenging, our distance-based embedding approach and overfitting mitigation
techniques yielded results that suggest opportunities for future spatially aware
WSI research using limited pathology datasets.