DOI: 10.1158/1535-7163.targ-23-b010 ISSN: 1538-8514

Abstract B010: Spatially-resolved prediction of gene expression signatures in H&E whole slide images using additive multiple instance learning models

Miles Markey, Juhyun Kim, Zvi Goldstein, Ylaine Gerardin, Jacqueline Brosnan-Cashman, Syed Ashar Javed, Dinkar Juyal, Harshith Padigela, Limin Yu, Bahar Rahsepar, John Abel, Stephanie Hennek, Archit Khosla, Chintan Parmar, Amaro Taylor-Weiner
  • Cancer Research
  • Oncology

Abstract

Background: Newly developed molecular technologies, such as spatial multiplexed assays and single-cell sequencing, have provided increased resolution and output for tumor analysis. However, these assays are often cost-prohibitive, making them inadequate ways to detect clinical biomarkers. In contrast, hematoxylin and eosin (H&E) staining is routine for cancer diagnostics but does not provide molecular information, potentially limiting its utility in the targeted therapy era. Machine learning models could augment the information revealed by H&E, potentially allowing molecular information to be inferred. Here, we describe a novel approach to predict gene expression signatures (GES) in H&E-stained whole slide images (WSI) using an additive multiple instance learning (aMIL) end-to-end model (1). We present results in breast cancer predicting spatially resolved levels of a TGFb GES, a proposed biomarker for TGFb antagonists and immunotherapy.

Methods: H&E-stained WSI from the TCGA BRCA cohort (N=1090) were split into training (60%), validation (20%), and test (20%) sets. TGFb-CAF GES (2) were computed, and median expression cut-off on training data was used to define “high” and “low” TGFb-CAF levels. aMIL models were optimized in training data for the binary classification of TGFb-CAF levels. Top-performing model iterations were compared on the validation set, and the optimal model was deployed on the held-out test set. aMIL heatmaps were merged with PathExplore tumor microenvironment (TME) model heatmaps to characterize cell, tissue, and nuclear spatial distributions and morphology in terms of human interpretable features (HIFs). HIFs were extracted from high-importance patches (top 25% of aMIL scores) for both TGFb-CAF-high and -low.

Results: Our model accurately predicted TGFb-CAF-high vs. -low BRCA samples (test AUROC=0.80). Also, model deployment on WSI provided interpretable heatmaps depicting TGFb-CAF predictions in tissue, providing spatial resolution to TGFb-CAF expression. Patches contributing most to TGFb-CAF-high prediction were enriched for cancer stroma, as well as cancer-infiltrating and stromal fibroblasts. Furthermore, significant differences in HIFs relating to fibroblast nucleus size and lymphocyte nucleus shape were observed between patches contributing most to TGFb-CAF-high and -low predictions.

Conclusions: We have developed a method to predict GES with spatial resolution in H&E-stained WSI. aMIL models provide exact marginal contributions of each patch towards every class prediction, allowing downstream analysis of tissue, cell, and nuclear features and providing biological interpretability not found in typical black-box models. The ability of our method to detect GES in H&E-stained WSI allows complex molecular information to be detected in routine clinical specimens with spatial specificity, providing a means for GES to potentially be realized as clinical biomarkers.

References: 1) Javed, SA, et al. Adv Neural Inf Process Syst. 2022 35: 20689-702; 2) Krishnamurthy, AT, et al. Nature. 2022 611:148-54.

Citation Format: Miles Markey, Juhyun Kim, Zvi Goldstein, Ylaine Gerardin, Jacqueline Brosnan-Cashman, Syed Ashar Javed, Dinkar Juyal, Harshith Padigela, Limin Yu, Bahar Rahsepar, John Abel, Stephanie Hennek, Archit Khosla, Chintan Parmar, Amaro Taylor-Weiner. Spatially-resolved prediction of gene expression signatures in H&E whole slide images using additive multiple instance learning models [abstract]. In: Proceedings of the AACR-NCI-EORTC Virtual International Conference on Molecular Targets and Cancer Therapeutics; 2023 Oct 11-15; Boston, MA. Philadelphia (PA): AACR; Mol Cancer Ther 2023;22(12 Suppl):Abstract nr B010.

More from our Archive