학술논문

Immune infiltrates in breast cancer : clinical significance from histopathology to prognosis
Document Type
Electronic Thesis or Dissertation
Source
Subject
Breast Cancer
Deep Learning
Digital Pathology
Machine Learning
Prognostication
Survival Analysis
Tumour Infiltrating Lymphocytes
Language
English
Abstract
Though breast cancer has been traditionally regarded as non-immunogenic, in recent years, evidence has increasingly shown that patient immune responses play a central role in prognosis. Overall, increased tumour infiltrating lymphocyte (TIL) counts are associated with better outcomes. However, the prognostic significance of TILs varies by TIL type, with CD8+ (cytotoxic T-cells) generally being associated with better outcomes, and both FOXP3+ (T-regulatory cells) and CD163+ (M2 macrophages) TILs being associated with worse outcomes. However, there is considerable heterogeneity in the prognostic literature of FOXP3+ and CD163+ TILs, with studies conflicting on both the direction and statistical significance of the association. Moreover, there is a paucity of information on the significance of CD20+ (B-cells) TILs. To better elucidate the individual prognostic associations of the above TILs, and study the prognostic interplay of each, this thesis aims to evaluate the association between breast cancer-specific survival (BCSS) and CD8+, FOXP3+, CD20+, and CD163+ TILs, both individually and in combination. To accomplish this, the first aim was to develop an algorithm capable of quickly and accurately scoring the TILs in the Breast Cancer Association Consortium's (BCAC) tissue microarray (TMA) dataset. TMAs are cassettes designed to efficiently organise cylindrical tumour samples (cores) from a large number of patients, and stain their horizontal sections for a variety of targets. BCAC's TMA dataset is composed of 137,181 core images from 18,088 patients, each stained for either CD8, FOXP3, CD20, or CD163. To score the BCAC dataset, I developed two separate machine learning algorithms, one based on the Random Forest and the other based in Halo, a proprietary digital pathology platform widely used in the clinic. Both models compare favourably with pathologist-generated overall CD8+ TIL counts, with Cohen's weighted Kappa scores of 0.8 and 0.81 for the custom and Halo algorithms, respectively. However, due to substantial time restrictions in the PhD, development of the tissue segmentation (tumour, stroma, artefact, glass) component of the custom algorithm was cut short. As a result, the custom algorithm's performance dropped markedly relative to the Halo algorithm, with tumour-specific kappas of 0.49 and 0.7, and stroma-specific kappas of 0.6 and 0.74, for the custom and Halo algorithms, respectively. Therefore, my focus shifted to the Halo algorithm, which alone, underwent pathologist-led training and validation across all markers and studies. During expert validation of Halo, two pathologists evaluated the algorithm's TIL and tumour/stroma segmentation across 100 randomly selected images (25/marker). For each image, the pathologists reached consensus, passing or failing the TIL and tissue segmentation separately, based on the accuracy of the predictions (extent of under- and over-prediction, fit of segmentation masks to objects of interest, etc.). Ultimately, the TIL and tissue segmentation components of Halo passed in 98% and 85% of the images, respectively. Halo then underwent quantitative validation on tissue annotations for the 100-image set, receiving an average F1 score of 0.90 across all markers and tissue segmentation categories. With model development complete, I proceeded to my second aim and primary objective: the prognostication of each TIL marker. I began by scoring the entirety of the BCAC dataset with Halo. TIL scores were calculated in the form of compartment-specific percents (i.e., the compartment area covered by TILs divided by the total compartment area). These compartments consisted of tumour- and stroma-specific TILs, as well as the overall TILs. Once scored, I merged the results with the available clinical data. The sample then underwent exclusion for missing survival data, ER status, age, tumour grade, tumour diameter (mm), and number of metastasised nodes. I conducted ER-stratified sensitivity analyses, establishing cutoffs for artefact percent, my primary QC metric which measured the total tissue area covered by damage, detritus, or non-specific staining. Ultimately, it was determined that cutoffs of 25% for ER+ samples and 95% for ER- samples produced the optimal balance of quality and sample size for each stratum. These cutoffs were utilised only in uni-metric survival analyses (Cox regression using only one TIL type in one compartment), as the multi-metric analyses suffered considerable missingness given that not every patient had IHC-stained TMAs for each marker. Beginning with the uni-metric analyses, CD8 had 7,845 ER+ samples and 2,885 ER- samples; FOXP3 had 7,830 ER+ samples and 2,785 ER- samples; CD20 had 8,070 ER+ samples and 2,834 ER- samples; and CD163 had 7,901 ER+ samples and 2,815 ER- samples. For the multimetric analyses, there were 7,774 ER+ samples and 2,591 ER- samples. My ER-stratified uni-metric and multi-metric Cox regressions were then performed for each marker, TIL metric, and compartment combination (e.g., CD8 tumoural percent). For brevity, here I will only outline my major findings, which derived from the fully adjusted analyses (age, grade, TVC(grade) [TVC: time-varying coefficient; Included to account for proportional hazards violations], tumour diameter, and number of metastasised nodes). I found statistically significant hazard ratios (HR) with protective effects across ER and compartment strata in my CD8 unimetric analyses (overall percent: ER+ HR (95% CI) = 0.92 (0.87, 0.97), ER- HR = 0.89 (0.84, 0.95); tumoural percent: ER+ HR = 0.82 (0.74, 0.91), ER- HR = 0.85 (0.78, 0.93); stromal percent: ER+ HR = 0.94 (0.91, 0.98), ER- HR = 0.93 (0.89, 0.97])). For my multimetric analyses, statistical significance was maintained across all ER- compartments, as well as ER+ tumoural percent (overall percent: ER- HR = 0.92 (0.86, 0.99); tumoural percent: ER+ HR = 0.89 (0.82, 0.98), ER- HR = 0.91 (0.82, 0.99); stromal percent: ER- HR = 0.95 (0.91, 0.99]). For FOXP3, in my unimetric analyses, statistically significant protective effects were observed for both ER+ and ER- overall percent (ER+ HR = 0.63 (0.41, 0.98), ER- HR = 0.56 (0.38, 0.82)), as well as for ER- tumoural percent (HR = 0.56 (0.34, 0.93)) and ER- stromal percent (HR = 0.70 (0.53, 0.92)). However, all statistical significance was lost in the multimetric analyses. For CD20, I found protective effects for all ER+ unimetric analyses (overall percent: HR = 0.93 (0.90, 0.97); tumoural percent: HR = 0.84 (0.74, 0.95); stromal percent: HR = 0.94 (0.91, 0.97)). For multimetric analyses, I found statistically significant protective effects for ER+ overall percent (HR = 0.96 (0.93, 0.99)) and ER+ stromal percent (HR = 0.96 (0.94, 0.99)). No statistically significant associations were found for ER- breast cancers. Finally, for CD163 statistically significant protective effects were found for all ER- unimetric (overall percent: HR = 0.96 (0.94, 0.98); tumoural percent: HR = 0.92 (0.88, 0.96); stromal percent: HR = 0.97 (0.95, 0.99)) and multimetric analyses (overall percent: HR = 0.97 (0.95, 0.99); tumoural percent: HR = 0.94 (0.90, 0.99); stromal percent: HR = 0.98 (0.96, 0.99)). No ER+ associations were statistically significant. Together, these results suggest that each marker is individually prognostically significant, with CD8 and CD20 being protective in ER+ breast cancers, and CD8, FOXP3, and CD163 being protective in ER- breast cancers. Furthermore, all markers maintain both the direction of association and the majority of their statistical significance when analysed in concert, with the exception of FOXP3.

Online Access