학술논문

Prediction of Upstaged Ductal Carcinoma In Situ Using Forced Labeling and Domain Adaptation
Document Type
Periodical
Source
IEEE Transactions on Biomedical Engineering IEEE Trans. Biomed. Eng. Biomedical Engineering, IEEE Transactions on. 67(6):1565-1572 Jun, 2020
Subject
Bioengineering
Computing and Processing
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Feature extraction
Adaptation models
Breast cancer
Ducts
Biopsy
domain adaptation
forced labeling
ductal carcinoma in situ
mammographic features
Language
ISSN
0018-9294
1558-2531
Abstract
Objective: The goal of this study is to use adjunctive classes to improve a predictive model whose performance is limited by the common problems of small numbers of primary cases, high feature dimensionality, and poor class separability. Specifically, our clinical task is to use mammographic features to predict whether ductal carcinoma in situ (DCIS) identified at needle core biopsy will be later upstaged or shown to contain invasive breast cancer. Methods: To improve the prediction of pure DCIS (negative) versus upstaged DCIS (positive) cases, this study considers the adjunctive roles of two related classes: atypical ductal hyperplasia (ADH), a non-cancer type of breast abnormity, and invasive ductal carcinoma (IDC), with 113 computer vision based mammographic features extracted from each case. To improve the baseline Model A's classification of pure vs. upstaged DCIS, we designed three different strategies (Models B, C, D) with different ways of embedding features or inputs. Results: Based on ROC analysis, the baseline Model A performed with AUC of 0.614 (95% CI, 0.496-0.733). All three new models performed better than the baseline, with domain adaptation (Model D) performing the best with an AUC of 0.697 (95% CI, 0.595-0.797). Conclusion: We improved the prediction performance of DCIS upstaging by embedding two related pathology classes in different training phases. Significance: The three new strategies of embedding related class data all outperformed the baseline model, thus demonstrating not only feature similarities among these different classes, but also the potential for improving classification by using other related classes.