학술논문

Deep Learning-Based Gleason Grading of Prostate Cancer From Histopathology Images—Role of Multiscale Decision Aggregation and Data Augmentation
Document Type
Periodical
Source
IEEE Journal of Biomedical and Health Informatics IEEE J. Biomed. Health Inform. Biomedical and Health Informatics, IEEE Journal of. 24(5):1413-1426 May, 2020
Subject
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Deep learning
Training data
Training
Glands
Biomedical imaging
Informatics
Principal component analysis
Prostate cancer
Gleason grading
histopathology
deep learning
Language
ISSN
2168-2194
2168-2208
Abstract
Visual inspection of histopathology images of stained biopsy tissue by expert pathologists is the standard method for grading of prostate cancer (PCa). However, this process is time-consuming and subject to high inter-observer variability. Machine learning-based methods have the potential to improve efficient throughput of large volumes of slides while decreasing variability, but they are not easy to develop because they require substantial amounts of labeled training data. In this paper, we propose a deep learning-based classification technique and data augmentation methods for accurate grading of PCa in histopathology images in the presence of limited data. Our method combines the predictions of three separate convolutional neural networks (CNNs) that work with different patch sizes. This enables our method to take advantage of the greater amount of contextual information in larger patches as well as greater quantity of smaller patches in the labeled training data. The predictions produced by the three CNNs are combined using a logistic regression model, which is trained separately after the CNN training. To effectively train our models, we propose new data augmentation methods and empirically study their effects on the classification accuracy. The proposed method achieves an accuracy of $\text{92}\%$ in classifying cancerous patches versus benign patches and an accuracy of $\text{86}\%$ in classifying low-grade (i.e., Gleason grade 3) from high-grade (i.e., Gleason grades 4 and 5) patches. The agreement level of our automatic grading method with expert pathologists is within the range of agreement between pathologists. Our experiments indicate that data augmentation is necessary for achieving expert-level performance with deep learning-based methods. A combination of image-space augmentation and feature-space augmentation leads to the best results. Our study shows that well-designed and properly trained deep learning models can achieve PCa Gleason grading accuracy that is comparable to an expert pathologist.