학술논문

Attention Mechanism Guided SE + ResNet-H Model for Gastrointestinal Endoscopy Image Classification
Document Type
Periodical
Source
IEEE Transactions on Instrumentation and Measurement IEEE Trans. Instrum. Meas. Instrumentation and Measurement, IEEE Transactions on. 73:1-13 2024
Subject
Power, Energy and Industry Applications
Components, Circuits, Devices and Systems
Residual neural networks
Diseases
Solid modeling
Gastrointestinal tract
Endoscopes
Lesions
Image classification
Attention mechanism
deep learning
transfer learning
wireless capsule endoscopy (WCE) image classification
wireless endoscopy
Language
ISSN
0018-9456
1557-9662
Abstract
Wireless capsule endoscopy (WCE) allows the screening and diagnosis of a patient’s gastrointestinal (GI) tract, including the areas inaccessible to traditional endoscopy, safely and painlessly. However, a large number of images produced in WCE examination require significant time and expertise to process the information and the achieved accuracy is limited due to manual examination by experts. To solve these problems, a novel squeeze-and-excitation (SE) + residual neural network (ResNet)-H GI lesion recognition model is proposed to automate the screening and detection process. An attention mechanism is added to the original ResNet50 model to form a new attention mechanism + ResNet50 model. Then, transfer learning is integrated with the new attention mechanism + ResNet50 model to classify WCE images. The achieved results in experimental settings involving different combinations of positions and types of added attention mechanisms demonstrate that the average detection accuracy of the optimized improved model can reach up to 97.84% for all types of GI images. Subsequently, large-margin cosine loss was used to replace the cross-entropy loss function. Based on the comparative analysis, the improved loss function enhanced the detection accuracy to 98.47% for different types of GI images. Furthermore, the results achieved using the gradient-weighted class activation mapping (Grad-CAM) illustrate that the improved model can capture the focal area well, and promising performance was achieved compared to the original ResNet50 model. The suggested approach can potentially assist endoscopists in the detection and examination of GI diseases during endoscopy.