학술논문

Slim UNETR: Scale Hybrid Transformers to Efficient 3D Medical Image Segmentation Under Limited Computational Resources
Document Type
Periodical
Source
IEEE Transactions on Medical Imaging IEEE Trans. Med. Imaging Medical Imaging, IEEE Transactions on. 43(3):994-1005 Mar, 2024
Subject
Bioengineering
Computing and Processing
Biomedical imaging
Transformers
Image segmentation
Task analysis
Computational modeling
Three-dimensional displays
Solid modeling
3D medical segmentation
lightweight
medical image analysis
resource-limited application
Language
ISSN
0278-0062
1558-254X
Abstract
Hybrid transformer-based segmentation approaches have shown great promise in medical image analysis. However, they typically require considerable computational power and resources during both training and inference stages, posing a challenge for resource-limited medical applications common in the field. To address this issue, we present an innovative framework called Slim UNETR, designed to achieve a balance between accuracy and efficiency by leveraging the advantages of both convolutional neural networks and transformers. Our method features the Slim UNETR Block as a core component, which effectively enables information exchange through self-attention mechanism decomposition and cost-effective representation aggregation. Additionally, we utilize the throughput metric as an efficiency indicator to provide feedback on model resource consumption. Our experiments demonstrate that Slim UNETR outperforms state-of-the-art models in terms of accuracy, model size, and efficiency when deployed on resource-constrained devices. Remarkably, Slim UNETR achieves 92.44% dice accuracy on BraTS2021 while being 34.6x smaller and 13.4x faster during inference compared to Swin UNETR. Code: https://github.com/aigzhusmart/Slim-UNETR