학술논문

End-to-end Multiple Instance Learning with Gradient Accumulation
Document Type
Conference
Source
2022 IEEE International Conference on Big Data (Big Data) Big Data (Big Data), 2022 IEEE International Conference on. :2742-2746 Dec, 2022
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Engineering Profession
Geoscience
Robotics and Control Systems
Signal Processing and Analysis
Training
Solid modeling
Graphics processing units
Big Data
Data models
Multiple Instance Learning
deep learning
memory management
big data
interpretability
Language
Abstract
Being able to learn on weakly labeled data and provide interpretability are two of the main reasons why attention-based deep multiple instance learning (ABMIL) methods have become particularly popular for classification of histopathological images. Such image data usually come in the form of gigapixel-sized whole-slide-images (WSI) that are cropped into smaller patches (instances). However, the sheer volume of the data poses a practical big data challenge: All the instances from one WSI cannot fit the GPU memory of conventional deep-learning models. Existing solutions compromise training by relying on pre-trained models, strategic selection of instances, sub-sampling, or self-supervised pre-training. We propose a training strategy based on gradient accumulation that enables direct end-to-end training of ABMIL models without being limited by GPU memory. We conduct experiments on both QMNIST and Imagenette to investigate the performance and training time and compare with the conventional memory-expensive baseline as well as a recent sampled-based approach. This memory-efficient approach, although slower, reaches performance indistinguishable from the memory-expensive baseline.