학술논문

Impact of Noisy Labels on Dental Deep Learning—Calculus Detection on Bitewing Radiographs
Document Type
article
Source
Journal of Clinical Medicine, Vol 12, Iss 9, p 3058 (2023)
Subject
artificial intelligence
machine learning
deep learning
computer vision
convolutional neural networks
calculus
Medicine
Language
English
ISSN
2077-0383
Abstract
Supervised deep learning requires labelled data. On medical images, data is often labelled inconsistently (e.g., too large) with varying accuracies. We aimed to assess the impact of such label noise on dental calculus detection on bitewing radiographs. On 2584 bitewings calculus was accurately labeled using bounding boxes (BBs) and artificially increased and decreased stepwise, resulting in 30 consistently and 9 inconsistently noisy datasets. An object detection network (YOLOv5) was trained on each dataset and evaluated on noisy and accurate test data. Training on accurately labeled data yielded an mAP50: 0.77 (SD: 0.01). When trained on consistently too small BBs model performance significantly decreased on accurate and noisy test data. Model performance trained on consistently too large BBs decreased immediately on accurate test data (e.g., 200% BBs: mAP50: 0.24; SD: 0.05; p < 0.05), but only after drastically increasing BBs on noisy test data (e.g., 70,000%: mAP50: 0.75; SD: 0.01; p < 0.05). Models trained on inconsistent BB sizes showed a significant decrease of performance when deviating 20% or more from the original when tested on noisy data (mAP50: 0.74; SD: 0.02; p < 0.05), or 30% or more when tested on accurate data (mAP50: 0.76; SD: 0.01; p < 0.05). In conclusion, accurate predictions need accurate labeled data in the training process. Testing on noisy data may disguise the effects of noisy training data. Researchers should be aware of the relevance of accurately annotated data, especially when testing model performances.