학술논문

Towards Reliable Dermatology Evaluation Benchmarks

Document Type

Working Paper

Author

Gröger, Fabian; Lionetti, Simone; Gottfrois, Philippe; Gonzalez-Jimenez, Alvaro; Groh, Matthew; Daneshjou, Roxana; Consortium, Labelling; Navarini, Alexander A.; Pouly, Marc

Source

Proceedings of the 3rd Machine Learning for Health Symposium, PMLR 225:101-128, 2023

Subject

Computer Science - Computer Vision and Pattern Recognition
Computer Science - Artificial Intelligence

Language

Abstract

Benchmark datasets for digital dermatology unwittingly contain inaccuracies that reduce trust in model performance estimates. We propose a resource-efficient data-cleaning protocol to identify issues that escaped previous curation. The protocol leverages an existing algorithmic cleaning strategy and is followed by a confirmation process terminated by an intuitive stopping criterion. Based on confirmation by multiple dermatologists, we remove irrelevant samples and near duplicates and estimate the percentage of label errors in six dermatology image datasets for model evaluation promoted by the International Skin Imaging Collaboration. Along with this paper, we publish revised file lists for each dataset which should be used for model evaluation. Our work paves the way for more trustworthy performance assessment in digital dermatology.
Comment: Link to the revised file lists: https://github.com/Digital-Dermatology/SelfClean-Revised-Benchmarks

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송