학술논문

Machine Learning for Technical Debt Identification

Document Type

Periodical

Author

Tsoukalas, D.; Mittas, N.; Chatzigeorgiou, A.; Kehagias, D.; Ampatzoglou, A.; Amanatidis, T.; Angelis, L.

Source

IEEE Transactions on Software Engineering IIEEE Trans. Software Eng. Software Engineering, IEEE Transactions on. 48(12):4892-4906 Dec, 2022

Subject

Computing and Processing
Tools
Software
Java
Radio frequency
Codes
Support vector machines
Benchmark testing
Machine learning
metrics/measurement
quality analysis and evaluation
software maintenance

Language

ISSN

0098-5589
1939-3520
2326-3881

Abstract

Technical Debt (TD) is a successful metaphor in conveying the consequences of software inefficiencies and their elimination to both technical and non-technical stakeholders, primarily due to its monetary nature. The identification and quantification of TD rely heavily on the use of a small handful of sophisticated tools that check for violations of certain predefined rules, usually through static analysis. Different tools result in divergent TD estimates calling into question the reliability of findings derived by a single tool. To alleviate this issue we use 18 metrics pertaining to source code, repository activity, issue tracking, refactorings, duplication and commenting rates of each class as features for statistical and Machine Learning models, so as to classify them as High-TD or not. As a benchmark we exploit 18,857 classes obtained from 25 Java projects, whose high levels of TD has been confirmed by three leading tools. The findings indicate that it is feasible to identify TD issues with sufficient accuracy and reasonable effort: a subset of superior classifiers achieved an F$_2$2-measure score of approximately 0.79 with an associated Module Inspection ratio of approximately 0.10. Based on the results a tool prototype for automatically assessing the TD of Java projects has been implemented.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송