학술논문

Prediction Of Software Defects by Employing Optimized Deep Learning and Oversampling Approaches
Document Type
Conference
Source
2024 2nd International Conference on Computer, Communication and Control (IC4) Computer, Communication and Control (IC4), 2024 2nd International Conference on. :1-5 Feb, 2024
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
General Topics for Engineers
Signal Processing and Analysis
Deep learning
Support vector machines
Costs
Source coding
Computational modeling
NASA
Software quality
Software Defeat Prediction
Dataset balancing
Abstract Syntax Tree
and Deep Learning
Language
Abstract
One of the most crucial methods for assessing software quality and cutting development costs is software defect prediction (SDP). Software defects can be predicted using data collected throughout the software lifecycle. There are currently a lot of SDP models available; however, most samples typically exhibit class imbalance, and their performance could have been better. This research suggests using an optimized deep learning (DL) model with an oversampling technique to predict software defeat early to address these problems. The proposed system mainly comprises three phases: dataset balancing, parsing source code, and classification. To begin, the dataset imbalance issue is solved with the help of k-means clustering and synthetic minority oversampling technique (KMSMOTE). After that, the source code is parsed using the AST, which converts the source code into tokens. Finally, the classification is done using the Sand Cat Optimized Recurrent Neural Network (SCRNN) approach to classify the input into the defect or non-defect, in which the hyperparameters are optimally chosen using the Sand Cat Swarm Optimization (SCSO) algorithm and the NIPUNA activation function solves the gradient vanishing problem. Tests conducted on the NASA dataset demonstrate that the system attains an average accuracy of 96.21%, confirming that the suggested model has good prediction performance and can lower the cost of misclassification and mitigate the effects of sample class imbalance.