Hybrid Feature Selection and Balancing Data Approach for Improved Software Defect Prediction

Balancing Class Multi Filter Feature Particle Swarm Optimization Software Defect Prediction

Authors

March 4, 2025
March 16, 2025
April 23, 2025

Downloads

Software Defect Prediction (SDP) plays a vital role in identifying defects within software modules. Accurate early detection of software defects can reduce development costs and enhance software reliability. However, SDP remains a significant challenge in the software development lifecycle. This study employs Particle Swarm Optimization (PSO) and addresses several challenges associated with its application, including noisy attributes, high-dimensional data, and imbalanced class distribution. To address these challenges, this study proposed a hybrid filter-based feature selection and class balancing method. The feature selection process incorporates Chi-Square (CS), Correlation-Based Feature Selection (CFS), and Correlation Matrix-Based Feature Selection (CMFS), which have been proven effective in reducing noisy and redundant attributes. Additionally, the Synthetic Minority Over-sampling Technique (SMOTE) is applied to mitigate class imbalance in the dataset. The K-Nearest Neighbors (KNN) algorithm is employed as the classification model due to its simplicity, non-parametric nature, and suitability for handling the feature subsets produced. Performance evaluation is conducted using the Area Under Curve (AUC) metric with a significance threshold of 0.05 to assess classification capability.  The proposed method achieved an AUC of 0.872, demonstrating its effectiveness in enhancing predictive performance. The proposed method was also superior to other combinations such as PSO SMOTE (0.0043), PSO SMOTE CS (0.0091), PSO SMOTE CFS (0.0111), and PSO SMOTE CFS CMFS (0.0007). The findings of this study show that the proposed method significantly enhances the efficiency and accuracy of PSO in software defect prediction tasks. This hybrid strategy demonstrates strong potential as a robust solution for future research and application in predictive software quality assurance.

How to Cite

Febrian, M. M., Saputro, S. W., Saragih, T. H., Abadi, F., & Herteno, R. (2025). Hybrid Feature Selection and Balancing Data Approach for Improved Software Defect Prediction. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(2), 232-244. https://doi.org/10.35882/ijeeemi.v7i2.67

Most read articles by the same author(s)

1 2 > >> 

Similar Articles

1-10 of 93

You may also start an advanced similarity search for this article.